Homework 2.1
In this problem, you will be using a
collection of student scores that is similar to what we used in the
lessons. Please download grades.json from the Download Handout link and
import it into your local mongo database as follows:
This next query, which uses the aggregation framework that we have not taught yet, will tell you the student_id with the highest average score:
Note: Aggregation requires mongodb 2.2 or above. The answer, deep in the resulting document, should be student_id 164 with an average of approximately 89.3.
Now it’s your turn to analyze the data set. Find all exam scores greater than or equal to 65, and sort those scores from lowest to highest. What is the student_id of the lowest exam score above 65?
mongoimport -d students -c grades < grades.json
The dataset contains 4 scores for 200 students.
First, let’s confirm your data is intact; the number of documents should be 800.
> use students
> db.grades.count()
800
This next query, which uses the aggregation framework that we have not taught yet, will tell you the student_id with the highest average score:
> db.grades.aggregate(
{'$group':{'_id':'$student_id', 'average':{$avg:'$score'}}}
,{'$sort':{'average':-1}}, {'$limit':1})
Note: Aggregation requires mongodb 2.2 or above. The answer, deep in the resulting document, should be student_id 164 with an average of approximately 89.3.
Now it’s your turn to analyze the data set. Find all exam scores greater than or equal to 65, and sort those scores from lowest to highest. What is the student_id of the lowest exam score above 65?
Homework 2.2
Write a program in the language of your choice that will remove the grade of type "homework" with the lowest score for each student from the dataset that you imported in HW 2.1. Since each document is one grade, it should remove one document per student.Hint/spoiler: If you select homework grade-documents, sort by student and then by score, you can iterate through and find the lowest score for each student by noticing a change in student id. As you notice that change of student_id, remove the document.
Initially, the collection will have 800 documents. To confirm you are on the right track, here are some queries to run after you process the data and put it into the grades collection:
***********************
import pymongo import sys # Copyright 2014 # Author: JJFT # connnecto to the db on standard port connection = pymongo.Connection("mongodb://localhost", safe=True) db = connection.students # attach to db collection = db.grades # specify the colllection print db print collection query = {'type':'homework'} try: cursor = collection.find(query).sort([('student_id',pymongo.ASCENDING),('score',pymongo.DESCENDING)]) estAnt = "" scoAnt = 0 for item in cursor: estudiante = item['student_id'] score = item['score'] if (estAnt == estudiante and score < scoAnt): print( estudiante,',',scoAnt,',',item["_id"]) db.grades.remove({"_id":item["_id"]}) estAnt = "" scoAnt = 0 else: estAnt = estudiante scoAnt = score except: # print ("Error trying to read collection:" + sys.exc_info()[0]) print ("Error trying to read collection:")Let us count the number of grades we have:
> db.grades.count()
600
Now let us find the student who holds the 101st best grade across all grades:
> db.grades.find().sort({'score':-1}).skip(100).limit(1)
{ "_id" : ObjectId("50906d7fa3c412bb040eb709"),
"student_id" : 100,
"type" : "homework",
"score" : 88.50425479139126 }
Now let us sort the students by student_id, type, and score, and then see what the top five docs are:
> db.grades.find({},{'student_id':1, 'type':1, 'score':1, '_id':0})
.sort({'student_id':1, 'score':1, }).limit(5)
{ "student_id" : 0, "type" : "quiz", "score" : 31.95004496742112 }
{ "student_id" : 0, "type" : "exam", "score" : 54.6535436362647 }
{ "student_id" : 0, "type" : "homework", "score" : 63.98402553675503 }
{ "student_id" : 1, "type" : "homework", "score" : 44.31667452616328 }
{ "student_id" : 1, "type" : "exam", "score" : 74.20010837299897 }
To verify that you have completed this task correctly, provide the identity of the student with the highest average in the class with following query that uses the aggregation framework. The answer will appear in the _id field of the resulting document.
> db.grades.aggregate(
{'$group':{'_id':'$student_id', 'average':{$avg:'$score'}}}
,{'$sort':{'average':-1}}, {'$limit':1})
Enter the student ID below. Please enter just the number, with no spaces, commas or other characters.
54
Homework 2.3
Blog User Sign-up and Login Download hw2-3.zip or hw2-3.tar and unpack.You should see three files at the highest level: blog.py, userDAO.py and sessionDAO.py. There is also a views directory which contains the templates for the project.
The project roughly follows the model/view/controller paradigm. userDAO and sessionDAO.py comprise the model. blog.py is the controller. The templates comprise the view.
If everything is working properly, you should be able to start the blog by typing:
python blog.py
Note that this project requires the following python modules be installed on your computer: cgi, hmac, re, datetime, random, json, sys, string, hashlib, bson, urllib, urllib2, random, re, pymongo, and bottle. A typical Python installation will already have most of these installed except pymongo and bottle.
If you have python-setuptools installed, the command "easy_install" makes this simple. Any other missing packages will show up when validate.py is run, and can be installed in a similar fashion.
$ easy_install pymongo bottle
If you goto http://localhost:8082 you should see a message “this is a placeholder for the blog”
Here are some URLs that must work when you are done.
http://localhost:8082/signup
http://localhost:8082/login
http://localhost:8082/logout
When you login or sign-up, the blog will redirect to http://localhost:8082/welcome and that must work properly, welcoming the user by username
We have removed two pymongo statements from userDAO.py and marked the area where you need to work with XXX. You should not need to touch any other code. The pymongo statements that you are going to add will add a new user upon sign-up and validate a login by retrieving the right user document.
The blog stores its data in the blog database in two collections, users and sessions. Here are two example docs for a username ‘erlichson’ with password ‘fubar’. You can insert these if you like, but you don’t need to.
> db.users.find()
{ "_id" : "erlichson",
"password" : "d3caddd3699ef6f990d4d53337ed645a3804fac56207d1b0fa44544db1d6c5de,YCRvW" }
>
> db.sessions.find()
{ "_id" : "wwBfyRDgywSqeFKeQMPqVHPizaWqdlQJ",
"username" : "erlichson" }>*************************
# Validates a user login. Returns user record or None def validate_login(self, username, password): user = None try: # XXX HW 2.3 Students Work Here # you will need to retrieve right document from the users collection. user = {'_id': username} user = self.users.find(user).limit(1) user = user.next() print "This space intentionally left blank." except: print "Unable to query database for user" if user is None: print "User not in database" return None salt = user['password'].split(',')[1] if user['password'] != self.make_pw_hash(password, salt): print "user password is not a match" return None # Looks good return user # creates a new user in the users collection def add_user(self, username, password, email): password_hash = self.make_pw_hash(password) user = {'_id': username, 'password': password_hash} if email != "": user['email'] = email try: # XXX HW 2.3 Students work here # You need to insert the user into the users collection. # Don't over think this one, it's a straight forward insert. self.users.insert(user) print "This space intentionally left blank." except pymongo.errors.OperationFailure: print "oops, mongo error" return False except pymongo.errors.DuplicateKeyError as e: print "oops, username is already taken" return False return True
Once you have the the project working, the following steps should work:
- go to http://localhost:8082/signup
- create a user
- Goto http://localhost:8082/logout
- Now login http://localhost:8082/login.
Ok, now it’s time to validate you got it all working.
There was one additional program that should have been downloaded in the project called validate.py.
python validate.py
For those who are using MongoHQ, MongoLab or want to run the webserver on a different host or port, there are some options to the validation script that can be exposed by running
python validate.py -help
If you got it right, it will provide a validation code for you to enter into the box below. Enter just the code, no spaces. Note that your blog must be running when you run the validator.
No hay comentarios:
Publicar un comentario