Buscar

domingo, 23 de noviembre de 2014

MongoDB for developers 3/8. Schema design. Homeworks

Homework 3.1

Download the students.json file from the Download Handout link and import it into your local Mongo instance with this command:
 
$ mongoimport --host localhost --port 27017 --db school --collection students 
--file "students.json" --drop --stopOnError
or
$ mongoimport -d school -c students < students.json

This dataset holds the same type of data as last week's grade collection, but it's modeled differently. You might want to start by inspecting it in the Mongo shell.
Write a program in the language of your choice that will remove the lowest homework score for each student. Since there is a single document for each student containing an array of scores, you will need to update the scores array and remove the homework.
Remember, just remove a homework score. Don't remove a quiz or an exam!
Hint/spoiler: With the new schema, this problem is a lot harder and that is sort of the point. One way is to find the lowest homework in code and then update the scores array with the low homework pruned. 

import pymongo
import sys


# Copyright 2014
# Author: JJFT

# connnecto to the db on standard port
connection = pymongo.Connection("mongodb://localhost", safe=True)

db = connection.school       # attach to db
collection = db.students         # specify the colllection

print db
print collection

query = {'scores.type':'homework'}

try:
    cursor = collection.find(query)
    for doc in cursor:
        scores = doc['scores']
        idd = doc['_id']
        print idd
        scoAnt = 0
        for scr in scores:
            if (scr['type'] == "homework"):
                if (scoAnt == 0): scoAnt = scr['score']
                print scr,',',scr['type'],',',scr['score']
                if (scr['score'] < scoAnt): scoAnt = scr['score']
        print scoAnt
        cond = {"$and" : [{'_id' : idd},{ 'scores.type' : "homework"}] }
        query = { '$pull': { 'scores': { "score": scoAnt } } }
        collection.update( cond,query,multi=False)
        print doc 

except:
    # print ("Error trying to read collection:" + sys.exc_info()[0])
        print ("Error trying to read collection:")

To confirm you are on the right track, here are some queries to run after you process the data with the correct answer shown:
Let us count the number of students we have:
 
> use school
> db.students.count() 
200

Let's see what Tamika Schildgen's record looks like:
 
> db.students.find( { _id : 137 } ).pretty( )
{
 "_id" : 137,
 "name" : "Tamika Schildgen",
 "scores" : [
  {
   "type" : "exam",
   "score" : 4.433956226109692
  },
  {
   "type" : "quiz",
   "score" : 65.50313785402548
  },
  {
   "type" : "homework",
   "score" : 89.5950384993947
  }
 ]
}
To verify that you have completed this task correctly, provide the identity (in the form of their _id) of the student with the highest average in the class with following query that uses the aggregation framework. The answer will appear in the _id field of the resulting document.
 
> db.students.aggregate(
     { '$unwind' : '$scores' }
   , { '$group' : { '_id' : '$_id' , 'average' : { $avg : '$scores.score' } } }
   , { '$sort' : { 'average' : -1 } } , { '$limit' : 1 } ) 
 
13 







Homework 3.2

Making your blog accept posts In this homework you will be enhancing the blog project to insert entries into the posts collection. After this, the blog will work. It will allow you to add blog posts with a title, body and tags and have it be added to the posts collection properly.
We have provided the code that creates users and allows you to login (the assignment from last week). To get started, please download hw3-2and3-3.zip from the Download Handout link and unpack. You will be using these file for this homework and the HW 3.3.
The areas where you need to add code are marked with XXX. You need only touch the BlogPostDAO.py file. There are three locations for you to add code for this problem. Scan that file for XXX to see where to work. 


    # inserts the blog entry and returns a permalink for the entry
    def insert_entry(self, title, post, tags_array, author):
        print "inserting blog entry", title, post

        # fix up the permalink to not include whitespace

        exp = re.compile('\W') # match anything not alphanumeric
        whitespace = re.compile('\s')
        temp_title = whitespace.sub("_",title)
        permalink = exp.sub('', temp_title)

        # Build a new post
        post = {"title": title,
                "author": author,
                "body": post,
                "permalink":permalink,
                "tags": tags_array,
                "comments": [],
                "date": datetime.datetime.utcnow()}

        # now insert the post
        try:
            # XXX HW 3.2 Work Here to insert the post
            self.posts.insert(post)
            print "Inserting the post"
        except:
            print "Error inserting post"
            print "Unexpected error:", sys.exc_info()[0]

        return permalink

    # returns an array of num_posts posts, reverse ordered
    def get_posts(self, num_posts):

        cursor = []         # Placeholder so blog compiles before you make your changes

        # XXX HW 3.2 Work here to get the posts
        cursor = self.posts.find().limit(num_posts)

        l = []

        for post in cursor:
            post['date'] = post['date'].strftime("%A, %B %d %Y at %I:%M%p") # fix up date
            if 'tags' not in post:
                post['tags'] = [] # fill it in if its not there already
            if 'comments' not in post:
                post['comments'] = []

            l.append({'title':post['title'], 'body':post['body'], 'post_date':post['date'],
                      'permalink':post['permalink'],
                      'tags':post['tags'],
                      'author':post['author'],
                      'comments':post['comments']})

        return l

    # find a post corresponding to a particular permalink
    def get_post_by_permalink(self, permalink):

        post = None
        # XXX Work here to retrieve the specified post
        post = self.posts.find_one({'permalink':permalink})

        if post is not None:
            # fix up date
            post['date'] = post['date'].strftime("%A, %B %d %Y at %I:%M%p")

        return post

As a reminder, to run your blog you type
python blog.py
To play with the blog you can navigate to the following URLs
http://localhost:8082/
http://localhost:8082/signup
http://localhost:8082/login
http://localhost:8082/newpost
You will be proving that it works by running our validation script as follows:
python validate.py
You need to run this in a separate terminal window while your blog is running and while the database is running. It makes connections to both to determine if your program works properly. Validate connects to localhost:8082 and expects that mongod is running on localhost on port 27017.
As before, validate will take some optional arguments if you want to run mongod on a different host or a use an external webserver.
This project requires Python 2.7. The code is not 3.0 compliant.
Ok, once you get the blog posts working, validate.py will print out a validation code for HW 3.2.
Please enter it below, exactly as shown with no spaces.

Homework 3.3

Making your blog accept comments In this homework you will add code to your blog so that it accepts comments. You will be using the same code as you downloaded for HW 3.2.
Once again, the area where you need to work is marked with an XXX in the blogPostDAO.py file. There are one location. You don't need to figure out how to retrieve comments for this homework because the code you did in 3.2 already pulls the entire blog post (unless you specifically projected to eliminate the comments) and we gave you the code that pulls them out of the JSON document.
This assignment has fairly little code, but it's a little more subtle than the previous assignment because you are going to be manipulating an array within the Mongo document. For the sake of clarity, here is a document out of the posts collection from a working project.
 {
 "_id" : ObjectId("509df76fbcf1bf5b27b4a23e"),
 "author" : "erlichson",
 "body" : "This is a blog entry",
 "comments" : [
  {
   "body" : "This is my comment",
   "author" : "Andrew Erlichson"
  },
  {
   "body" : "Give me liberty or give me death.",
   "author" : "Patrick Henry"
  }
 ],
 "date" : ISODate("2012-11-10T06:42:55.733Z"),
 "permalink" : "This_is_a_blog_post_title",
 "tags" : [
  "cycling",
  "running",
  "swimming"
 ],
 "title" : "This is a blog post title"
}
Note that you add comments in this blog from the blog post detail page, which appears at
http://localhost:8082/post/post_slug
where post_slug is the permalink. For the sake of eliminating doubt, the permalink for the example blog post above is http://localhost:8082/post/This_is_a_blog_post_title 
    # add a comment to a particular blog post
    def add_comment(self, permalink, name, email, body):

        comment = {'author': name, 'body': body}

        if (email != ""):
            comment['email'] = email

        try:
            # this is here so the code runs before you fix the next line
            last_error = {'n':-1}
            # XXX HW 3.3 Work here to add the comment to the designated post
            
            self.posts.update({'permalink':permalink},
                   {"$push" : { "comments" : comment} },multi=False)

            return last_error['n']          # return the number of documents updated

        except:
            print "Could not update the collection, error"
            print "Unexpected error:", sys.exc_info()[0]
            return 0
You will run validation.py to check your work, much like the last problem. Validation.py will run through and check the requirements of HW 3.2 and then will check to make sure it can add blog comments, as required by this problem, HW 3.3. It checks the web output as well as the database documents.
python validate.py
Once you have the validation code, please copy and paste in the box below, no spaces.

No hay comentarios:

Publicar un comentario