Sie sind auf Seite 1von 52

Basic Application & Schema Design

Alvin Richards - Technical Director, 10gen Inc. @jonnyeight

Schema design is easy! Data as Objects in code Common patterns Single table inheritance One-to-Many & Many-to-Many Trees Queues

Use MongoDB with your language

10gen Supported Drivers Ruby, Python, Perl, PHP, Javascript Java, C/C++, C#, Scala Erlang, Haskell Object Data Mappers Morphia - Java Mongoid, MongoMapper - Ruby Community Drivers F# , Smalltalk, Clojure, Go, Groovy

So todays example will use...

Design your objects in your code - Java using Driver

// Get a connection to the database DBCollection coll = new Mongo().getDB("blogs"); // Create the Object Map<String, Object> obj = new HashMap... obj.add("author", "Herg"); obj.add("text", "Destination Moon"); obj.add("date", new Date()); // Insert the object into MongoDB coll.insert(new BasicDBObject(obj));

Design your objects in your code - Java using Object Data Mapper
// Use Morphia annotations @Entity class Blog { @Id String author; @Indexed Date date; String text; }

Design your objects in your code - Java using Object Data Mapper
// Create the data store Datastore ds = new Morphia().createDatastore() // Create the Object Blog entry = new Blog("Herg", New Date(), "Destination Moon") // Insert object into MongoDB;

RDBMS Table Row(s) Index Join Partition Partition Key MongoDB Collection JSON Document Index Embedding & Linking Shard Shard Key

Schema Design Relational Database

Schema Design MongoDB

Schema Design MongoDB


Schema Design MongoDB


Design Session
Design documents that simply map to your application
> post = {author: "Herg", date: ISODate("2011-09-18T09:56:06.298Z"), text: "Destination Moon", tags: ["comic", "adventure"]} >

Find the document

> db.posts.find() { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"), author: "Herg", date: ISODate("2011-09-18T09:56:06.298Z"), text: "Destination Moon", tags: [ "comic", "adventure" ] } Notes: ID must be unique, but can be anything youd like MongoDB will generate a default ID if one is not supplied

Add and index, nd via Index

Secondary index for author // 1 means ascending, -1 means descending > db.posts.ensureIndex({author: 1}) > db.posts.find({author: 'Herg'}) { _id: ObjectId("4c4ba5c0672c685e5e8aabf3"), date: ISODate("2011-09-18T09:56:06.298Z"), author: "Herg", ... }

Examine the query plan

> db.blogs.find({author: "Herg"}).explain() { "cursor" : "BtreeCursor author_1", "nscanned" : 1, "nscannedObjects" : 1, "n" : 1, "millis" : 5, "indexBounds" : { "author" : [ [ "Herg", "Herg" ] ] } }

Query operators
Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

// find posts with any tags > db.posts.find({tags: {$exists: true}})

Query operators
Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

// find posts with any tags > db.posts.find({tags: {$exists: true}})

Regular expressions:

// posts where author starts with h > db.posts.find({author: /^h/i })

Query operators
Conditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

// find posts with any tags > db.posts.find({tags: {$exists: true}})

Regular expressions:

// posts where author starts with h > db.posts.find({author: /^h/i })


// number of posts written by Herg > db.posts.find({author: "Herg"}).count()

Extending the Schema

new_comment = {author: "Kyle", date: new Date(), text: "great book"}

> db.posts.update( {text: "Destination Moon" }, { "$push": {comments: new_comment}, "$inc": {comments_count: 1}})

Extending the Schema

> db.blogs.find({_id: ObjectId("4c4ba5c0672c685e5e8aabf3")}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "Herg", date : ISODate("2011-09-18T09:56:06.298Z"), text : "Destination Moon", tags : [ "comic", "adventure" ], comments : [ { author : "Kyle", date : ISODate("2011-09-19T09:56:06.298Z"), text : "great book" } ], comments_count: 1 }

Extending the Schema

// create index on nested documents: > db.posts.ensureIndex({"": 1}) > db.posts.find({"":"Kyle"})

Extending the Schema

// create index on nested documents: > db.posts.ensureIndex({"": 1}) > db.posts.find({"":"Kyle"}) // find last 5 posts: > db.posts.find().sort({date:-1}).limit(5)

Extending the Schema

// create index on nested documents: > db.posts.ensureIndex({"": 1}) > db.posts.find({"":"Kyle"}) // find last 5 posts: > db.posts.find().sort({date:-1}).limit(5) // most commented post: > db.posts.find().sort({comments_count:-1}).limit(1)

When sorting, check if you need an index

Common Patterns


Single Table Inheritance - RDBMS

shapes table id type
1 area radius length 1 width

circle 3.14

square 4



Single Table Inheritance MongoDB

> db.shapes.find()
{ _id: "1", type: "circle",area: 3.14, radius: 1} { _id: "2", type: "square",area: 4, length: 2} { _id: "3", type: "rect", area: 10, length: 5, width: 2}

missing values not stored!

Single Table Inheritance MongoDB

> db.shapes.find()
{ _id: "1", type: "circle",area: 3.14, radius: 1} { _id: "2", type: "square",area: 4, length: 2} { _id: "3", type: "rect", area: 10, length: 5, width: 2}

// find shapes where radius > 0 > db.shapes.find({radius: {$gt: 0}})

Single Table Inheritance MongoDB

> db.shapes.find()
{ _id: "1", type: "circle",area: 3.14, radius: 1} { _id: "2", type: "square",area: 4, length: 2} { _id: "3", type: "rect", area: 10, length: 5, width: 2}

// find shapes where radius > 0 > db.shapes.find({radius: {$gt: 0}}) // create index > db.shapes.ensureIndex({radius: 1}, {sparse:true})

index only values present!

One to Many
One to Many relationships can specify degree of association between objects containment life-cycle

One to Many
- Embedded Array - $slice operator to return subset of comments - some queries harder e.g nd latest comments across all blogs
blogs: { author : "Herg", date : ISODate("2011-09-18T09:56:06.298Z"), comments : [ { author : "Kyle", date : ISODate("2011-09-19T09:56:06.298Z"), text : "great book" } ]}

One to Many
- Normalized (2 collections) - most exible - more queries
blogs: { _id: 1000, author: "Herg", date: ISODate("2011-09-18T09:56:06.298Z"), comments: [ {comment : 1)} ]} comments : { _id : 1, blog: 1000, author : "Kyle", date : ISODate("2011-09-19T09:56:06.298Z")} > blog = db.blogs.find({text: "Destination Moon"}); > db.comments.find({blog: blog._id});

One to Many - patterns

- Embedded Array / Array Keys

- Embedded Array / Array Keys - Normalized

Many - Many
- Product can be in many categories - Category can have many products

Many - Many
products: { _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }

Many - Many
products: { _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] } categories: { _id: 20, name: "adventure", product_ids: [ 10, 11, 12 ] } categories: { _id: 21, name: "movie", product_ids: [ 10 ] }

Many - Many
products: { _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] } categories: { _id: 20, name: "adventure", product_ids: [ 10, 11, 12 ] } categories: { _id: 21, name: "movie", product_ids: [ 10 ] } //All categories for a given product > db.categories.find({product_ids: 10})

products: { _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }

categories: { _id: 20, name: "adventure"}

products: { _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }

categories: { _id: 20, name: "adventure"} // All products for a given category > db.products.find({category_ids: 20)})

products: { _id: 10, name: "Destination Moon", category_ids: [ 20, 30 ] }

categories: { _id: 20, name: "adventure"} // All products for a given category > db.products.find({category_ids: 20)}) // All categories for a given product product = db.products.find(_id : some_id) > db.categories.find({_id : {$in : product.category_ids}})

Hierarchical information

Full Tree in Document
{ comments: [ { author: Kyle, text: ..., replies: [ {author: James, text: ..., replies: []} ]} ] }

Pros: Single Document, Performance, Intuitive Cons: Hard to search, Partial Results, 16MB limit

Array of Ancestors



- Store all Ancestors of a node { _id: "a" } { _id: "b", thread: [ "a" ], replyTo: "a" } { _id: "c", thread: [ "a", "b" ], replyTo: "b" } { _id: "d", thread: [ "a", "b" ], replyTo: "b" } { _id: "e", thread: [ "a" ], replyTo: "a" } { _id: "f", thread: [ "a", "e" ], replyTo: "e" } // find all threads where "b" is in > db.msg_tree.find({thread: "b"})

Array of Ancestors



- Store all Ancestors of a node { _id: "a" } { _id: "b", thread: [ "a" ], replyTo: "a" } { _id: "c", thread: [ "a", "b" ], replyTo: "b" } { _id: "d", thread: [ "a", "b" ], replyTo: "b" } { _id: "e", thread: [ "a" ], replyTo: "a" } { _id: "f", thread: [ "a", "e" ], replyTo: "e" } // find all threads where "b" is in > db.msg_tree.find({thread: "b"}) // find all direct message "b: replied to > db.msg_tree.find({replyTo: "b"})

Array of Ancestors



- Store all Ancestors of a node { _id: "a" } { _id: "b", thread: [ "a" ], replyTo: "a" } { _id: "c", thread: [ "a", "b" ], replyTo: "b" } { _id: "d", thread: [ "a", "b" ], replyTo: "b" } { _id: "e", thread: [ "a" ], replyTo: "a" } { _id: "f", thread: [ "a", "e" ], replyTo: "e" } // find all threads where "b" is in > db.msg_tree.find({thread: "b"}) // find all direct message "b: replied to > db.msg_tree.find({replyTo: "b"}) //find all ancestors of f: > threads = db.msg_tree.findOne({_id:"f"}).thread > db.msg_tree.find({_id: { $in : threads})

Trees as Paths
Store hierarchy as a path expression - Separate each node by a delimiter, e.g. / - Use text search for nd parts of a tree
{ comments: [ { author: "Kyle", text: "initial post", path: "" }, { author: "Jim", text: "jims comment", path: "jim" }, { author: "Kyle", text: "Kyles reply to Jim", path : "jim/kyle"} ] } // Find the conversations Jim was part of > db.posts.find({path: /^jim/i})

Need to maintain order and state Ensure that updates are atomic { inprogress: false, priority: 1, ... }); // find highest priority job and mark as in-progress job ={ query: {inprogress: false}, sort: {priority: -1}, update: {$set: {inprogress: true, started: new Date()}}, new: true})

Need to maintain order and state Ensure that updates are atomic { inprogress: false, priority: 1, ... }); // find highest priority job and mark as in-progress job ={ query: {inprogress: false}, sort: {priority: -1}, update: {$set: {inprogress: true, started: new Date()}}, new: true})

{ inprogress: true, priority: 1, started: ISODate("2011-09-18T09:56:06.298Z") ... }


Schema design is different in MongoDB Basic data design principals stay the same Focus on how the application manipulates data Rapidly evolve schema to meet your requirements Enjoy your new freedom, use it wisely :-)

download at

conferences, appearances, and meetups>

Facebook | Twitter | LinkedIn


Das könnte Ihnen auch gefallen