Write Concern
- w: (write) if it's equal to 1, it determines weather or not you want to wait for the write operation to be acknowledge.
- j: (journal = log to disk with the operations with the data) if it's equal to 1, getLastError waits until the journal commits to disk
- w = 0, j = 0 : fire and forget
- w = 1, j = 0 : wait for a simple acknowlegement from mongo that receives the write. BY DEFAULT
- w = 0 or 1 , j = 1 : wait for the write commits the journal. TO SURE THAT YOU ARRIVE TO DISK BEFORE YOU GO ON
Network Errors
The Pymongo Driver
The current driver will always be at api.mongodb.org/python/current
The most important takeaways are that you should be using pymongo.MongoClient() to connect to a standalone server, or if you're connecting to a replica set, an even better option is pymongo.MongoReplicaSetClient().
Which of the following are valid, supported ways to connect to a server with pymongo?
Introduction to Replication
The solution of mongo is building a replica set. A replica set is a group of nodes with mongod that works mirroring each other the data. The is one primary node and the others are secondaries dynamically.
The operation is that your application and its drivers stay connected to the primary node, and will write to the primary (you can only write to the primary). If the primary goes down, the reamining nodes will perfom an election to elect a new primary having a strict majority of the original nodes.
The minimun number of nodes to buid a replica set is three and you can have an arbiter node to decide which one will be primary in case of a tie.
Replica Set Elections
Types of Replica set nodes:
- Regular: has the data and it's the most normal type of node. It can be a primary or secondary
- Arbiter: It can be a regular node. It's used for voting purposes. If you have even number of replica set nodes you need to make sure that there's an arbiter node in order to have a strict majority to elect a node as primary
- Delayed/Regular: It offers the possibility to delay behind other nodes to recover data in a fast way. It cannot be a primary but can participate voting the election. Its priority is set to zero.
- Hidden: it's often used for analytics.
Write Consistency
Creating a Replica Set
1) Create the nodes of Replica set:
mongod --port 27017 --dbpath "/var/lib/mongodb/data/rs1" --replSet group1 --logpath "log_1.log" --oplogSize 200 --fork --smallfiles
mongod --port 27018 --dbpath "/var/lib/mongodb/data/rs2" --replSet group1 --logpath "log_2.log" --oplogSize 200 --fork --smallfiles
mongod --port 27019 --dbpath "/var/lib/mongodb/data/rs3" --replSet group1 --logpath "log_3.log" --oplogSize 200 --fork --smallfiles
2) Initialise the Replica set creating a variable within the configuration and loading it from the shell.
config = {
"_id" : "group1",
"members" : [
//node 1
{ "_id" : 1 , "host" : "localhost:27017"}
//node 2
,{ "_id" : 2 , "host" : "localhost:27018"}
//node 3
,{ "_id" : 3 , "host" : "localhost:27019"}
]
};
rs.initiate(config);
To know the status of replica set:
rs.status();
To allow readings from an slave node it's necessary to execute in the secondary node:
rs.slaveOk()
rs.isMaster()
rs.stepDown()
rs.help()
Replica Set Internals
Failover and Rollback
What happens if a node comes back up as a secondary after a period of being offline and the oplog has looped on the primary?
Connecting to a Replica Set from Pymongo
"mongodb://localhost:27018",
"mongodb://localhost:27019"
replicaSet="rs1",
w=1, j=True)
replicaSet="rs1",
w=1, j=True)
What happens when the failover occurs
db.test.insert({'x':1})
Detecting Failover
Proper Handling of Failover
# let's do some inserting
for i in range(0,1000000):
for retries in range(0,3):
doc = {'i':i}
try:
test.insert(doc)
print "Inserted " + str(i)
break
except pymongo.errors.DuplicateKeyError:
print "Duplicate key error"
break
except:
print sys.exc_info()[0]
print "Retrying..."
time.sleep(5)
time.sleep(.5)
doc = {'i':i} for retries in range(0,3): try: test.insert(doc) print "Inserted " + str(i) break except pymongo.errors.DuplicateKeyError: print "Duplicate key error" break except: print sys.exc_info()[0] print "Retrying..." time.sleep(5)
Write concern revisited
- w = 1: it will wait until the primary node makes the write
- w = 2: it will wait until two nodes make the write (if we have three nodes)
- w = 3: it will wait until three nodes make the write (if we have three nodes)
- j = 1: it will wait until the primary node
- wtimeout (seconds): how long you wiloing to wait for the writes to be acknowledged by the secondaries. (it can be set in the drivers)
- on the connection
- on the connections inside the driver
- in the configuration itself of the Replica set you can set default values
Read preferences
- always read on the primary
- always read on the secondary. If there isn't secondary the reads cannot do
- secondary preference and if there isn't it will read from a primary
- primary preference and if there isn't it will read from a secondary
- the nearest node
- by tagging. You can assing tags to nodes in order to name them
Implications of replication
- Seed list to ensure that an election will done when the primary goes down
- Write concern: the idea of waiting for some number of nodes to acknowlege the writes through to w parameter, the j parameter which lets it wait or not for the primary node to commit that write to disk. An wtimeout parameter, which is how long you are going to wait to see that your write replicated to other members of the replica set.
- Read preferences
- Errors can have: errors can always happen because of transient situations like failover occuring, or they can happen because there are network errors that occurs o errors in terms of violating the unique key constraints
- Seed list to ensure that an election will done when the primary goes down
- Write concern: the idea of waiting for some number of nodes to ack
Introduction to Sharding
Building a Sharded Environment
Implications of Sharding
- Every document needs to include the shard key
- The shard key is immutable: yo cannot change the shard key inside the document
- it needs an index that starts with the shard key but it cannot be a multiple index
- when do an update, it's necessary to specify the shard key or specify that multi is true
- no shard key means scatter gather operation, which could be expensive
- you can't have a unique key, no unique index, unless it's also part of the shard key
Sharding + Replication
Choosing a Shard Key
- Sufficient cardinality: in order to mongo can distribute the documents in shards
- avoid hotspots in writes: monotically increasing: the shard key will not provoque that all the writes will go to an specific shard
{'username':'toeguy', 'posttime':ISODate("2012-12-02T23:12:23Z"), "randomthought": "I am looking at my feet right now", 'visible_to':['friends','family', 'walkers']}Thinking about the tradeoffs of shard key selection, select the true statements below.
Casinos Near Casinos Near Me - MapyRO
ResponderEliminarFind the 평택 출장마사지 closest 포천 출장안마 casinos to you in a row with detailed reviews, map directions, This map is updated regularly 충청북도 출장샵 so 원주 출장안마 you can 목포 출장샵 easily find the nearest casino