[ACCEPTED]-How to configure MongoDB Java driver MongoOptions for production use?-database-tuning

Accepted answer
Score: 163

Updated to 2.9 :

  • autoConnectRetry simply means the driver 144 will automatically attempt to reconnect 143 to the server(s) after unexpected disconnects. In 142 production environments you usually want 141 this set to true.

  • connectionsPerHost are the amount of physical 140 connections a single Mongo instance (it's 139 singleton so you usually have one per application) can 138 establish to a mongod/mongos process. At 137 time of writing the java driver will establish 136 this amount of connections eventually even 135 if the actual query throughput is low (in 134 order words you will see the "conn" statistic 133 in mongostat rise until it hits this number 132 per app server).

    There is no need to set 131 this higher than 100 in most cases but this 130 setting is one of those "test it and 129 see" things. Do note that you will 128 have to make sure you set this low enough 127 so that the total amount of connections 126 to your server do not exceed

    db.serverStatus().connections.available

    In production 125 we currently have this at 40.

  • connectTimeout. As the name 124 suggest number of milliseconds the driver 123 will wait before a connection attempt is 122 aborted. Set timeout to something long (15-30 121 seconds) unless there's a realistic, expected 120 chance this will be in the way of otherwise 119 succesful connection attempts. Normally 118 if a connection attempt takes longer than 117 a couple of seconds your network infrastructure 116 isn't capable of high throughput.

  • maxWaitTime. Number 115 of ms a thread will wait for a connection 114 to become available on the connection pool, and 113 raises an exception if this does not happen 112 in time. Keep default.

  • socketTimeout. Standard socket 111 timeout value. Set to 60 seconds (60000).

  • threadsAllowedToBlockForConnectionMultiplier. Multiplier 110 for connectionsPerHost that denotes the 109 number of threads that are allowed to wait 108 for connections to become available if the 107 pool is currently exhausted. This is the 106 setting that will cause the "com.mongodb.DBPortPool$SemaphoresOut: Out 105 of semaphores to get db connection" exception. It 104 will throw this exception once this thread 103 queue exceeds the threadsAllowedToBlockForConnectionMultiplier 102 value. For example, if the connectionsPerHost 101 is 10 and this value is 5 up to 50 threads 100 can block before the aforementioned exception 99 is thrown.

    If you expect big peaks in throughput 98 that could cause large queues temporarily 97 increase this value. We have it at 1500 96 at the moment for exactly that reason. If 95 your query load consistently outpaces the 94 server you should just improve your hardware/scaling 93 situation accordingly.

  • readPreference. (UPDATED, 2.8+) Used to determine 92 the default read preference and replaces 91 "slaveOk". Set up a ReadPreference 90 through one of the class factory method. A full description of the most common settings can be found at the end of this post

  • w. (UPDATED, 2.6+) This 89 value determines the "safety" of 88 the write. When this value is -1 the write 87 will not report any errors regardless of 86 network or database errors. WriteConcern.NONE 85 is the appropriate predefined WriteConcern 84 for this. If w is 0 then network errors 83 will make the write fail but mongo errors 82 will not. This is typically referred to 81 as "fire and forget" writes and 80 should be used when performance is more 79 important than consistency and durability. Use 78 WriteConcern.NORMAL for this mode.

    If you 77 set w to 1 or higher the write is considered 76 safe. Safe writes perform the write and 75 follow it up by a request to the server 74 to make sure the write succeeded or retrieve 73 an error value if it did not (in other words, it 72 sends a getLastError() command after you 71 write). Note that until this getLastError() command 70 is completed the connection is reserved. As 69 a result of that and the additional command 68 the throughput will be signficantly lower 67 than writes with w <= 0. With a w value 66 of exactly 1 MongoDB guarantees the write 65 succeeded (or verifiably failed) on the 64 instance you sent the write to.

    In the case 63 of replica sets you can use higher values 62 for w whcih tell MongoDB to send the write 61 to at least "w" members of the 60 replica set before returning (or more accurately, wait 59 for the replication of your write to "w" members). You 58 can also set w to the string "majority" which 57 tells MongoDB to perform the write to the 56 majority of replica set members (WriteConcern.MAJORITY). Typicall 55 you should set this to 1 unless you need 54 raw performance (-1 or 0) or replicated 53 writes (>1). Values higher than 1 have 52 a considerable impact on write throughput.

  • fsync. Durability 51 option that forces mongo to flush to disk 50 after each write when enabled. I've never 49 had any durability issues related to a write 48 backlog so we have this on false (the default) in 47 production.

  • j *(NEW 2.7+)*. Boolean that when set to 46 true forces MongoDB to wait for a successful 45 journaling group commit before returning. If 44 you have journaling enabled you can enable 43 this for additional durability. Refer to 42 http://www.mongodb.org/display/DOCS/Journaling to see what journaling gets you (and thus 41 why you might want to enable this flag).

ReadPreference The 40 ReadPreference class allows you to configure 39 to what mongod instances queries are routed 38 if you are working with replica sets. The 37 following options are available :

  • ReadPreference.primary() : All 36 reads go to the repset primary member only. Use 35 this if you require all queries to return 34 consistent (the most recently written) data. This 33 is the default.

  • ReadPreference.primaryPreferred() : All reads go to the repset 32 primary member if possible but may query 31 secondary members if the primary node is 30 not available. As such if the primary becomes 29 unavailable reads become eventually consistent, but 28 only if the primary is unavailable.

  • ReadPreference.secondary() : All 27 reads go to secondary repset members and 26 the primary member is used for writes only. Use 25 this only if you can live with eventually 24 consistent reads. Additional repset members 23 can be used to scale up read performance 22 although there are limits to the amount 21 of (voting) members a repset can have.

  • ReadPreference.secondaryPreferred() : All 20 reads go to secondary repset members if 19 any of them are available. The primary member 18 is used exclusively for writes unless all 17 secondary members become unavailable. Other 16 than the fallback to the primary member 15 for reads this is the same as ReadPreference.secondary().

  • ReadPreference.nearest() : Reads 14 go to the nearest repset member available 13 to the database client. Use only if eventually 12 consistent reads are acceptable. The nearest 11 member is the member with the lowest latency 10 between the client and the various repset 9 members. Since busy members will eventually 8 have higher latencies this should also automatically 7 balance read load although in my experience 6 secondary(Preferred) seems to do so better 5 if member latencies are relatively consistent.

Note 4 : All of the above have tag enabled versions 3 of the same method which return TaggableReadPreference 2 instances instead. A full description of 1 replica set tags can be found here : Replica Set Tags

More Related questions