<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[MongoHQ Blog]]></title>
  <link href="http://blog.mongohq.com/atom.xml" rel="self"/>
  <link href="http://blog.mongohq.com/"/>
  <updated>2013-05-15T17:00:38-05:00</updated>
  <id>http://blog.mongohq.com/</id>
  <author>
    <name><![CDATA[MongoHQ]]></name>
    
  </author>
  <generator uri="http://octopress.org/">Octopress</generator>

  
  <entry>
    <title type="html"><![CDATA[Get Database Alerts - Anywhere]]></title>
    <link href="http://blog.mongohq.com/blog/2013/05/14/database-alerts-anywhere/"/>
    <updated>2013-05-14T15:18:00-05:00</updated>
    <id>http://blog.mongohq.com/blog/2013/05/14/database-alerts-anywhere</id>
    <content type="html"><![CDATA[<p>We are excited to announce our MongoHQ Alerts feature which will
push realtime alerts about your database environment. The typical
events include: replica set step down and periods of high lock percentage.
Receiving these notifications in real-time on the status of your
database environment has been a much requested feature.</p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-05-14-database-alerts-anywhere/mongohq-alerts-view.png" alt="MongoHQ Alerts" /></p>

<p>By triggering real-time time alerts, it will bring a better
understanding of growth and scaling in staging and early production.</p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-05-14-database-alerts-anywhere/mongohq-alert-view.png" alt="MongoHQ Alerts" /></p>

<h2>Get alerts on the Go</h2>

<p>With this feature, we enabled multiple methods of message delivery. You can</p>

<ul>
<li>View alerts on the website via the &#8220;Alert&#8221; icon</li>
<li>Receive an E-mail when an alert is triggered</li>
<li>Receive a PagerDuty notification when an alert is triggered</li>
</ul>


<p><img src="http://blog.mongohq.com/images/for_posts/2013-05-14-database-alerts-anywhere/mongohq-alert-settings.png" alt="MongoHQ Alerts" /></p>

<p>By default, all alerts are viewable via
<a href="https://app.mongohq.com">app.mongohq.com</a>.  To have alerts delivered,
click the &#8216;Alerts&#8217; icon and choose &#8220;Settings&#8221;.  All delivered alerts
will be metered to 1 unique alert every 12 hours.</p>

<p>When you get an alert on the go, remember, MongoHQ&#8217;s site is built to be
responsive from mobile devices. Get the alert, and handle the alert
immediately while on the go.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoDB Indexing Best Practices]]></title>
    <link href="http://blog.mongohq.com/blog/2013/05/06/mongodb-indexing-best-practices/"/>
    <updated>2013-05-06T09:07:00-05:00</updated>
    <id>http://blog.mongohq.com/blog/2013/05/06/mongodb-indexing-best-practices</id>
    <content type="html"><![CDATA[<p>Going “Best Practice” on any topic is an expansive statement. I will
give it a go with high-level anti-patterns and best-practices. Some
best-practices will be general statements about the performance of
MongoDB.</p>

<h2>Indexing Constraints</h2>

<p>The following constraints are in effect as of MongoDB 2.4.x release.
There is talk of adding features in later versions that will remove most
of these limitations. If these limitations are removed, these
constraints will continue to be the best practice for scaling, and will
lead to better performance.</p>

<ul>
<li>MongoDB can only use 1 index per query</li>
<li>MongoDB can only use one “multi-value” operator / query (e.g. $nin, $in, $nor, $gte, $ge, $lt, $lte, $near, $sort). Yes, that does include sorting. However, you can use a range query and sort on the same field effectively.</li>
<li>MongoDB indexes must include the multi-value operator as the last used field in the index.</li>
<li>RAM is fast; disk is slow. Have enough RAM for your MongoDB indexes.</li>
</ul>


<p>Below are example queries, one that breaks the constraints above, and
one that shows a solution:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="c1">// A Bad query with conflicting `time` and `user_id`</span>
</span><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">coll</span><span class="p">.</span><span class="nx">find</span><span class="p">({</span><span class="nx">action</span><span class="o">:</span> <span class="s2">&quot;run-process&quot;</span><span class="p">,</span> <span class="nx">time</span><span class="o">:</span> <span class="p">{</span><span class="nx">$gte</span><span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s1">&#39;2013-05-06&#39;</span><span class="p">),</span> <span class="nx">$lt</span><span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s1">&#39;2013-05-07&#39;</span><span class="p">)}).</span><span class="nx">sort</span><span class="p">({</span><span class="nx">user_id</span><span class="o">:</span> <span class="o">-</span><span class="mi">1</span><span class="p">})</span>
</span><span class='line'>
</span><span class='line'><span class="c1">// A better query using date &#39;bucketing&#39;</span>
</span><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">coll</span><span class="p">.</span><span class="nx">find</span><span class="p">({</span><span class="nx">action</span><span class="o">:</span> <span class="s2">&quot;run-process&quot;</span><span class="p">,</span> <span class="nx">day_bucket</span><span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s1">&#39;2013-05-06&#39;</span><span class="p">)}).</span><span class="nx">sort</span><span class="p">({</span><span class="nx">user_id</span><span class="o">:</span> <span class="o">-</span><span class="mi">1</span><span class="p">})</span>
</span></code></pre></td></tr></table></div></figure>


<p>When optimizing queries, move logic from the query into the schema.
Instead of using the query to bound a date field, we are using a date
bucket to move that date logic to the schema. In the example above, we
remove the $gte and $lte, and replace with a bucket value representing
the entire day.</p>

<p>These indexing constraints will set you free. By following these
constraints, you will model data and build a database that can be scaled
in a sharded/distributed environment. These constraints require you to
be creative with your schemas.</p>

<h2>Too Many Indexes</h2>

<p>Everyone thinks of the new indexes to add. Not everyone thinks of
indexes to be removed.</p>

<p>Recently, I helped a customer optimize his database. Write lock on the
database was running consistently at 95%. CPU was spiking consistently,
and making for a poor experience. We looked at the indexes, and
determined the customer had too many indexes (126 non-_id indexes). 24
of the indexes were on one collection. For each insert/update which
modified keys, MongoDB was writing the document, and updating 24
indexes. This caused of the excessive write-lock.</p>

<p>The first action was to delete all of his indexes (do not try this on a
database larger than 15 GB). Immediately after deleting the indexes,
write lock fell to 0%. A tradeoff for removing indexes was going to disk
for queries. However, disk reads were faster with no indexes than when
performance writes against too many indexes.</p>

<p>Once we removed these indexes, we turned on system profiling in MongoHQ
from the Database > Admin page. We let the database run for about 6
hours, and then ran a short analysis on the system.profile. In the
analysis, we found 6 essential indexes. After adding the 6 essential
indexes, performance was drastically better. In summary, we removed 124
indexes, and added 6 back as required.</p>

<p>For the sake of RAM and write lock , be precise with your indexes.</p>

<h2>Best Practices</h2>

<ul>
<li>Know thy constraints above</li>
<li>Learn to use explain <a href="http://blog.mongohq.com/blog/2013/02/05/explaining-explain/">explain.explain() - Understanding mongo query behavior</a></li>
<li>Limit the number of indexes on a collection. The first index will not hurt performance. The 24th index will hurt performance. It is up to you to know what is acceptable. Keep track of your indexes and queries.</li>
<li>Keep track of your indexes and queries. MongoDB does not have an index usage counter. If you think one would be valuable, go here to vote! <a href="https://jira.mongodb.org/browse/SERVER-2227">https://jira.mongodb.org/browse/SERVER-2227</a></li>
</ul>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoDB Atlanta 2013]]></title>
    <link href="http://blog.mongohq.com/blog/2013/04/23/mongodb-atlanta-2013/"/>
    <updated>2013-04-23T09:16:00-05:00</updated>
    <id>http://blog.mongohq.com/blog/2013/04/23/mongodb-atlanta-2013</id>
    <content type="html"><![CDATA[<p><img src="http://f.cl.ly/items/2c3c0X0F1o0g1e1x1f1z/BIi2nanCAAMX0-C.jpeg"></p>

<p>The MongoHQ Team is in Atlanta today, sponsoring and presenting at MongoDB Atlanta. This conference, in its third year at at <a href="http://www.gtri.gatech.edu/">Georgia Tech Research Institute</a> (GTRI) has a host of great speakers, including our own Chris Winslett. I&#8217;m sure the southern food and hospitality will be enjoyed as well. Check out some of the great talks that are happening today:</p>

<h3>MongoDB Index Constraints and Creative Schemas</h3>

<p>The first step to understanding scaling with MongoDB is understanding the constraints of the system. We will build out the constraints of MongoDB indexing. Then, we will talk about how to use these constraints to optimize schema. <em>Presented by: Chris Winslett, <a href="http://www.mongohq.com">MongoHQ</a></em>.</p>

<h3>Developer Happiness Through MongoDB</h3>

<p>MongoDB is the ideal data store for modern web apps. By providing freedom of choice, a sense of creative control, and a strong adaptability to change, MongoDB makes developers happier. Problems that are tedious or unnecessarily complex to model with a relational database are instead a joy to solve with a schema-less, documented-oriented structure. Those factors along with a suite of built-in tools like full-text search and aggregations contribute to an excellent experience for web developers. <em>Presented by: Luigi Montanez, <a href="http://www.upworthy.com">Upworthy</a></em>.</p>

<h3>Hash-based Sharding in MongoDB 2.4</h3>

<p>In version 2.4, MongoDB introduces hash-based sharding, a new option for distributing data in sharded collections. Hash-based sharding and range-based sharding present different advantages for MongoDB users deploying large scale systems. In this talk, we&#8217;ll provide an overview of this new feature and discuss when to use hash-based sharding or range-based sharding. <em>Presented by: Kelly Stirman, Director of Product Marketing, <a href="http://www.10gen.com">10gen</a></em>.</p>

<p>The team at MongoHQ is excited to be a part of this great event and we are looking forward to particiaption in the upcoming MongoSF and MongoNYC events as well.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Converting Data to Metrics - MongoDB Analytics Part 2]]></title>
    <link href="http://blog.mongohq.com/blog/2013/04/05/converting-data-to-metrics-mongodb-analytics-part-2/"/>
    <updated>2013-04-05T14:36:00-05:00</updated>
    <id>http://blog.mongohq.com/blog/2013/04/05/converting-data-to-metrics-mongodb-analytics-part-2</id>
    <content type="html"><![CDATA[<p>In part one of &#8221;<a href="http://blog.mongohq.com/blog/2013/01/29/constructing-analytics-with-mongohq/">First Steps of an Analytics Platform with MongoDB</a>&#8221;, we discussed how to build an efficient logging portion of an analytics system using time buckets or time dimension cubes.  The next logical step is summating the data from the logs into cacheable values.  Toward the end, we showed a simple common example:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">events</span><span class="p">.</span><span class="nx">aggregate</span><span class="p">(</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$match</span><span class="o">:</span> <span class="p">{</span><span class="nx">time_bucket</span><span class="o">:</span> <span class="s2">&quot;2013-01-month&quot;</span><span class="p">}},</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$unwind</span><span class="o">:</span> <span class="s2">&quot;$time_bucket&quot;</span><span class="p">},</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$project</span><span class="o">:</span> <span class="p">{</span><span class="nx">time_bucket_event</span><span class="o">:</span> <span class="p">{</span><span class="nx">$concat</span><span class="o">:</span> <span class="p">[</span><span class="s2">&quot;$time_bucket&quot;</span><span class="p">,</span> <span class="s2">&quot;/&quot;</span><span class="p">,</span> <span class="s2">&quot;$event&quot;</span><span class="p">]}}},</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$group</span><span class="o">:</span> <span class="p">{</span><span class="nx">_id</span><span class="o">:</span> <span class="s2">&quot;$time_bucket_event&quot;</span><span class="p">,</span> <span class="nx">event_count</span><span class="o">:</span> <span class="p">{</span><span class="s2">&quot;$sum&quot;</span><span class="o">:</span> <span class="mi">1</span><span class="p">}}}</span>
</span><span class='line'><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Let&#8217;s take a step back and look at the different options: Aggregation Framework or Map Reduce?</p>

<h2>Aggregation Framework or Map Reduce?</h2>

<p>Since 2.2.x release, the aggregation framework has been the de facto MongoDB number cruncher.  If you are familiar with a SQL db&#8217;s <code>group by</code> functions, you will be at home with the functions.  Performance wise, Aggregation Framework smokes MapReduce, <a href="http://stackoverflow.com/questions/13908438/is-mongodb-aggregation-framework-faster-that-map-reduce">like not even close</a>.</p>

<p>Unless your required data manipulation functions are not present in the current versions, chose Aggregation Framework.  For instance, different types of statistical analysis may be difficult with the Aggregation Framework.  Averages are easy.  Any type of deviation calculations will require aggregation process calls: one for averaging and the other for deviation calculations.</p>

<p>In sharded MongoDB environments with aggregation framework, you will get the benefit of distributed processing at the data node level as you did with map reduces.  Thus, each data node will return the summated results, and the <code>mongos</code> will concatenate and process the results returned from data nodes.</p>

<h2>Getting started with Aggregation Framework</h2>

<p>Aggregation framework is a series of actions, also known as a &#8220;pipeline&#8221;.  This pipeline is processed in order and each action can filter or manipulate data.  For instance, if you wanted to filter data and concat a variable called <code>name</code>:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">my_collection</span><span class="p">.</span><span class="nx">aggregate</span><span class="p">([</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$match</span><span class="o">:</span> <span class="p">{</span><span class="nx">first_name</span><span class="o">:</span> <span class="s2">&quot;Clark&quot;</span><span class="p">}},</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$project</span><span class="o">:</span> <span class="p">{</span><span class="nx">_id</span><span class="o">:</span> <span class="s2">&quot;$id&quot;</span><span class="p">,</span> <span class="nx">full_name</span><span class="o">:</span> <span class="p">{</span><span class="nx">$concat</span><span class="o">:</span> <span class="p">[</span><span class="nx">$first_name</span><span class="p">,</span> <span class="s2">&quot; &quot;</span><span class="p">,</span> <span class="nx">$last_name</span><span class="p">]}}}</span>
</span><span class='line'><span class="p">])</span>
</span></code></pre></td></tr></table></div></figure>


<p>The above will return a full name for all individuals with first_name &#8220;Clark&#8221;.  For a complete list of the functions with the aggregation framework, see the <a href="http://docs.mongodb.org/manual/reference/aggregation/">10gen Aggregation Framework Reference</a>.</p>

<h2>Processing and Caching</h2>

<p>Repeat after me: cache aggregation queries.  While running aggregation queries is fast, it is best not to run these commands for every application action.  As with most activities, running this command once is fast.  Running this command 1000s of times per second will deteriorate performance in the rest of the stack. Also, these analytics systems form the backbone of big data; it is small now, but it production these systems grow quickly.</p>

<p>So, we will run these aggregation commands once, and cache the results.  For best results on application performance, try running these actions asynchronously.  Trigger the Aggregation Framework through background worker that will run and save the updated values to your aggregation collection.</p>

<p>You will need to run two commands to save the output: 1) the aggregation command and 2) the upsert command.  Upsert is a good use case here to update, or insert if does not exist.</p>

<h2>Measurement and Summary</h2>

<p>After storing data in a best practice way, summating and caching data is the next step for any analytics platform.  There is a right way and many wrong ways, and simple algorithm changes can yield good performance gains.  While you are measuring your ability to run these analytics systems, you should also measure the performance of these summation queries.  Build the measurement of the queries while you are building the query &#8211; perhaps in the same background job.</p>

<p>I can guarantee your summation jobs will behave differently with 50 GB of data versus 200 MB of data in a development environment. If you do not measure these summation jobs, you will wake up with performance that you have no historical record.  Knowing how you got to that point is unknownable without metrics from the beginning.  Furthermore, knowing what is &#8220;good&#8221; is also a shot in the dark.  Measure.  Measure.  Measure.</p>

<p>Go forth, take your data, and summate.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoDB 2.4.1 maintenance window complete]]></title>
    <link href="http://blog.mongohq.com/blog/2013/03/27/mongodb-2-dot-4-1-maintenance-window-complete/"/>
    <updated>2013-03-27T17:20:00-05:00</updated>
    <id>http://blog.mongohq.com/blog/2013/03/27/mongodb-2-dot-4-1-maintenance-window-complete</id>
    <content type="html"><![CDATA[<p>The maintenance window to upgrade MongoDB for all of our shared environments has been completed.</p>

<h3>Why the forced upgrade?</h3>

<p>This emergency upgrade was required because of a vulnerability affecting older versions of MongoDB (MongoDB 2.0 and MongoDB 2.2).  The vulnerability (<a href="https://jira.mongodb.org/browse/SERVER-9124">SERVER-9124</a>) was caused by deficiencies with javascript sandboxing, allowing the possibility of system commands to be run and if properly exploited, could have resulted in access to other data on the host.</p>

<p>Working with 10gen and understanding the core issue, we made the decision to move quickly to upgrade all of our shared database plans, starting with large databases and finishing, today, with our sandbox/free plans.</p>

<h3>Affected environments</h3>

<p>The vast majority of our hosts (including all custom deployments) are not at risk from this vulnerability. We&#8217;ve isolated Mongo processes on most of our shared hosts in such a way that user data is isolated and individual database vulnerabilities are contained. Some of our older shared hosts haven&#8217;t yet been migrated to the new containers, we&#8217;re now speeding up our efforts to replace those. Sandbox database hosts are, by definition, going to be more susceptible to problems like this since multiple users share a single Mongo process, we will continue to aggressively upgrade Mongo versions on these.</p>

<h3>Email notification</h3>

<p>We received reports of some customers not receiving notifications of the upgrade. We are very sorry about this and have resolved the issue. We strive to provide an excellent customer experience and proper communication is an absolute must.</p>

<p>Thank you again for your patience. We have talked to many of you today as we worked through this upgrade together, but if you are having any additional issues or have questions, we&#8217;d love to help. You can reach our team at: <strong>support@mongohq.com</strong>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoDB 2.4.1 now available on MongoHQ]]></title>
    <link href="http://blog.mongohq.com/blog/2013/03/23/mongodb-2-dot-4-1-now-available/"/>
    <updated>2013-03-23T17:10:00-05:00</updated>
    <id>http://blog.mongohq.com/blog/2013/03/23/mongodb-2-dot-4-1-now-available</id>
    <content type="html"><![CDATA[<p>MongoDB 2.4.1 has been released and it is an important update to MongoDB 2.4.0, addressing secondary index issues with MongoDB replica sets. This should only affect any replica sets that were upgraded and then had a member re-synced or added after the upgrade to MongoDB 2.4.0.</p>

<p>We have updated MongoHQ to include MongoDB 2.4.1 and you can safely update your replica sets. If you are using 2.4.0 and have a single/standalone instance of MongoDB, this release does not affect you.</p>

<p>To upgrade, visit the MongoHQ version selector on <a href="http://new.mongohq.com">MongoHQ.com</a> or <a href="http://bridge.mongohq.com/signup">sign up for a MongoHQ account</a> and try out MongoDB 2.4.1.</p>

<p>If you are interested in finding out more information about this issue, view <a href="http://jira.mongodb.org/browse/SERVER-9087">SERVER-9087</a> on the MongoDB Jira site.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoDB 2.4 and the new version selector]]></title>
    <link href="http://blog.mongohq.com/blog/2013/03/19/mongodb-2-dot-4-and-the-new-version-selector/"/>
    <updated>2013-03-19T10:43:00-05:00</updated>
    <id>http://blog.mongohq.com/blog/2013/03/19/mongodb-2-dot-4-and-the-new-version-selector</id>
    <content type="html"><![CDATA[<p>Today, 10gen announced the final production release of MongoDB 2.4. As we have talked about in previous blog posts, this release includes a number of great new features, including: full-text search (beta), a new javascript engine, better performance on counts, hashed shard keys, and atomic fixed-length arrays. There are a number of smaller fixes and improvements as well &#8230; a really solid release.</p>

<p>Along with this, at MongoHQ we are rolling out a new feature that allows you to choose the version of MongoDB you use, simply by selecting a dropdown in the MongoHQ web interface.</p>

<p>This is how it works:</p>

<ol>
<li><p>Log into <a href="https://new.mongohq.com/">MongoHQ</a> and select the database you&#8217;d like to upgrade. (If you don&#8217;t have an account, you can <a href="http://bridge.mongohq.com/signup">create one at MongoHQ.com</a>)</p></li>
<li><p>Click on the &#8220;Admin&#8221; tab for the selected database, scroll down and choose the version of MongoDB that you want to run. <p><img src="http://blog.mongohq.com/images/for_posts/2013-03-19-mongodb-2-dot-4-and-the-new-version-selector/version_dropdown.png" alt="Select the version of MongoDB you want to run" /></p></p></li>
<li><p>Click the &#8220;Change Version&#8221; button.</p></li>
</ol>


<p>That&#8217;s it. Now, the magic happens. Ok, just kidding &#8230; but clicking the &#8220;Change Version&#8221; button will initiate a job to change the version of your MongoDB database. This is what happens:</p>

<h3>Upgrading Standalone/Single instances</h3>

<p>For people using our shared single/standalone MongoDB instances, the update process is simple. Once initiated, we will update the version of MongoDB and perform a brief restart of your database. It should take, at most, 5-10 seconds to complete. From there, you can resume normal database operations.</p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-03-19-mongodb-2-dot-4-and-the-new-version-selector/single_instance_overview.png" alt="Upgrading single server MongoDB instances" /></p>

<h3>Upgrading Replica Sets</h3>

<p>Upgrading replica sets are a bit different, so the system will perform what we call a &#8220;rolling upgrade&#8221;. This allows you to migrate to a different version of MongoDB with little to no downtime.</p>

<p>The rolling update will first update all the secondary members and arbiters of the cluster one-by-one, changing the version, restarting and safely bringing the secondary processes back online. After all of this has happened and has verified successful, the system will trigger a state change of your primary to secondary and will allow MongoDB to promote a new primary.</p>

<p>Once the new primary has taken over in the cluster, the system will upgrade the old primary (now secondary) to the version of MongoDB you selected. That will complete the upgrade process.</p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-03-19-mongodb-2-dot-4-and-the-new-version-selector/rs_job_overview.png" alt="Upgrading replica set MongoDB instances" /></p>

<p><strong>Please Note:</strong> For replica sets, ensure that your driver is properly configured to handle MongoDB replica set state changes, otherwise you may have to restart your application. If you have questions about this, we can help! Just send us a note at: <strong>support@mongohq.com</strong>.</p>

<h3>Supporting our Providers</h3>

<p>Of course, if you are running MongoDB with MongoHQ on our providers, like: <a href="http://support.mongohq.com/partners/heroku.html">Heroku</a>, <a href="http://support.mongohq.com/partners/cloudbees.html">CloudBees</a>, <a href="http://support.mongohq.com/partners/appfog.html">AppFog</a>, <a href="http://support.mongohq.com/partners/appharbor.html">AppHarbor</a>, Engine Yard, Nodejitsu and others, you can use this same version selecting functionality to run MongoDB 2.4 as well.</p>

<h3>MongoDB Full-Text Search</h3>

<p>Since full-text search is considered beta for the 2.4 release of MongoDB, we do not enable it by default when you upgrade to MongoDB 2.4. If you would like to try it on larger data sets, please contact us at: support@mongohq.com and we will work with you to make this happen.</p>

<p>We hope you enjoy this great new feature and that it gives you the flexibilty and control to manage the versions and features of MongoDB that you need to use.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoDB 2.4 Fixed Length Arrays & Release Candidate 2]]></title>
    <link href="http://blog.mongohq.com/blog/2013/03/10/mongodb-2-dot-4-release-candidate-2/"/>
    <updated>2013-03-10T08:49:00-05:00</updated>
    <id>http://blog.mongohq.com/blog/2013/03/10/mongodb-2-dot-4-release-candidate-2</id>
    <content type="html"><![CDATA[<p>The MongoDB Experimental Plans are updated and running MongoDB 2.4 Release
Candidate 2.  One of the new features in MongoDB 2.4 is fixed length
arrays.  It will make schema design a little easier.</p>

<h2>Fixed Length Array in MongoDB</h2>

<p>With MongoDB 2.4, one of the cool features is atomic fixed length arrays.  To use, combine the <code>$push</code>
and <code>$slice</code> functionality.  Previously, to have fixed length arrays, you would need to run two separate commands:
<code>$find</code> and inplace <code>$set</code> of the entire array.</p>

<p>Now, you can create fixed length arrays with the <code>{$push: {"field_name": {$each: ["array_of_elements"], $slice: -4}}}</code> where
field_name is the attribute on the document, and slice is the length of the document (always negative or 0).  Essentially, the
example above will:</p>

<ul>
<li>Push to the right side of the array</li>
<li>Keep the 4 elements on the right side of the array</li>
</ul>


<p><strong>An Example</strong></p>

<p>We are going to record the last 5 readings for a river gauge:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="o">&gt;</span> <span class="nx">doc</span> <span class="o">=</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">insert</span><span class="p">({</span><span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">),</span> <span class="nx">name</span><span class="o">:</span> <span class="s2">&quot;Niagra River&quot;</span><span class="p">})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">lastest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">5796</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">find</span><span class="p">()[</span><span class="mi">0</span><span class="p">]</span>
</span><span class='line'><span class="p">{</span>
</span><span class='line'>  <span class="s2">&quot;_id&quot;</span> <span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">),</span>
</span><span class='line'>  <span class="s2">&quot;lastest_gauge_readings&quot;</span> <span class="o">:</span> <span class="p">[</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">5796</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:46:00.810Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>  <span class="p">],</span>
</span><span class='line'>  <span class="s2">&quot;name&quot;</span> <span class="o">:</span> <span class="s2">&quot;Niagra River&quot;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>As you can see the <code>$push</code> syntax has changed.  It is using a more verbose, more functional syntax.  In the example above, with <code>$push</code>, we are adding
each element in the <code>$each</code> array to the right side of the index.  Then, with <code>$slice</code>, we are keeping the 5 right most elements.  Below, we will run
the following 5 more times with random values:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">lastest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">5800</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">lastest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">5960</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">lastest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">6000</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">lastest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">6200</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">lastest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">6300</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">find</span><span class="p">()[</span><span class="mi">0</span><span class="p">]</span>
</span><span class='line'><span class="p">{</span>
</span><span class='line'>  <span class="s2">&quot;_id&quot;</span> <span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">),</span>
</span><span class='line'>  <span class="s2">&quot;lastest_gauge_readings&quot;</span> <span class="o">:</span> <span class="p">[</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">5800</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:48:31.508Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">},</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">5960</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:48:37.268Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">},</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">6000</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:48:42.308Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">},</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">6200</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:48:45.917Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">},</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">6300</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:48:49.247Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>  <span class="p">],</span>
</span><span class='line'>  <span class="s2">&quot;name&quot;</span> <span class="o">:</span> <span class="s2">&quot;Niagra River&quot;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>The <code>$slice</code> operator keeps <em>n</em>-right most elements of the array.</p>

<h3>Sorted Fixed Length Arrays in MongoDB</h3>

<p>If you like Redis&#8217;s <code>zscore</code> functionality, and want to bring it into MongoDB.  The latest MongoDB 2.4 is bringing the sorted arrays functionality to MongoDB with the added <code>$push</code> features.</p>

<p><strong>Common scenarios for this are like:</strong></p>

<ul>
<li>Content management system - related posts, top <em>n</em> refererring products</li>
<li>E-commerce storefront - top <em>n</em> referring products</li>
<li>Social Aggregation - top <em>n</em> retweeted status updates</li>
</ul>


<p>Sorted arrays are a nice way to cache values instead of using a sort on another query.  Try to use these instead of sort queries against entire collections.  We will continue to use our river gauge example on the Niagra River:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">highest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">5796</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$sort</span><span class="o">:</span> <span class="p">{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">1</span><span class="p">},</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span></code></pre></td></tr></table></div></figure>


<p>Above, we are using <code>$sort</code> to specify a key(s) to sort from, then using <code>$slice</code> to keep the last 5.  We will run 5 more pushes to the document:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
<span class='line-number'>25</span>
<span class='line-number'>26</span>
<span class='line-number'>27</span>
<span class='line-number'>28</span>
<span class='line-number'>29</span>
<span class='line-number'>30</span>
<span class='line-number'>31</span>
<span class='line-number'>32</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">highest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">5796</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$sort</span><span class="o">:</span> <span class="p">{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">1</span><span class="p">},</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">highest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">5900</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$sort</span><span class="o">:</span> <span class="p">{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">1</span><span class="p">},</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">highest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">5600</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$sort</span><span class="o">:</span> <span class="p">{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">1</span><span class="p">},</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">highest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">2500</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$sort</span><span class="o">:</span> <span class="p">{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">1</span><span class="p">},</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">update</span><span class="p">({</span> <span class="nx">_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">)},</span> <span class="p">{</span><span class="nx">$push</span><span class="o">:</span> <span class="p">{</span><span class="nx">highest_gauge_readings</span><span class="o">:</span> <span class="p">{</span><span class="nx">$each</span><span class="o">:</span> <span class="p">[{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">7500</span><span class="p">,</span> <span class="nx">at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">()}],</span> <span class="nx">$sort</span><span class="o">:</span> <span class="p">{</span><span class="nx">cubic_meter_second</span><span class="o">:</span> <span class="mi">1</span><span class="p">},</span> <span class="nx">$slice</span><span class="o">:</span> <span class="o">-</span><span class="mi">5</span><span class="p">}}})</span>
</span><span class='line'><span class="o">&gt;</span> <span class="nx">db</span><span class="p">.</span><span class="nx">gauges</span><span class="p">.</span><span class="nx">find</span><span class="p">()[</span><span class="mi">0</span><span class="p">]</span>
</span><span class='line'><span class="p">{</span>
</span><span class='line'>  <span class="s2">&quot;_id&quot;</span> <span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;513cc4c663c0a2303e1ec093&quot;</span><span class="p">),</span>
</span><span class='line'>  <span class="s2">&quot;highest_gauge_readings&quot;</span> <span class="o">:</span> <span class="p">[</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">2500</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:59:59.315Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">},</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">5600</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:59:51.695Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">},</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">5796</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:59:11.096Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">},</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">5900</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T17:59:46.215Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">},</span>
</span><span class='line'>    <span class="p">{</span>
</span><span class='line'>      <span class="s2">&quot;cubic_meter_second&quot;</span> <span class="o">:</span> <span class="mi">7500</span><span class="p">,</span>
</span><span class='line'>      <span class="s2">&quot;at&quot;</span> <span class="o">:</span> <span class="nx">ISODate</span><span class="p">(</span><span class="s2">&quot;2013-03-10T18:00:05.345Z&quot;</span><span class="p">)</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'>  <span class="p">],</span>
</span><span class='line'>  <span class="s2">&quot;name&quot;</span> <span class="o">:</span> <span class="s2">&quot;Niagra River&quot;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<h2>Other Features</h2>

<p>If you are looking for a comprehensive list of MongoDB features, take a look at <a href="https://jira.mongodb.org/secure/ReleaseNote.jspa?projectId=10380&amp;version=11892">MongoDB 2.4 Release Notes</a>. To begin using a Sandbox database with MongoHQ, <a href="https://www.mongohq.com/signup">Sign Up</a> and create a Sandbox database.  To upgrade any paid databases to 2.4.0-rc2, please send an email to support@mongohq.com
with your database name.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Experience with CircleCI]]></title>
    <link href="http://blog.mongohq.com/blog/2013/03/08/experience-with-circleci/"/>
    <updated>2013-03-08T00:00:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/03/08/experience-with-circleci</id>
    <content type="html"><![CDATA[<p>We are now using <a href="http://circleci.com">CircleCI</a> to automatically test and deploy our NodeJS services.
The business-marketing term for this is &#8220;Continuous Integration&#8221;.  With simple tools like CircleCI,
Heroku, and our own managed database services, continuous integration is the norm for us, not the exception.
The lynchpin of the process is CircleCi, and it is simple to set up, but there are a few tricks to
be aware of.</p>

<h2>What we had</h2>

<p>Before CircleCI, we used a self-hosted Jenkins instance. Jenkins works well.  Organizing jenkins to deploy
were running two independent projects: one for testing code, and one for packaging and publishing the resulting asset.</p>

<p>This system only built assets for the master branch.  All commands for testing, packaging, publishing were
stored directly in Jenkins instead of being under version control.</p>

<p>Our setup was ok, but life can be easier.  But, we found the time spent nursing our Jenkins environment to be painful.</p>

<h2>Goals</h2>

<ul>
<li>testing, packaging, publishing steps <em>SHOULD</em> be stored in code</li>
<li>sensitive credentials should <em>NOT</em> be stored in code</li>
<li>ci should simply call test, package, publish steps</li>
<li>ci should publish assets for master, stage, and dev branches</li>
</ul>


<h2>Testing, packaging, and publishing steps</h2>

<p>Being fans of <strong>#badassrockstartech</strong>, we are using <a href="http://gruntjs.com">Grunt</a>,
which is a modular Make system. We use Grunt to run our testing, packaging, and publishing steps.</p>

<p>Powered by npm, we can utilize 3rd-party code to accomplish common tasks (e.g.
transpiling coffeescript into javascript) and can define our own tasks (e.g.
packaging and publishing to s3).</p>

<p>By moving the logic of testing, packaging, and publishing out of Jenkins and into
Grunt, configuring CircleCI becomes very simple and intuitive.</p>

<h2>Don&#8217;t commit credentials</h2>

<p>We moved our commands to publish to S3 into Grunt, but did not want to commmit our
credentials. This is one thing we want to remain hard-coded in our CI settings
separate from our code.</p>

<p>To accomplish this, I decided a simple <code>export AWS_SECRET_KEY=1234</code> as a command
hard-coded into CircleCI&#8217;s web interface would do the trick, but due to the
<a href="https://circleci.com/docs/environment-variables">way CircleCI executes your commands</a>
you cannot alter the environmental variables in this way.</p>

<p>I reached out to Circle for help, and David Lowe at Circle responded within a few
hours with a solution.</p>

<blockquote><p>We should be adding a secure credential storage mechanism soon, but for now, you will need to work around it.</p>

<p>You&#8217;ve got the right idea, but unfortunately, &#8216;export AWS_SECRET_KEY&#8217; won&#8217;t quite work because each command runs in a separate shell. Instead, if you want to do this from the UI, you need to do something like:</p>

<p><code>echo 'export AWS_SECRET_KEY=1234' &gt;&gt; ~/.circlerc</code></p>

<p>The .circlerc file is sourced before each test and deploy command.</p></blockquote>

<p>This works perfectly for our process. We hard-code our sensitive data bits into
the pre-dependency section of the web UI.</p>

<h2>Simply call test, package, and publish steps</h2>

<p>Now that our tasks are defined and our build server configured with credentials,
the only thing left is to do all the CI work.</p>

<p>The web UI is an easy way to specify commands to run, but you get more options
and flexibility in defining a circle.yml file to tell CircleCI what to do.</p>

<p>This is our circle.yml file.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class='yaml'><span class='line'><span class="c1">## Tests</span>
</span><span class='line'><span class="l-Scalar-Plain">test</span><span class="p-Indicator">:</span>
</span><span class='line'>  <span class="l-Scalar-Plain">override</span><span class="p-Indicator">:</span>
</span><span class='line'>    <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">grunt test</span>
</span><span class='line'>
</span><span class='line'><span class="c1">## Deployment options</span>
</span><span class='line'><span class="l-Scalar-Plain">deployment</span><span class="p-Indicator">:</span>
</span><span class='line'>  <span class="l-Scalar-Plain">default</span><span class="p-Indicator">:</span>
</span><span class='line'>    <span class="l-Scalar-Plain">branch</span><span class="p-Indicator">:</span> <span class="p-Indicator">[</span><span class="nv">master</span><span class="p-Indicator">,</span> <span class="nv">stage</span><span class="p-Indicator">,</span> <span class="nv">dev</span><span class="p-Indicator">]</span>
</span><span class='line'>    <span class="l-Scalar-Plain">commands</span><span class="p-Indicator">:</span>
</span><span class='line'>      <span class="p-Indicator">-</span> <span class="l-Scalar-Plain">grunt clean coffee package publish:&quot;$CIRCLE_BRANCH&quot;</span>
</span></code></pre></td></tr></table></div></figure>


<h2>Conclusion</h2>

<p>Being a service company ourselves, we value offloading tasks to others so we
can focus more on what we&#8217;re good at. CircleCI integrates seamlessly with GitHub,
allows us to customize what we need, and gives us a very pretty an intuitive UI
with very little effort.</p>

<p>I&#8217;m happy to hear they just <a href="http://blog.circleci.com/so-we-raised-a-bunch-of-money/">raised a bunch of money</a>,
and I&#8217;m excited to see how they continue to make the world of continuous
integration easier.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Now available: $100 SSD replica sets, beta autoscaling]]></title>
    <link href="http://blog.mongohq.com/blog/2013/02/27/now-available-available-SSD-replica-sets-beta-autoscaling/"/>
    <updated>2013-02-27T06:06:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/02/27/now-available-available-SSD-replica-sets-beta-autoscaling</id>
    <content type="html"><![CDATA[<p>We&#8217;ve been working hard on our underlying database infrastructure for the last several months in silence (mostly). Today, we&#8217;re releasing the first piece of a much larger set of tools: SSD backed replica sets at both $100/mo and $500/mo price levels.</p>

<p>SSDs are by far the best way to run most Mongo deployments and we&#8217;ve been experimenting with various disk setups for a little over a year now (there are a shocking number of SSD options out there). Today&#8217;s release represents the future of our hosted service: each database runs in an isolation capsule with dedicated memory, IO capacity and CPU capacity. The memory/CPU/disk resources are tuned to the most common Mongo workloads we&#8217;ve seen across our 50,000 hosted databases in the last three years.</p>

<h2>The new databases</h2>

<p>New database are available at two price levels, the first is a &#8220;just going into production&#8221; $100/mo replica set database:</p>

<ul>
<li>4GB of included SSD storage ($25/GB/mo for extra storage)</li>
<li>400MB of dedicated RAM per member</li>
</ul>


<p>The second level offer a per/GB price break, starting at $500/mo for a replica set DB:</p>

<ul>
<li>25GB of included SSD storage ($20/GB/mo for extra storage)</li>
<li>2.5GB of dedicated RAM per member</li>
</ul>


<h2>Beta autoscaling</h2>

<p>We&#8217;re also inviting some customers to participate in a private beta test of our database autoscaling. MongoDB loves memory and IO, so we&#8217;re rolling out live resource upgrades to help customer databases stay healthy during periods of rapid growth. Adding RAM and IO (vertical scaling) is nearly instantaneous and usually a much better option than forcing your data to work well with sharding. Want in? Just add one of these new DBs to your account and let us know: support@mongohq.com</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoDB 2.4.0-rc0 available on MongoHQ]]></title>
    <link href="http://blog.mongohq.com/blog/2013/02/22/mongodb-2-dot-4-0-rc0-available-on-mongohq/"/>
    <updated>2013-02-22T14:10:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/02/22/mongodb-2-dot-4-0-rc0-available-on-mongohq</id>
    <content type="html"><![CDATA[<p>The future is now!  Are you ready to upgrade your staging Replica Set to MongoDB 2.4.0-rc0 in preparation of the 2.4.0 release?  Or, <em>if you ready</em>, we can even upgrade your production database.</p>

<p><strong>To upgrade, please send your database name to <a href="http://support.mongohq.com/support/new_request.html?referer=2.4.0-rc0-upgrade">support@mongohq.com</a>.</strong></p>

<p>Run MongoDB 2.4.0-rc0 on MongoHQ today.  On any MongoHQ paid plan, you can use the new 2.4.x features (full-text search, updated aggregation features, and newly optimized geospatial indexes, all backed by the V8 engine).  For a refresher of MongoDB 2.4.0 features, take a look at:</p>

<ul>
<li><a href="http://docs.mongodb.org/manual/release-notes/2.4/">MongoDB 2.4.0 Release Notes</a></li>
<li><a href="http://blog.mongohq.com/blog/2013/01/17/explore-mongodb-2-dot-3-on-mongohq/">Explore MongoDB 2.3 on MongoHQ</a></li>
<li><a href="http://blog.mongohq.com/blog/2013/01/22/first-week-with-mongodb-2-dot-4-development-release/">MongoDB and Full Text Search: My First Week With MongoDB 2.4 Development Release</a></li>
</ul>


<p><strong>To upgrade, please send your database name to <a href="http://support.mongohq.com/support/new_request.html?referer=2.4.0-rc0-upgrade">support@mongohq.com</a>.</strong>  We will upgrade your database as quickly as possible.  The upgrade process will require a quick restart of your database; then, you will be off and running. MongoDB 2.4.0 will become the default version when the production release is made available.</p>

<p><strong>To test MongoDB 2.4.0-rc0 in a non-production environment</strong>, our Sandbox plans called &#8220;Dharma Experimental&#8221; are by-default running the latest release candidate.  We treat our Sandbox plans like free bread; please, have as many as you like, and keep testing.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoHQ Teammates Participate in the Mercedes Marathon]]></title>
    <link href="http://blog.mongohq.com/blog/2013/02/21/mongohq-teammates-participate-in-the-mercedes-marathon/"/>
    <updated>2013-02-21T13:40:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/02/21/mongohq-teammates-participate-in-the-mercedes-marathon</id>
    <content type="html"><![CDATA[<p>This past weekend, Kristine Toone and Dusty Hall competed in the <a href="http://mercedesmarathon.com/">Mercedes Marathon</a> in Birmingham, AL.  Kristine completed the 1/2 marathon (13.1 miles), and Dusty in the full 26.2 miles.  The weather was in the 20s, which according to Kristine: &#8220;makes it more exciting.&#8221;</p>

<p>Kristine - Mile 7
<img src="http://blog.mongohq.com/images/for_posts/2013-02-21-mongohq-teammates-participate-in-the-mercedes-marathon/running.jpg" alt="Kristine - Mile 7" /></p>

<p>Kristine Toone with her husband, Brian, and children, Analise and Josiah.
<img src="http://blog.mongohq.com/images/for_posts/2013-02-21-mongohq-teammates-participate-in-the-mercedes-marathon/group-shot.jpg" alt="Kristine &amp; Family" /></p>

<p>Dusty with daughters Mary Margo and Dottie
<img src="http://blog.mongohq.com/images/for_posts/2013-02-21-mongohq-teammates-participate-in-the-mercedes-marathon/sign.jpg" alt="Dusty &amp; Family" /></p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Behind the Curtain of MongoHQ: A Gotcha with Select in Python]]></title>
    <link href="http://blog.mongohq.com/blog/2013/02/19/behind-the-curtain-of-mongohq-a-gotcha-with-select-in-python/"/>
    <updated>2013-02-19T14:27:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/02/19/behind-the-curtain-of-mongohq-a-gotcha-with-select-in-python</id>
    <content type="html"><![CDATA[<p><em>We generally spend our time talking about databases, but occasionally run into fun technical challenges that seem worth sharing. Here&#8217;s something one of our newest team members (Paul Rubin) recently learned.</em></p>

<p>One of our upcoming features requires low level, high performance networking (more about the actual feature later&#8230;). We originally prototyped the feature with Python, and exercised the new tool with a LOT of open TCP connections which caused some weird crashes. The crashes seemed random; it took us time to undercover the underlying snag.  The fix turned out to be fairly easy, as you might expect.</p>

<p>The snag we encountered was due to a Linux limitation which is poorly documented and not widely known.  The limitation is in the underlying Linux <code>select()</code> system call and as such, it applies to all programs (in whatever language) that use <code>select</code>.</p>

<p>The simplest way to listen to several sockets or pipes concurrently is with the <code>select</code> system call.  In Python, <code>select</code> takes 3 arrays of file or socket objects to know what to listen to.  Python documentation does not mention the library translates the arrays to bit vectors indexed by numeric file descriptors, which is implemented in C as part of the system interface. The bit vectors have fixed sizes determined by a kernel parameter, 1024 bits by default.  Even if your <code>select()</code> call is listening on only one socket, given the socket&#8217;s <code>.fileno()</code> is higher than 1024, <code>select()</code> cannot handle the connections, and will trigger a runtime error.</p>

<p>The solution is to use <code>select.epoll()</code> instead of <code>select.select()</code>. <code>Epoll</code> does not have the 1024 file descriptor limitation and, as a bonus, is more efficient than <code>select</code>.  When listening to a large number of sockets, <code>Epoll</code> is quicker because the library does not linearly scan a large bitmap result searching for sockets with available data. In Python, this does not matter since building the return array is likely to be much slower than scanning the bitmap, but high-performance server implementations should take this into account.</p>

<h2>The Tradeoff</h2>

<p>In Python, <code>epoll</code> events give OS-level numeric file descriptor numbers rather than mapping to the associated Python socket or file objects. Mapping the events manually can be tricky in situations when sockets are opened and closed in multiple application locations.  OS-level file descriptors can be reused after begin closed, so the mapping must be fresh.</p>

<p>The classic article about high-concurrency server implementation is &#8220;The C10K problem&#8221; by Dan Kegel <a href="http://www.kegel.com/c10k.html">http://www.kegel.com/c10k.html</a>. It is a bit out of date by now, but still worth reading for anyone working in this area.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Using Ruby &amp; Resque with MongoDB?  Improve your performance with a single gem.]]></title>
    <link href="http://blog.mongohq.com/blog/2013/02/13/using-ruby-and-resque-with-mongodb-improve-your-performance-with-a-single-gem/"/>
    <updated>2013-02-13T09:11:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/02/13/using-ruby-and-resque-with-mongodb-improve-your-performance-with-a-single-gem</id>
    <content type="html"><![CDATA[<p>Who should win: your web visitors or your background workers?
Hopefully, you will not have to answer this question &#8230; or if you are
planning for exponential growth, hopefully you will.</p>

<p>We have seen a pattern recently with MongoDB, Ruby, and Resque (the
background processor).  Resque behavior combined with the Ruby driver
behavior prevents background worker performance from scaling linearly.
When Resque forks, the Ruby driver creates a new connection to the
database.  This translates into one new connection per processed job.
If you are running 1000s of jobs / second, it will cause MongoDB to run
less than optimal.</p>

<p>The quickest solution is the &#8216;resque-jobs-per-fork&#8217; gem.  As the name
implies, Resque will run multiple jobs per fork.  Instead of a 1-to-1
job to connection pattern, you will have 500-to-1 or 1000-to-1 job to
connection pattern.</p>

<h2>The Cause</h2>

<p>As simple as it sounds, your connection pattern to your MongoDB affects
your customer’s experience with your product.  Good connection patterns
consist of long running, persistent connections.  When looking at the
logs, you should only see a &#8220;connection accepted from&#8221; every 10 - 15
seconds, even on large scale deployments.  Most of the time, poor
connection patterns exist from the beginning of an application&#8217;s
development, but they only become evident due to performance issues.
These performance issues arise as databases grow in size and the
application grows in usage.</p>

<p>With MongoDB, poor connection behavior of an application is exposed due
to the following constraints: 1) per connection memory overhead and 2)
read lock causing slow authentication.</p>

<h2>Per Connect Memory Overhead</h2>

<p>As of the 2.0 branch, each connection in MongoDB, allocates 1MB of RAM.
Before 2.0, it was dependent on system &#8216;stack size&#8217; settings.  The
following code in MongoDB shows the per connection memory usage
algorithm:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class='c'><span class='line'><span class="k">static</span> <span class="k">const</span> <span class="kt">size_t</span> <span class="n">STACK_SIZE</span> <span class="o">=</span> <span class="mi">1024</span><span class="o">*</span><span class="mi">1024</span><span class="p">;</span> <span class="c1">// if we change this we need</span>
</span><span class='line'><span class="n">to</span> <span class="n">update</span> <span class="n">the</span> <span class="n">warning</span>
</span><span class='line'>
</span><span class='line'><span class="k">struct</span> <span class="n">rlimit</span> <span class="n">limits</span><span class="p">;</span>
</span><span class='line'><span class="n">verify</span><span class="p">(</span><span class="n">getrlimit</span><span class="p">(</span><span class="n">RLIMIT_STACK</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">limits</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">);</span>
</span><span class='line'><span class="k">if</span> <span class="p">(</span><span class="n">limits</span><span class="p">.</span><span class="n">rlim_cur</span> <span class="o">&gt;</span> <span class="n">STACK_SIZE</span><span class="p">)</span> <span class="p">{</span>
</span><span class='line'>  <span class="n">pthread_attr_setstacksize</span><span class="p">(</span><span class="o">&amp;</span><span class="n">attrs</span><span class="p">,</span> <span class="p">(</span><span class="n">DEBUG_BUILD</span>
</span><span class='line'>    <span class="o">?</span> <span class="p">(</span><span class="n">STACK_SIZE</span> <span class="o">/</span> <span class="mi">2</span><span class="p">)</span>
</span><span class='line'>    <span class="o">:</span> <span class="n">STACK_SIZE</span><span class="p">));</span>
</span><span class='line'><span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">limits</span><span class="p">.</span><span class="n">rlim_cur</span> <span class="o">&lt;</span> <span class="mi">1024</span><span class="o">*</span><span class="mi">1024</span><span class="p">)</span> <span class="p">{</span>
</span><span class='line'>  <span class="n">warning</span><span class="p">()</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;Stack size set to &quot;</span> <span class="o">&lt;&lt;</span> <span class="p">(</span><span class="n">limits</span><span class="p">.</span><span class="n">rlim_cur</span><span class="o">/</span><span class="mi">1024</span><span class="p">)</span> <span class="o">&lt;&lt;</span> <span class="s">&quot;KB. We</span>
</span><span class='line'><span class="n">suggest</span> <span class="mi">1</span><span class="n">MB</span><span class="s">&quot; &lt;&lt; endl;</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p><a href="https://github.com/mongodb/mongo/blob/master/src/mongo/util/net/message_server_port.cpp#L78">mongo/util/net/message_server_port.cpp#L78</a></p>

<p>As with most everything in life, one of anything is practically nothing
(think cars in traffic). 1 MB is a rounding error of modern RAM sizes.
However, a modern large scale application consists of many components
with many requests. These 1000s of operations per second could turn into
major RAM usage if implemented incorrectly.</p>

<p>Given MongoDBs reliance on good RAM usage, deficient RAM usage can
quickly ruin performance.</p>

<h2>Write Lock &amp; Slow Authentication</h2>

<p>When using authentication, each connection and authentication action is
a database query.  If a database is under heavy write load, the
authentication will be as slow as the rest of your queries.  Thus,
rapidly connecting to a database in a high-write environment
authentication will have to navigate other locks for your database.</p>

<p>As with Memory Overhead, these deficiencies are not typically noticed
until the application grows in data and usage.</p>

<h2>Resque&#8217;s Forking Code &amp; Ruby Driver&#8217;s Reconnect</h2>

<p>Resque uses process forking to spawn new workers for each job
<a href="https://github.com/defunkt/resque/blob/master/lib/resque/worker.rb#L137">resque/blob/master/lib/resque/worker.rb#L137</a>.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">if</span> <span class="no">Kernel</span><span class="o">.</span><span class="n">respond_to?</span><span class="p">(</span><span class="ss">:fork</span><span class="p">)</span>
</span><span class='line'>  <span class="no">Kernel</span><span class="o">.</span><span class="n">fork</span> <span class="o">&amp;</span><span class="n">block</span> <span class="k">if</span> <span class="n">will_fork?</span>
</span><span class='line'><span class="k">else</span>
</span></code></pre></td></tr></table></div></figure>


<p>Ruby driver reconnects <a href="https://github.com/mongodb/mongo-ruby-driver/blob/master/lib/mongo/util/pool.rb#L240">mongo-ruby-driver/blob/master/lib/mongo/util/pool.rb#L240</a>.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="k">if</span> <span class="n">socket</span><span class="o">.</span><span class="n">pid</span> <span class="o">!=</span> <span class="no">Process</span><span class="o">.</span><span class="n">pid</span>
</span><span class='line'>  <span class="vi">@sockets</span><span class="o">.</span><span class="n">delete</span><span class="p">(</span><span class="n">socket</span><span class="p">)</span>
</span><span class='line'>  <span class="k">if</span> <span class="n">socket</span>
</span><span class='line'>    <span class="n">socket</span><span class="o">.</span><span class="n">close</span> <span class="k">unless</span> <span class="n">socket</span><span class="o">.</span><span class="n">closed?</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'>  <span class="n">checkout_new_socket</span>
</span><span class='line'><span class="k">else</span>
</span></code></pre></td></tr></table></div></figure>


<p>The Ruby Mongo Driver does this because managing connections between
parent and child processes in Ruby is a beast.  The fool-proof method is
to re-initialize connections on each fork.</p>

<h2>Resque &amp; Ruby : The Effect</h2>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="nb">require</span> <span class="s1">&#39;rubygems&#39;</span>
</span><span class='line'><span class="nb">require</span> <span class="s1">&#39;mongo&#39;</span>
</span><span class='line'>
</span><span class='line'><span class="vi">@conn</span> <span class="o">=</span> <span class="no">Mongo</span><span class="o">::</span><span class="no">Connection</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="s2">&quot;localhost&quot;</span><span class="p">,</span> <span class="mi">27017</span><span class="p">,</span> <span class="ss">:pool_size</span> <span class="o">=&gt;</span> <span class="mi">10</span><span class="p">,</span>
</span><span class='line'><span class="ss">:pool_timeout</span> <span class="o">=&gt;</span> <span class="mi">5</span><span class="p">)</span>
</span><span class='line'><span class="vi">@db</span>   <span class="o">=</span> <span class="vi">@conn</span><span class="o">[</span><span class="s1">&#39;resque_connection_test&#39;</span><span class="o">]</span>
</span><span class='line'><span class="vi">@coll</span> <span class="o">=</span> <span class="vi">@db</span><span class="o">[</span><span class="s1">&#39;users&#39;</span><span class="o">]</span>
</span><span class='line'>
</span><span class='line'><span class="nb">puts</span> <span class="vi">@db</span><span class="o">.</span><span class="n">command</span><span class="p">({</span><span class="n">getLastError</span><span class="p">:</span> <span class="mi">1</span><span class="p">})</span>
</span><span class='line'>
</span><span class='line'><span class="mi">1</span><span class="o">.</span><span class="n">upto</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span> <span class="k">do</span>
</span><span class='line'>  <span class="n">client</span> <span class="o">=</span> <span class="nb">fork</span> <span class="k">do</span>
</span><span class='line'>    <span class="nb">puts</span> <span class="vi">@db</span><span class="o">.</span><span class="n">command</span><span class="p">({</span><span class="n">getLastError</span><span class="p">:</span> <span class="mi">1</span><span class="p">})</span>
</span><span class='line'>  <span class="k">end</span>
</span><span class='line'>
</span><span class='line'>  <span class="no">Process</span><span class="o">.</span><span class="n">wait</span><span class="p">(</span><span class="n">client</span><span class="p">)</span>
</span><span class='line'><span class="k">end</span>
</span></code></pre></td></tr></table></div></figure>


<p>The code above mimics the effects of a standard Resque worker.  Each
“puts” prints a different “connectionId”, thus each fork establishes a
new MongoDB connection. If you are watching the MongoDB logs, you will
see 11 lines containing “connection accepted from.”</p>

<h2>Debugging In MongoHQ</h2>

<p>Mongostat hides most performance issues due to poor connection patterns.
With the example above, mongostat would show the same number of
connections as your have Resque workers. By looking at the logs, you
will see a new line containing &#8220;connection accepted from&#8221; for each time
that Ruby forks the process.</p>

<p>Debugging decreased performance due to connectivity issues requires
access to the Mongo logs. If you see more than a few connection attempts
every few seconds, please consider a method to use longer persistent
connections. You will achieve better resource usage, and better product
delivery for your clients.</p>

<p>MongoHQ shared plans have access to real time logs from the Mongo
server.  Using these real time logs, you can see your connection
patterns.  For assistance, please E-mail support@mongohq.com.</p>

<ul>
<li><em>Resque &amp; Ruby are not the only offenders with poor connection
patterns.  The stock Node.js driver tests for Replica Set status every
second for each process &#8211; issuing reconnections.  PHP and Apache is
evil when not configured properly &#8211; the continuous building up and
tearing down of workers triggers new connections.</em></li>
</ul>

]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Return of the Clones]]></title>
    <link href="http://blog.mongohq.com/blog/2013/02/08/return-of-the-clones/"/>
    <updated>2013-02-08T16:49:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/02/08/return-of-the-clones</id>
    <content type="html"><![CDATA[<p>We&#8217;ve cloned the clone.  The &#8220;clone database&#8221; functionality of the
original MongoHQ has been ported to our new UI.  Back in October, we
launched the new site. Yet, we knew we were missing a features of
goodness.  The return of the clone functionality is powered by our
backend jobs system with visible progress indicators. Since clones are
background processes, your source database remains available for the
duration. If you are using the clone as an upgrade process, you will
want to put your application in read-only or maintenance mode.</p>

<p>To use, just click the &#8220;Clone / Upgrade&#8221; button in the upper right when
viewing your database.</p>

<p><strong>Data Size &amp; Index Size Growth Charts</strong></p>

<p>We have a <code>dataSize</code> growth chart associated with all databases now.  As
you grow, you can see where you have been and were you are going. The
chart includes both <code>dataSize</code> and <code>indexSize</code> as reported by
<code>db.stats()</code>.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[explain.explain() - Understanding mongo query behavior]]></title>
    <link href="http://blog.mongohq.com/blog/2013/02/05/explaining-explain/"/>
    <updated>2013-02-05T10:49:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/02/05/explaining-explain</id>
    <content type="html"><![CDATA[<p>MongoDB has an extremely flexible query syntax. The flexibility allows
all sorts of useful queries, and some that are mysteriously slow. It&#8217;s
reasonably easy to figure out <em>why</em> a query is slow &#8230; you simply need
to ask MongoDB to explain itself.</p>

<h2>.explain() yourself mongo</h2>

<p>By simply adding .explain() to the end of your mongo query, mongo will
return details about how it goes about fulfilling that query. This will
help you understand what indexes are being used and how many documents
mongo actually sifts through to generate the result.</p>

<h2>a contrived example</h2>

<p>First, let&#8217;s generate a dataset&#8230; how about some people. Running the
following code right in the mongo shell will create us 200k person
objects to work with, each with their own birthdate, gender, and hair
color. No indexes have been created (yet).</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="kd">var</span> <span class="nx">genderChoices</span> <span class="o">=</span> <span class="p">[</span><span class="s1">&#39;male&#39;</span><span class="p">,</span> <span class="s1">&#39;female&#39;</span><span class="p">];</span>
</span><span class='line'><span class="kd">var</span> <span class="nx">hairChoices</span> <span class="o">=</span> <span class="p">[</span><span class="s1">&#39;black&#39;</span><span class="p">,</span> <span class="s1">&#39;brown&#39;</span><span class="p">,</span> <span class="s1">&#39;blond&#39;</span><span class="p">,</span> <span class="s1">&#39;red&#39;</span><span class="p">,</span> <span class="s1">&#39;auburn&#39;</span><span class="p">,</span>
</span><span class='line'><span class="s1">&#39;chestnut&#39;</span><span class="p">,</span> <span class="s1">&#39;white&#39;</span><span class="p">];</span>
</span><span class='line'><span class="kd">var</span> <span class="nx">birthdate</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">();</span>
</span><span class='line'><span class="nx">birthdate</span><span class="p">.</span><span class="nx">setYear</span><span class="p">(</span><span class="mi">1970</span><span class="p">);</span>
</span><span class='line'>
</span><span class='line'><span class="k">for</span> <span class="p">(</span><span class="kd">var</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="mi">200000</span><span class="p">;</span> <span class="o">++</span><span class="nx">i</span><span class="p">)</span> <span class="p">{</span>
</span><span class='line'>  <span class="nx">birthdate</span><span class="p">.</span><span class="nx">setHours</span><span class="p">(</span><span class="nx">birthdate</span><span class="p">.</span><span class="nx">getHours</span><span class="p">()</span> <span class="o">+</span> <span class="mi">1</span><span class="p">);</span>
</span><span class='line'>  <span class="nx">db</span><span class="p">.</span><span class="nx">people</span><span class="p">.</span><span class="nx">insert</span><span class="p">({</span>
</span><span class='line'>    <span class="nx">gender</span><span class="o">:</span> <span class="nx">genderChoices</span><span class="p">[</span><span class="nb">Math</span><span class="p">.</span><span class="nx">floor</span><span class="p">(</span><span class="nb">Math</span><span class="p">.</span><span class="nx">random</span><span class="p">()</span> <span class="o">*</span>
</span><span class='line'><span class="nx">genderChoices</span><span class="p">.</span><span class="nx">length</span><span class="p">)],</span>
</span><span class='line'>    <span class="nx">hair</span><span class="o">:</span> <span class="nx">hairChoices</span><span class="p">[</span><span class="nb">Math</span><span class="p">.</span><span class="nx">floor</span><span class="p">(</span><span class="nb">Math</span><span class="p">.</span><span class="nx">random</span><span class="p">()</span> <span class="o">*</span> <span class="nx">hairChoices</span><span class="p">.</span><span class="nx">length</span><span class="p">)],</span>
</span><span class='line'>    <span class="nx">birthdate</span><span class="o">:</span> <span class="nx">birthdate</span>
</span><span class='line'>  <span class="p">})</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<h2>go unindexed query!</h2>

<p>Let&#8217;s filter down to the first 10 males with a tasteful chestnut hair
color and sort them by birthdate. I&#8217;ll use the pretty web interface.</p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-01-31-quickly-explain-explain/no-index.png" alt="No Index" /></p>

<p>Performance is already pretty bad on this query. With only 200k
documents, our query takes over 100ms (which is bad enough to
automatically get logged by MongoDB as a slow query).</p>

<p>Notice also, the cursor is a <code>BasicCursor</code> which means that no index was
used.</p>

<p>The killer information that you should look for is the number of objects
MongoDB scanned through fulfilling this query. I inserted 200k
documents, and well, the nscanned value tells us we&#8217;re searching through
every single document.</p>

<p>For sorted queries, also pay attention to scanAndOrder which when true
means that MongoDB spent time sorting the results instead of using an
index.</p>

<h2>index race</h2>

<p>Just for example, I&#8217;m going to add two separate indexes: one on hair and
one on gender. Again, I&#8217;ll use the pretty web interface.</p>

<p><strong> be careful adding indexes to production data </strong></p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-01-31-quickly-explain-explain/creating-index.png" alt="Adding Hair Index" /></p>

<p>With an index on hair, and a separate index on gender, let&#8217;s run the
query again.</p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-01-31-quickly-explain-explain/competing-indexes.png" alt="Competing Indexes" /></p>

<p>I&#8217;m cutting off part of this explain() result for later, but first look
at the cursor value.</p>

<p>The cursor is <code>BtreeCursor hair_1</code> which says we&#8217;re using an index on
hair. Mongo only needed to scan through 28.5k documents to complete the
query. But why use the hair index?</p>

<h2>a DB with a plan</h2>

<p>Here is the next part of the explain output:</p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-01-31-quickly-explain-explain/attempted-plans.png" alt="Attempted Plans" /></p>

<p>This shows that MongoDB tried 3 ways to fulfil my query. It tried using
the hair index, the gender index, and no indexing. The poor gender and
BasicCursor options only scanned through about 200 documents before
MongoDB terminated them (because it already had the answer from the hair
index).</p>

<p>So, we did better by adding the hair index. If this is the only query we
ever run against our data, then the gender index isn&#8217;t helping (and
actually slows down updates/inserts/deletes to maintain the index).</p>

<p>Looking back at the hair index, the nscanned tells us that MongoDB
scanned 28.5k documents and generated a result of 14k documents that
pass the query.</p>

<p>28.5k is about 15% of our total data, which is a nice reduction. We can
do better though with a compound index which will bring the nscanned
value closer to the total result set.</p>

<h2>better, stronger, faster</h2>

<p>Let&#8217;s create an index on hair, gender, and then birthdate. We&#8217;re choosing
hair first because that&#8217;s the most efficient at breaking apart our
data into smaller fragments (which is why MongoDB chose it in our index
race above).</p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-01-31-quickly-explain-explain/creating-compound-index.png" alt="Creating Compound Index" /></p>

<p>How does it perform?</p>

<p><img src="http://blog.mongohq.com/images/for_posts/2013-01-31-quickly-explain-explain/full-index.png" alt="Full Index" /></p>

<p>The results are much faster, and in this optimal case, our nscanned is
equal to n and scanAndOrder is false. Good job contrived example. You
show us what we can aspire to.</p>

<h2>Explain yourself</h2>

<p>Explain is a great tool to understand a little more about how your query
is actually working with your indexes, and with MongoHQ it&#8217;s a single
pretty button away from helping you.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[First Steps of an Analytics Platform with MongoDB]]></title>
    <link href="http://blog.mongohq.com/blog/2013/01/29/constructing-analytics-with-mongohq/"/>
    <updated>2013-01-29T08:43:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/01/29/constructing-analytics-with-mongohq</id>
    <content type="html"><![CDATA[<p>MongoDB excels at analytics. Every week, we have customers asking for insight on building an analytics and metrics platform.  We have seen outstanding performance from good practices and have seen issues with common bad practices.</p>

<p>Customers have different types of analytics engines on our hosted MongoDB platform ranging from usage metrics, business domain specific metrics, to financial platforms. The most generic type of metrics that most clients start tracking are events (e.g. “how many people walked into my stores” or “how many people opened an iPhone application”).</p>

<p>A proper schema is the first step to getting off the ground quickly on a platform that will scale.</p>

<h2>The Naive Approach</h2>

<p>With MongoDB, the first urge is to begin inserting documents quickly.  The first documents typically have the following schema:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="p">{</span>
</span><span class='line'>  <span class="nx">store_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(),</span> <span class="c1">// Object id of a store</span>
</span><span class='line'>  <span class="nx">event</span><span class="o">:</span> <span class="s2">&quot;door open&quot;</span><span class="p">,</span> <span class="c1">// will be one of &quot;door opened&quot;, &quot;sale made&quot;, or &quot;phone calls&quot; </span>
</span><span class='line'>  <span class="nx">created_at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">(</span><span class="s2">&quot;2013-01-29T08:43:00Z&quot;</span><span class="p">)</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>To run a query on the <code>event</code>, <code>store_id</code>, and <code>created_at</code>, you run the following query:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">events</span><span class="p">.</span><span class="nx">find</span><span class="p">({</span><span class="nx">store_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;aaa&quot;</span><span class="p">),</span> <span class="nx">created_at</span><span class="o">:</span> <span class="p">{</span><span class="nx">$gte</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">(</span><span class="s2">&quot;2013-01-29T00:00:00Z&quot;</span><span class="p">),</span> <span class="nx">$lte</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">(</span><span class="s2">&quot;2013-01-30T00:00:00Z&quot;</span><span class="p">)}})</span>
</span></code></pre></td></tr></table></div></figure>


<p>These types of queries are deceptive. When you build the query on your local environment, it is fast. When scaling to 10 GB, they become slow.  Typically, to increase speed, compound indexes are added for the following:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">events</span><span class="p">.</span><span class="nx">ensureIndex</span><span class="p">({</span><span class="nx">store_id</span><span class="o">:</span> <span class="mi">1</span><span class="p">,</span> <span class="nx">created_at</span><span class="o">:</span> <span class="mi">1</span><span class="p">})</span>
</span><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">events</span><span class="p">.</span><span class="nx">ensureIndex</span><span class="p">({</span><span class="nx">event</span><span class="o">:</span> <span class="mi">1</span><span class="p">,</span> <span class="nx">created_at</span><span class="o">:</span> <span class="mi">1</span><span class="p">})</span>
</span><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">events</span><span class="p">.</span><span class="nx">ensureIndex</span><span class="p">({</span><span class="nx">store_id</span><span class="o">:</span> <span class="mi">1</span><span class="p">,</span> <span class="nx">event</span><span class="o">:</span> <span class="mi">1</span><span class="p">,</span> <span class="nx">created_at</span><span class="o">:</span> <span class="mi">1</span><span class="p">}</span> <span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>Each of these indexes must entirely fit in RAM.  New documents do not have greater values for <code>store_id</code> and <code>event</code> than perviously inserted documents.  Any insert command will add a document record to the middle of an index.  Any query will have to venture into the middle of the index.  Thus, all indexes must fit in RAM to maximize performance.</p>

<h2>An Optimized Document Schema</h2>

<p>To optimize your document schema, create a <code>time_bucket</code> attribute that breaks down acceptable date ranges to hour, day, month, week, quarter, and/or year.</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="p">{</span>
</span><span class='line'>  <span class="nx">store_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(),</span> <span class="c1">// Object id of a store</span>
</span><span class='line'>  <span class="nx">event</span><span class="o">:</span> <span class="s2">&quot;door open&quot;</span><span class="p">,</span>
</span><span class='line'>  <span class="nx">created_at</span><span class="o">:</span> <span class="k">new</span> <span class="nb">Date</span><span class="p">(</span><span class="s2">&quot;2013-01-29T08:43:00Z&quot;</span><span class="p">),</span>
</span><span class='line'>  <span class="nx">time_bucket</span><span class="o">:</span> <span class="p">[</span>
</span><span class='line'>    <span class="s2">&quot;2013-01-29 08-hour&quot;</span><span class="p">,</span>
</span><span class='line'>    <span class="s2">&quot;2013-01-29-day&quot;</span><span class="p">,</span>
</span><span class='line'>    <span class="s2">&quot;2013-04-week&quot;</span><span class="p">,</span>
</span><span class='line'>    <span class="s2">&quot;2013-01-month&quot;</span><span class="p">,</span>
</span><span class='line'>    <span class="s2">&quot;2013-01-quarter&quot;</span><span class="p">,</span>
</span><span class='line'>    <span class="s2">&quot;2013-year&quot;</span>
</span><span class='line'>  <span class="p">]</span>
</span><span class='line'><span class="p">}</span>
</span></code></pre></td></tr></table></div></figure>


<p>With the optimized schema, you would create the following indexes:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">events</span><span class="p">.</span><span class="nx">ensureIndex</span><span class="p">({</span><span class="nx">time_bucket</span><span class="o">:</span> <span class="mi">1</span><span class="p">,</span> <span class="nx">store_id</span><span class="o">:</span> <span class="mi">1</span><span class="p">,</span> <span class="nx">event</span><span class="o">:</span> <span class="mi">1</span><span class="p">})</span>
</span><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">events</span><span class="p">.</span><span class="nx">ensureIndex</span><span class="p">({</span><span class="nx">time_bucket</span><span class="o">:</span> <span class="mi">1</span><span class="p">,</span> <span class="nx">event</span><span class="o">:</span> <span class="mi">1</span><span class="p">})</span>
</span></code></pre></td></tr></table></div></figure>


<p>With this document schema, we use a practice called “bucketing”.  Instead of building a query on a range, we run the query:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">events</span><span class="p">.</span><span class="nx">find</span><span class="p">({</span><span class="nx">store_id</span><span class="o">:</span> <span class="nx">ObjectId</span><span class="p">(</span><span class="s2">&quot;aaa&quot;</span><span class="p">),</span> <span class="s2">&quot;time_bucket&quot;</span><span class="o">:</span> <span class="s2">&quot;2013-01-29-day&quot;</span><span class="p">})</span>
</span></code></pre></td></tr></table></div></figure>


<p>The draw back of this optimized document schema is every query must include a <code>time_bucket</code> attribute for every non-_id query. However, when querying most reporting systems, a date specification is required</p>

<h2>Better Use of RAM</h2>

<p>Using the optimized <code>time_bucket</code>, new documents are added to the right side of the index. Any inserted document will have a greater time_bucket value than the previous documents. By adding to the right side of the index and using <code>time_bucket</code> to query, MongoDB will swap to disk any rarely older documents. MongoDB runs with minimal RAM usage. Your “hot data” size will be the most recently accessed (typically 1 - 3 months with most analytics applications), and the older data will settle nicely to disk.</p>

<p>Neither queries nor inserts will access the middle of the index, and older index chunks can swap to disk.</p>

<h2>Bonus Points: Using the Aggregation Framework</h2>

<p>Using aggregation to find the number of specific events per day, we can run:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
</pre></td><td class='code'><pre><code class='js'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">events</span><span class="p">.</span><span class="nx">aggregate</span><span class="p">(</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$match</span><span class="o">:</span> <span class="p">{</span><span class="nx">time_bucket</span><span class="o">:</span> <span class="s2">&quot;2013-01-month&quot;</span><span class="p">}},</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$unwind</span><span class="o">:</span> <span class="s2">&quot;$time_bucket&quot;</span><span class="p">},</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$project</span><span class="o">:</span> <span class="p">{</span><span class="nx">time_bucket_event</span><span class="o">:</span> <span class="p">{</span><span class="nx">$concat</span><span class="o">:</span> <span class="p">[</span><span class="s2">&quot;$time_bucket&quot;</span><span class="p">,</span> <span class="s2">&quot;/&quot;</span><span class="p">,</span> <span class="s2">&quot;$event&quot;</span><span class="p">]}}},</span>
</span><span class='line'>  <span class="p">{</span><span class="nx">$group</span><span class="o">:</span> <span class="p">{</span><span class="nx">_id</span><span class="o">:</span> <span class="s2">&quot;$time_bucket_event&quot;</span><span class="p">,</span> <span class="nx">event_count</span><span class="o">:</span> <span class="p">{</span><span class="s2">&quot;$sum&quot;</span><span class="o">:</span> <span class="mi">1</span><span class="p">}}}</span>
</span><span class='line'><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p><em>Disclaimer: I used the $concat operator that will be available in MongoDB 2.4. To use a hosted MongoDB 2.4 Development Branch, take a look at our experimental databases.</em></p>

<p>When using the aggregation framework, cache a final run for any particular day to a summated collection.  Use this summated collection to present data via any application or reporting server.</p>

<h2>Follow Up Materials</h2>

<p>The premier presentation for MongoDB Analytics is by John Nunemaker of Github. He regularly makes the rounds at MongoDB conferences covering how to do MongoDB analytics properly. 10gen makes these presentations available. <a href="http://www.10gen.com/presentations/mongosv-2012/mongodb-analytics-github">MongoDB for Analytics (at Github)</a></p>

<p>When building a scaling analytics system, look for logical &#8220;buckets&#8221; for data.  Avoid using &#8220;$in&#8221;, &#8220;$gte&#8221;, &#8220;$lte&#8221; operators when possible.  MongoDB is fun because it rewards creativity for good schema design.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[MongoDB and Full Text Search: My First Week with MongoDB 2.4 Development Release]]></title>
    <link href="http://blog.mongohq.com/blog/2013/01/22/first-week-with-mongodb-2-dot-4-development-release/"/>
    <updated>2013-01-22T13:38:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/01/22/first-week-with-mongodb-2-dot-4-development-release</id>
    <content type="html"><![CDATA[<p>At MongoHQ, we deployed and made available for use a hosted MongoDB 2.3 beta server to allow ourselves and our customers an opportunity to check out some of the new features that 10gen has been working on. Admittedly, a nice benefit of working at MongoHQ is we get to toy around with a lot of new and interesting technology.</p>

<p>From my initial reading of 2.3 release notes, I underestimated full text searching.  Before we get started on what I wanted to talk about today, I wanted to mention one gotcha.</p>

<h3>So, the gotcha &#8230;</h3>

<p><strong>CopyDB and Auth</strong> - Using the hosted MongoDB 2.3 server, I wanted to use some of MongoHQ&#8217;s tools to transfer data from an existing database to my new MongoDB 2.3 database. To do this, I used the built-in <a href="http://docs.mongodb.org/manual/tutorial/copy-databases-between-instances/"><code>copyDatabase</code> command</a>, which MongoHQ exposes via their web-based UI. When I did that, I saw the following:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>Tue Jan 22 14:58:44.314 [conn697] copydbgetnonce is not supported when running with authentication enabled
</span><span class='line'>Tue Jan 22 14:58:44.315 [conn697] copydb is not supported when running with authentication enabled</span></code></pre></td></tr></table></div></figure>


<p>I temporarily got around this using the <code>mongodump</code> and <code>mongorestore</code> command, but it looks like there are still some auth-related bugs to be cleaned up in the developmental release. No worries as that is what developmental releases are put out there to figure out.</p>

<p>Now, on to some details around full-text search&#8230;</p>

<h2>Underestimating Full Text Searching</h2>

<p>Full-text searching with MongoDB 2.4 is more complex and powerful than originally illustrated in our first blog post outlining this feature. In the original example, we showed:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">emails</span><span class="p">.</span><span class="nx">ensureIndex</span><span class="p">({</span><span class="nx">body</span><span class="o">:</span> <span class="s2">&quot;text&quot;</span><span class="p">},</span> <span class="p">{</span><span class="nx">name</span><span class="o">:</span> <span class="s2">&quot;email_body_text_index&quot;</span><span class="p">})</span>
</span><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">emails</span><span class="p">.</span><span class="nx">runCommand</span><span class="p">(</span> <span class="s2">&quot;text&quot;</span><span class="p">,</span> <span class="p">{</span><span class="nx">search</span><span class="o">:</span> <span class="s2">&quot;Pho&quot;</span><span class="p">}</span> <span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>However, there are a number of other settings that are available for these query types, including:</p>

<ul>
<li>Field weighting</li>
<li>Negative Matches</li>
<li>Phrase Matching</li>
<li>Entire document indexes</li>
</ul>


<p>So, I&#8217;ll quickly break down each one of these parameters, as each is pretty useful:</p>

<h3>Field Weighting</h3>

<p>The documented spec only allows one text index per collection, but it allows multiple fields on that single index.  Initially, my thought goes to fields like <code>tags</code>, <code>subjects</code>, and <code>fulltext</code>.  Using the following:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">emails</span><span class="p">.</span><span class="nx">ensureIndex</span><span class="p">({</span><span class="nx">tags</span><span class="o">:</span> <span class="s2">&quot;text&quot;</span><span class="p">,</span> <span class="nx">subject</span><span class="o">:</span> <span class="s2">&quot;text&quot;</span><span class="p">,</span> <span class="s2">&quot;body&quot;</span><span class="o">:</span> <span class="s2">&quot;text&quot;</span><span class="p">},</span> <span class="p">{</span>
</span><span class='line'>    <span class="nx">name</span><span class="o">:</span> <span class="s2">&quot;email_text_index&quot;</span><span class="p">,</span>
</span><span class='line'>    <span class="nx">weight</span><span class="o">:</span> <span class="p">{</span>
</span><span class='line'>      <span class="nx">tags</span><span class="o">:</span> <span class="mi">5</span><span class="p">,</span>    <span class="c1">// Assume folks are better at tagging than writing</span>
</span><span class='line'>      <span class="nx">subject</span><span class="o">:</span> <span class="mi">4</span><span class="p">,</span> <span class="c1">// Assume the subject is better than the body at description</span>
</span><span class='line'>      <span class="nx">body</span><span class="o">:</span> <span class="mi">1</span>
</span><span class='line'>    <span class="p">}</span>
</span><span class='line'><span class="p">})</span>
</span></code></pre></td></tr></table></div></figure>


<p>MongoDB uses a scoring algorithm to choose the most appropriate documents based on the weights and the number of matches.</p>

<h3>Negative Terms (not yet documented) and Phrases</h3>

<p>If you look <a href="https://github.com/mongodb/mongo/blob/master/src/mongo/db/fts/fts_matcher.cpp#L38">at the code</a>, you will see negative words are also possible.  So, you run something like the following:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">emails</span><span class="p">.</span><span class="nx">command</span><span class="p">(</span><span class="s1">&#39;text&#39;</span><span class="p">,</span> <span class="p">{</span><span class="nx">search</span><span class="o">:</span> <span class="s2">&quot;Pho -&#39;gangnam chicken&#39;&quot;</span><span class="p">})</span>
</span></code></pre></td></tr></table></div></figure>


<p>Running the above you would find mentions of &#8220;Pho&#8221;, without mentioning &#8216;gangnam chicken&#8217;.</p>

<h3>Entire Document Indexing (i.e. wildcard)</h3>

<p>Scanning through the documentation, I ran across this nugget: &#8220;$**&#8221;.  With it, you can supply the entire document to full text searching like so:</p>

<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
</pre></td><td class='code'><pre><code class='javascript'><span class='line'><span class="nx">db</span><span class="p">.</span><span class="nx">emails</span><span class="p">.</span><span class="nx">ensureIndex</span><span class="p">({</span><span class="s2">&quot;$**&quot;</span><span class="o">:</span> <span class="s2">&quot;text&quot;</span><span class="p">},</span> <span class="p">{</span><span class="nx">name</span><span class="o">:</span> <span class="s2">&quot;email_index_text&quot;</span><span class="p">})</span>
</span></code></pre></td></tr></table></div></figure>


<p><em>If you are following along at home, you will need to drop the index above before adding this one</em></p>

<p>I am working with very large documents on our test database, and as expected the wildcard indexes were noticeably <em>slower</em> than specifying precise fields. This certainly makes sense, but is something to keep in mind as your data set grows.</p>

<h2>Index Sizes</h2>

<p>After this first week, I began thinking &#8220;how big are these indexes?&#8221;  How much ram will they require?  In our internal tests, we are not working with huge data sets on this project yet, but we were seeing wildcard indexes consuming about 1/3 of the total data size for the collection.</p>

<p>Given this, you should take this type of memory use into consideration as some long-running text searches could inadvertently shove a good chunk of your active data set out of memory. 10gen makes mention of this in their release notes, so make sure you keep this in mind as you plan out resources required to use this feature.</p>

<h2>The Code</h2>

<p>All of MongoDB is open-source.  All of the full text search is modularized inside the <code>fts</code> directory at:</p>

<p><a href="https://github.com/mongodb/mongo/tree/master/src/mongo/db/fts">https://github.com/mongodb/mongo/tree/master/src/mongo/db/fts</a></p>

<p>The significant files are:</p>

<ul>
<li><code>fts_matcher.cpp</code> contains the matching algorithm &#8211; there are some undocumented nuggets in there.</li>
<li><code>fts_search.cpp</code> contains the search runner and result builder.</li>
</ul>


<p>Even if you are just a Ruby or Node or PHP or Python developer and prefer to stay away from C++, reading code is always a good exercise. As you learn more about this feature, take a moment to learn more about the code that operates it as well.</p>

<p>We are excited about some of the capabilities that full-text search adds to MongoDB and looking forward to seeing this feature mature over the coming months.</p>

<h3>Get Started</h3>

<p>To get started testing the MongoDB 2.3 release, <a href="https://www.mongohq.com/signup">sign up for MongoHQ</a>, and create a database on the &#8220;Experimental&#8221; plan.</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Explore MongoDB 2.3 on MongoHQ]]></title>
    <link href="http://blog.mongohq.com/blog/2013/01/17/explore-mongodb-2-dot-3-on-mongohq/"/>
    <updated>2013-01-17T10:06:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/01/17/explore-mongodb-2-dot-3-on-mongohq</id>
    <content type="html"><![CDATA[<p>There are some interesting and exciting changes coming to MongoDB in the 2.4 production release, including:</p>

<ul>
<li>Full-text search</li>
<li>Change in the internal Javascript engine</li>
<li>Hashed indexes</li>
</ul>


<p>We&#8217;ll talk a bit more about these changes in just a moment, but first, we wanted to give our customers <strong>a way to try out the new features</strong>, even though they are in the 2.3 development release. You  can do this by doing the following:</p>

<ol>
<li><a href="https://new.mongohq.com">Log into MongoHQ</a></li>
<li>Create a new sandbox databases with the &#8220;Dharma 2.3 Experimental&#8221; option.</li>
<li>Insert data</li>
<li>Make profit.</li>
</ol>


<p>Now that you are all set and have your experimental database provisioned and ready to go, let&#8217;s talk about a couple of the new features.</p>

<h3>Full-text Search</h3>

<p>10gen has pointed out on numerous occasions that this has been one of the most-requested features on the MongoDB roadmap, so the engineers are pretty stoked to finally offer it.</p>

<p>However, it is still highly experimental. Good to explore, but do not use it on production environments.</p>

<p>If you had a collection of emails with a field called <code>email_body</code>, you can add the following index:</p>

<pre><code>db.emails.ensureIndex({email_body:text}, {name:"email_body_text_index")
</code></pre>

<p>Allow the index to build and from there, you should be good to start testing queries against it. A query may look like:</p>

<pre><code>db.emails.runCommand( "text", {search: "Pho"} )
</code></pre>

<p>This will return a BSON document &#8230; not a cursor. So, a bit of a change there.</p>

<p>This should get you started &#8230; there are a number of additional features and options that you can use when querying. For more information about doing text queries with this new functionality, check  out the <a href="http://docs.mongodb.org/manual/release-notes/2.4/#text-queries">release notes for MongoDB 2.3</a>.</p>

<h3>Changes to the Javascript Engine</h3>

<p>With the 2.3/2.4 release of MongoDB, 10gen is replacing Mozilla&#8217;s Spidermonkey javascript engine with Google&#8217;s open-source V8 engine. While there are some speed improvements in various cases, the  details of this change are probably a bit out of the realm of the interest of most people using MongoDB as a database.</p>

<p>Nevertheless, it is a worthwhile endeavor to benchmark some of your map reduce functionality against this new version of MongoDB as compared to 2.2 versions of the code.</p>

<h3>That&#8217;s It</h3>

<p>We hope you have fun testing this developmental release of MongoDB. <strong>Please do not use this environment for anything production-related</strong>. If you have any questions, we&#8217;re happy to help!</p>
]]></content>
  </entry>
  
  <entry>
    <title type="html"><![CDATA[Replica Set Read Preferences and MongoDB 2.2]]></title>
    <link href="http://blog.mongohq.com/blog/2013/01/15/replica-set-read-preferences-and-mongodb-2-dot-2/"/>
    <updated>2013-01-15T11:47:00-06:00</updated>
    <id>http://blog.mongohq.com/blog/2013/01/15/replica-set-read-preferences-and-mongodb-2-dot-2</id>
    <content type="html"><![CDATA[<h3>Replica Set Read Preferences and MongoDB 2.2</h3>

<p>  When making the transition from a single server setup to a Replica Set
there are many more benefits than just a hot standby.  Read preferences
are one that should be heavily considered.  Benefits vary from reducing
system load and/or network throughput on the primary to location based
reads that are based on latency from the client.  There are several
options for configuring read preferences and different reasons why you
might choose one over another.</p>

<p>  Recent versions MongoDB have included support for sending read
operations to the secondary member but with the release of 2.2 three new
modes were added.  These include primaryPreferred, secondaryPreferred
and nearest.  As in prior versions these can be applied to per
connection, collection or operation.</p>

<h4>Read preference modes:</h4>

<ul>
<li><p><strong>primary</strong></p>

<p>The default.  Send all read operations to primary even if primary is
unavalialbe, thus failing.  This would be used when stale results are
not a option.</p></li>
<li><p><strong>primaryPreferred</strong> (NEW)</p>

<p>Similar to the default although when the primary is unavaliable read
operations are sent to the secondary.  This provides some level of
uptime during the transition from secondary -> primary during a
failover.  Although failovers don&#8217;t occur that often one must be
prepared.</p></li>
<li><p><strong>secondary</strong></p>

<p>Send all read operations to secondary.  If secondary is unavailble
the operation will fail.</p></li>
<li><p><strong>secondaryPreferred</strong> (NEW)</p>

<p>Send all read operations to secondary unless secondary is
unavailble.  When the secondaryu is unavailable reads are sent to the
primary.</p></li>
<li><p><strong>nearest</strong></p>

<p>Read operations are sent to the nearst member using the member
selection
<a href="http://docs.mongodb.org/manual/applications/replication/#replica-set-read-preference-behavior-nearest">http://docs.mongodb.org/manual/applications/replication/#replica-set-read-preference-behavior-nearest</a>
process.  Could be used when stale results are acceptable and low
latency is a priority.</p></li>
</ul>


<h4>Driver examples (Ruby):</h4>

<ul>
<li><p><strong>per connection</strong></p>

<p>When specifying the read preference in the connection all future
objects will inhereit this preference unless specifically set.</p></li>
</ul>


<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">db</span> <span class="o">=</span>
</span><span class='line'><span class="no">Mongo</span><span class="o">::</span><span class="no">ReplSetConnection</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="o">[</span><span class="s1">&#39;host.m0.mongohq.com:10011&#39;</span><span class="p">,</span><span class="s1">&#39;host.m1.mongohq.com:10011&#39;</span><span class="o">]</span><span class="p">,</span>
</span><span class='line'><span class="ss">:read</span> <span class="o">=&gt;</span> <span class="ss">:secondary_preferred</span><span class="p">)</span><span class="o">.</span><span class="n">db</span><span class="p">(</span><span class="s2">&quot;myapp&quot;</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<ul>
<li><strong>per collection</strong></li>
</ul>


<figure class='code'><figcaption><span></span></figcaption><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
</pre></td><td class='code'><pre><code class='ruby'><span class='line'><span class="n">db</span> <span class="o">=</span>
</span><span class='line'><span class="no">Mongo</span><span class="o">::</span><span class="no">ReplSetConnection</span><span class="o">.</span><span class="n">new</span><span class="p">(</span><span class="o">[</span><span class="s1">&#39;host.m0.mongohq.com:10011&#39;</span><span class="p">,</span><span class="s1">&#39;host.m1.mongohq.com:10011&#39;</span><span class="o">]</span><span class="p">,</span>
</span><span class='line'><span class="ss">:read</span> <span class="o">=&gt;</span> <span class="ss">:primary</span><span class="p">)</span><span class="o">.</span><span class="n">db</span><span class="p">(</span><span class="s2">&quot;myapp&quot;</span><span class="p">)</span>
</span><span class='line'><span class="n">auth</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">authenticate</span><span class="p">(</span><span class="s2">&quot;user&quot;</span><span class="p">,</span><span class="s2">&quot;password&quot;</span><span class="p">)</span>
</span><span class='line'><span class="n">staging</span> <span class="o">=</span> <span class="n">db</span><span class="o">.</span><span class="n">collection</span><span class="p">(</span><span class="s1">&#39;staging&#39;</span><span class="p">)</span>
</span><span class='line'><span class="n">staging</span><span class="o">.</span><span class="n">find</span><span class="p">({</span><span class="ss">:customer_name</span> <span class="o">=&gt;</span> <span class="s1">&#39;John&#39;</span><span class="p">},</span> <span class="ss">:read</span> <span class="o">=&gt;</span>
</span><span class='line'><span class="ss">:secondary_preferred</span><span class="p">)</span>
</span></code></pre></td></tr></table></div></figure>


<p>  Other driver documentation can be found here: <a href="http://api.mongodb.org">http://api.mongodb.org</a></p>

<p>  As you can see the new modes offer more flexibility to where your
reads operation take place.  Although most applications perform well
without configuring read preference fine tunning for growth may solve
future issues.</p>
]]></content>
  </entry>
  
</feed>
