<div class="gmail_quote">On Fri, Jul 16, 2010 at 2:34 PM, Nolan Darilek <span dir="ltr"><<a href="mailto:nolan@thewordnerd.info">nolan@thewordnerd.info</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
-----BEGIN PGP SIGNED MESSAGE-----<br>
Hash: SHA1<br>
<br>
Hey, folks. I suppose that I might be wrong about this, and if so then<br>
I'd love to be, but I thought that I'd share my recent findings here<br>
either to inspire work in different directions if possible, or to close<br>
a possibly useless avenue of work if not.<br>
<br>
I have a real-time navigation app that stores its data in MongoDB, using<br>
its geospatial queries. During the course of my work, I routinely found<br>
that some queries were reliably slow while others were reliably fast.<br>
We're talking differences of seconds, some taking 5, some 30, while<br>
others were completed in under a second. Naturally, this is unsuitable<br>
for an app that needs to provide near real-time feedback. I opened an<br>
issue here, including my dump of an import of Texas' OSM nodes:<br>
<br>
<a href="http://jira.mongodb.org/browse/SERVER-1392" target="_blank">http://jira.mongodb.org/browse/SERVER-1392</a><br>
<br>
It seems that I'm running up against limitations in MongoDB's<br>
geohash-based mechanism. It's probably perfectly suitable for most<br>
average geospatial-based searches, but not so much for the case of<br>
rendering OSM data in reliably short bursts of time. The issue has been<br>
marked wontfix.<br>
<br>
I'm open to the possibility that I'm missing something, but have long<br>
suspected that the geospatial support wasn't up to something of this<br>
magnitude. I suppose that it might work as a data storage and<br>
replication system, but if you need to get back data quickly then<br>
MongoDB likely isn't a good fit.<br>
<br>
Anyhow, I thought that I'd share, especially as some of us were<br>
discussing use of MongoDB here a few weeks back.<br><br></blockquote><div><br></div><div>I was seeing quite the opposite results with smallish (city-sized) bounding boxes: I was getting very fast responses (much faster than a second or two). I was definitely running into limitations of the Python serializer/deserializer before I was running into limitations of Mongo. This was after inserting most of an entire planet dump.</div>
<div><br></div><div>However, it does seem like his explanation is valid: geohashing creates buckets and when those buckets are too big they fill up and make for slow queries. The nice thing about geohashing is that you can have arbitrarily-sized buckets, so I always assumed that they were picking the size based on how many points they saw. Maybe not.</div>
</div><br>