[OSM-dev] Fwd: Re: OSM and MongoDB

Greg Studer greg at 10gen.com
Wed Apr 13 15:35:35 BST 2011


MongoDB does use a geohash as the indexing method for geo-searches, but
pretty sure that's not the cause of the huge query times.  The
geohashing tends to be very fast, but the way points were buffered for
return in pre-1.9 releases could in particular point distributions cause
these slowdowns - I'm guessing the neighboring boxes had many more
points.

Exact point checks and distances are also being introduced in 1.9, so
when/if the hash isn't precise enough to complete your search, you
shouldn't get these types of inaccurate results (the hash is currently
tunable to 32 bits of precision).  Of course, these are all new
developments (along with polygon searches and multi-location documents),
geo-indexing has gotten a lot of attention as of late.

disclaimer: as per my email address, I work at 10gen on MongoDB

On Wed, 2011-04-13 at 08:52 -0500, Ian Dees wrote: 
> 
> 
> On Tue, Apr 12, 2011 at 3:56 PM, Steve Coast <steve at asklater.com>
> wrote:
>         Interesting.
>         
>         How efficient is the (big)int indexing and/or masking?
>         
> 
> 

> 
> I haven't had a chance to look at the integer indexing/masking. If I
> remember it from discussions on dev a long while ago I think it's very
> close to geohashes.
>  
>         
>         Was this all on a single machine? 
> 
> 
> Yes.
>  
>         
>         
>         
>         
>         
>         On 4/12/2011 1:52 PM, Ian Dees wrote: 
>         > Yep.
>         > 
>         > On Tue, Apr 12, 2011 at 3:51 PM, Steve Coast
>         > <steve at asklater.com> wrote: 
>         >         and using the builtin spatial index? 
>         >         
>         >         
>         >         
>         >         On 4/12/2011 1:50 PM, Ian Dees wrote: 
>         >         > Yes, one document per node/way/relation.
>         >         > 
>         >         > On Tue, Apr 12, 2011 at 3:47 PM, Steve Coast
>         >         > <steve at asklater.com> wrote: 
>         >         >         how was the data put in the db though? 1
>         >         >         document per node? 
>         >         >         
>         >         >         
>         >         >         On 4/12/2011 1:39 PM, Nolan Darilek
>         >         >         wrote: 
>         >         >         > Oopse, meant for this to go to the whole
>         >         >         > list.
>         >         >         > 
>         >         >         > 
>         >         >         > 
>         >         >         > -------- Original Message -------- 
>         >         >         >            Subject: 
>         >         >         > Re: [OSM-dev] OSM
>         >         >         > and MongoDB
>         >         >         >               Date: 
>         >         >         > Tue, 12 Apr 2011
>         >         >         > 15:26:41 -0500
>         >         >         >               From: 
>         >         >         > Nolan Darilek
>         >         >         > <nolan at thewordnerd.info>
>         >         >         >                 To: 
>         >         >         > Ian Dees
>         >         >         > <ian.dees at gmail.com>
>         >         >         > 
>         >         >         > 
>         >         >         > I had/am having a somewhat bad
>         >         >         > experience storing OSM data in MongoDB.
>         >         >         > 
>         >         >         > Initially I stored all map data in
>         >         >         > MongoDB, but queries took ages. The same
>         >         >         > queries that happen in 100-200 MS now
>         >         >         > often took nearly a second.
>         >         >         > Additionally, some took upwards of 5,
>         >         >         > and I even found spots on my map
>         >         >         > sparsely populated with points, but
>         >         >         > which reliably performed the queries I
>         >         >         > need in 30+ seconds.
>         >         >         > 
>         >         >         > I filed a thorough bug in their tracker,
>         >         >         > including a dataset and queries that
>         >         >         > reliably duplicated the issue. It was
>         >         >         > marked wontfix, I abandoned MongoDB, and
>         >         >         > it was apparently re-opened and fixed
>         >         >         > several months later. So perhaps it's a
>         >         >         > non-issue now.
>         >         >         > 
>         >         >         > I'm still using MongoDB for part of my
>         >         >         > current project, user POI storage. It
>         >         >         > does indeed use geohashes, and I'm
>         >         >         > experiencing strange accuracy issues. My
>         >         >         > platform is pedestrian navigation with
>         >         >         > many small distance queries. Points in
>         >         >         > the non-MongoDB dataset are reliably
>         >         >         > detected in a radius roughly 100 meters
>         >         >         > around the traveler. Points in MongoDB
>         >         >         > queried with the same bounding boxes
>         >         >         > don't appear until they're within 30-40
>         >         >         > meters. I recently updated from an older
>         >         >         > version to a new build of 1.8. The older
>         >         >         > version widely varied the detection
>         >         >         > range. Some points were detected 100 or
>         >         >         > so meters out, while others weren't
>         >         >         > picked up until 30 or so. It was always
>         >         >         > the same points, too. The point for my
>         >         >         > apartment remains reliably visible for
>         >         >         > ~100 meters or so, while the corner
>         >         >         > store and restaurant didn't appear until
>         >         >         > I was very close. 1.8 at least appears
>         >         >         > to be consistent, always detecting at 30
>         >         >         > meters or so. I can only assume that
>         >         >         > this is a geohash oddity that only
>         >         >         > appears for very small differences,
>         >         >         > something that works out to rounding
>         >         >         > error for larger values.
>         >         >         > 
>         >         >         > I like MongoDB for many things, but not
>         >         >         > for geospatial data more complicated
>         >         >         > than a series of points. I'm working on
>         >         >         > migrating user/POI storage to a
>         >         >         > geospatial store.
>         >         >         > 
>         >         >         > 
>         >         >         > On 04/12/2011 01:20 PM, Ian Dees wrote: 
>         >         >         > > Yep, and I think Mongo uses geohashes
>         >         >         > > as their index behind the scenes. One
>         >         >         > > of the problems with that, though, is
>         >         >         > > they have some arbitrary length that
>         >         >         > > they compute the geohash to and when
>         >         >         > > you have lots of points (as OSM data
>         >         >         > > does) the buckets they're searching
>         >         >         > > are very full.
>         >         >         > > 
>         >         >         > > On Tue, Apr 12, 2011 at 1:00 PM, Steve
>         >         >         > > Coast <steve at asklater.com> wrote: 
>         >         >         > >         bbox queries using the built
>         >         >         > >         in spatial indexing
>         >         >         > >         presumably? OSM has it's own
>         >         >         > >         magical bitmask for that, that
>         >         >         > >         may also be as fast in mongo,
>         >         >         > >         who knows. 
>         >         >         > >         
>         >         >         > >         
>         >         >         > >         On 4/11/2011 5:58 PM, Ian Dees
>         >         >         > >         wrote: 
>         >         >         > >         > On Mon, Apr 11, 2011 at 6:36
>         >         >         > >         > PM, Sergey Galuzo
>         >         >         > >         > <sergal at microsoft.com>
>         >         >         > >         > wrote: 
>         >         >         > >         >         Hi,
>         >         >         > >         >         
>         >         >         > >         >          
>         >         >         > >         >         
>         >         >         > >         >         I am working on
>         >         >         > >         >         evaluation of
>         >         >         > >         >         MongoDB for several
>         >         >         > >         >         storage solutions at
>         >         >         > >         >         hand. Some of them
>         >         >         > >         >         resemble current OSM
>         >         >         > >         >         editing database. I
>         >         >         > >         >         have heard that OSM
>         >         >         > >         >         dev is/was
>         >         >         > >         >         evaluating MongoDB
>         >         >         > >         >         also. I was
>         >         >         > >         >         wondering whether it
>         >         >         > >         >         possible to share
>         >         >         > >         >         the findings?
>         >         >         > >         >         
>         >         >         > >         >          
>         >         >         > >         >         
>         >         >         > >         >         
>         >         >         > >         > 
>         >         >         > >         > 
>         >         >         > >         > In my experimentation with
>         >         >         > >         > MongoDB (seen
>         >         >         > >         > here: https://github.com/iandees/mongosm/) I found it to be very slow. Inserts were speedy, but bounding-box queries took a long time. 
>         >         >         > >         > 
>         >         >         > >         > 
>         >         >         > >         > The most recent dev version
>         >         >         > >         > of MongoDB includes
>         >         >         > >         > "multi-location documents"
>         >         >         > >         > support: 
>         >         >         > >         > http://www.mongodb.org/display/DOCS/Geospatial+Indexing#GeospatialIndexing-MultilocationDocuments 
>         >         >         > >         > 
>         >         >         > >         > 
>         >         >         > >         > This would allow a single
>         >         >         > >         > way document to be indexed
>         >         >         > >         > at multiple locations and
>         >         >         > >         > vastly speed up the map
>         >         >         > >         > query. 
>         >         >         > >         > 
>         >         >         > >         > _______________________________________________
>         >         >         > >         > dev mailing list
>         >         >         > >         > dev at openstreetmap.org
>         >         >         > >         > http://lists.openstreetmap.org/listinfo/dev 
>         >         >         > >         
>         >         >         > >         _______________________________________________
>         >         >         > >         dev mailing list
>         >         >         > >         dev at openstreetmap.org
>         >         >         > >         http://lists.openstreetmap.org/listinfo/dev
>         >         >         > >         
>         >         >         > > 
>         >         >         > > 
>         >         >         > > _______________________________________________
>         >         >         > > dev mailing list
>         >         >         > > dev at openstreetmap.org
>         >         >         > > http://lists.openstreetmap.org/listinfo/dev
>         >         >         > 
>         >         >         > 
>         >         >         > _______________________________________________
>         >         >         > dev mailing list
>         >         >         > dev at openstreetmap.org
>         >         >         > http://lists.openstreetmap.org/listinfo/dev
>         >         >         
>         >         >         _______________________________________________
>         >         >         dev mailing list
>         >         >         dev at openstreetmap.org
>         >         >         http://lists.openstreetmap.org/listinfo/dev
>         >         >         
>         >         > 
>         > 
> 
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev





More information about the dev mailing list