[Geocoding] Local Mirror of OSM Data...

Peter Childs pchilds at bcs.org
Fri Oct 23 11:07:06 BST 2009


2009/10/23 Emilie Laffray <emilie.laffray at gmail.com>

>
>
> 2009/10/23 Peter Childs <pchilds at bcs.org>
>
> I'm looking to set up a local mirror of the OSM data, so I can index and
>> working out some new wonderful way of searching it. etc etc
>> Anyway, What's the best way to set this up,
>>
>> I was looking at taking the planet.osm possibly with diffs later and
>> throwing it at a SAX parser and then into a database.
>>
>> I did speculate on using OSMOSIS but its too slow and I'm speculating on
>> soundexing and metaphoning the data as its imported,
>>
>> I'm also looking at being able to build a tree (parent/child) structure
>> for areas, But these are only ideas currently.
>>
>> Currently I'm importing planet.osm into a postgres database using osmosis
>> so see how big it is, But its been going all night, and looks like its only
>> done about 5% where as decompressing the planet takes about 2 hours, so I
>> was expecting it done in kind of say 6?
>>
>> Any ideas/help would be most useful.
>>
>>
> Currently, there are only two competing schemas for OSM database: osmosis
> and osm2pgsql.
> A full import is taking time and you will need a machine capable of very
> throughput in terms of IO. I don't think there is an easy way to import data
> directly into a mode that will just work. In addition, Osmosis has a SAX
> parser option which works very nicely. But you will still be limited by your
> hardware IO performance.
> Personally, I believe that soundexing data is not very interesting, as it
> is very limited (read only English language). Using double metaphone is a
> better idea, but initially I suspect you might want to know the scope of the
> search you want to do and then expand on it afterwards.
> Working on a full planet isn't going to be the easiest thing to do since it
> is so huge. You may want to restrict yourself to only a smaller country like
> UK. In addition, if you want to perform a meaningful search, you will
> probably need your own database schema. The work that Brian Quinion is doing
> is absolutely brilliant from that point of view.
>
>
I don't disagree.

My problem does not seam to be IO but Processor in that it takes about 2
hours to decompress the planet but more like 24 for osmosis to read parse
and put the database into postgres and most of that is not the database but
Osmosis.

I'm looking at using my own Schema and a SAX parser

Currently our UK Streets using a high bread soundx but I am thinking that
the double meta phone may be better. Unfrotnally the UK Streets we have are
old and a bit out of date (and have copyright issues) hence why I want to
move over to OSM.

I'm just trying to fix the problem generally rather than concentrating on
exactly what I need.

Peter.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/geocoding/attachments/20091023/c65a2460/attachment.html>


More information about the Geocoding mailing list