[osmosis-dev] Osmosis + Hadoop (was: Re: Changes to Osmosis Pgsql Schema)
Lars Francke
lars.francke at gmail.com
Sat Aug 7 16:22:28 BST 2010
Hi,
> 2: I am beginning a project to parallelize OSM data processing
> with Hadoop, and the postgreSQL copy-format output is perfect
> for loading into HDFS. (If this goes well, I'd want to discuss
> ideas for adapting Osmosis to talk to Hadoop, eventually.)
that is very interesting. I'm doing the same with great success (I've
recently written about it[1]) and I'm currently putting the final
touches on a HBase patch[2] to allow bulk loading of OSM data into
HBase.
Just as a heads up: If you're using Hive the PostgreSQL copy-format is
unfortunately not perfect as the output of boolean columns is not
recognized by Hive ('t' and 'f') resulting in NULL columns.
Would you mind sharing a few of your ideas and use-cases (in regards
to OSM(osis) and Hadoop). What exactly do you mean by "Hadoop" and how
do you think Osmosis could help here?
Cheers,
Lars
[1] http://blog.lars-francke.de/2010/07/22/processing-openstreetmap-data-with-hive/
[2] https://issues.apache.org/jira/browse/HBASE-1861
More information about the osmosis-dev
mailing list