[osmosis-dev] Osmosis + Hadoop (was: Re: Changes to Osmosis Pgsql Schema)
Lars Francke
lars.francke at gmail.com
Sat Aug 7 22:10:50 BST 2010
> Firstly, I should point out that I only learned of Map/Reduce and Hadoop
> within the past two weeks, and I don't know Java (yet), so I've only
> gotten as far as some thought experiments.
Well if there ever is anything I can help you with let me know.
> The "easy" use case would be as a fast replacement/preprocessor for TagStat,
> i.e. frequency counts of tags. An enhancement would be the ability to
> report those by geographic area, or better yet, a user's native language.
That's exactly what I'm doing for OSMdoc.com the geographic reports
are hard to do though. I've also got a language-detection tool on my
todo list.
> My original thought was simply to rapidly create the feature geometries for
> import to postGIS, for research and for quick setup of test/development
> environments. I found a Master's thesis and some code on-line where the
> author used the Java Topology Suite in a study of parallel GIS processing [1];
> unfortunately he seems not to have learned how to do joins, either natively
> or with Hive or Pig, and this probably had never encountered HBase before
> submitting his work.
I have read that thesis/article before. And while interesting I came
to the same conclusion that it only solves a very small part of the
problem.
> If you look at the rate of growth of the OSM data [2], and look at the
> work we have to do in order to make postGIS handle what we have now [3][4],
> I think the handwriting on the wall is telling us that parallel processing
> is the only way we'll be able to scale, especially as we gain exposure
> through efforts like Bing's and MapQuest/AOL Local's.
I agree.
> So... my pie in the sky is to see Mapnik work with HBase and be
> able to scale out the rendering as much as we need, and vastly
> reduce/eliminate the need for postGIS.
Same here. I'll have OSM with HBase set up before the end of the month
and I've also asked around for help with a Mapnik backend.
What I don't quite get is where you think Osmosis can help here: XML
to HDFS, etc.?
Cheers,
Lars
More information about the osmosis-dev
mailing list