[OSM-dev] OSM on Hadoop

William Temperley willtemperley at gmail.com
Tue Jun 21 13:11:46 UTC 2016


Dear all,

I would like to draw your attention to a project I've been working on to
process OSM planet files on Hadoop:

https://github.com/willtemperley/osm-hadoop

It's geared toward a quite specific task: from a planet.pbf file,
extracting and rasterizing the linear features with a specific tag - we
needed to do this for all highways and railways as an input to an
accessibility model.

We've leveraged the Osmosis pbf2 library to perform the deserialization
which has just worked, excepting that seeking between file blocks is
impossible - see the readme for the workaround.

I'd be interested to hear anyone else's experience processing OSM data
using big data tech.  I'd be interested in working on a more generic
framework, perhaps with support for Apache Hive or other analytic
frameworks.

Any ideas on how to take things forward, formats to support etc would be of
interest.

Best regards,

Will Temperley
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20160621/841ee167/attachment.html>


More information about the dev mailing list