[OSM-dev] using Osmium to filter osh files

Jochen Topf jochen at remote.org
Fri May 23 07:25:19 UTC 2014


Hi!

Wow. That's a lot. I suggest you start with something more simple like the
amenity tags on nodes and then work your way up. Everything that involves
nodes and ways together, you have to read the input file twice and/or store
data in between and that makes it more complicated. This is especially
complex for history files.

Concering Osmium: The new Osmium and its documentation is work in progress,
it will take a while for all of that to appear. I fear you'll have to make
do with whats there. The handler concept is quite similar though to what
the "old osmium" was doing and there are some links to talks and blog posts
on http://wiki.openstreetmap.org/wiki/Osmium that you can read to get a
better idea about the general architecture. Basically what happens is that
a file is read and a callback on each of the handlers is called in turn to
work on the objects as they are read from the file.

Jochen

On Do, Mai 22, 2014 at 11:53:37 -0400, Abhishek wrote:
> Date: Thu, 22 May 2014 11:53:37 -0400
> From: Abhishek <dalek2point3 at gmail.com>
> To: Jochen Topf <jochen at remote.org>
> Cc: dev at openstreetmap.org
> Subject: Re: [OSM-dev] using Osmium to filter osh files
> 
> definitely.
> 
> I'm looking to analyze the development of OpenStreetMap in the US with
> a particular focus on contributor activity -- and trying to understand
> differences between number of contributors entering in different
> regions over time and also the *nature* of their contribution activity
> (adding new streets, amenities, natural features vs. adding tags,
> fixing existing features etc).
> 
> As a first pass, I've already used the changeset files (using the
> middle of the BBOX as an approximate measure of location) to
> understand user contribution activity. What the changeset files do not
> allow me to do is understand the *nature* of the contributions. In
> order to do this, I'm looking at the history files.
> 
> As a first pass, I would like to build a flatfile (CSV-like) that is
> "edit-level" (rather than changeset level) and understand what each
> edit meant. In particular, I'm interested in classifying edits into
> the following categories
> 
> (a) adding new amenity (record name, location of amenity)
> (b) adding new street (record name of street and approximate location,
> i.e. "midpoint" of the way)
> (c) adding tags to existing street (which tags? maxspeed and oneway
> are interesting)
> (d) deleting features
> (e) other (notably adding natural features etc)
> 
> So, specifically, one idea might be to have a dataset that records
> every node added, its location, metadata (user, timestamp etc) and its
> tags and for every way, reduce it to a point (like osmconvert's
> "all-to-node") and do the same. I'm also open to other suggestions.
> 
> The algorithm might work as follows:
> 
> 1. go through every node in the osh file and write it to a csv only if
> it does not belong to a way (this will capture point features)

A node can be part of a way and a point feature at the same time.

> 2. go through every way, reduce the way to a single point, write the
> point feature and related metadata to a csv file
> 3. ignore relations.
> 
> So this would be something like osmconvert with the options
> "all-to-nodes" and "drop-relations"
> 
> Any ideas on how should I go about doing this? In terms of the
> documentation, I've been using the "new" osmium and looking at
> osmcode.org, but my sense is that this documentation is not yet
> complete (for example I cannot find the tag filter classes that you
> mention) -- are these documented on the Wiki?
> 
> Its fun to be using a low-level tool like Osmium, but any help would
> make this process a lot easier for me. Thanks!
> 
> Abhishek
> 
> On Thu, May 22, 2014 at 8:40 AM, Jochen Topf <jochen at remote.org> wrote:
> > On Mi, Mai 21, 2014 at 11:20:18 -0400, Abhishek wrote:
> >> I would like to use osmium to filter .osh files. Specifically I wanted
> >> to recreate the features of osmfilter, that allows me extract certain
> >> features like "amenity=*" or "highway=*" along with their relevant
> >> histories from a .osh.pbf file.
> >>
> >> I've managed to successsfully setup osmium and osmium-tool, but I
> >> couldnt figure out a way to use these tools to filter features from
> >> the history data. I'm very new to writing code in C++, so I was hoping
> >> this feature was implemented. Any ideas on where I should be looking
> >> for help?
> >
> > Working with the history files is not easy and it very much depends on what you
> > really want to do. In the general case, it is not enough to find, for instance,
> > all ways tagged with highway=*, you have to find the nodes that were used by
> > those ways at the time when those ways were current. If you are only interested
> > in the tags and their history and not the location of those ways, it becomes
> > much easier. So first, you have to understand the details of the OSM data model
> > and how it plays out in the history files.
> >
> > Osmium has many building blocks that you will need, it can read the history
> > files, there are tag filter classes (osmium::tags::KeyFilter and
> > osmium::tags::KeyValueFilter) and ways to store and index data. But there
> > is no easy recipe.
> >
> > Maybe you can tell us a bit more about what you want to accomplish in the end
> > and we can have more tips for you then.
> >
> > Jochen
> > --
> > Jochen Topf  jochen at remote.org  http://www.jochentopf.com/  +49-721-388298

-- 
Jochen Topf  jochen at remote.org  http://www.jochentopf.com/  +49-721-388298



More information about the dev mailing list