[OSM-dev] using Osmium to filter osh files

Abhishek dalek2point3 at gmail.com
Thu May 22 15:53:37 UTC 2014


definitely.

I'm looking to analyze the development of OpenStreetMap in the US with
a particular focus on contributor activity -- and trying to understand
differences between number of contributors entering in different
regions over time and also the *nature* of their contribution activity
(adding new streets, amenities, natural features vs. adding tags,
fixing existing features etc).

As a first pass, I've already used the changeset files (using the
middle of the BBOX as an approximate measure of location) to
understand user contribution activity. What the changeset files do not
allow me to do is understand the *nature* of the contributions. In
order to do this, I'm looking at the history files.

As a first pass, I would like to build a flatfile (CSV-like) that is
"edit-level" (rather than changeset level) and understand what each
edit meant. In particular, I'm interested in classifying edits into
the following categories

(a) adding new amenity (record name, location of amenity)
(b) adding new street (record name of street and approximate location,
i.e. "midpoint" of the way)
(c) adding tags to existing street (which tags? maxspeed and oneway
are interesting)
(d) deleting features
(e) other (notably adding natural features etc)

So, specifically, one idea might be to have a dataset that records
every node added, its location, metadata (user, timestamp etc) and its
tags and for every way, reduce it to a point (like osmconvert's
"all-to-node") and do the same. I'm also open to other suggestions.

The algorithm might work as follows:

1. go through every node in the osh file and write it to a csv only if
it does not belong to a way (this will capture point features)
2. go through every way, reduce the way to a single point, write the
point feature and related metadata to a csv file
3. ignore relations.

So this would be something like osmconvert with the options
"all-to-nodes" and "drop-relations"

Any ideas on how should I go about doing this? In terms of the
documentation, I've been using the "new" osmium and looking at
osmcode.org, but my sense is that this documentation is not yet
complete (for example I cannot find the tag filter classes that you
mention) -- are these documented on the Wiki?

Its fun to be using a low-level tool like Osmium, but any help would
make this process a lot easier for me. Thanks!

Abhishek

On Thu, May 22, 2014 at 8:40 AM, Jochen Topf <jochen at remote.org> wrote:
> On Mi, Mai 21, 2014 at 11:20:18 -0400, Abhishek wrote:
>> I would like to use osmium to filter .osh files. Specifically I wanted
>> to recreate the features of osmfilter, that allows me extract certain
>> features like "amenity=*" or "highway=*" along with their relevant
>> histories from a .osh.pbf file.
>>
>> I've managed to successsfully setup osmium and osmium-tool, but I
>> couldnt figure out a way to use these tools to filter features from
>> the history data. I'm very new to writing code in C++, so I was hoping
>> this feature was implemented. Any ideas on where I should be looking
>> for help?
>
> Working with the history files is not easy and it very much depends on what you
> really want to do. In the general case, it is not enough to find, for instance,
> all ways tagged with highway=*, you have to find the nodes that were used by
> those ways at the time when those ways were current. If you are only interested
> in the tags and their history and not the location of those ways, it becomes
> much easier. So first, you have to understand the details of the OSM data model
> and how it plays out in the history files.
>
> Osmium has many building blocks that you will need, it can read the history
> files, there are tag filter classes (osmium::tags::KeyFilter and
> osmium::tags::KeyValueFilter) and ways to store and index data. But there
> is no easy recipe.
>
> Maybe you can tell us a bit more about what you want to accomplish in the end
> and we can have more tips for you then.
>
> Jochen
> --
> Jochen Topf  jochen at remote.org  http://www.jochentopf.com/  +49-721-388298



More information about the dev mailing list