[osmosis-dev] Reading OSM History dumps
Peter Körner
osm-lists at mazdermind.de
Sun Aug 22 18:44:54 BST 2010
Am 22.08.2010 08:26, schrieb Brett Henderson:
> Hi Peter,
>
> This all sounds very interesting and will no doubt have many uses that I
> can't anticipate.
>
> I can't give you much assistance but will try to answer any specific
> questions you have. My wife is going to give birth sometime within the
> next month which means my priorities are about to change drastically ;-)
Oh, congratulations on this!
> You seem to have thought about most of the complexities of the problem
> already so you know what you're dealing with.
I think that all is solvable using just enough logic :) I did the demo
implementation in PHP to see if this is possible and I think I know the
OSM data structure enough to know what it means.
But I don't know Osmosis and Java enough to know how tow to implement
the simple multi-level arrays from PHP in a way that will work with
those really big files.
What I need is a store that can
- store all versions of a Node*
- access a specific version of a node
- access all versions of a node
- the oldest version of a node that has been created before Date X
*not only the Node's location but also the Meta-Info (Timestamp, User,
UserID) because you would want to have this as the Meta-Info on the
generated intermediate Way-Versions.
I looked into the three implementations of NodeLocationStore (especially
the InMemoryNodeLocationStore) and I was thinking how I could extend the
really simple fixed-size memory store to be able to store a complete
Node and index by Id and Version at the same time.
Because there is no fixed number of versions per Node I can't go with a
simple offset=NodeID*NodeSize calculation but I have to write the nodes
one after another just as they come in and save the Offsets in a List,
but I'm not sure how to build a List that allows Random Access to the
offset to all versions of a node as well as to a specific version in Java.
I also found the IndexedObjectStore class in
org.openstreetmap.osmosis.core.store and I thought about extending it to
track three Indexes (NodeID, Version and Timestamp). Do you know if this
would be workable?
> You mentioned the problem of obtaining test data. I'd suggest using:
> http://planet.openstreetmap.org/history/
They are in .osc format but I need a task to convert from .osc to
history-.osm and back, too.
> That is a full history from day one of the project up until now. It is
> already in the OSM change format that Osmosis understands. Cutting
> bounding boxes out of full history data is a difficult (but not
> impossible)
In regard to the Node-Moded-In/-Out problem, yes. At the moment I'm
working with self-including history files, that contain all referenced
items from version 1 on. When I start to convert .osc files into
history-.osm files I will have to deal with objects with incomplete
histories (when a node has been moved I only know its new position).
There is a need to feed in a second data-source like an already existing
database.
> problem that you may have to solve in order to move
> forward. In order to build way linestrings for all way versions and for
> all node versions impacting the way you will have to solve a similar
> problem to understanding how to cut bbox data so you may be able to kill
> a couple of birds with one stone.
I'm not really sure if this will work as all I'm focusing on now is to
get a complete dump analyzed, but we may get closer to this goal.
> One thing to note is that I'm currently changing the simple schema a bit
> to improve performance.
Yes I tracked that and it like the step towards hstore as I already used
it a lot with osm2pgsql.
Peter
More information about the osmosis-dev
mailing list