[osmosis-dev] [RFE] Add bounding shape to OSM dumps
Brett Henderson
brett at bretth.com
Thu Jan 21 22:15:12 GMT 2010
On Fri, Jan 22, 2010 at 5:17 AM, WanMil <wmgcnfg at web.de> wrote:
> > Hi,
> >
> > Apollinaris Schoell wrote:
> >> osmosis supports 2 options to keep ways and relations intact. but
> >> geofabrik extracts don't use it as far as I know. completeWays
> >> completeRelations
> >
> > That is correct: These options exist, and are not used, because if we'd
> > use them, the nightly build would take something like three days instead
> > of six hours. Osmosis works very well in streaming mode and these
> > options make it impossible to stream - Osmosis needs to create a
> > temporary copy of the full dataset and is rather inefficient at it.
> >
> > We typically do something like
> >
> > osmosis --rx file.osm --tee 20 --bp file=country1.poly --wx country1.osm
> > --bp file=country2.poly --wx country2.osm...
> >
> > which would causes Osmosis to make 20 temporary full copies of the data
> > and write them to disk in "completeWays" mode.
> >
> > We use "clipIncompleteEntities" which means that Osmosis removes
> > references to those nodes outside the polygon from any ways or relations.
> >
> > It would be good if Osmosis could somehow flag the clipped entities so
> > that processing software could at least know that there is something
> > wrong, or incomplete, with them.
> >
> > Adding the actual polygon used for clipping could of course be done but
> > it will not automatically enable proper filling. Assume this:
> >
> > |
> > +-|----+-----+
> > | | |
> > | | |<--- filled area
> > +-|----+-----+
> > |
> > |<-- clipping boundary
> >
> > After clipping with clipIncompleteEntities, this will lead to
> >
> > |
> > | +-----+
> > | | |
> > | | |
> > | +-----+
> > |
> > |
> >
> > Even if you know where the clipping boundary is, you cannot extend the
> > object towards that boundary properly because you are missing the nodes
> > beyond the boundary.
> >
> > Bye
> > Frederik
> >
>
> Ok, we cannot reconstruct the 100% exact original shape if the first
> node outside the boundary is missing. But a reasonable workaround is to
> connect directly to the boundary in case there are nodes missing in a
> polygon and the last point is not too far away from the boundary. I
> think in most cases this will be ok.
>
> This is not the 100% solution and it would be better if ways and
> relations are added completely or at least the first point beyond
> boundary should be clear. But as long as this is too time-consuming the
> proposed solution to add the boundary to the OSM dump is an easy to
> realize and very helpful improvement. Flagging of clipped ways is
> another good and helpful improvement.
>
Hi WanMil,
Frederik has already given you most useful info, but I'll add some comments
to the discussion in case you or others find it useful.
ADDING GEOMETRY DETAILS TO OUTPUT FILE
It is not simple to add additional geometry information to an output OSM
file due to the way Osmosis works internally. The task that extracts
boundary data (eg. --bounding-polygon) is independent of the --write-xml
task producing the output file which means that the geometry information
cannot be written to the output file. This is an intentional design design
in the Osmosis design where each task can be implemented independently of
all other tasks.
The advantage of this approach is that tasks can be combined in various
combinations according to what the end user requires. The disadvantage is
that there is no simple way of passing additional information between tasks
other than standard node/way/relation data. Osmosis is a generic tool
supporting many use cases, which means it can't always provide the ideal
solution.
It is not impossible, but requires a fair bit of rework to implement.
Simple bounding box information can be propagated through the pipeline so I
believe the --bounding-box will cause a Bound element to be added to the
output file. But more complex geometries such as those used by
--bounding-polygon are not supported. This would need to be enhanced if we
wanted to pass bounding box information.
I don't have a lot of time for Osmosis so I can't tackle this one.
CLIPPED WAYS
Flagging clipped ways would be quite useful. It requires enhancing the
Entity class within Osmosis to allow additional information to be attached.
It also requires all existing tasks to be enhanced where necessary to
manipulate this additional data. Again, I'm not likely to do this myself,
but would support anybody trying to implement it themselves.
COMPLETE GEOMETRIES
As you've discovered, you can end up with missing data for ways or relations
that sit outside the bounding box. For example, nodes that sit outside the
area may be part of a way inside the area but will be left out.
The --bounding-xxxx tasks are unlikely to ever be fixed in this regard. As
Frederik points out there is just no way to efficiently do this in a streamy
fashion because it requires random access to the entire dataset which isn't
something Osmosis is good at. By the time you process ways, the nodes are
not available in memory any more so you can't tell where each way is
located. You can't go back and add extra nodes to the result set (because
nodes have already been written to output), and you can't invent new nodes
to connect the way to the edge of the polygon (because you don't know where
the way is located without access to nodes).
I have created an alternative mechanism for doing this but it requires
setting up a PostGIS database containing the full planet (or at least
containing a much larger area than the area of interest) and performing
extracts from there. The database is populated using the --write-pgsql or
--write-pgsql-dump tools, it is kept up to date with latest diffs using
--write-pgsql-change, and bounding boxes are extracted using the
--read-pgsql and --dataset-bounding-box tasks. There is no
--dataset-bounding-polygon task but it shouldn't be impossible to
implement. Having said all that, it requires a lot of setup, and I don't
know how well it performs in the real world.
There's lots of things that could be done to improve Osmosis in its
--bounding-xxx tasks. I'm not likely to do any of them (lack of time being
the main constraint), but anybody is welcome to step in and improve them if
they're prepared or able to do some coding.
Brett
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/osmosis-dev/attachments/20100122/0d006090/attachment.html>
More information about the osmosis-dev
mailing list