[OSM-dev] Working with OSM data with less or no metadata

Simon Poole simon at poole.ch
Wed Feb 14 15:47:25 UTC 2018


General comments:

- we are just considering removing metadata from what is publicly
available outside of the OSM community, the current thinking is that it
can remain available to authenticated users

- while there might be a tiny bit of leakage from providing version
numbers we haven't considered them to be a large concern, and a good
argument can be made while they need to be public (see below)

- timestamps however cannot only potentially be used in lieu of
changeset ids to group contributions, the information itself is
problematic because it allows to profile contributions over time

Neither uid/display name and timestamp of an existing object version are
required to create a modified version for upload to the API, the version
number however is.

Simon


Am 14.02.2018 um 10:30 schrieb Michael Reichert:
> Hi,
>
> people are talking about potential changes to the amount of (personal)
> data distributed by OSM, in the light of new data protection laws
> becoming effective in the EU this May. There haven't been any official
> statements by the OSMF but discussions are going on in the LWG [1].
>
> Even though it is still unclear what the concrete steps will be, I have
> done some experiments. How well do our existing tools behave if you feed
> them with OSM data that has less metadata than usual, or no metadata at
> all? I have set up a test suite which tests Osmium-Tool (which uses the
> Libosmium library; master branch), Osmosis 0.44.1 and Osmconvert 0.6.
>
> The test suite is availabe at
> https://github.com/geofabrik/metadata-test/
> and consists of a Bash script. You need to have osmium, osmosis and
> osmconvert in your path (or you have to modify the script a bit). The
> test suite comes with its own hand crafted test data which will be first
> converted to PBF by Osmium. Afterwards all three tools will prove
> themselves in the following challenges:
>
> - converting XML to PBF
> - converting PBF to XML
> - converting XML to XML
> - applying a diff
> - deriving changes between two OSM files
>
> All challenges are run four times, one iteration with full metadata, one
> with timestamp and version fields, one with version field only and one
> without any metadata. Some PBF challenges will also have two variants –
> one with DenseNodes and one without.
>
> The results are files located in the output/ directory. You have to
> inspect them manually, I have not written a tool to parse them and
> output how many tests failed.
>
> *Results*
> I compiled the results into a spreadsheet. You can download it at
> https://github.com/geofabrik/metadata-test/raw/master/table.ods
>
> To sum them up:
> - Osmium is the only programme which passes all format conversion tests.
>
> - Osmosis cannot read any XML (OSM and OSC) files without timestamp and
> version fields.
>
> - Osmosis and Osmconvert [2] treat all metadata fields in the DenseInfo
> message of the PBF format as mandatory. However, the format
> specification doesn't declare these fields as mandatory. Therefore, they
> write default values into PBF files if the input lacks these fields:
> version="-1" timestamp="1969-12-31T23:59:59Z" changeset="-1" (Osmosis [3]),
> timestamp="1970-01-01T00:00:01Z" changeset="1" version="1" (Osmconvert)
> This partially applies to the XML output of Osmosis, too.
>
> - Deriving a diff file of the changes between two OSM files only works
> if both files have the same amount of metadata. If one file contains
> less or more metadata, all objects will appear in the diff file with
> their new metadata and bloat it up. The question is whether this is the
> desired behaviour (i.e. the ability to clean a file from metadata using
> large diffs) or if this behaviour is not desired and the tools
> generating diffs should compare the tags, location and members of
> objects which have the same ID but different metadata.
>
> - Some tools have bugs which lead to wrong diffs (e.g. missing
> modifications) if some metadata fields are missing.
>
> Best regards
>
> Michael
>
>
> [1]
> https://wiki.osmfoundation.org/wiki/Working_Group_Minutes#Licensing_Working_Group
> [2] Osmium also had this bug. But it was fixed on the master branch a
> few days ago.
> [3] Osmium cannot parse negative version numbers and throws an exception.
>
>
>
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> https://lists.openstreetmap.org/listinfo/dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20180214/0087bb1a/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20180214/0087bb1a/attachment-0001.sig>


More information about the dev mailing list