<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>General comments:</p>
<p>- we are just considering removing metadata from what is publicly
available outside of the OSM community, the current thinking is
that it can remain available to authenticated users<br>
</p>
<p>- while there might be a tiny bit of leakage from providing
version numbers we haven't considered them to be a large concern,
and a good argument can be made while they need to be public (see
below)<br>
</p>
<p>- timestamps however cannot only potentially be used in lieu of
changeset ids to group contributions, the information itself is
problematic because it allows to profile contributions over time</p>
<p>Neither uid/display name and timestamp of an existing object
version are required to create a modified version for upload to
the API, the version number however is.</p>
<p>Simon<br>
</p>
<br>
<div class="moz-cite-prefix">Am 14.02.2018 um 10:30 schrieb Michael
Reichert:<br>
</div>
<blockquote type="cite"
cite="mid:af5a2ea6-5a7e-acca-74f1-0ab39f6d99ca@geofabrik.de">
<pre wrap="">Hi,
people are talking about potential changes to the amount of (personal)
data distributed by OSM, in the light of new data protection laws
becoming effective in the EU this May. There haven't been any official
statements by the OSMF but discussions are going on in the LWG [1].
Even though it is still unclear what the concrete steps will be, I have
done some experiments. How well do our existing tools behave if you feed
them with OSM data that has less metadata than usual, or no metadata at
all? I have set up a test suite which tests Osmium-Tool (which uses the
Libosmium library; master branch), Osmosis 0.44.1 and Osmconvert 0.6.
The test suite is availabe at
<a class="moz-txt-link-freetext" href="https://github.com/geofabrik/metadata-test/">https://github.com/geofabrik/metadata-test/</a>
and consists of a Bash script. You need to have osmium, osmosis and
osmconvert in your path (or you have to modify the script a bit). The
test suite comes with its own hand crafted test data which will be first
converted to PBF by Osmium. Afterwards all three tools will prove
themselves in the following challenges:
- converting XML to PBF
- converting PBF to XML
- converting XML to XML
- applying a diff
- deriving changes between two OSM files
All challenges are run four times, one iteration with full metadata, one
with timestamp and version fields, one with version field only and one
without any metadata. Some PBF challenges will also have two variants –
one with DenseNodes and one without.
The results are files located in the output/ directory. You have to
inspect them manually, I have not written a tool to parse them and
output how many tests failed.
*Results*
I compiled the results into a spreadsheet. You can download it at
<a class="moz-txt-link-freetext" href="https://github.com/geofabrik/metadata-test/raw/master/table.ods">https://github.com/geofabrik/metadata-test/raw/master/table.ods</a>
To sum them up:
- Osmium is the only programme which passes all format conversion tests.
- Osmosis cannot read any XML (OSM and OSC) files without timestamp and
version fields.
- Osmosis and Osmconvert [2] treat all metadata fields in the DenseInfo
message of the PBF format as mandatory. However, the format
specification doesn't declare these fields as mandatory. Therefore, they
write default values into PBF files if the input lacks these fields:
version="-1" timestamp="1969-12-31T23:59:59Z" changeset="-1" (Osmosis [3]),
timestamp="1970-01-01T00:00:01Z" changeset="1" version="1" (Osmconvert)
This partially applies to the XML output of Osmosis, too.
- Deriving a diff file of the changes between two OSM files only works
if both files have the same amount of metadata. If one file contains
less or more metadata, all objects will appear in the diff file with
their new metadata and bloat it up. The question is whether this is the
desired behaviour (i.e. the ability to clean a file from metadata using
large diffs) or if this behaviour is not desired and the tools
generating diffs should compare the tags, location and members of
objects which have the same ID but different metadata.
- Some tools have bugs which lead to wrong diffs (e.g. missing
modifications) if some metadata fields are missing.
Best regards
Michael
[1]
<a class="moz-txt-link-freetext" href="https://wiki.osmfoundation.org/wiki/Working_Group_Minutes#Licensing_Working_Group">https://wiki.osmfoundation.org/wiki/Working_Group_Minutes#Licensing_Working_Group</a>
[2] Osmium also had this bug. But it was fixed on the master branch a
few days ago.
[3] Osmium cannot parse negative version numbers and throws an exception.
</pre>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:dev@openstreetmap.org">dev@openstreetmap.org</a>
<a class="moz-txt-link-freetext" href="https://lists.openstreetmap.org/listinfo/dev">https://lists.openstreetmap.org/listinfo/dev</a>
</pre>
</blockquote>
<br>
</body>
</html>