[Rebuild] Large relations - rules and tools

Frederik Ramm frederik at remote.org
Wed Feb 1 12:47:16 GMT 2012


Hi,

On 02/01/12 12:20, Richard Fairhurst wrote:
> 2 I think is not yet provided anywhere. I guess it needs a full-history
> database to be done properly: if all else fails I will write a nasty Perl
> script that will download each version in turn from the API and diff them,
> but that would be very_horrible, as our tagging@ friends say, and probably
> result in my being banned from the API. Any thoughts?

One thing that falls out of my OSMI processing is a "reduced full 
history file" that contains the full history (tainted and untainted 
versions) of every object where at least one version is tainted, in a 
special ASCII form with one line per object version, like this:

r69318 1/1722 v=1 u=10983 c=820189 o=0 mb=5183633*way*SouthWestCoastPath 
^created_by^Potlatch 0.10f^type^route
r69318 2/1722 v=1 u=10983 c=820189 o=0 
mb=5183633*way*SouthWestCoastPath,30336554*way* ^created_by^Potlatch 
0.10f^type^route
r69318 3/1722 v=1 u=10983 c=820189 o=0 
mb=5183633*way*SouthWestCoastPath,30336554*way* ^created_by^Potlatch 
0.10f^name^SouthWestCoastPath^type^route
r69318 4/1722 v=1 u=10983 c=820189 o=0 
mb=5183633*way*SouthWestCoastPath,5183635*way*,30336554*way* 
^created_by^Potlatch 0.10f^name^SouthWestCoastPath^type^route
r69318 5/1722 v=1 u=10983 c=820189 o=0 
mb=5183633*way*SouthWestCoastPath,5183635*way*,5183636*way*,30336554*way* ^created_by^Potlatch 
0.10f^name^SouthWestCoastPath^type^route
r69318 6/1722 v=1 u=10983 c=820189 o=0 
mb=5183633*way*SouthWestCoastPath,5183635*way*,5183636*way*,23012934*way*,30336554*way* 
^created_by^Potlatch 0.10f^name^SouthWestCoastPath^type^route
r69318 7/1722 v=1 u=10983 c=820189 o=0 
mb=5183633*way*SouthWestCoastPath,5183635*way*,5183636*way*,23012934*way*,23012937*way*,30336554*way* 
^created_by^Potlatch 0.10f^name^SouthWestCoastPath^type^route
r69318 8/1722 v=1 u=10983 c=820189 o=0 
mb=5183633*way*SouthWestCoastPath,5183635*way*,5183636*way*,23012934*way*,23012937*way*,23012939*way*,30336554*way* 
^created_by^Potlatch 0.10f^name^SouthWestCoastPath^type^route
r69318 9/1722 v=1 u=10983 c=820189 o=0 
mb=5183633*way*SouthWestCoastPath,5183635*way*,5183636*way*,23012934*way*,23012937*way*,23012939*way*,23012940*way*,30336554*way* 
^created_by^Potlatch 0.10f^name^SouthWestCoastPath^type^route
r69318 10/1722 v=1 u=10983 c=820189 o=0 
mb=5183633*way*SouthWestCoastPath,5183635*way*,5183636*way*,23012934*way*,23012937*way*,23012939*way*,23012940*way*,23012942*way*,30336554*way* 
^created_by^Potlatch 0.10f^name^SouthWestCoastPath^type^route
...

(Key: "r69318": object id; "1/1722": version 1 of 1722, "v=1": visible; 
"u=10983": user 10983; "c=820189": changeset 820189; "o=0": not 
odbl-clean; mb=...: all members, comma-separated; finally at line end 
the tags.)

This file has only 3 GB compressed (20 GB uncompressed) and is 
relatively easy & quick to parse, compared to the XML monster that is 
the full history file.

I make a new version of this about once a day, and could make it 
available for download if anyone fancies doing their own analyses. I 
could also make a file with only the relations in it.

Bye
Frederik



More information about the Rebuild mailing list