[Talk-us] Blocking of user "WorstFixer" for removing ele=0 etc

Frederik Ramm frederik at remote.org
Wed May 16 08:33:15 BST 2012


Hi,

On 05/16/2012 03:13 AM, Alan Mintz wrote:
> I can understand why that is - it's being worked on by many people, may
> need partial revertability, will probably run for a long time, etc.
> Removal of one tag in bulk doesn't present these issues, and may be
> possible, which is why I'm asking: a) does it help; and b) is it possible?

It is certainly possible to remove created_by from the database without 
a trace; it just requires a couple (two?) SQL statements.

It would be an almost unprecedented action; the last time we kicked 
something out of the database like that was when we removed the first, 
aborted, TIGER import.

There is some reservation among sysadmins against doing something like 
that because, being outside of the envelope of "normal operations", it 
could have side effects that nobody foresaw. It would also falsify 
history in that, by removing that tag, we would essentially claim that 
the tag never was there in the first place. This is of course not 
terribly important but still - for objects created in pre-0.6 API times 
the created_by tag that you can look up in the object history is the 
only thing that tells us what editor was used when the object was created.

So yes, it is possible and I believe if the benefit was big enough it 
could be done.

But what's the benefit really? Most people who run a local database 
instance will run an osm2pgsql database and not have imported created_by 
in the first place so no waste of space there. When new diffs are 
generated and pushed out, they are unlikely to contain many created_by 
tags because created_by is deleted upon sight by modern editors, so 
that's a non-issue too; and as for planet file size, I removed all 1.75 
million created_by tags from a 1300 MB germany.osm.pbf and ended up with 
a 1297 MB file which suggests that 45 MB of the planet file could be 
saved by removing all created_by tags (that's about 0.3%).

It might make a larger difference on the history planet file, and there 
surely will be some places where the response to a "map" API call might 
be more thoroughly affected (there are stretches of coastline, I 
believe, where every node carries a source and a created_by tag).

However, on the whole, I don't think that there's a large enough benefit 
for drastic action; as I said, most editors will already drop created_by 
on upload so the tag is slowly dying out anyway.

Bye
Frederik

-- 
Frederik Ramm  ##  eMail frederik at remote.org  ##  N49°00'09" E008°23'33"



More information about the Talk-us mailing list