[Talk-us] Blocking of user "WorstFixer" for removing ele=0 etc

Alan Mintz Alan_Mintz+OSM at Earthlink.Net
Tue May 15 22:24:25 BST 2012


At 2012-05-13 02:49, Frederik Ramm wrote:
>Removing ele=0 from objects is, in my opinion, totally unnecessary;

And maybe incorrect, as ele=0 means we know the elevation is 0, while no 
ele tag means we do not know the elevation.


>  like created_by, over which WorstFixer made a similar fuss, such 
> information could be removed where an object is touched for some other 
> reason but I don't see why it would have to be mass-removed.

The reason for this may not be obvious to some. I assume it's because we 
store history of all objects, and it's a waste of space, not to mention 
bandwidth and processing resources to push the changes out to the mirrors, 
for almost no benefit. I just add "created_by=''" to my JOSM presets (or 
maybe it does this automatically now) so I clean it up when performing 
other edits.


>  Even so, a mass-removal would be ok if proposed, discussed, and accepted 
> by the community like we expect everyone to; it's not ok to just do it on 
> your own and see if someone notices.

Yes. Having said all that, OSMTI says there are 23 million nodes (33% of 
the total) with created_by tags! This seemed surprisingly high to me.

I retrieved nodes from 300 random 0.1x0.1 degree bboxes. Of those, only 37 
returned any nodes at all**. All but 6 of those areas had no "created_by" 
tags on their nodes. Of those, only 2 were significant in percentage*, both 
in Norway.

#137 had 1558 nodes, 801 of which (51%) have created_by tags.   BLTR: 
68.137    13.766  68.237  13.866
#264 had 2297 nodes, 1946 of which (85%) have created_by tags. BLTR: 
60.787     4.900   60.887  5.000

In #137, they are mostly tagged:
     <tag k="created_by" v="JOSM"/> (TI says this makes up 63% of the values)

In #264, they are mostly tagged:
     <tag k="created_by" v="almien_coastlines"/> (TI says this makes up 10% 
of the values)
     <tag k="source" v="PGS(could be inacurately)"/>


My questions are:

1. Would removing the created_by from 33% of the nodes in the database save 
significant storage space, dump size, backup time, etc.?

2. Is it possible to remove these in bulk from the database without having 
to keep the history, push those diffs to mirrors, etc.? Do the mirrors 
occasionally start fresh from a new dump? Or can they run the same bulk 
purge? Or do I overestimate the necessity of doing it this way (and we can 
just clean it up with the regular tools and processes)?



* While not a significant portion of the total nodes in the area (only 4%), 
there were almost 600 created-by-tagged nodes in this file from England:

#123 had 14013 nodes, 594 of which (4%) have created_by tags.   BLTR: 
51.086    0.088   51.186  0.188


** I guess this clarifies why old satellites that fall from their orbits 
and other space junk never seem to hit anything, even if they survive 
re-entry :)

--
Alan Mintz <Alan_Mintz+OSM at Earthlink.net>




More information about the Talk-us mailing list