[OSM-talk] Fwd: Re: Tagwatch

OJW streetmap at blibbleblobble.co.uk
Fri Oct 19 18:28:20 BST 2007


On Friday 19 October 2007 16:34:02 David Groom wrote:
> Why for instance does natural = peak not show up in the table despite being
> in ohm-POI-features-z17.xml, or is it that the tag is not used frequently
> enough?

I'm scanning OSM files to get the tags, currently using the latest UK.osm from 
NickW's site

Then there's the "Watchlist" page, which tells the script about tags that it 
should be interested in (still looking for better ideas - this model is just 
my first draft of a way to find "interesting" tags)

http://wiki.openstreetmap.org/index.php/Tagwatch/Watchlist

Then when it generates the actual pages, it does a frequency comparison within 
each tag.  So if your highway=something type occurs at least 0.1 times as 
frequently as highway=residential then it gets a line of its own with photo 
and translations and wikipedia link. If not, it gets included on the list 
of "other values".

I guess that should be expended to cover anything that someone's defined 
a 'description' tag for?

Talking of statistics, I'd like to know about methods of categorising the 
distribution of sets like this - i.e. given a frequency distribution of tags, 
how do you find which ones are "important"?  This is a maths problem I'd like 
to know the answer to, as it could help tagwatch a lot.

Also, it needs a method of telling the difference between rarely-used tags and 
typos.  e.g. "highway=Primary" gets used a lot but is wrong -- how can 
tagwatch know?  My plan at the moment is to get people to mark these using 
the wiki pages, but suggestions (and code) welcome.





More information about the talk mailing list