[OSM-dev] North America gone in geofabrik and tagwatch

Frederik Ramm frederik at remote.org
Tue Jul 13 12:23:00 BST 2010


Alan,

Alan Mintz wrote:
> Hopefully, someone can throw some resource at it. I'm a little 
> disappointed to see what could be seen as a value judgement, something I 
> try very hard to avoid in my cartographic endeavors, but I also 
> understand the constraints of limited resources.

 From the excerpts perspective it is really driven by what I said - the 
excerpts are there to make data processing easier for people with small 
machines, but anyone who can process a North America extract is by 
definition not someone with a small machine ;)

> I do believe TagWatch is an indispensable tool for creating at least 
> _some_ consistency in tagging. Given most of our (including myself) lack 
> of attention to documenting things in the wiki, it's really a great 
> resource for finding out what people are actually doing in real-world 
> scenarios in the US.
> 
> What sort of resources are required? Machine, RAM, CPU/elapsed time, 
> disk space, size of files to up/download via network, assuming we're 
> just looking for a place to process, not host the results? Can it be 
> done on Windows?

Assuming you are content to run it once a week, you would have to 
download the full planet file, create an extract with Osmosis, and then 
run the tagwatch perl script on it. The tagwatch perl script will not 
run on Windows and will probably require a 16 GB RAM machine to process 
all of North America, but I can test-run it here if you want certainty. 
The whole process from downloading the planet to a finished tagwatch 
output will probably keep the machine occcupied for something like two 
days, and nothing much can be run at the time.

The result is a set of HTML files which can be copied to a server 
somewhere, that bit is the least critical.

The tagwatch architecture is really shite (everone agrees including 
those who have built it and run it) but it's the best we have at the 
moment. OSMDoc, being database based, is much better, but is "out of 
service" currently, and is not exactly resource-saving either. But the 
OSMDoc author has promised some updates for this month.

Suggestion: I could run tagwatch for USA, infrequently (say twice a 
month or so) on one of my machines for a while, and push the results to 
somewhere on the web. As soon as OSMDoc flies again, we (or more 
precisely you guys on the other side of the ocean) could try and find 
resources to run OSMDoc which will give you a much better user 
experience, more interactivity, and faster turnaround times.

Maybe one could even enlist the help of Aol for that, since they seem to 
be interested in helping US mappers.





More information about the dev mailing list