[OSM-dev] planet info

Dave Stubbs osm.list at randomjunk.co.uk
Fri Feb 29 09:23:28 GMT 2008

On Thu, Feb 28, 2008 at 10:47 PM, Jason Reid
<osm at bowvalleytechnologies.com> wrote:
> David Earl wrote:
>  > How feasible would it be to put a set of attributes either on the top
>  > level element or an element created for the purpose which tells me how
>  > many nodes, ways and relations there are in the file. If you have the
>  > counts to hand at the beginning, great, but if not if you wrote '...
>  > nodecount="000000000000" waycount="000000000000"
>  > relationcount="000000000000"' at the beginning, and then when you've
>  > output the elements and counted them up as you do it, at the end seek
>  > back and replace the zeros with the counts.
>  >
>  > This would enable me and others to do progress reporting on making a
>  > pass through the file. (I can't do it by file size and read position
>  > because the filesize function won't go bigger than 2Gb in PHP, and I
>  > can't count the elements before I start without completely decompressing
>  > the file first, which I no longer have enough free disk to do).
>  >
>  > David
>  >
>  > _______________________________________________
>  > dev mailing list
>  > dev at openstreetmap.org
>  > http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>  >
>  There is the planet statistics script that I wrote a while back (in
>  python) that I need to get around to popping into SVN, it doesn't count
>  nodes or relations currently, only ways, but it wouldn't be hard to add
>  (plus it would give it something to do since 92.5% of the objects in the
>  dump are nodes and it currently scans over them silently). It could be
>  modified to sit in between the output of the planet script and gzip and
>  calculate as the file is being compressed (the script uses a stream
>  consuming parser to read stdin, in my uses piping from bzcat currently,
>  and could pass the stream back out stdout unmodified)

I think if we wanted counting it would be simpler to just add it to
the C code rather than pipe through another application which actually
has the same limitations (no knowledge of counts at the start, and no

The other possibility would be to write to a whole sequence of files,
all compressed, and just tar the results with a stats meta file to
make a single file for download... most processors could be modified
to read tarballs quite easily, and if not you could untar them first -
it would basically be an OSM Jar but with choice of compression. Just
a random thought... I'm sure you can think of many holes.

Don't forget there's also
http://www.openstreetmap.org/stats/data_stats.html -- if you just want
a rough guess at the number of nodes/ways and you are dealing with a
recent planet, then you could just scrape that to get the numbers.


More information about the dev mailing list