[OSM-dev] planet info

Fri Feb 29 09:56:28 GMT 2008

On Fri, Feb 29, 2008 at 9:23 AM, Dave Stubbs <osm.list at randomjunk.co.uk>
wrote:

> On Thu, Feb 28, 2008 at 10:47 PM, Jason Reid
> <osm at bowvalleytechnologies.com> wrote:
> >
> > David Earl wrote:
> >  > How feasible would it be to put a set of attributes either on the top
> >  > level element or an element created for the purpose which tells me
> how
> >  > many nodes, ways and relations there are in the file. If you have the
> >  > counts to hand at the beginning, great, but if not if you wrote '...
> >  > nodecount="000000000000" waycount="000000000000"
> >  > relationcount="000000000000"' at the beginning, and then when you've
> >  > output the elements and counted them up as you do it, at the end seek
> >  > back and replace the zeros with the counts.
> >  >
> >  > This would enable me and others to do progress reporting on making a
> >  > pass through the file. (I can't do it by file size and read position
> >  > because the filesize function won't go bigger than 2Gb in PHP, and I
> >  > can't count the elements before I start without completely
> decompressing
> >  > the file first, which I no longer have enough free disk to do).
> >  >
> >  > David
> >  >
> >  > _______________________________________________
> >  > dev mailing list
> >  > dev at openstreetmap.org
> >  > http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
> >  >
> >  There is the planet statistics script that I wrote a while back (in
> >  python) that I need to get around to popping into SVN, it doesn't count
> >  nodes or relations currently, only ways, but it wouldn't be hard to add
> >  (plus it would give it something to do since 92.5% of the objects in
> the
> >  dump are nodes and it currently scans over them silently). It could be
> >  modified to sit in between the output of the planet script and gzip and
> >  calculate as the file is being compressed (the script uses a stream
> >  consuming parser to read stdin, in my uses piping from bzcat currently,
> >  and could pass the stream back out stdout unmodified)
> >
>
>
> I think if we wanted counting it would be simpler to just add it to
> the C code rather than pipe through another application which actually
> has the same limitations (no knowledge of counts at the start, and no
> seek).
>
> The other possibility would be to write to a whole sequence of files,
> all compressed, and just tar the results with a stats meta file to
> make a single file for download... most processors could be modified
> to read tarballs quite easily, and if not you could untar them first -
> it would basically be an OSM Jar but with choice of compression. Just
> a random thought... I'm sure you can think of many holes.
>
> Don't forget there's also
> http://www.openstreetmap.org/stats/data_stats.html -- if you just want
> a rough guess at the number of nodes/ways and you are dealing with a
> recent planet, then you could just scrape that to get the numbers.
>

There's also http://osmxapi.hypercube.telascience.org/total.xml.  This is
xml so it may be easier to handle than data_stats.

>
> Dave
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20080229/c61bbabe/attachment.html>