[OSM-dev] distribution of IDs

Sun Nov 16 15:45:49 GMT 2008

On Sun, 2008-11-16 at 16:16 +0100, Marcus Wolschon wrote:
> Hello,
> 
> I am currently exprimenting with different index-formats for
> http://wiki.openstreetmap.org/wiki/User:MarcusWolschon%5Cosmbin_draft
> .
> My current index assumes that the list of encountered ID-values is
> quite dense = there are fewer unused ID-values then used ones.
> 
> Does this assumption hold true or are our ID-values more sparsely
> distributed in the world-file?
> 
> Currently I'm working with a hamburg-extract and my current (preptty
> bad) index takes 3 times the disk-space the binary-data
> do. I wonder if this ration would be better for a world-file or if I
> need to optimize here.

The IDs are allocated sequentially. Any ones which are missing in the
planet dump will be for objects that have been deleted. 

The stats for IDs the osm2pgsql import of the latest planet dump are:

Node stats: total(278150661), max(311426557)
Way stats: total(22702734), max(28356734)
Relation stats: total(41545), max(50910)

i.e. approximately 278150661/311426557*100 = 89% of allocated node IDs
are still in use.

Obviously if you look at a small subset of the planet then the IDs will
appear much more sparsely populated. There may be a need for a different
index scheme for the two cases.

	Jon