[OSM-dev] Binary OSM; the first pass encoder
Stefan de Konink
stefan at konink.de
Sun Nov 9 06:14:48 GMT 2008
Stefan de Konink wrote:
> As discussed before; it is possible to do a second pass binary encoding
> with all strings in a distinct table. Where the linked list can be
> recovered to an array can be recovered from the storage. This would make
> a significance difference for the tag keys alone.
>
> In this case all string fields can converted to unsigned long fields for
> now 4G of distinct fields seems enough :)
Since I have some more statistics.
The binary file is 418MB
The strings within the binary file 224MB (\n terminated)
Amount of lines: 29688795
This list deduplicated: 19MB
Amount of lines: 2087179
So with some quick calculations:
418 - 224 + 90 + 19 =~ 303MB
...now it would be nice to see how this values work out on the full
planet :) Never the less; 300MB of binary data directly useable in
any application, plus an on demand generated index, doesn't sound bad
for entire country.
Stefan
More information about the dev
mailing list