[OSM-talk] OSM data, how can we contribute to keep it to a reasonable size?

Sun Jan 21 08:21:36 UTC 2018

On 19.01.18 01:53, Paul Norman wrote:
> Speaking as a developer and frequent consumer of OSM data, don't do 
> any of these things to save space.
>
> Instead, worry about ease of editing. If a few meter long way has 
> hundreds of nodes, that's a problem for editing, and should be fixed; 
> a mass of unnecessary import-sourced tags confuses people, don't use 
> them; and overlapping landuse with lots of multipolygons is difficult 
> to edit, so should be avoided. Following these behaviors will slightly 
> reduce data size, but the point is keeping the map maintainable.
>
> It's also difficult to say what will affect size without a detailed 
> understanding of the format and how it's processed. Size is also not 
> the only indicator of time to process - for various reasons, relations 
> are much slower to work with than ways with most data consumption 
> workflows.

I've met multipolygons which are hard to understand. It is similar here 
with the programming code, it should be first of all readable, otherwise 
if the developer disappears due to a bus factor [1], it would be hard to 
maintain.

Certainly, throwing hardware at the problem is a perfectly respectable 
solution. However, if we look at the power consumption of modern 
processors it ceases to look perfect. For example, Intel i7-8700K 3.7 
GHz 8th generation processor consumes 95 W [2], add to this the fans, 
hard disks, etc., it comes to 400 W power supply unit. And it did not 
change from, for instance, 4th generation processors, which also had 
power consumption of 60 - 90 W [3].

For comparison, the Raspeberry Pi 3 single-board computer requires only 
10 W power supply [4], 40 times less. And this single-board computer 
with the Raspbian Lite GUI-less OS is still capable to run an Apache2 & 
database web-server for a simple site with a light traffic.

So keeping the map maintainable, correcting polygon or tagging errors, 
avoiding unnecessary import-sourced tags and nodes functions as a 
precalculation, which is done only once and by this reduces waste 
multiple times.

[1] https://en.wikipedia.org/wiki/Bus_factor

[2] https://www.brack.ch/intel-cpu-core-i7-8700k-3-616011

[3] https://www.brack.ch/intel-cpu-core-i7-4790k-4-307846

[4] https://www.raspberrypi.org/products/raspberry-pi-3-model-b/

Best regards,

Oleksiy