[OSM-dev] GPX for the future

Richard Fairhurst richard at systemeD.net
Wed Sep 5 13:24:27 BST 2007

Someone at Microsoft did a talk at Where 2.0 called "What to do with  
thousands of GPS tracks"  
(http://conferences.oreillynet.com/cs/where2007/view/e_sess/13408), to  
which my first thought is "merely thousands?".

So GPS tracks on OSM are currently stored in two ways:

- as files on the server (accessible as, say,  
- as points in the db

The former isn't causing a problem AFAICS: storage isn't an issue. The  
latter may be, and is likely to get worse.

We have probably all exacerbated this by being super-conscientious.  
Received opinion within OSM is that you set your GPS to 1point/sec  
where possible, which makes for lovely-looking traces, means that the  
amount of redundancy in the GPS database is massive, but not trivial  
to eliminate.

If we take the position that the files form the "complete" record, and  
the db forms the delivery mechanism to users, then we can look at  
processing the data to make it more efficient.

The obvious way to do this is to simplify on import. In other words,  
the full tracklog is still stored as a file, but surplus info is  
removed from the database: so if you have a straight line

     .   .   .   .

then the middle two points are redundant and can be removed.

Douglas-Peucker is the standard polyline simplification algorithm but,  
as a recursive algorithm, is pretty processor-intensive. But there are  
simpler ways of doing it, e.g. iterate over each point and keep a note  
of the 'heading', and only store a point when it diverges by n  
degrees. We would obviously want to keep n very low so that fidelity  
is still retained for tracing, and at the same time include a minimum  
time threshold (so tracks where the average is every 10s, for example,  
aren't simplified any further).

Because we still have the data stored in files, it doesn't stop us  
from doing funky stuff (e.g. calculating average speeds for a given  
road) in the future if we want to. It just makes the delivery faster  
for our main purpose right now.

Am I smoking crack or would this help?


More information about the dev mailing list