[OSM-dev] GPX for the future
Richard Fairhurst
richard at systemeD.net
Wed Sep 5 13:24:27 BST 2007
Someone at Microsoft did a talk at Where 2.0 called "What to do with
thousands of GPS tracks"
(http://conferences.oreillynet.com/cs/where2007/view/e_sess/13408), to
which my first thought is "merely thousands?".
So GPS tracks on OSM are currently stored in two ways:
- as files on the server (accessible as, say,
http://www.openstreetmap.org/trace/37369/data)
- as points in the db
The former isn't causing a problem AFAICS: storage isn't an issue. The
latter may be, and is likely to get worse.
We have probably all exacerbated this by being super-conscientious.
Received opinion within OSM is that you set your GPS to 1point/sec
where possible, which makes for lovely-looking traces, means that the
amount of redundancy in the GPS database is massive, but not trivial
to eliminate.
If we take the position that the files form the "complete" record, and
the db forms the delivery mechanism to users, then we can look at
processing the data to make it more efficient.
The obvious way to do this is to simplify on import. In other words,
the full tracklog is still stored as a file, but surplus info is
removed from the database: so if you have a straight line
. . . .
then the middle two points are redundant and can be removed.
Douglas-Peucker is the standard polyline simplification algorithm but,
as a recursive algorithm, is pretty processor-intensive. But there are
simpler ways of doing it, e.g. iterate over each point and keep a note
of the 'heading', and only store a point when it diverges by n
degrees. We would obviously want to keep n very low so that fidelity
is still retained for tracing, and at the same time include a minimum
time threshold (so tracks where the average is every 10s, for example,
aren't simplified any further).
Because we still have the data stored in files, it doesn't stop us
from doing funky stuff (e.g. calculating average speeds for a given
road) in the future if we want to. It just makes the delivery faster
for our main purpose right now.
Am I smoking crack or would this help?
cheers
Richard
More information about the dev
mailing list