[OSM-dev] [OSM-talk] Thoughts on an enhanced GPX api

Ævar Arnfjörð Bjarmason avarab at gmail.com
Tue Jul 28 14:21:28 BST 2009


On Tue, Jul 28, 2009 at 11:15 AM, Tom Hughes<tom at compton.nu> wrote:
> On 28/07/09 11:33, Ævar Arnfjörð Bjarmason wrote:
>
>> On Tue, Jul 28, 2009 at 9:04 AM, René Affourtit<raffourtit at gmail.com>
>>  wrote:
>>
>> * All the data is losslessly inserted into the database
>>
>> This means that we can get waypoint/segment/time/ele/whatever data out
>> again. It would probably be simplest to do this by having additional
>> tables equivalent to the node/way tables where a GPX trkseg would be a
>> way, waypoints nodes and so on.
>
> Track segment information is already preserved, as is elevation data even
> though we never use it (one day I'll get around to removing it...).
>
>> * The data is versioned, and anyone can edit it
>>
>> I have a lot of GPX tracks that could be improved, e.g. by deleting
>> point clouds. I'd like to edit them using normal OSM tools, have those
>> edits versioned (so they can be rolled back), and have other users do
>> those fixes for me. Just like with the OSM data I upload.
>
> GPS data is one of our fundamental pieces of evidence that we've surveyed
> things - is that really compatible with allowing people to edit it? Does
> "editing" the GPS data really make any sense at all?

Users are already editing GPX tracks before they upload them. Or
deleting their tracks, editing them and then re-uploading them.
Facilitating this editing within OSM would add more information to
prove that we've surveyed things (since history would be kept), not
less.

> Maybe deletion of points makes sense, but I can't see that changing a point
> in any way should ever be allowed.

Not even to (as I suggested earlier in the thread) clarify the
waypoint description?

Anyway I'm not suggesting that GPX tracks should be edited
willy-nilly. Just that allowing editing (which would be monitored!)
would be a more cleaner and more general solution to the problem of
aggregated GPS crud than ad-hoc solutions like expiring tracks after a
given amount of time, or asking someone who's uploaded a lot of
tag-cloud data in your area to delete his tracks (as was previously
being suggested (on IRC I think) recently).

>> * Users can download GPX traces:
>>
>> ** As a point cloud within a bbox
>>
>> Like now.
>>
>> ** As "all tracks within bbox"
>>
>> So that tracks can be distinguished (and hidden) and their metadata
>> read&  edited.
>>
>> ** Using other methods
>>
>> E.g. "all tracks by user"
>
> Bearing in mind of course the privacy issues, at least with regard to legacy
> traces, including the question of privacy dilution if you make additional
> information available about the legacy public traces.

It would make sense not to serve a more detailed format than a tag
cloud for any traces not marked public, yes.

(Aside from that users need to be made more aware of what marking
things public/private or setting their location really means. I did my
small bit towards that by adding a link to
http://wiki.openstreetmap.org/wiki/Visibility_of_GPS_traces to the GPX
upload form)

>> Then, instead of deleting traces they (or their segments/points) could
>> simply be tagged indicating their subjective quality using a free-form
>> tagging system. You could then just set your editor to ignore those
>> traces.
>>
>> Free-form tags could obviously be used for other purposes, e.g.
>> marking the trace as surveyed with a given GPS model.
>
> Traces already have free form tags which can be edited, although currently
> only by the person that uploaded them.

People can manually edit the GPX to add such metadata, but we don't
make it easy. Which of course means that nobody does it.

>> Implementing this would require new tables in the database, optional
>> changes to all editors (since they could keep using /trackpoints), and
>> new database tables to track GPX data and its history.
>
> Does it really need any new tables? I can't see why, unless you really want
> to pull track segments out into a separate table? What would be in there
> though other that the track ID and track segment ID - does a GPX file
> contain any information other than that about a segment?

GPX supports arbitrary tags. If we were to import that losslessly &
serve it to users via an API we'd need more than just the current
gps_points table (which only allows for a small subset of possible
tags).

To support arbitrary GPS tags we'd need gps_points and gps_point_tags
(modeled after node_tags). So that e.g. gps_points.altitude would be
removed and replaced by gps_point_tags.k = ele.

Track segments could then be done by reusing the schema for
current_ways/way_tags. And if editing was to be supported
corresponding history tables would need to be created as well.

> Waypoints is the other things I guess. I have considered adding them in the
> past but never quite got around to it. There was a historic argument against
> adding them but I think that can largely be ignored to be honest.
>
>> How does this sound? I'm pretty happy with the 0.6 API except for the
>> GPS bits. I'd like to make GPX a first-class object in OSM and would
>> be willing to hack the rails port to make that happen (when I have
>> time). Is anyone else interested in being able to do what I've
>> described above?
>
> The API code is the easy bit - the performance and disk space issues will be
> the hard problems to solve.

The majority of space imported GPX traces take up is in their track
point. Most GPS loggers seem to log only lat/lon/ele/time so we
wouldn't be storing anything additionally there.

What would be added would be waypoints and other additional metadata.
Which (if the OSM node/way schema is used) wouldn't take up more space
than an equivalently tagged collection of nodes or ways.

> If we dropped the (unused and largely useless) elevation field from the
> points table and added a deleted flag that would keep the disk usage
> basically stable.
>
> The "start point" in the trace table, which isn't very useful, could be
> replaced by a bounding box to allow bbox queries - that's something that I
> have been thinking about doing for a while.

> Performance issues will mainly come into play if you want to do anything
> that requires cross-checking the point cloud against the trace list to
> determine what user owns it and/or whether it is public or not.

Getting all traces within a given bbox shouldn't be more expensive
than getting all OSM ways within a bbox is now. But of course if you
had only the points that made up those ways and wanted to find out
what ways they belonged to that would be more expensive.

So that probably shouldn't be supported at all.




More information about the dev mailing list