[talk-au] Running stats against GPX files ...

Tue Jun 23 14:43:13 BST 2009

--- On Tue, 23/6/09, Graeme Wilson <wanderer55 at live.com.au> wrote:
> I have ideas about trimming out unnecessary points in GPX
> files. I wrote an elementary program to look at positions,
> and if there were more than two positions the same, then it
> meant that the car was stationary for more than two seconds.
> The program copies the data from an existing file to a new
> file but without the stationary data. It has two
> 'bugs' in it. The first two lines taken from the
> original file got corrupted, and if the car was stationary
> and the gps decided that the position moved through jitter,
> then the program would think the car had moved and then it
> would 'undo' and the process had to start over
> again. 

The app shouldn't be recording points unless the person moves 5m or more, although to compensate for GPS drift it might be best to have this equal or close to equalling the hdop.

Even though the app collecting the data is distant dependent, it will record once per second if the speed is > 5m/s.

> Another idea I have, but haven't got my head around
> yet, is to look for straight roads. It's just an idea at
> this stage, but it revolves around Pythagoras. If you have a
> straight road, then all the points will have the same slope,
> ie lat divided by lon. Once a section of road has been
> decided as straight, then only two points need to be used to
> designate it.

In my app, rather than post-processing, I can find out the bearing based on the previous point, if the bearing doesn't vary by more than say 10 degrees, I could filter it. Although I suspect the angle used to filter would depend on the speed or distance between points.

> Another idea is to have our own proprietry file format
> called .osm I reckon that we could have several lines of
> text at the start to declare things like program version,
> who made the file, the start and end time of the actual
> recording for copyright purposes, the number of lines of
> data in the file, and then the lat and lon to 7 decimal
> points accuracy.

I can record all the useful point information using 17 bytes per point using 1 and 4 byte integers to store the information, 4 bytes for lat, 4 bytes for long, 1 byte for hdop, 4 bytes for elevation and 4 bytes for time. In fact this is what I'm doing to reduce the amount of data used to upload GPS trace information.

That doesn't include any header information, such as what device or chipset used to collect the data etc.

> I also think that the GPX file is too verbose with all the
> XML formatting, and reckon that it will fill up the servers
> with too much unnecessary stuff. I have been told that for
> copyright purposes, all the data etc that is uploaded is to
> be kept forever. Lot of redundant stuff there.

There is nothing wrong with some redundency between tracks, it comes back to that lovely averages word people keep abusing :)

Also if you are zig-zagging collecting data you would likely cross the same point twice if not more times collecting, so disgarding data simply because it's the same point wouldn't be a good idea.