<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">And we have changed the PBF format before</blockquote><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
and are in the process of changing it again, so it is not such a big deal to<br>
add support for these things later if they are actually needed.<br></blockquote><div><br></div><div>One of my goals was to reduce breaking changes, or making files that a program thinks it can read, but can't actually read. (e.g., history dumps) </div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">
<br>
</div>I use ints internally in Osmium for the lon/lat as does PBF. But there is this<br>
conversion in there and depending on the granularity factor I am not sure I can<br>
actually do that using just integers. I don't want to use doubles though. </blockquote><div><br></div><div>All units in PBF are in nano-degrees, so you can always use longs to do your calculation, as long as you do the right casts so that the arithmetic is done in longs instead of possibly overflowing ints.</div>
<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">So<br>
this might break on some granularity factors, I don't know and I never tested<br>
it. I actually use a int to double conversion before the factor is applied and<br>
later convert back to int. And in the usual case for OSM I don't do this double<br>
conversion at all, I just use the int as is because it has the right<br>
granularity factor anyway. This extra check (one if that can be perfectly<br>
branch predicted because it never changes) makes the reading of the whole PBF<br>
file about 1% faster! double/int-conversions are slow. So even this seemingly<br>
small thing mean I spent too much time thinking about it and writing code I am<br>
not sure is perfectly right. :-(<br></blockquote><div><br></div><div>Reading a PBF file into code that uses 32-bit integers to represent latitudes and longitudes is probably safe on all current PBF files, but is potentially lossy operation; a latitude in in a 32-bit integer is only precise to 100 nanodegrees. if the PBF file happens to have measurements precise to 1 nanodegree, you must lose 2 digits of precision.</div>
<div><br></div><div>Here is an alternate formula that only requires integer arithmatic that will go from a PBF file to a 32-bit integer and is correct for any granulatity.</div><div><br></div><div> long lat = .... // Latitude encoded in the pbf. type must be a 64-bit int to avoid overflow in calculation.</div>
<div> latitude_int = ((lat_offset + granularity*lat)/50+1)/2 // This calculation must be done with 64-bit longs.</div><div><br></div><div>This formula will be correct for any granularity and lat_offset . The reason for the $/50+1)/2$ instead of $/100$ is to get better round-off behavior; it'll round-nearest instead of round-to-zero. <a href="http://en.wikipedia.org/wiki/Rounding">http://en.wikipedia.org/wiki/Rounding</a> </div>
<div><br></div><div>If the granularity is 100, or any multiple of 100 (e.g., 200, 1000, 10000, 700), you can simplify the above formula into:</div><div> int lat = .... // This can be an 32-bit int without overflow.</div>
<div> latitude_int = (lat_offset/50+1)/2 + (granularity/100)*lat // This calculation can be done using 32-bit ints.</div><div><br></div><div>I don't want to put these formulas as part of the spec as they are the least-lossy approximations of the lossless formulas in the specification.</div>
<div><br></div><div>Scott</div><div><br></div></div>