<div class="gmail_quote">On Wed, Jan 28, 2009 at 11:51 AM, Brett Henderson <span dir="ltr"><<a href="mailto:brett@bretth.com">brett@bretth.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

<div class="Ih2E3d">Frederik Ramm wrote:<br>

> Will the planet dump and/or diffs be extended so that they contain all<br>

> changesets too, or what should be the preferred mode of operation for a<br>

> third-party application that wants to track changesets? The only way I<br>

> can currently think of would be looking at diffs to find out which<br>

> changeset IDs were active and then download them individually. Have I<br>

> overlooked something?<br>

><br>

</div>Replication of changesets is on my undocumented long term TODO list for<br>

osmosis (I should add it to trac) but I don't know when I'll be able to<br>

do it.  I had some discussions with Shaun and Matt a while back on how<br>

this might be done efficiently.  Identifying changesets for replication<br>

is a bit tricky and would probably involve two passes, first pass would<br>

identify all changesets created in a time interval, and the second pass<br>

would identify all changesets modified (ie. have entities referring to<br>

them) in a time interval.  Once identified they could be read and<br>

included in a changeset file just like any other entity.<br>

</blockquote><div><br>Perhaps you only need to worry about exporting the changeset headers.  The members of a changeset can be derived from the elements themselves.<br><br><node id='456' changeset='123'> provides enough information in the current diff files to be able to create a table of all changeset members.  The only thing missing is the changeset header.<br>

<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>

The reason they're more difficult than other entities is twofold.<br>

1. They don't have a history table so modifications are much harder to<br>

detect.<br>

2. The bounding box information attached to each cannot be computed by<br>

osmosis in a "streamy" fashion which means they have to be re-replicated<br>

if they change.<br>

<br>

The two stages of changeset identification cause other problems such as<br>

being able to retrieve changesets after identification in an efficient<br>

manner (ie. not one by one), and potentially having to store identifiers<br>

in memory which is problematic at least theoretically if long interval<br>

changesets are being extracted (not likely to be an issue for the<br>

current daily changesets).<br>

<br>

Bit of a brain dump there but they're some of the reasons I'm holding<br>

off until I have a decent amount of time to invest in it.<br>

<br>

However, if replicating into a database osmosis will "invent"<br>

changesets.  It will create one changeset per user per changeset<br>

interval which will approximate the real thing.  It won't help if you<br>

need the real changeset id though.<br>

<font color="#888888"><br>

Brett<br>

</font><div><div></div><div class="Wj3C7c"><br>

<br>

_______________________________________________<br>

dev mailing list<br>

<a href="mailto:dev@openstreetmap.org">dev@openstreetmap.org</a><br>

<a href="http://lists.openstreetmap.org/listinfo/dev" target="_blank">http://lists.openstreetmap.org/listinfo/dev</a><br>

</div></div></blockquote></div><br>