<div class="gmail_quote">On Wed, Jan 28, 2009 at 11:51 AM, Brett Henderson <span dir="ltr"><<a href="mailto:brett@bretth.com">brett@bretth.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="Ih2E3d">Frederik Ramm wrote:<br>
> Will the planet dump and/or diffs be extended so that they contain all<br>
> changesets too, or what should be the preferred mode of operation for a<br>
> third-party application that wants to track changesets? The only way I<br>
> can currently think of would be looking at diffs to find out which<br>
> changeset IDs were active and then download them individually. Have I<br>
> overlooked something?<br>
><br>
</div>Replication of changesets is on my undocumented long term TODO list for<br>
osmosis (I should add it to trac) but I don't know when I'll be able to<br>
do it. I had some discussions with Shaun and Matt a while back on how<br>
this might be done efficiently. Identifying changesets for replication<br>
is a bit tricky and would probably involve two passes, first pass would<br>
identify all changesets created in a time interval, and the second pass<br>
would identify all changesets modified (ie. have entities referring to<br>
them) in a time interval. Once identified they could be read and<br>
included in a changeset file just like any other entity.<br>
</blockquote><div><br>Perhaps you only need to worry about exporting the changeset headers. The members of a changeset can be derived from the elements themselves.<br><br><node id='456' changeset='123'> provides enough information in the current diff files to be able to create a table of all changeset members. The only thing missing is the changeset header.<br>
<br> </div><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>
The reason they're more difficult than other entities is twofold.<br>
1. They don't have a history table so modifications are much harder to<br>
detect.<br>
2. The bounding box information attached to each cannot be computed by<br>
osmosis in a "streamy" fashion which means they have to be re-replicated<br>
if they change.<br>
<br>
The two stages of changeset identification cause other problems such as<br>
being able to retrieve changesets after identification in an efficient<br>
manner (ie. not one by one), and potentially having to store identifiers<br>
in memory which is problematic at least theoretically if long interval<br>
changesets are being extracted (not likely to be an issue for the<br>
current daily changesets).<br>
<br>
Bit of a brain dump there but they're some of the reasons I'm holding<br>
off until I have a decent amount of time to invest in it.<br>
<br>
However, if replicating into a database osmosis will "invent"<br>
changesets. It will create one changeset per user per changeset<br>
interval which will approximate the real thing. It won't help if you<br>
need the real changeset id though.<br>
<font color="#888888"><br>
Brett<br>
</font><div><div></div><div class="Wj3C7c"><br>
<br>
_______________________________________________<br>
dev mailing list<br>
<a href="mailto:dev@openstreetmap.org">dev@openstreetmap.org</a><br>
<a href="http://lists.openstreetmap.org/listinfo/dev" target="_blank">http://lists.openstreetmap.org/listinfo/dev</a><br>
</div></div></blockquote></div><br>