<div style="display: flex; flex-wrap: wrap; white-space: pre-wrap; align-items: center; "><img height="20" width="20" style="border-radius:50%; margin-right: 4px;" decoding="async" src="https://avatars.githubusercontent.com/u/5893857?s=20&v=4" /><strong>jake-low</strong> left a comment <a href="https://github.com/openstreetmap/openstreetmap-website/pull/5973#issuecomment-2884935872">(openstreetmap/openstreetmap-website#5973)</a></div>
<blockquote>
<p dir="auto">Edit: I've just noticed that <a href="https://github.com/OSMCha/osmchange-parser">https://github.com/OSMCha/osmchange-parser</a> has already implemented the re-grouping. Maybe <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/jake-low/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/jake-low">@jake-low</a> has some insights to share on the JSON format.</p>
</blockquote>
<p dir="auto">Just dropping in to respond to this ping, sorry it took a while.</p>
<p dir="auto">I extracted osmchange-parser from the backend of OSMCha last year while doing some refactoring. I published it as its own package on the off chance it would be useful somewhere else, but to my knowledge nobody else is currently using it. The JSON structure emitted by that parser isn't something I gave a lot of thought to; it mirrors the input XML structure pretty closely simply because that was easy to accomplish with a SAX-style parser.</p>
<p dir="auto">I didn't see anyone respond above to <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/users/joto/hovercard" data-octo-click="hovercard-link-click" data-octo-dimensions="link_type:self" href="https://github.com/joto">@joto</a>'s point, but I'd like to reiterate it: I don't think osmChange files need to categorize edits into <code class="notranslate">create</code>/<code class="notranslate">modify</code>/<code class="notranslate">delete</code> groups explicitly. These groups are implied by other properties that are also included in the file (the element's <code class="notranslate">version</code> number and <code class="notranslate">visible</code> boolean). Implementations that deal with osmChange files need to handle cases where the stated type of the edit doesn't match its semantics (<a href="https://github.com/bdon/OSMExpress/blob/main/python/examples/augmented_diff.py#L126-L134">example</a>), which means they might as well just ignore the stated type all the time, and instead infer the type from the <code class="notranslate">version</code> and <code class="notranslate">visible</code> attributes.</p>
<p dir="auto">Given this, I think a candidate JSON schema that's worth considering is one like this:</p>
<div class="highlight highlight-source-js" dir="auto"><pre class="notranslate"><span class="pl-kos">{</span>
  <span class="pl-s">"version"</span>: <span class="pl-s">"0.6"</span><span class="pl-kos">,</span>
  <span class="pl-s">"generator"</span>: <span class="pl-s">"OpenStreetMap server"</span><span class="pl-kos">,</span>
  <span class="pl-s">"copyright"</span>: <span class="pl-s">"OpenStreetMap and contributors"</span><span class="pl-kos">,</span>
  <span class="pl-s">"attribution"</span>: <span class="pl-s">"http://www.openstreetmap.org/copyright"</span><span class="pl-kos">,</span>
  <span class="pl-s">"license"</span>: <span class="pl-s">"http://opendatacommons.org/licenses/odbl/1-0/"</span><span class="pl-kos">,</span>
  <span class="pl-s">"elements"</span>: <span class="pl-kos">[</span>
    <span class="pl-c">// this is an array of elements in normal OSM order</span>
    <span class="pl-c">// (sorted by type, then ID, then version number).</span>
    <span class="pl-kos">{</span>
      <span class="pl-s">"type"</span>: <span class="pl-s">"node"</span><span class="pl-kos">,</span>
      <span class="pl-s">"id"</span>: <span class="pl-c1">349018659</span><span class="pl-kos">,</span>
      <span class="pl-s">"lat"</span>: <span class="pl-c1">47.4624576</span><span class="pl-kos">,</span>
      <span class="pl-s">"lon"</span>: <span class="pl-c1">-</span><span class="pl-c1">121.4782337</span><span class="pl-kos">,</span>
      <span class="pl-s">"timestamp"</span>: <span class="pl-s">"2024-08-19T15:53:56Z"</span><span class="pl-kos">,</span>
      <span class="pl-s">"version"</span>: <span class="pl-c1">7</span><span class="pl-kos">,</span>
      <span class="pl-s">"changeset"</span>: <span class="pl-c1">155467928</span><span class="pl-kos">,</span>
      <span class="pl-s">"user"</span>: <span class="pl-s">"Mateusz Konieczny - bot account"</span><span class="pl-kos">,</span>
      <span class="pl-s">"uid"</span>: <span class="pl-c1">3199858</span><span class="pl-kos">,</span>
      <span class="pl-s">"tags"</span>: <span class="pl-kos">{</span>
        <span class="pl-s">"ele"</span>: <span class="pl-s">"1907.7"</span><span class="pl-kos">,</span>
        <span class="pl-s">"gnis:feature_id"</span>: <span class="pl-s">"1521553"</span><span class="pl-kos">,</span>
        <span class="pl-s">"name"</span>: <span class="pl-s">"Kaleetan Peak"</span><span class="pl-kos">,</span>
        <span class="pl-s">"natural"</span>: <span class="pl-s">"peak"</span><span class="pl-kos">,</span>
        <span class="pl-s">"source"</span>: <span class="pl-s">"USGS"</span><span class="pl-kos">,</span>
        <span class="pl-s">"wikidata"</span>: <span class="pl-s">"Q49040648"</span><span class="pl-kos">,</span>
        <span class="pl-s">"wikipedia"</span>: <span class="pl-s">"en:Kaleetan Peak"</span>
      <span class="pl-kos">}</span>
    <span class="pl-kos">}</span><span class="pl-kos">,</span>
    <span class="pl-c">// ... followed by more elements (omitted in this example) ...</span>
  <span class="pl-kos">]</span>
<span class="pl-kos">}</span></pre></div>
<p dir="auto">This has the advantage that the response is the <em>exact</em> same schema as if you fetched an element individually from the API (e.g. from <code class="notranslate">/node/:id</code> or <code class="notranslate">/node/:id/history</code>). For comparison:</p>
<pre class="notranslate"><code class="notranslate">λ curl -s 'https://www.openstreetmap.org/api/0.6/node/349018659' -H 'Accept: application/json' | jq
{
  "version": "0.6",
  "generator": "openstreetmap-cgimap 2.0.1 (764192 spike-08.openstreetmap.org)",
  "copyright": "OpenStreetMap and contributors",
  "attribution": "http://www.openstreetmap.org/copyright",
  "license": "http://opendatacommons.org/licenses/odbl/1-0/",
  "elements": [
    {
      "type": "node",
      "id": 349018659,
      "lat": 47.4624576,
      "lon": -121.4782337,
      "timestamp": "2024-08-19T15:53:56Z",
      "version": 7,
      "changeset": 155467928,
      "user": "Mateusz Konieczny - bot account",
      "uid": 3199858,
      "tags": {
        "ele": "1907.7",
        "gnis:feature_id": "1521553",
        "name": "Kaleetan Peak",
        "natural": "peak",
        "source": "USGS",
        "wikidata": "Q49040648",
        "wikipedia": "en:Kaleetan Peak"
      }
    }
  ]
}
</code></pre>
<p dir="auto">This is nice especially in strongly typed languages: if you have a type definition for an <code class="notranslate">Element</code> as returned by the OSM API, you can reuse that definition when fetching data from any of these endpoints. (I'm ignoring the <code class="notranslate">if-unused</code> attribute used for uploading in these examples for simplicity)</p>
<div class="highlight highlight-source-ts" dir="auto"><pre class="notranslate"><span class="pl-c">// this type definition works for responses from several different endpoints</span>
<span class="pl-c">// (/node/:id, /node/:id/history, /changeset/:id/download, etc)</span>
<span class="pl-k">type</span> <span class="pl-smi">OSMResponse</span> <span class="pl-c1">=</span> <span class="pl-kos">{</span>
  <span class="pl-c1">version</span>: <span class="pl-smi">string</span><span class="pl-kos">;</span>
  <span class="pl-c1">generator</span>: <span class="pl-smi">string</span><span class="pl-kos">;</span>
  <span class="pl-c1">copyright</span>: <span class="pl-smi">string</span><span class="pl-kos">;</span>
  <span class="pl-c1">attribution</span>: <span class="pl-smi">string</span><span class="pl-kos">;</span>
  <span class="pl-c1">license</span>: <span class="pl-smi">string</span><span class="pl-kos">;</span>
  <span class="pl-c1">elements</span>: <span class="pl-smi">Element</span><span class="pl-kos">[</span><span class="pl-kos">]</span><span class="pl-kos">;</span>
<span class="pl-kos">}</span>

<span class="pl-k">type</span> <span class="pl-smi">Element</span> <span class="pl-c1">=</span> <span class="pl-smi">Node</span> <span class="pl-c1">|</span> <span class="pl-smi">Way</span> <span class="pl-c1">|</span> <span class="pl-smi">Relation</span><span class="pl-kos">;</span>

<span class="pl-k">type</span> <span class="pl-smi">Node</span> <span class="pl-c1">=</span> <span class="pl-kos">{</span>
  <span class="pl-c1">type</span>: <span class="pl-s">'node'</span><span class="pl-kos">;</span>
  <span class="pl-c1">id</span>: <span class="pl-smi">number</span><span class="pl-kos">;</span>
  <span class="pl-c1">lat</span>: <span class="pl-smi">number</span><span class="pl-kos">;</span>
  <span class="pl-c1">lon</span>: <span class="pl-smi">number</span><span class="pl-kos">;</span>
  <span class="pl-c1">timestamp</span>: <span class="pl-smi">string</span><span class="pl-kos">;</span>
  <span class="pl-c1">version</span>: <span class="pl-smi">number</span><span class="pl-kos">;</span>
  <span class="pl-c1">changeset</span>: <span class="pl-smi">number</span><span class="pl-kos">;</span>
  <span class="pl-c1">user</span>: <span class="pl-smi">string</span><span class="pl-kos">;</span>
  <span class="pl-c1">uid</span>: <span class="pl-smi">number</span><span class="pl-kos">;</span>
  <span class="pl-c1">tags</span>: <span class="pl-smi">Record</span><span class="pl-c1"><</span><span class="pl-smi">string</span><span class="pl-kos">,</span> <span class="pl-smi">string</span><span class="pl-c1">></span><span class="pl-kos">;</span>
<span class="pl-kos">}</span>

<span class="pl-k">type</span> <span class="pl-v">Way</span> <span class="pl-c1">=</span> ...</pre></div>

<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br />Reply to this email directly, <a href="https://github.com/openstreetmap/openstreetmap-website/pull/5973#issuecomment-2884935872">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AAK2OLNSDWEP27BJ6YKKJYD26TXVPAVCNFSM6AAAAAB4HU3NT2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDQOBUHEZTKOBXGI">unsubscribe</a>.<br />You are receiving this because you are subscribed to this thread.<img src="https://github.com/notifications/beacon/AAK2OLJ6G2HWGGXY3XWGPBT26TXVPA5CNFSM6AAAAAB4HU3NT2WGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTVL6SQMA.gif" height="1" width="1" alt="" /><span style="color: transparent; font-size: 0; display: none; visibility: hidden; overflow: hidden; opacity: 0; width: 0; height: 0; max-width: 0; max-height: 0; mso-hide: all">Message ID: <span><openstreetmap/openstreetmap-website/pull/5973/c2884935872</span><span>@</span><span>github</span><span>.</span><span>com></span></span></p>

<script type="application/ld+json">[
{
"@context": "http://schema.org",
"@type": "EmailMessage",
"potentialAction": {
"@type": "ViewAction",
"target": "https://github.com/openstreetmap/openstreetmap-website/pull/5973#issuecomment-2884935872",
"url": "https://github.com/openstreetmap/openstreetmap-website/pull/5973#issuecomment-2884935872",
"name": "View Pull Request"
},
"description": "View this Pull Request on GitHub",
"publisher": {
"@type": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
}
]</script>