[OSM-dev] [OSM-talk] donating read-only api-mirrors

Sat Feb 7 03:54:07 GMT 2009

Stefan de Konink <stefan at konink.de> writes:

> Matt Amos wrote:
>> On Fri, Feb 6, 2009 at 11:47 PM, Stefan de Konink <stefan at konink.de> wrote:
>>> Matt Amos wrote:
>>>> On Fri, Feb 6, 2009 at 8:32 PM, Stefan de Konink <stefan at konink.de> wrote:
>>>>> Even then; the ordering can be done in SQL.
>>>> not in the two-step scheme that Matthias was suggesting. it might be
>>>> possible using temporary tables, but care would be necessary to work
>>>> around mysql's horrible "can't re-open table" brokenness.
>>> I really don't see how you could fetch *all* ways without the nodes that are
>>> in the bbox or a join on the table.
>> 
>> what are you talking about? i was merely making the simple observation
>> that in the two-step node fetching scheme that Matthias was suggesting
>> (fetch and stream nodes in bbox, fetch and stream nodes used by way)
>> the fact that it is done in *two* queries with processing in-between
>> means it is impossible to order the results using only SQL.
>
> Ok do I understand you right in best readable case want to do the following?
>
> SELECT * FROM nodes WHERE BBOX(...) OR id IN (
>
>   SELECT node FROM way_nds WHERE way IN (
>
>    SELECT way FROM way_nds WHERE node IN (
>
>     SELECT * FROM nodes WHERE BBOX(...)
>
>    )
>
>   )
>
> ) ORDER BY id;
>
> But for performance reason Matthias would like to split this process in 
> two distinct parts, so reuse of the resultset is possible?
>
> nodes     := SELECT id FROM nodes WHERE BBOX(...)
> ways      := SELECT way FROM way_nds WHERE node IN (nodes);
>
> SELECT * FROM nodes WHERE IN(nodes) OR IN(
>    SELECT node FROM way_nds WHERE way IN (ways);
> )
>
> And while busy...
>
> SELECT * FROM ways WHERE id IN (ways);
>
>> the node IDs in the bbox and extras used in ways would have to be
>> merged and stored - which might be possible in a decent database - but
>> mysql has a limitation where it doesn't allow temporary tables to be
>> used twice in a statement, which makes things much more difficult.
>
> As pointed out before, I am nowhere near a MySQL expert; but is MySQL 
> able to make soup from those partial results (vulcano wise it has a 
> caching advantage here), and might be smart enough to reuse the results. 
> I have benchmarked using my database engine that there is a tipping 
> point in reusing materialized resultsets (integers) versus rejoining 
> them. If reusing resultsets in MySQL is always cheap, no matter how big 
> the query that might be a good approach.

I am no SQL expert neither.  So I don't know whether it is feasible to
do everything in MySQL.  I suggested the above scheme because it comes
close to what the API is currently doing.

Why would storing of the data be such a big overhead?  Especially
comparing to storing everything in (limited) memory?  I think compared
to all the other processing the API is doing writing the data to disk
and then reading it again would not be so significant.

>
>>>> adding an error element would require a (small) change to all clients
>>>> to support it, which is fine if all the tool authors are willing to
>>>> make the change (or accept a patch).
>>> Yes, but since we are changing protocols soon, error messages would be a
>>> minor implementation effort ;)
>> 
>> indeed. but the API 0.6 changes were agreed a long time ago. adding
>> even a "minor" change at this point would to be discussed and agreed
>> with the tool authors.
>
> I frankly don't want to make problems bigger than they are, is the only 
> error message currently/0.6 http headers? I am sure the parser will get 
> the message if we just send broken XML back ;)

For data consumers (like T at H) the API change to 0.6 actually does not
require any changes.  Checking for an error element or even for an
incomplete XML file does require the client to do the right thing.
Sending an error code and no data when something goes wrong does not
leave much room for misinterpretation even if the client does not look
at the response code.

Only software that wants to upload data to the server needs to be
adapted to API 0.6 (and of course software that wants to use the new
attributes).

Matthias