[OSM-dev] Temporary tables for the map query?

Alex Wilson alex_wilson at pobox.com
Mon May 26 15:40:10 BST 2008


Thanks for the reply and giving me some more background. In fact I wasn't
planning to use the temporary table queries I included within rails - but to
assemble the data for streaming for the C++ implementation I'm working on.
Which, using the aforementioned SQL, shouldn't need to have a memory
footprint above a few kilobytes. But I guess this is at the expense of
relying on MySQL to behave sensibly when working on queries that return
large datasets (i.e. to assemble the data lazily as it's needed by the
caller - rather than to assemble it all at once in memory).

On your point about MySQL's rubbish optimiser: is it genuinely so bad that
it's better in this case to send a huge number of ids back into the db for
subsequent queries rather than build that id set up locally in the db? I can
believe it in general for more complicated queries - but it strikes me
(perhaps naiviely) that the example I sent is sufficiently trivial that
MySQL should be able to handle it if it can do any form of optimisation on
joins whatsover. Perhaps I should set up some test-cases to resolve either
way...

Cheers,

Alex

2008/5/26 Tom Hughes <tom at compton.nu>:

> In message <44d06e000805240124h26649f42g9de9dc8541bd041c at mail.gmail.com>
>           "Alex Wilson" <alex_wilson at pobox.com> wrote:
>
> > I've been looking at the rails_port implementation of the map query. As I
> > understand it, it seems to be retrieving all nodes within the bounding
> box
> > into ruby, then sending this list back out to the db to retrieve all
> ways,
> > then sending all the way ids out for finally retrieving the extra nodes
> and
> > associated relations. So there's quite a lot of (rather inefficient) data
> > movement between the ruby code and the database.
>
> There are far bigger issues, like that fact that we create an
> enourmous rails activerecord object for each node...
>
> In theory we could just do a join instead of sending all the IDs
> back but MySQL is so rubbish that it is incapable of executing such
> a query in a sensible way.
>
> > I would imagine that it would be more efficient to do this all on the
> > database using temporary tables: rather than try to explain what I mean,
> > below is an example of something that works. The last three lines of
> which
> > return all the data - so you only bring back only the stuff that's
> needed,
> > only send through the bbox coords - and that resultant data can be
> streamed
> > easily and doesn't need to be in memory at once (apart from in the db -
> > which is designed to handle this kind of thing). Is there a particular
> > reason that this hasn't been done before -  am I missing something
> obvious?
>
> Ye gads no. Potlatch already uses temporary tables which I would
> prefer it didn't...
>
> Even with temporary tables (and as I said, they aren't really needed
> as you can just join to the table that already exists) you will need
> to do fairly complicated joins that will most likely defeat MySQL's
> optimiser.
>
> The code will also no longer be "rails code" but will essentially be
> raw SQL at which point why would be bother to carry on using the rails
> framework, which costs us far more than sending the IDs back to the
> database does.
>
> Tom
>
> --
> Tom Hughes (tom at compton.nu)
> http://www.compton.nu/
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstreetmap.org/pipermail/dev/attachments/20080526/e8255eba/attachment.html>


More information about the dev mailing list