[OSM-dev] Rantings about API 0.6
frederik at remote.org
Tue Feb 10 01:01:43 GMT 2009
Iván Sánchez Ortega wrote:
> [2009-02-10 01:00:34.809017 #1664] User Load (0.000144) SELECT * FROM
> `users` WHERE (`users`.`id` = 1)
> [2009-02-10 01:00:34.810457 #1664] SQL (0.000144) SELECT `display_name`
> FROM `users` WHERE (`users`.display_name = 'ivansanchez' AND `users`.id <> 1)
> [2009-02-10 01:00:34.811241 #1664] SQL (0.000108) SELECT `email` FROM
> `users` WHERE (`users`.email = 'ivan at sanchezortega.es' AND `users`.id <> 1)
> WTF does this happen on every object (read: node) I upload? The API already
> knows who I am the second I started to upload the diff!
Probably the same thing that Matt Amos just posted, the individual
object processing code being re-used for each object in the change file,
and so for each object it checks whether you're really the owner of the
Maybe we should indeed re-think this one; it is the only optimisation
that would influence the protocol. (Then again it would do so in a
backwards-compatible way; if we drop the changeset=... attribute
requirement later and clients still send it, who cares.)
> [2009-02-10 01:00:34.863175 #1664] SQL (0.000244) SELECT `id` FROM
> `current_nodes` WHERE (`current_nodes`.id IS NULL)
> current_nodes.id is defined as NOT NULL. So, WTF?
> [2009-02-10 01:00:34.864872 #1664] SQL (0.000258) SELECT `id` FROM
> `changesets` WHERE (`changesets`.id = 44 AND `changesets`.id <> 44)
I can only assume that these things are somehow deeply magic Rails
incantations that will provide good Karma for the requests to come. And
I can assure you that it would be worse with Hibernate ;-)
> [2009-02-10 01:00:34.859181 #1664] Changeset Update (0.000314) UPDATE
> `changesets` SET `num_changes` = 18443 WHERE `id` = 44
> Couldn't this wait until the diff upload is complete?
As we're in a transaction, I guess you're right. - You have probably
found a weak spot where we traded efficiency for maintainability.
Re-using the individual object change building blocks as-is makes the
code easier to read, understand, maintain, but carries a performance
> So, while I'm no expert about Rails, I do suspect that all those extra
> unnecessary queries are done automagically.
The middle ones, probably yes. The top and bottom ones, proably by design.
> I do know OSM will have some brand new DB server (now with 0.1 more API!) that
> should be able to handle the extra overhead, and that most of those queries
> will hit data that will be cached anyway.
Also, the above requests would all happen if you did individual uploads,
*plus* you would have the http setup overhead, so even as-is, diff
uploads will bring a considerable gain.
> Is it worth to optimize the code for diff uploads?
It can probably wait till later.
Frederik Ramm ## eMail frederik at remote.org ## N49°00'09" E008°23'33"
More information about the dev