[OSM-talk] Bounding box
Nick Hill
nick at nickhill.co.uk
Mon Feb 5 21:17:19 GMT 2007
Hello Raphael
Insane queries are those whose very high demand don't stack up to the close to
zero value of the result.
Raphaël Jacquot wrote:
> said insane queries may be due in part
None of the following. An insane query is defined above, and is as a result of
malevolence or ignorance.
> * because the storage driver is not the proper one for the usage we have
> (for instance, if we're using myisam)
> * because the request needs to read data that is in a separate table,
> and for some reason locks said table (because of limitations in either
> the storage driver, the database engine or both)
> * because the app doesn't use transaction facilities if available in the
> database engine (which in turn would allow it to enable things like
> "copy on modify" for the current transaction -- that is, the transaction
> that modifies data sees its version while it's being processed, other
> read only transaction see the previous version, other write transactions
> are blocked if they attempt to access this particular data)
Locking is a relatively minor issue. Locking occurs when DB load causes
otherwise fast queries to take a long time to execute. It is therefore as a
result of I/O or CPU load. In which case, without locking, the blocked query
will still be delayed by the same bottleneck.
>
>> Unfortunately, the mechanism to filter insane queries (which were
>> perhaps generated through malice or ignorance) catches a few sane
>> queries, which occasionally will require modified workflow.
>>
>> Once we have upgraded the RAM in the API, we can consider increasing
>> the node limit. When we partition the database or implement tiles, we
>> can consider increasing the bbox limit.
>
> there is no need to use partitionning or other 'hackish' schemes if the
> indexing engine features the proper index generation algorithms (such as
> R-tree based indexes)
I don't consider partitioning hackish.
> Of course, you'll then tell me that the index takes a lot of space or
> something. well, that's what computing is all about, compromises.
> that is, either your data is very compressed and needs a lot of
> computing power to sift through, or, you sacrifice storage space to have
> a fast to go through data structure, and your data is fast to access.
> you can't have both.
I agree to a small part, but mostly disagree. I agree that our indexing scheme
is sub-optimal. Your assumption that the bbox and node limit requirement is
substantially as a result of the indexing scheme is incorrect. I also strongly
disagree with your hidden assumption that an OGC R-tree index as commonly
implemented is anywhere near optimal. I also disagree with the implication that
the space inefficiency of the OGC indexing scheme somehow makes the index
perform better than any other theoretical model or that a different scheme is
somehow compressed.
More information about the talk
mailing list