[Tilesathome] Using RAM-drive for ROMA temp tables

Sun Dec 7 21:51:37 GMT 2008

Martijn van Oosterhout wrote:
> On Sat, Dec 6, 2008 at 9:22 AM, Mathieu Arnold <mat at mat.cc> wrote:
>   
>> I'd like to add one thing at that. On my instance, the query returning the
>> nodes in the bbox takes at most 1s, and most of the time is under 0.1s. The
>> index takes about 14GB, and I only have 3.5GB of RAM. I do think it's *not*
>> that bad :-)
>>     
>
> Note it's more complicated still. Even though the index is 14GB, if
> you remove all the leaves of the index, it's probably less than 1GB
> because the width of the index entry is so small. That you cache
> easily. Which means that each node lookup will take at most 2 disk
> seeks. Add locality of reference by area and the fact that render
> requests are not distributed evenly over the world and the average
> performance would be pretty good.
>   
 From memory when I was playing with this a few months back I came to 
similar conclusions.  Identifying nodes and storing ids in a temp table 
didn't take all that long.  What took longer was usually the retrieval 
of actual node data based on those values.  Locality of values does help 
considerably because I suspect pgsql or the OS itself will usually read 
more data than it needs at a time which means that the disk isn't hit 
for every individual node.

I wonder if you'd get any performance increases by filtering the data 
that is stored in the db in the first place.  For example, created_by 
tags would be a prime candidate for discarding.  The less data in the 
db, the closer the data will be packed together and in theory the less 
disk seeks that will occur.  If ROMA is only being used for tiles at home 
you could be fairly selective about the data that is imported.

Brett