[OSM-dev] Mass imports (TIGER and AND)

Tue Aug 28 08:29:24 BST 2007

In message <1188275447.28903.15.camel at localhost>
        Dave Hansen <dave at sr71.net> wrote:

> The thing that *IS* on my laptop is the ruby code.  It is responsible
> for 90% of the CPU time, and the CPUs are maxed out.  mysql, on the
> other hand, is responsible for ~3% of total cpu time.  Even with my
> piddly notebook hard drive, the I/O wait time is under 1%.

That's quite impressive, because the CPUs on our web servers never
get anywhere near maxing out, and between then they are processing 
anything up to about a dozen requests each at any one time.

> People have been saying that we should write the import code in ruby to
> run on the server and use the existing rails code.  If the ruby code
> itself is the bottleneck and not the round-trip time or the disk, is
> doing the import through the ruby code going to even help?

As somebody else has pointed out, it is only the object model that you
would need to use so all overhead of parsing the requests would be
avoided.

I think the problem with my scheme is going to be keeping the amount
of history required to map the negative IDs in the change file to the
allocated positive IDs as things are added. That will use up a lot of
memory in ruby.

Tom

-- 
Tom Hughes (tom at compton.nu)
http://www.compton.nu/