[OSM-dev] Faster loading with scabies
brett at bretth.com
Fri Oct 17 13:17:36 BST 2008
Sascha Silbe wrote:
> On Thu, Oct 16, 2008 at 09:21:54AM +1100, Brett Henderson wrote:
>> The biggest problem I found wasn't the actual processing of INSERT
>> statements, it was MySQL scaling non-linearly with the number of rows.
>> MyISAM tables are very fast to import regardless of number of rows, but
>> InnoDB seems to slow down as the number of rows increase. I'm surprised
>> loading with LOAD DATA INFILE fixes that.
> Just a quick stab in the dark as I do know almost nothing about
> osmosis and MySQL:
> If you create the indices before loading data, they will be constantly
> rebuilt as you add each row => would explain the exponential
> slow-down. The PostgreSQL manual has a section about optimizing
> database imports giving some useful tips & tricks; I'd guess the same
> is true for the MySQL one.
> If osmosis already defers index creation (including referential
> constraints which usually build indices implicitly), just ignore this
> mail. :)
It does some index disabling prior to import using the DISABLE KEYS
statements. But that doesn't disable all indices (and I can't remember
if it works on InnoDB tables ...). So this would appear to explain the
exponential slow down. I seem to remember manually dropping indexes
completely and still having trouble though, but I have no numbers to
back this up. If somebody wishes to run some benchmarks with and
without indexes for several different dataset sizes I'd be extremely
grateful. Writing code to do this stuff is the easy bit, it's testing
and benchmarking that takes the time. Better yet if somebody can send
me patches for osmosis I'll be ecstatic.
More information about the dev