[OSM-dev] Minute Diffs Broken

Tue May 5 10:36:44 BST 2009

Tom Hughes wrote:
> Brett Henderson wrote:
>
>> That does look interesting.  I'd hope to use that outside the main 
>> database though.  My thoughts were to use triggers to populate short 
>> term flag tables which a single threaded process would read, use as 
>> keys to select modified data into an offline database, then clear.  
>> This offline database could then use a queueing system such as PgQ (I 
>> haven't seen it before, will have to check it out) to send events to 
>> the various consumers of the data.  I'd like to minimise access to 
>> the central database if possible because 1. it will scale better, and 
>> 2. it adds less burden to existing DBAs.
>
> It is highly unlikely that anything which requires modifications to 
> the database schema and/or adding triggers or anything like that to 
> the database will be possible, at least in the short to medium term.
Come on Tom, where's your sense of adventure ;-)
>
> We're only just getting things stable again and I have no desire to 
> start fiddling with things just yet - we need time to let what we have 
> bed in properly.
I understand your concerns.  I wouldn't have even mentioned it if I had 
valid alternatives.  At this point I'm feeling somewhat stymied though.  
I had a system that worked well under 0.5, but I can't offer the same 
service under 0.6.
>
> On top of that if we're going to start talking about replication style 
> solutions then we will need to look carefully at all the available 
> systems and consider what will best lend itself to what will doubtless 
> be our need to scale to multiple database servers in the future. That 
> isn't something we can do quickly.
If you're referring to multi-mastered clustered databases then that is a 
whole different problem that shouldn't be confused with what I'm trying 
to achieve.  I'm simply trying to provide a way for people to access 
regular updates in a read-only fashion where data integrity is the 
highest priority.  By allowing delays in delivery (I'd like to get it 
down to a couple of minutes but I'm not aiming for anything like 
real-time) it becomes a simpler problem with hopefully a simpler 
solution.  Any multiple database system is likely to be a long way off 
so I can't wait for that.

I have a couple of questions for you in particular.
1. Are there any known issues with the current API that could cause 
delays in excess of 5 minutes?  Or is it just a fact of life with large 
changesets?  I guess what I'm asking is longer than 5 minutes a regular 
occurrence with a large changeset or is something strange going on in 
rare cases?
2. What appear to be the current system bottlenecks?  Is the database 
already approaching processing capacity or is rails the limiting factor?
3. Is there any way I can change your mind on making db changes ;-)

If 1 is just an intermittent issue then the current issues may be 
solveable without changes at the osmosis end.  If not then I have to 
make a change of some kind.
If the existing db is already a bottleneck then I have to tread very 
carefully with what I do.  If not then I have some more flexibility.  
Having said that, I believe osmosis adds very load on the database 
judging by the munin graphs.
As for 3, I won't be asking you to start adding a bunch of triggers and 
tables to the database just yet.  Any change would have to go through 
significant testing to measure its impact.  Just as I spent a lot of 
time testing before introducing the existing osmosis diffs, I'd be doing 
the same for a more reliable replication mechanism.  But if there's no 
chance of it happening then I won't bother.  It might be worth nothing 
that there are currently 5 osmosis processes reading from the database, 
it is possible to reduce that to 1 with a smarter solution.

Brett