[OSM-talk] Upload slowness - what's going on?

Tue May 17 05:14:49 UTC 2016

API download and upload has gotten fast(er) tonight, for the first
time since the server move on May 9!
Whatever they've done today, it's working!
I am so happy. This really impacts my life.
Thanks,
Ben

On Fri, May 13, 2016 at 5:59 AM, Grant Slater
<openstreetmap at firefishy.com> wrote:
> Hi All,
>
> On Monday 9th May 2016 the master OSM database server was moved to
> York (Bytemark) from London (Imperial).
> This was to avoid multiple upcoming weekends of planned power testing
> & maintenance at the Imperial data centre. For the last few years
> Imperial has housed all our main critical systems including master &
> slave DB servers and frontend & backend web/api servers. We also added
> 4 new frontend/backend web/api server to York on Monday.
>
> We now have the master database server in York and the secondary
> database server in Imperial. We also have a warm standby slave db in
> AWS Ireland. A fourth SSD (NVMe) based DB server was delivered
> yesterday (Thursday), but it needs testing (burn-in, reliability,
> performance etc) before we can start using it. Slave DB servers can be
> promoted to master if required.
>
> The slave db servers serve Web/API read traffic and writes go to the
> master. When the frontend + backend servers were in the same data
> centre as the master db server the latency was <1ms. We now run a VPN
> to connect the servers up and the latency is ~8ms Imperial to
> Bytemark. Currently we are using the frontend & backends server at
> Imperial (closest to slave db read server) and sending writes over the
> VPN to Bytemark. The extra 8ms roundtrip is triggered multiple times
> based on the size of the upload changeset, this is the root cause for
> the slower uploads. The link between Imperial & Bytemark can handle
> gigabit speeds. Over the last few days we've been tweaking the VPN
> settings to get optimal latency & throughput over the links.
>
> Over today (for at least the weekend) we are switching to the new
> frontend & backend servers in York (Bytemark). London Imperial will be
> offline from approximately 5pm (GMT+1) for the first weekend of power
> maintenance.
>
> In summary: The slow uploads are a known issue and we'll fix as soon
> as practical. Our main concern has been setting up multiple data
> center redundancy to avoid extended downtime.
>
> Here is the list of all core hardware and hosting locations:
> https://hardware.openstreetmap.org/
>
> Hope that answers the questions. ;-)
>
> Photos or it didn't happen:
> * Syncing & powering down before we start London -> York DB move:
> https://twitter.com/OSM_Tech/status/729582996685213696
> * Staged photo of racking up the master DB server at Bytemark:
> https://twitter.com/OSM_Tech/status/729693392737832961
> * Testing the new Frontend / Backend servers a week ago:
> https://twitter.com/OSM_Tech/status/728286193696292865
>
> Bytemark are a fantastic hosting company and their ongoing support of
> the OpenStreetMap project is highly commendable. Please support them
> ;-) https://twitter.com/bytemark/status/729698435339853824
>
> Kind regards,
>
> Grant
> Part of the OSM Ops team.
>
>
> On 13 May 2016 at 11:44, Tim Waters <chippy2005 at gmail.com> wrote:
>> I believe the Dev mailing list may have some of your technical answers
>> https://lists.openstreetmap.org/pipermail/dev/2016-May/thread.html
>>
>> It appears from that list that the database servers are now a few
>> hundreds of miles from where the web servers are, causing the increase
>> in latency. I do not know if this is a permanent change, the thread on
>> osm-dev does seem to indicate that things are still in flux.
>>
>> Tim
>>
>>
>>
>> On 13 May 2016 at 06:02, Ben Discoe <bdiscoe at gmail.com> wrote:
>>> Several of us have noticed radically slowly upload speed for
>>> changesets, roughly since the server move on May 9.  Like, as
>>> painfully slow as it used to be, it's now several times slower.
>>>
>>> It's been discussed with @OSM_Tech on twitter, in this thread:
>>> https://twitter.com/OSM_Tech/status/730857486618664960
>>>
>>> Before I get too hysterical, can somebody tell me what happened, and
>>> can it be fixed?
>>>
>>> OSM_Tech's mysterious message:
>>>   "Large uploads will take around 3 times longer. Small uploads extra
>>> delay should be minimal."
>>>
>>> Does this mean that something did change?  It is database writes that
>>> are taking so much longer?  Changesets with as few as 400 object are
>>> taking several times longer, what constitutes "large" vs. "small"?
>>> Can it be fixed?  Can I donate large sums of money somewhere to help
>>> it get fixed?
>>>
>>> Thanks,
>>> Ben
>>>
>>> _______________________________________________
>>> talk mailing list
>>> talk at openstreetmap.org
>>> https://lists.openstreetmap.org/listinfo/talk
>>
>> _______________________________________________
>> talk mailing list
>> talk at openstreetmap.org
>> https://lists.openstreetmap.org/listinfo/talk