[OSM-dev] broken nodes in planet dump

Jon Burgess jburgess777 at googlemail.com
Thu Sep 27 18:59:58 BST 2007


On Thu, 2007-09-27 at 18:35 +0100, Dave Stubbs wrote:
> On 27/09/2007, Tom Hughes <tom at compton.nu> wrote:
> > In message <a4c775140709270536r2bb5a2e8ua60f1fcf4947ae99 at mail.gmail.com>
> >           "Dave Stubbs" <osm.list at randomjunk.co.uk> wrote:
> >
> > > On 27/09/2007, Tom Hughes <tom at compton.nu> wrote:
> > > > In message <b215ec5f0709270218q233757f7u6a0812bfaf202dc6 at mail.gmail.com>
> > > >         Jon Burgess <jburgess777 at googlemail.com> wrote:
> > > >
> > > > > I made another planet dump myself last night and I think it looks
> > > > > complete (~20GB). It needs to be bzip2 compressed though.
> > > > >
> > > > > Tom: It is on tile in
> > > > > ~jburgess/svn.openstreetmap.org/applications/utils/planet.osm/C/planet2.osm
> > > > > if you want to take that one instead of running the dump again.
> > > >
> > > > I was actually in the middle of doing a dump with the C dumper
> > > > myself ;-) I've started compressing your one now...
> > >
> > > There seems to be some UTF8-style problems judging by the rendering:
> > > http://tile.openstreetmap.org/10/531/341.png
> >
> > The new dump isn't complete yet, so tile should still be using
> > last week's as far as I know.
> 
> The munin graphs suggest tile loaded something in last night, and
> tiles rendered since then are wrong, where before they were OK.
> Something's changed in the last couple of days at least.

I loaded a new dump last night and started rendering the tiles again
around midday today. It looks like the main database and my C planet
dump tool had a disagreement about what charset to use for the results,
even though the code tries to force the connection to run as UTF-8. I've
halted the rendering again for now.


When I wrote the C planet dump tool I forced the database connection to
use UTF-8 and my local database was configured to run everything as
UTF-8. It looks like the main server is converting results back to
latin1 and I need to add more options to get UTF8 results:

mysql> show variables like 'character_set%';

| character_set_client            | latin1                      |
| character_set_connection        | latin1                      |
| character_set_database          | utf8                        |
| character_set_filesystem        | binary                      |
| character_set_results           | latin1                      |
| character_set_server            | latin1                      |
| character_set_system            | utf8                        |

How many different character set settings can one server have!

	Jon 





More information about the dev mailing list