[OSM-talk] planet.dump

Jonas Svensson jonass at lysator.liu.se
Wed Aug 2 12:35:28 BST 2006


On Wed, 2 Aug 2006, SteveC wrote:

> * @ 02/08/06 10:12:25 AM openstreetmap at gagravarr.org wrote:
> > On Wed, 2 Aug 2006, Jonas Svensson wrote:
> > >I think I have found the error. Some lines contain the text "B&B" which
> > >should be "B&B". Assuming we are still using the html-encoding as in
> > >previous dumps. Wouldn't UTF-8 be better?
> >
> > I believe that the code that generates the dumps is
> > 	http://svn.openstreetmap.org/utils/planet.osm/planet.rb
> >
> > I'm sure that Steve would appreciate patches to make it better :)
>
> yes please

First the print of header should change to:
<?xml version="1.0" encoding="utf-8" ?>
And the rest might be as easy as adding -Ku to the command
line: "#!/path/to/ruby -Ku "

Otherwise a quick and dirty fix might be to use entitys instead of utf-8:

  require 'cgi'
  puts CGI.escapeHTML( '<a href="/mp3">Click Here</a>' )

Also
<http://www.yotabanana.com/hiki/ruby-gettext-howto-cgi.html?ruby-gettext-howto-cgi>
mentions set_output_charset("UTF-8") but I do not know it that is
part of standard ruby or an extension.

I may look into this more later on.

/Jonas





More information about the talk mailing list