[OSM-talk] planet.dump

David Sheldon dave at earth.li
Wed Aug 2 09:40:09 BST 2006


On Wed, Aug 02, 2006 at 10:33:02AM +0200, Jonas Svensson wrote:
> I think I have found the error. Some lines contain the text "B&B" which
> should be "B&B". Assuming we are still using the html-encoding as in
> previous dumps. Wouldn't UTF-8 be better?

The dump is supposed to be in XML. In XML '&' MUST ALWAYS be encoded as
'&'. I believe the dump should be in UTF-8 anyway, but it is
probably safest to encode any non-ascii characters using the appropriate
entity references. For exampl £ rather than a (British) pound sign.
The HTML £ is not valid XML unless you elsewhere define the entity
"pound".
 

David
-- 
it is a strange, strange meme. people are either immune, or very seriously
infected. could it be a tailored meme-war plague? - TWIC, o.c.ousfg




More information about the talk mailing list