[OSM-dev] Question: Tags key and value maximum length

Shaun McDonald shaun at shaunmcdonald.me.uk
Tue Apr 21 12:23:18 BST 2009


The Key Value pairs string length limit has been around for a long  
time, with the exception of node tags.

The 255 limit is from MySQL. For the timebeing we are are trying to be  
compatible with MySQL for people who are still using MySQL. Many  
developers will not have had time to switch their development/ 
production databases from MySQL to Postgres yet.

The limit also prevents problems of people putting huge amounts of  
data into them, thus causing problems elsewhere on the api, through  
huge documents being produced.

If you want to put more than 255 chars, then you need to split it over  
multiple tags:

description =
description_1=
description_2 =
etc

Shaun

On 21 Apr 2009, at 11:55, Lars Francke wrote:

> In that case I want to propose a change as 255 chars is a rather
> arbitrary value for a maximum and I definitely see use cases where
> more characters are needed. I just had a look at the first node that
> was truncated (20833623) and Hamburg is now missing a lot of its post
> codes. If anyone ever runs the opengeodb import again I can imagine
> that there might be problems. In the first 10 truncations there was
> one tag that probably was an error the rest was perfectly valid text
> :(
>
> 255 chars is a bit small for a few descriptions, automatic imports or
> notes as well. If you want a limit at least choose something like 1000
> or even 10000. As the truncated tags show it isn't used very often but
> there are valid uses.
>
> But as we are using PostgreSQL now a limit is not really neccessary.
> From the PostgreSQL documentation:
>
> "The storage requirement for a short string (up to 126 bytes) is 1
> byte plus the actual string, which includes the space padding in the
> case of character. Longer strings have 4 bytes overhead instead of 1.
> Long strings are compressed by the system automatically, so the
> physical requirement on disk might be less. Very long values are also
> stored in background tables so that they do not interfere with rapid
> access to shorter column values. In any case, the longest possible
> character string that can be stored is about 1 GB. (The maximum value
> that will be allowed for n in the data type declaration is less than
> that. It wouldn't be very useful to change this because with multibyte
> character encodings the number of characters and bytes can be quite
> different anyway. If you desire to store long strings with no specific
> upper limit, use text or character varying without a length specifier,
> rather than making up an arbitrary length limit.)
>
> "There are no performance differences between these three [character,
> character varying, text] types, apart from increased storage size when
> using the blank-padded type, and a few extra cycles to check the
> length when storing into a length-constrained column. While
> character(n) has performance advantages in some other database
> systems, it has no such advantages in PostgreSQL. In most situations
> text or character varying should be used instead."
>
> Please consider this as I think it will make OSM a bit more future  
> proof.
>
> Thanks,
> Lars
>
> _______________________________________________
> dev mailing list
> dev at openstreetmap.org
> http://lists.openstreetmap.org/listinfo/dev





More information about the dev mailing list