[Talk-us] Bulk fix of comma delimiters in source tag in parts of CA, NV, AZ, UT
Alan Mintz
Alan_Mintz+OSM at Earthlink.Net
Fri Jul 27 06:34:35 BST 2012
It was brought to my attention that P2 shows a warning that a field
contains multiple values when it sees semi-colons in a field. As a result,
some had interpreted this as an error, and "fixed" it by changing them to
commas. Since the commas are a legitimate value character, the field no
longer looks like it has multiple values and the warning goes away. This
behavior is the subject of another thread (on dev, moved to tagging).
AFAIK, semi-colons are still the correct way of delimiting multiple values.
How consumers and editors deal with this are a separate issue - my concern
was to fix these "fixes" in the area and particular tag I knew where it was
occurring - source=*.
Using OAPI, I downloaded the relevant objects in the bbox
[32,-130,39,-110], sorted them to remove the cases where the comma was a
legitimate part of a single value (long English descriptions), and then
replaced the commas with semicolons in the resulting 8592 objects. Many of
these were not "fixes", but were instead entered that way to begin with.
Anyone have an issue with me uploading the fixed data?
I realize that the issue may exist outside this bbox as well. It might be
useful to look for the issue globally. Also, there are probably other tags
that legitimately and non-controversially may contain multiple values. I
was trying to work out a process, which turns out to be somewhat manual,
even with the help of a couple scripts.
--
Alan Mintz <Alan_Mintz+OSM at Earthlink.net>
More information about the Talk-us
mailing list