[OSM-dev] perl and special utf-8 characters
5313501608656osm at rainbow.in-berlin.de
Sat Mar 19 08:02:25 GMT 2011
On 11-03-11 17:56:28 CET, Gary68 wrote:
> i wouldn't know how to do it with regexes.
> ok. let's put it this way. i have a POSITIVE list of allowed chars
> (inkl. utf8 2byte ones) and i have a string.
> i want to eliminate all chars in the string that are not in the POSITIVE
sounds very simple.
my $s = 'abcäöüß$€✓XYZ';
print "full: ", encode ('UTF-8', $s), "\n";
$s =~ tr/\000-\377//cd;
print "latin1 only: ", encode ('UTF-8', $s), "\n";
$s =~ tr/\000-\177//cd;
print "ASCII only: ", encode ('UTF-8', $s), "\n";
:r !perl /tmp/g2
latin1 only: abcäöüß$XYZ
ASCII only: abc$XYZ
More information about the dev