[OSM-dev] perl and special utf-8 characters

Robert Joop 5313501608656osm at rainbow.in-berlin.de
Sat Mar 19 08:02:25 GMT 2011

On 11-03-11 17:56:28 CET, Gary68 wrote:
> i wouldn't know how to do it with regexes.
> ok. let's put it this way. i have a POSITIVE list of allowed chars
> (inkl. utf8 2byte ones) and i have a string.
> i want to eliminate all chars in the string that are not in the POSITIVE
> list.

sounds very simple.

:r /tmp/g2
use utf8;

use Encode;

my $s = 'abcäöüß$€✓XYZ';
print "full: ", encode ('UTF-8', $s), "\n";
$s =~ tr/\000-\377//cd;
print "latin1 only: ", encode ('UTF-8', $s), "\n";
$s =~ tr/\000-\177//cd;
print "ASCII only: ", encode ('UTF-8', $s), "\n";

:r !perl /tmp/g2
full: abcäöüß$€✓XYZ
latin1 only: abcäöüß$XYZ
ASCII only: abc$XYZ


More information about the dev mailing list