[OSM-dev] perl and special utf-8 characters
Robert Joop
5313501608656osm at rainbow.in-berlin.de
Sat Mar 19 08:02:25 GMT 2011
On 11-03-11 17:56:28 CET, Gary68 wrote:
>
> i wouldn't know how to do it with regexes.
>
> ok. let's put it this way. i have a POSITIVE list of allowed chars
> (inkl. utf8 2byte ones) and i have a string.
>
> i want to eliminate all chars in the string that are not in the POSITIVE
> list.
sounds very simple.
:r /tmp/g2
use utf8;
use Encode;
my $s = 'abcäöüß$€✓XYZ';
print "full: ", encode ('UTF-8', $s), "\n";
$s =~ tr/\000-\377//cd;
print "latin1 only: ", encode ('UTF-8', $s), "\n";
$s =~ tr/\000-\177//cd;
print "ASCII only: ", encode ('UTF-8', $s), "\n";
__END__
:r !perl /tmp/g2
full: abcäöüß$€✓XYZ
latin1 only: abcäöüß$XYZ
ASCII only: abc$XYZ
rj
More information about the dev
mailing list