VERSION

Version 0.1

SYNOPSIS

use Convert::YText qw(encode_ytext decode_ytext);

$encoded=encode_ytext($string); $decoded=decode_ytext($encoded);

($decoded eq $string) || die \*(L"this should never happen!\*(R";

DESCRIPTION

Convert::YText converts strings to and from \*(L"YText\*(R", a format inspired by xtext defined in \s-1RFC1894\s0, the \s-1MIME\s0 base64 and quoted-printable types (\s-1RFC\s0 1394). The main goal is encode a \s-1UTF8\s0 string into something safe for use as the local part in an internet email address (\s-1RFC2822\s0).

By default spaces are replaced with \*(L"+\*(R", \*(L"/\*(R" with \*(L"~\*(R", the characters \*(L"A-Za-z0-9_.-\*(R" encode as themselves, and everything else is written \*(L"=USTR=\*(R" where \s-1USTR\s0 is the base64 (using \*(L"A-Za-z0-9_.\*(R" as digits) encoding of the unicode character code. The encoding is configurable (see below).

PROCEDURAL INTERFACE

The module can can export \*(C`encode_ytext\*(C' which converts arbitrary unicode string into a \*(L"safe\*(R" form, and \*(C`decode_ytext\*(C' which recovers the original text. \*(C`validate_ytext\*(C' is a heuristic which returns 0 for bad input.

OBJECT ORIENTED INTERFACE.

For more control, you will need to use the \s-1OO\s0 interface.

new

Create a new encoding object.

Arguments

Arguments are by name (i.e. a hash).

\s-1ESCAPE_CHAR\s0 ('=') Must not be in digit string.
\s-1SPACE_CHAR\s0 ('+') Non digit to replace space. Can be the empty string.
\s-1SLASH_CHAR\s0 ( '~') Non digit to replace slash. Can be the empty string.
\s-1EXTRA_CHARS\s0 ('._\-') Other characters to leave unencoded.

encode

Arguments

a string to encode.

Returns

encoded string

decode

Arguments

a string to decode.

Returns

encoded string

valid

Simple necessary but not sufficient test for validity.

DISCUSSION

According to \s-1RFC\s0 2822, the following non-alphanumerics are \s-1OK\s0 for the local part of an address: \*(L"!#$%&'*+-/=?^_`{|}~\*(R". On the other hand, it seems common in practice to block addresses having \*(L"%!/|`#&?\*(R" in the local part. The idea is to restrict ourselves to basic \s-1ASCII\s0 alphanumerics, plus a small set of printable \s-1ASCII\s0, namely \*(L"=_+-~.\*(R".

The characters '+' and '-' are pretty widely used to attach suffixes (although usually only one works on a given mail host). It seems ok to use '+-', since the first marks the beginning of a suffix, and then is a regular character. The character '.' also seems mostly permissable.

AUTHOR

David Bremner, <[email protected]<gt>

COPYRIGHT

Copyright (C) 2011 David Bremner. All Rights Reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

RELATED TO Convert::YText…

MIME::Base64, MIME::Decoder::Base64, MIME::Decoder::QuotedPrint.