[PATCH] fix charset-hook bugs
Alain Bench <veronatif <at> free.fr>
2006-08-02 10:05:03 GMT
Hello,
While testing <create-alias> recoding, I noticed a bug in the IDN
code, where conversion between IDN and display was unwantedly sensible
to charset-hooks. Example: When user has "charset-hook . euc-jp",
because he receives many mails containing EUC-JP charset with various
wrong MIME labels, a domain "xn--ren-dma.org" is displayed "ren?«±.org"
instead of the correct "rené.org". And when he enters "rené.org", it's
idnized to wrong "xn--ren-c03j.org". Exact wrongs depend on $charset.
I believe that such conversions, between constant "utf-8" and the
$charset provided by user, both under full control, should not be
subject to charset-hooks. It's not needed (any corrections can be done
directly on $charset, or indirectly thru iconv-hooks), and can create
unwanted interferences in some perfectly sensible setups.
Charset-hooking iso-8859-1 to whatever else intends to correct wrong
MIME charset=iso-8859-1 labels; It should not break any operations in an
already correct iso-8859-1 $charset. Should be subject to charset-hooks
*only* uncertain fromcodes coming from the wild Internet.
The attached patch-1.5.12.ab.M_ICONV_HOOK_sanitize.1 corrects this
by removing M_ICONV_HOOK_{FROM,TO} flags usage from IDN, PGP, and GPGME
code, where not appropriate. It also removes completely the
M_ICONV_HOOK_TO symbol and depending code, because charset-hooking a
tocode is never wanted. It adds comments about role of flag
M_ICONV_HOOK_FROM, and its misleading name (ICONV_HOOK, when it
exclusively triggers charset-hooks), hoping less misusage in the future.
And while at it, the patch corrects a typo and normalizes casing of some
constant charset names.
(Continue reading)