1 Dec 2010 01:41
Re: how to optimally preserve encoding when converting TEI-L archives?
Christian Wittern <cwittern <at> GMAIL.COM>
2010-12-01 00:41:17 GMT
2010-12-01 00:41:17 GMT
Hi Ron, As far as my very limited experience with mail archives goes, you have to be aware of the fact that the encoding is specified at the message level, while the messages might all be thrown together in files for monthly archives or similar bundles of messages. When processing these archives, you will need software that understands this and can deal with it (including the many intricacies of the MIME format used for mail messages on the internet). I have had some success with the mail module in the standard Python library, but their surely are other tools as well. Hope this helps, Christian On 2010-12-01 24:30 , Ron Van den Branden wrote: > Hi all, > > I'm about to make a fresh conversion of the TEI-L archives (for import in > MarkMail this time). Ideally, I would take this opportunity to improve on > the issues David raised about character encoding in the nabble mirror > (http://tei-l.970651.n3.nabble.com/TEI-L-mirror-s-tp1878767p1878945.html). > > Since I am far from an expert on this matter, I would like to ask if > anyone has any suggestions for optimal preservation of character encoding. > I don't know what encoding the files are in (though I specified UTF-8 as > encoding for incoming messages on my mail client, and received the monthly > archives in the body of email messages), but am often warned about illegal > / illegible characters when opening a file. It seems to me that the > archive is messed up with all kinds of different encodings and I'm not(Continue reading)
> Here there are some of the pictures I took in Zadar at the TEI
> meeting
RSS Feed