An idea and a bug
2007-08-07 18:52:55 GMT
I'm hacking at my own version of IDLE to try to implement language support in Python. I found a bug.
1. The bug.
I'm using IDLE to work on a copy of IDLE in my home directory. I copy the idlelib directory to ~/jqidlelib , add "jq" to lines 2 and 22 of idle.py, open it in IDLE, and start hacking on the other files. If a hacked version of some other module is not open in IDLE and I press F5 to run idle.py, I get the desired behaviour (the modules are loaded from my hacked copies) but the error display is wacky. Errors refer to the line numbers of my hacked versions, but they give the path and code snippets from the actual IDLE directory.
2. My idea
I have started a discussion on the OLPC wiki and over email with some interested parties within OLPC about the way to do i18n (internat...ion) on Python. My idea is to have code on-disk in "real", english-based python, but to allow displaying and editing of that code in the user's native language. This means that there are systems for invisible, real-time, two-way translation of keywords, identifiers, and docstrings/comments. This is primarily aimed at a young demographic; I realize that there are plenty of "real programmers" who learn the keywords of a programming language in their abstract meaning, without knowing the everyday English meaning of those terms. (after all, "string", "hash", and "for" have meanings i n programming quite distinct from their everyday meanings anyway).
Keywords: dynamic two-way translation is easy.
Identifiers: supports initially creating identifiers in a native (unicode) language. When a translation is created later (using easy context menus and dictionary-based guessing support), the .py file is changed to include the English version, and the translation is stored in a parallel .p4n file. Translations of included modules are imported from their .p4n files. All clashes and character set issues are dealt with (using prefixes to escape, for instance) in order to keep the translation reversible at all times. This requires some minor modifications to python itself for the case when an imported module has added a translation but the calling code has not been edited since then so it still uses the non-english version of the identif ier. This case would be treated with extreme care to prevent security hazards.
Docstrings and comments: These are either escaped as "code" (using simple ascii markup) and translated using the same system as above, or translated on a by-line basis. Support would exist to do initial translation using some on-line babelfish-like tool.
I think the best way to explain all this is to implement it, so I'm doing so. Ideally, this implementation will be clean, simple, and unobtrusive enough that it will become a standard feature of IDLE, and even inspire inclusion in other IDEs - becoming standard within Python, and available for other languages and human-editable data formats.
ps. Here's how Mike Fletcher put his proposed solution, my emphasis added, the visual stuff he mentions is a separate issue:
* assume a progression from graphical, to local language, to
subset-of-English for those who want to become proficient programmers
o provide tools that let the children put blocks together in a
reasonably constrained environment
o show the children what putting the blocks together does in
their native language
o let the children edit the source-code in their own language
to start with
o allow the children to flip back and forth between
native-language keywords and English keywords
o support the children in translating to/from English for code
they want to publish
* for each component in the programming language (e.g. for loops,
while loops, variable assignment, etceteras)
o provide a visual form which implies the operation and allows
for drag-and-drop operation
o provide a canonical translation (which is also reserved
words/symbols/punctuation in the native language)
o provide a localized description of the operation (preferably
with some simple lessons on usage, maybe some demos with
sample code for each one, hook these up in the UI so that
the child can explore the idea behind the operation)
o when used, show the (standard) Python code generated by the
* for the special case of identifiers in the user's
o store all identifiers as transcoded ascii identifiers in
the source code on initial writing (before translation)
o display all identifiers as their non-transcoded versions in
o store the translation matrix in a separate file (to allow
for multiple translations)
* allow for a localization option that also translates keywords and
o uses the canonical translation
o allow the user to input any Unicode identifier as an
identifier which is not in the canonical translation
o typing in the localized version of "for" or "class" would
insert "for" and "class" into the stored code
* identifiers/keywords which have no translations are presented in
ascii (i.e. all current Python code would show up in English,
potentially with keywords in Arabic or Thai)
* have a tool that allows for querying e.g. BabelFish to get
translations from/to a given language for given identifiers
o allow for exporting any pure-ascii translation (e.g.
English) to the base file (updating the translation tables
as we do so)
o should provide both localization and internationalization
support (i.e. the same tool would allow the country to load
a common Python module and produce a translation for it)
_______________________________________________ IDLE-dev mailing list IDLE-dev <at> python.org http://mail.python.org/mailman/listinfo/idle-dev