6 Feb 2010 08:13
Writing lexers in Lua
Neil Hodgson <nyamatongwe <at> gmail.com>
2010-02-06 07:13:07 GMT
2010-02-06 07:13:07 GMT
There is now some experimental support for writing lexers in Lua. The API is similar to the StyleContext class used in LexCPP although the low level calls from Accessor are also available. It is very likely that the API will change and it may still be 'experimental' when included in a release of SciTE so that the API can be fixed after more experience. The documentation is at http://www.scintilla.org/ScriptLexer.html Iteration is by character rather than byte and styler.Current(), styler.Next(), and styler.Previous() return strings containing all the bytes in multi-byte characters. If the document is in UTF-8 with the value "«" then the initial value of styler.Current() is "«" which is the same as "\xc2\xab". If the document is in Latin-1 then "«" is "\xab". This makes it easy to write lexers for a particular encoding in that encoding as code can be written naturally like styler.Match("«"). Lexers for languages that depend on characters outside ASCII for syntax and that have to deal with multiple encodings will be more complex. The API still uses byte positions for Position() and other calls since it is costly to convert byte positions to character positions and vice versa. Another change from previous lexers is that there is an imaginary extra NUL ('\0') character at the end of the document when using styler.More() .. styler.Forward(). This makes it easier to treat the end of the document as if it was the end of a line which means the normal code for determining that an identifier is a keyword will trigger. This avoids the common lexer problem of keywords at the end of the document not highlighting.(Continue reading)
Thank you Neil.
Philippe:
> I haven't commented on this one, but sometime I think such feature can
> be useful.
> [...snip...]
> Having that controlled by option is OK.
Thank you for the feedback.
Lorenzo
RSS Feed