Armel Asselin | 1 Apr 2007 19:22
Picon

Re: Komodo's user defined lexers

From: "Neil Hodgson" <nyamatongwe <at> gmail.com>
> Armel Asselin:
>
>> My idea is that lexing states could be stored (approximately) each N
>> bytes/characters and lexing could be only pure-dynamic.
>
>   It may be an interesting idea to explore. Cutting every N
> characters leads to more states since, for example, N may split the
> '/' and '*' in a C comment start. The current approach which only
> lexes whole lines allows lexers handle these short sequences easily.
> The lexer could report the last simple resynchronization point before
> N thus shortening the segment.
that's a bit why i talk about _approximately_ N (so the distance between 
successive lexing states could be in [N/2; N*2[ for example), for N large 
enough [say around 1000]. it can be useful for what you explain, as well as 
for insertion/deletion, so that a way of keeping parsed stuff _after_ last 
operation point could be designed by some notion allowing to reparse from 
last operation point and compare to the following state previously stored. 
If equal, full reparsing is avoided which could help with large files. The 
technic could be extended to potentially re-synchronize to any later 
position with a matching lexing state.
I think of an array of (distance-to-prev;lex-state) entries, indexed by 
cumulated positions per packet of K entries or an array of 
(abs-position;lex-state) directly indexed on abs-position field.

Armel
Neil Hodgson | 2 Apr 2007 00:37
Picon

Re: SciTE Quick Reference

Jeffrey Sabarese:

> hello. this is to say that i've been compiling my own sort of "cheat
> sheet" SciTE quick reference.

   There is a SciTE mailing list at
http://mailman.lyra.org/mailman/listinfo/scite-interest

   This list is for programming with the Scintilla component.

   Neil
Neil Hodgson | 2 Apr 2007 01:32
Picon

Decorations

   Projects based on Scintilla have been using indicators to mark
areas of text that contain errors, search matches, URL hotspots,
misspellings, etc. These areas are discovered through some user
initiated command or by running a background process rather than as
part of lexing. Such marks are often fairly sparse: a HTML syntax
checker may show 5 errors in a whole file. Its a little complex using
indicators in this way as the lexing state has to be preserved while
setting the indicator and there is only 1 to 3 bits available for this
state.

   I'm thinking of adding the capability to have a set of indicators
that are not stored in the styling bytes but rather in a compact
format similar to run length encoding. They may be called "Extended
Indicators" or "Decorations". The implementation would be based on
SinkWorld's RunStyles class. The existing indicator drawing code would
be reused with the same styles available for each type.

   A decoration would be reasonably stable: inserting or deleting text
outside a decoration leaves it alone; insertions inside a decoration
show the decoration, and insertions at the edges don't show the
decoration. Decorations would not be remembered in the undo history.
Since they are not lexer based there wouldn't be any 'decorations
valid' range remembered by Scintilla nor an equivalent to
SCN_STYLENEEDED. Containers would manage this themselves through
SCN_MODIFIED.

   Decorations could either be implemented completely in the view
layer (simpler but only visible in one view) or with the locations
stored in the document and visual appearance in the view. There would
be a defined drawing order and some could be drawn underneath the text
(Continue reading)

Franz Steinhaeusler | 2 Apr 2007 08:32
Picon
Picon

Re: SciTE Quick Reference

On Sat, 31 Mar 2007 02:40:12 -0400, "Jeffrey Sabarese" <core5fedora <at> gmail.com> wrote:

>[...]
>
>the URL is:
>http://novicenotes.com/software/scite-quick-reference/
>
>[...]
Nice Info, thanks!

--

-- 

Franz Steinhaeusler
Armel Asselin | 2 Apr 2007 19:17
Picon

Re: Decorations

>   Projects based on Scintilla have been using indicators to mark
> areas of text that contain errors, search matches, URL hotspots,
> misspellings, etc. These areas are discovered through some user
> initiated command or by running a background process rather than as
> part of lexing. Such marks are often fairly sparse: a HTML syntax
> checker may show 5 errors in a whole file. Its a little complex using
> indicators in this way as the lexing state has to be preserved while
> setting the indicator and there is only 1 to 3 bits available for this
> state.
>
>   I'm thinking of adding the capability to have a set of indicators
> that are not stored in the styling bytes but rather in a compact
> format similar to run length encoding. They may be called "Extended
> Indicators" or "Decorations". The implementation would be based on
> SinkWorld's RunStyles class. The existing indicator drawing code would
> be reused with the same styles available for each type.
>
>   A decoration would be reasonably stable: inserting or deleting text
> outside a decoration leaves it alone; insertions inside a decoration
> show the decoration, and insertions at the edges don't show the
> decoration. Decorations would not be remembered in the undo history.
> Since they are not lexer based there wouldn't be any 'decorations
> valid' range remembered by Scintilla nor an equivalent to
> SCN_STYLENEEDED. Containers would manage this themselves through
> SCN_MODIFIED.
>
>   Decorations could either be implemented completely in the view
> layer (simpler but only visible in one view) or with the locations
> stored in the document and visual appearance in the view. There would
> be a defined drawing order and some could be drawn underneath the text
(Continue reading)

Josiah Carlson | 2 Apr 2007 20:08
Picon
Favicon

Re: Decorations


"Neil Hodgson" <nyamatongwe <at> gmail.com> wrote:
>    Projects based on Scintilla have been using indicators to mark
> areas of text that contain errors, search matches, URL hotspots,
> misspellings, etc. These areas are discovered through some user
> initiated command or by running a background process rather than as
> part of lexing. Such marks are often fairly sparse: a HTML syntax
> checker may show 5 errors in a whole file. Its a little complex using
> indicators in this way as the lexing state has to be preserved while
> setting the indicator and there is only 1 to 3 bits available for this
> state.
[snip]

Sounds great to me.  Allowing so many indicators would allow for syntax
highlighting while also supporting SubEthaEdit-like pair/group
programming (something I've been looking to add to my scintilla-based
editor for quite a while), along with different colors for console
stdout/stderr/expected input in shells (also something I've been looking
to add for a while).

Thank you for considering this.

 - Josiah
Simon Steele | 2 Apr 2007 23:30

Re: Decorations

Neil,

> I'm thinking of adding the capability to have a set of indicators
> that are not stored in the styling bytes but rather in a compact format
> similar to run length encoding. They may be called "Extended Indicators"
> or "Decorations". The implementation would be based on SinkWorld's
> RunStyles class. The existing indicator drawing code would
> be reused with the same styles available for each type.

Sounds excellent, the current system can be complicated to use and very
restrictive when using complicated lexers like the hypertext one.

As far as storage, I can't see a view with >32 indicators as anything
other than disastrous from a usability point of view so surely a 32-bit
field would be adequate?

I'm presuming lexers will be able to set decorations as well as making
them available via an API. In this case would it not be better to store
the state with the document, I don't think we currently lex multiple times
for multiple views?

Simon.
Shane Caraveo | 3 Apr 2007 00:31
Picon

Re: Decorations


Josiah Carlson wrote:
> "Neil Hodgson" <nyamatongwe <at> gmail.com> wrote:
>>    Projects based on Scintilla have been using indicators to mark
>> areas of text that contain errors, search matches, URL hotspots,
>> misspellings, etc. These areas are discovered through some user
>> initiated command or by running a background process rather than as
>> part of lexing. Such marks are often fairly sparse: a HTML syntax
>> checker may show 5 errors in a whole file. Its a little complex using
>> indicators in this way as the lexing state has to be preserved while
>> setting the indicator and there is only 1 to 3 bits available for this
>> state.
> [snip]
> 
> Sounds great to me.  Allowing so many indicators would allow for syntax
> highlighting while also supporting SubEthaEdit-like pair/group
> programming (something I've been looking to add to my scintilla-based
> editor for quite a while), along with different colors for console
> stdout/stderr/expected input in shells (also something I've been looking
> to add for a while).

That would be great, we have some hacky patches to get coloring for 
stdout/err/in that do not always work well.

Shane
Eric Promislow | 3 Apr 2007 02:04
Picon

Re: Decorations

Sounds fine for our needs.  We'd prefer to have the decorations
associated with the document -- it's easier for apps that
can provide more than one concurrent view to a document, and
it then pushes the responsibility for managing multiple views
to each app, which makes sense to me.

The SCN_MODIFIED event works if getting the list of decorations
at a range (which might be one-byte long) would be efficient.

- Eric

Neil Hodgson wrote:
>   Projects based on Scintilla have been using indicators to mark
> areas of text that contain errors, search matches, URL hotspots,
> misspellings, etc. These areas are discovered through some user
> initiated command or by running a background process rather than as
> part of lexing. Such marks are often fairly sparse: a HTML syntax
> checker may show 5 errors in a whole file. Its a little complex using
> indicators in this way as the lexing state has to be preserved while
> setting the indicator and there is only 1 to 3 bits available for this
> state.
> 
>   I'm thinking of adding the capability to have a set of indicators
> that are not stored in the styling bytes but rather in a compact
> format similar to run length encoding. They may be called "Extended
> Indicators" or "Decorations". The implementation would be based on
> SinkWorld's RunStyles class. The existing indicator drawing code would
> be reused with the same styles available for each type.
> 
>   A decoration would be reasonably stable: inserting or deleting text
(Continue reading)

Neil Hodgson | 3 Apr 2007 02:22
Picon

Re: Decorations

Simon Steele:

> As far as storage, I can't see a view with >32 indicators as anything
> other than disastrous from a usability point of view so surely a 32-bit
> field would be adequate?

   I was thinking that it could be allocated for different purposes
since you don't want to have a conflict between lexers and containers.
The first 8 bits could be for lexers. This includes the existing 3
bits currently available to lexers. For a transition period,
indicators 0, 1, and 2 are still stored in the style bytes. but will
be treated as equivalent by the new API. Lexers are then updated to
use bits 3 to 7 and after the transition, bits 0,1,and 2 use
RunStyles. If anyone is using indicators for dense styles that are not
suitable for RunStyles (such as marking the start of every lexeme)
then they should comment.

   The order of drawing may be important for some decorations - such
as highlighting the spelling mistake under the cursor while retaining
the original mistake decorations. There is also the possibility of a
multi-valued decoration: maybe you have 5 levels of issue severity.
While this can be represented as 5 separate decorations, it may be
better to treat it as a simple decoration with 6 values (including 0
as none).

> I'm presuming lexers will be able to set decorations as well as making
> them available via an API. In this case would it not be better to store
> the state with the document, I don't think we currently lex multiple times
> for multiple views?

(Continue reading)


Gmane