Grant Beaty | 1 Apr 2008 12:35
Picon

[groovy-dev] Ranges

I apologize if this is a known-issue, but I couldn't find an open issue for it.

On 1.5.4., ranges don't seem to work very well for a few reasons:

1) "4.5 in (4.0..6.0)" evaluates to false, probably because non-integer ranges are ObjectRanges.

2) It seems like all the ranges create a list of iterative objects inside of them, so the above check fails. I was using a 0..1000000 range, and IntelliJ kept crashing when trying to call toString() on it (although sometimes the toString() method seems to work properly, other times it has acted like toListString(); I'm not sure why). IntelliJ's debugger showed one million java.lang.Integers in the ObjectRange.

3) Aside from memory issues, the big ranges tend to cause performance problems too. For example,
def a = 10.0..1000000.0
"2.2 in a" evaluates with a noticeable delay (although "2.0 in a" is quick). This seems to be O(n), maybe we could use a binary search on Comparable objects for better performance here?

4) It would also be really cool to have an infinite value, so ranges without upper or lower bounds could be dealt with normally and not as special cases. e.g.,
-inf..10
24..inf
...or something like that. I'm not sure exactly what is possible with DSLs, though.

5) Fractions in IntRanges also evaluate to false. e.g.,
3.0 in 2..5 == false

It seems to me that we need a range implementation for floats and BigDecimals? Or maybe something is just broken? I'd be happy to try and implement the above changes, but it would be my first time committing to any sort of programming language, so I can't make any promises on timeliness or efficacy.

-Grant

Paul King | 1 Apr 2008 12:52
Picon
Favicon
Gravatar

Re: [groovy-dev] Ranges


Hi Grant, there are quite a few things we could do to improve
Ranges. Perhaps even including some of the things you discuss
below. At the moment, the base class for ranges, Range extends
List, so the range abstraction is just a shorthand for creating
a discrete list rather than an interval of some kind. There
is a little bit of recognition that sometimes interval like
things are what we are interested and the containsWithinBounds
method represents this use case.

def a = 10.0..1000000.0
assert !a.contains(22.2)
assert a.containsWithinBounds(22.2)

But this doesn't cover all the cases you mentioned.

Cheers,
Paul.

Grant Beaty wrote:
> I apologize if this is a known-issue, but I couldn't find an open issue 
> for it.
> 
> On 1.5.4., ranges don't seem to work very well for a few reasons:
> 
> 1) "4.5 in (4.0..6.0)" evaluates to false, probably because non-integer 
> ranges are ObjectRanges.
> 
> 2) It seems like all the ranges create a list of iterative objects 
> inside of them, so the above check fails. I was using a 0..1000000 
> range, and IntelliJ kept crashing when trying to call toString() on it 
> (although sometimes the toString() method seems to work properly, other 
> times it has acted like toListString(); I'm not sure why). IntelliJ's 
> debugger showed one million java.lang.Integers in the ObjectRange.
> 
> 3) Aside from memory issues, the big ranges tend to cause performance 
> problems too. For example,
> def a = 10.0..1000000.0
> "2.2 in a" evaluates with a noticeable delay (although "2.0 in a" is 
> quick). This seems to be O(n), maybe we could use a binary search on 
> Comparable objects for better performance here?
> 
> 4) It would also be really cool to have an infinite value, so ranges 
> without upper or lower bounds could be dealt with normally and not as 
> special cases. e.g.,
> -inf..10
> 24..inf
> ...or something like that. I'm not sure exactly what is possible with 
> DSLs, though.
> 
> 5) Fractions in IntRanges also evaluate to false. e.g.,
> 3.0 in 2..5 == false
> 
> It seems to me that we need a range implementation for floats and 
> BigDecimals? Or maybe something is just broken? I'd be happy to try and 
> implement the above changes, but it would be my first time committing to 
> any sort of programming language, so I can't make any promises on 
> timeliness or efficacy.
> 
> -Grant

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Grant Beaty | 1 Apr 2008 13:06
Picon

Re: [groovy-dev] Ranges

Yes, I just realized that after I'd sent the email off, I was only testing with the 'in' keyword. Range.containsWithinBounds() certainly works how I'd expect.

It would seem to me that it should be the default behavior with 'in', which I think would mean that you won't need the List anymore? But I suppose that would break compatibility?

-Grant

On Tue, Apr 1, 2008 at 5:52 AM, Paul King <paulk-V+QuBFElvc30CCvOHzKKcA@public.gmane.org> wrote:

Hi Grant, there are quite a few things we could do to improve
Ranges. Perhaps even including some of the things you discuss
below. At the moment, the base class for ranges, Range extends
List, so the range abstraction is just a shorthand for creating
a discrete list rather than an interval of some kind. There
is a little bit of recognition that sometimes interval like
things are what we are interested and the containsWithinBounds
method represents this use case.

def a = 10.0..1000000.0
assert !a.contains(22.2)
assert a.containsWithinBounds(22.2)

But this doesn't cover all the cases you mentioned.

Cheers,
Paul.


Grant Beaty wrote:
> I apologize if this is a known-issue, but I couldn't find an open issue
> for it.
>
> On 1.5.4., ranges don't seem to work very well for a few reasons:
>
> 1) "4.5 in (4.0..6.0)" evaluates to false, probably because non-integer
> ranges are ObjectRanges.
>
> 2) It seems like all the ranges create a list of iterative objects
> inside of them, so the above check fails. I was using a 0..1000000
> range, and IntelliJ kept crashing when trying to call toString() on it
> (although sometimes the toString() method seems to work properly, other
> times it has acted like toListString(); I'm not sure why). IntelliJ's
> debugger showed one million java.lang.Integers in the ObjectRange.
>
> 3) Aside from memory issues, the big ranges tend to cause performance
> problems too. For example,
> def a = 10.0..1000000.0
> "2.2 in a" evaluates with a noticeable delay (although "2.0 in a" is
> quick). This seems to be O(n), maybe we could use a binary search on
> Comparable objects for better performance here?
>
> 4) It would also be really cool to have an infinite value, so ranges
> without upper or lower bounds could be dealt with normally and not as
> special cases. e.g.,
> -inf..10
> 24..inf
> ...or something like that. I'm not sure exactly what is possible with
> DSLs, though.
>
> 5) Fractions in IntRanges also evaluate to false. e.g.,
> 3.0 in 2..5 == false
>
> It seems to me that we need a range implementation for floats and
> BigDecimals? Or maybe something is just broken? I'd be happy to try and
> implement the above changes, but it would be my first time committing to
> any sort of programming language, so I can't make any promises on
> timeliness or efficacy.
>
> -Grant


---------------------------------------------------------------------
To unsubscribe from this list, please visit:

   http://xircles.codehaus.org/manage_email



Paul King | 1 Apr 2008 13:35
Picon
Favicon
Gravatar

Re: [groovy-dev] Ranges


The 'in' behavior is currently always linked with isCase()
and much of the List behavior is linked in with iterating
over ranges. So, neither is easy to get rid of.

Paul.

Grant Beaty wrote:
> Yes, I just realized that after I'd sent the email off, I was only 
> testing with the 'in' keyword. Range.containsWithinBounds() certainly 
> works how I'd expect.
> 
> It would seem to me that it should be the default behavior with 'in', 
> which I think would mean that you won't need the List anymore? But I 
> suppose that would break compatibility?
> 
> -Grant
> 
> On Tue, Apr 1, 2008 at 5:52 AM, Paul King <paulk@... 
> <mailto:paulk@...>> wrote:
> 
> 
>     Hi Grant, there are quite a few things we could do to improve
>     Ranges. Perhaps even including some of the things you discuss
>     below. At the moment, the base class for ranges, Range extends
>     List, so the range abstraction is just a shorthand for creating
>     a discrete list rather than an interval of some kind. There
>     is a little bit of recognition that sometimes interval like
>     things are what we are interested and the containsWithinBounds
>     method represents this use case.
> 
>     def a = 10.0..1000000.0
>     assert !a.contains(22.2)
>     assert a.containsWithinBounds(22.2)
> 
>     But this doesn't cover all the cases you mentioned.
> 
>     Cheers,
>     Paul.
> 
> 
>     Grant Beaty wrote:
>      > I apologize if this is a known-issue, but I couldn't find an open
>     issue
>      > for it.
>      >
>      > On 1.5.4., ranges don't seem to work very well for a few reasons:
>      >
>      > 1) "4.5 in (4.0..6.0)" evaluates to false, probably because
>     non-integer
>      > ranges are ObjectRanges.
>      >
>      > 2) It seems like all the ranges create a list of iterative objects
>      > inside of them, so the above check fails. I was using a 0..1000000
>      > range, and IntelliJ kept crashing when trying to call toString()
>     on it
>      > (although sometimes the toString() method seems to work properly,
>     other
>      > times it has acted like toListString(); I'm not sure why). IntelliJ's
>      > debugger showed one million java.lang.Integers in the ObjectRange.
>      >
>      > 3) Aside from memory issues, the big ranges tend to cause performance
>      > problems too. For example,
>      > def a = 10.0..1000000.0
>      > "2.2 in a" evaluates with a noticeable delay (although "2.0 in a" is
>      > quick). This seems to be O(n), maybe we could use a binary search on
>      > Comparable objects for better performance here?
>      >
>      > 4) It would also be really cool to have an infinite value, so ranges
>      > without upper or lower bounds could be dealt with normally and not as
>      > special cases. e.g.,
>      > -inf..10
>      > 24..inf
>      > ...or something like that. I'm not sure exactly what is possible with
>      > DSLs, though.
>      >
>      > 5) Fractions in IntRanges also evaluate to false. e.g.,
>      > 3.0 in 2..5 == false
>      >
>      > It seems to me that we need a range implementation for floats and
>      > BigDecimals? Or maybe something is just broken? I'd be happy to
>     try and
>      > implement the above changes, but it would be my first time
>     committing to
>      > any sort of programming language, so I can't make any promises on
>      > timeliness or efficacy.
>      >
>      > -Grant
> 
> 
>     ---------------------------------------------------------------------
>     To unsubscribe from this list, please visit:
> 
>        http://xircles.codehaus.org/manage_email
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Paul King | 1 Apr 2008 13:40
Picon
Favicon
Gravatar

Re: [groovy-dev] Re: [groovy-user] triple slashy string user feedback


OK, I updated the proposal page to reflect the current
state of the discussion:

http://docs.codehaus.org/display/GroovyJSR/Groovy+String+Handling

Feedback welcome.

Cheers, Paul.

Paul King wrote:
> Jochen Theodorou wrote:
>> The uses cases our current strings are not able to met are embedding 
>> of groovy programs in a string without escaping and multiline regexpr. 
>> And for the first I must say that I the programs may become a problem 
>> when I have to use strings in them and when I copy&paste the program 
>> from somewhere else. But as our eval is not as powerful as a Ruby eval 
>> these embedded programs have less abilities and as such are less 
>> useful. As a consequence ''' and """ do satisfy 90% of the needs here 
>> already. The multiline regexpr seems to be the only reason for me to 
>> add yet another heredoc string. And so I think the proposal should 
>> target this as its primary goal.
> 
> I believe if this is catered for, we will cover the embedded snippets
> anyway.
> 
>> [...] Looking at the whole picture doe not mean to make each feature 
>> equal.
> 
> Sure, but I wanted to give others the opportunity to present
> use cases for features they believe are important.
> 
>> [...]
>>> and it also leaves open the possibility of
>>>
>>> def greeting = $'hello'$
>>>
>>> which is kind of nice but perhaps not really needed.
>>>
>>> I guess the trailing $ doesn't add much, so perhaps this would do:
>>>
>>> def greeting = $"
>>> hello, $name
>>> "
>>
>> but then you have to escape ", while in the other version you
>> theoretically do not have to escape anything.
> 
> You would need to escape it when it appears on a line by itself and
> if you follow your suggestion of allowing any characters, then we could
> have "" as the ident, e.g.:
> 
> def greeting = $"""
> hello, $name
> """
> 
> :-)
> 
>> There is also another advantage and that is the usage of this kind of 
>> string in expressions:
>>
>> def greeting = $"
>>  hello, $name
>> "$.trim()
>>
>> The heredoc I suggested above (not the <<(<) version) does work in 
>> assignments, declarations and method calls, but you can not make it 
>> part of a path expression
>>
>>>> this way anything that follows $' can be used, not only identifiers. 
>>>> I would also reduce the four variants to two, because we already 
>>>> have ''' and """.
>>>
>>> That is reasonable but gives us reduced flexibility in that we couldn't
>>> do special first/last line handling and couldn't have ''' or """ in such
>>> strings.
>>
>> the question is if that is needed.
> 
> For me, this is certainly in the diminishing returns category.
> 
>>>> These two would both not support java escaping (\n is "\\n") and 
>>>> would be different only in allowing GStrings or not. And then it 
>>>> seems logical to do $" (with GString) and $' without.. maybe I would 
>>>> use /' and /" instead to let it more look like the slashy string but 
>>>> that seems not to have any value of recognition, but it is the same 
>>>> for me with $' I am more used to << for here docs.... meaning <<' 
>>>> and <<" or << and <<<. of course then a here doc can not be used in 
>>>> a method call without parentheses, but that shouldn't be much of a 
>>>> deficit.
>>>
>>> I guess all of these are possible - though some look a little 
>>> troublesome
>>> to me. I guess the proposals as they currently stand try to keep the
>>> current flavor of Groovy strings as much as possible rather than try
>>> to mimic PERL (or another language) implementation of here docs.
>>
>> true, but the usage of << is very common and has a value of 
>> recognition,  that is very important. the $', $", $/, $< versions have 
>> an optical clush with GStrings. of course they are not, but they look 
>> a bit like them, especially with an identifier. Also it seems only 
>> Powershell has stringsthat look a bit like that. Perl, PHP, most Unix 
>> shells and Ruby do use <<. Python uses this r notation, that we can 
>> not rally use, because it clashes with a method call. but again, this 
>> is syntax, we need to concentrate on functionality first
> 
> Yes, I certainly have no problem if we have only 1 or 2 here docs
> variations with using one of your suggestions. The previous choices
> were to mentally link back to single line variations to make it easy
> to remember what escaping and GString handling would be in place
> but if we are not going with all 4 variations, then another scheme
> makes sense.
> 
>> bye blackdrag
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
> 
>    http://xircles.codehaus.org/manage_email
> 
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Martin Kempf | 1 Apr 2008 15:48
Picon
Favicon

Re: [groovy-dev] adding parenthesis information to the CST

Jeremy Rayner schrieb:
> On Fri, Mar 28, 2008 at 4:30 PM, Jochen Theodorou <blackdrag@...> wrote:
>   
>>  GROOVY-259 and GROOVY-2697 both suggest adding parenthesis information
>>  to the CST.
>>     
>
> This additional information would be fantastic and a much needed
> addition to the CST.
>
> Martin's before/after jpg's look good, but I wonder if we might go one step
> further and create a PAREN_EXPR token type, rather than a plain EXPR token type
> for quick identification and to make it easier on the SourcePrinter.
>
>   
We've added the new EXPR to have proper (means including the 
parenthesis) source position information. This was needed to be able to 
find out if parentheseis were used by looking at the sourcefile.
Now, if the PAREN_EXPR token type will be converted into a new ASTNode 
(e.g.ParenthesisedExpression) with a reference to the enclosed 
expression, the need of  looking in the sourcefile would be obsolete and 
we would apprectiate that. Or is the intention to add parenthesis 
information only to the CST?
> As mentioned on page 31 of http://groovy.javanicus.com/groovydevcon3/
> we have/had this
> and some other limitations on the CST...
>
> * No Comments
>   
We are planning in our bachelor thesis to handle the refactoring with 
comments, means no loss of comments during refactoring. So the handling 
of comments in the grammar / CST / AST will affect us too. Our 
supervising professor has written an experience report about comment 
handling during refactoring:
http://wiki.hsr.ch/PeterSommerlad/wiki.cgi?Oopsla08CommentHandling
Several working and non working solutions are explained and it can be 
used as a guideline. Any thougths on this comment-issue are welcome to 
find the best solution for groovy.
> * No Semicolons
> * No Brackets (also bear in mind casting and other paren expressions
> that shouldn't fall under this...)
> * Indistinguishable Regex and String Literals (single, double, triple)
> * Character and Unicode escapes stored as already resolved, e.g. \t
> * public foo(bar) {}      equiv to       public foo(def bar) {}
>   
Every solved indistinguishability in the AST would help. e.g. the String 
literals are always represented by a ConstantExpression in the AST. To 
find out the used quotation, a look at the sourcefile is needed. An 
attribute in the ConstantExpression or even diffrent nodes would make 
the sourcefile lookup obsoltete.
> This would not affect the AST, but there are different tools
>  that do need updates. That is the maven joint compiler I think, the
>  groovy doc tool and of course our APP. Not sure if others are depending
>  on our grammar too.
>   
>
> * The SourceCodeTraversal will need to change slightly to accommodate
> this change
> * The Visitor interface and adapter will need the new PAREN_EXPR visit
> method added
> * The SourcePrinter will need to print parens
> * The SourcePrinter tests will need to have some cases added for the
> parens, to check it all works
>
> * Groovydoc will be unchanged internally as it is just another visitor
> for the above SourceCodeTraversal (hey, we getting reuse outta this thing ;-) )
>
> * I don't think it would affect JetGroovy, as they use flex and a
> custom parser, it may have
> a knock effect for the Eclipse plugin and Netbeans plugin, not sure...
>
> * APP obviously would need a good going over too...
>
>   
>>  so what kind of policy should we have here? Jez, your parts are affected
>>  too, maybe you want to say something.
>>     
>
> Well this kind of change doesn't affect the usage of the groovy grammer by our
> beloved users, it also is hidden (mostly) from the runtime once it has
> been laundered
> and mutated by the AntlrParserPlugin into the AST, so it is fairly
> clean at both ends.
>   
Not anymore when there is a new ParenthesisedExpression. This might be a 
disadvantage.
> I'd certainly suggest we add these features to the CST one at a time,
> and ensuring we
> catch all the ripples in CST traversals, visitors and APP.
>
>   
We agree to that too. Is it also possible to have these features in the 
AST? In this case, some visiors would need to be adapted.

martin

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Jochen Theodorou | 1 Apr 2008 17:40
Picon
Gravatar

Re: [groovy-dev] adding parenthesis information to the CST

Martin Kempf schrieb:
> Jeremy Rayner schrieb:
>> On Fri, Mar 28, 2008 at 4:30 PM, Jochen Theodorou <blackdrag@...> 
>> wrote:
>>  
>>>  GROOVY-259 and GROOVY-2697 both suggest adding parenthesis information
>>>  to the CST.
>>
>> This additional information would be fantastic and a much needed
>> addition to the CST.
>>
>> Martin's before/after jpg's look good, but I wonder if we might go one 
>> step
>> further and create a PAREN_EXPR token type, rather than a plain EXPR 
>> token type
>> for quick identification and to make it easier on the SourcePrinter.
>   
> We've added the new EXPR to have proper (means including the 
> parenthesis) source position information. This was needed to be able to 
> find out if parentheseis were used by looking at the sourcefile.
> Now, if the PAREN_EXPR token type will be converted into a new ASTNode 
> (e.g.ParenthesisedExpression) with a reference to the enclosed 
> expression, the need of  looking in the sourcefile would be obsolete and 
> we would apprectiate that. Or is the intention to add parenthesis 
> information only to the CST?

changing the CST is one thing, changing the AST is another. While we can 
probably do the CST change for 1.5.5, we surely can not do the AST 
change for 1.5.5. Many tools may have to be rewritten, because they 
expect a certain structure in the AST. If the code is using our 
visitors, then there might not be much of a problem, but if not, then 
there is probably one. Strictly spoken the AST is no direct 
representation of the source, it is an abstract representation of the 
source that is used to attach the semantic. So it is quite possible that 
for different CSTs you might get the same AST. So strictly spoken, any 
refactoring tool has to work with the CST and AST. The AST to know what 
to do and the CST to know how to do it. And that means no comments, no 
parens, no whitespace and other things as well.

Instead I think it would be much more interesting to have CST and AST 
linked so that you can easily navigate in both. Due to the mutating 
nature of the AST I guess this is only doable as AST->CST, not the other 
way.. but I also think that this would be enough already.

bye blackdrag

--

-- 
Jochen "blackdrag" Theodorou
The Groovy Project Tech Lead (http://groovy.codehaus.org)
http://blackdragsview.blogspot.com/
http://www.g2one.com/

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Jim White | 1 Apr 2008 21:50
Gravatar

Re: [groovy-dev] triple slashy string dev feedback

Paul King wrote:

> Jim White wrote:
> 
>... 
>> I think the thing to do about slashy string is patch it as I suggested 
>> in order to allow ending \ (and it already allows ending /).
> 
> This still seems troublesome to me. A wart on a wart? The current
> workarounds seem as simple to explain as what the revised escaping
> would require. But I am not totally against it, just cautious
> because it doesn't feel Groovy.
> 
>> Switching to / as an escape seems like the necessary thing to do 
>> AFAICT, but I don't see how to do that short of Groovy 2.
> 
> So, we can allow // as an additional escape in 1.6 and remove \/ in 2.0?

Hmmm, I guess that would work as the only breaking case I can think of 
at the moment is fairly degenerate:

def s = /foo/// // Would change from 'foo' and a comment to 'foo/'.

But it can't be a general delimiter:

def s = /foo/ // I wouldn't be a comment anymore...

So the delimiter can only be used to escape the delimiter.

And I see $ is the $ escape.  Or not.

def x = 'z'
[/$x$$x/]
==>
[z$z]

I expect:
[/$x$$x/]
==>
[z$$x]

Very fishy.

So I think the slashy escapes are:

$$ -> $
// -> /   (Groovy 1.6)
\eol  -> empty (line concatenation)
$/ -> $end
\/ -> /
\\/ -> \end (n > 1 * \ -> (n - 1) * \)

Both the first and last qualify for 1.5.5 as bug fixes imo.

And I don't have a problem with deprecating the last two in Groovy 2.

The $ business in slashy strings bugs me.  I would much rather have had 
slashy string *not* be GString.  As I see it the reason for slashy is to 
not have non-regex escape confusion in regex strings, but by supporting 
GString (some of) the complexity is brought right back in (yes I know $ 
appears infrequently in regex other than last position but I still see 
that as supporting the position against).

>> For myself, the slashy escape problem has been such that I just never 
>> use slashy if a string needs escapes (because after all, if I need 
>> escapes I might as well use ' because there is no question what I need 
>> to do).
>>
>> And I hesistate to bring this up, but another feature that I think 
>> should be considered for slashy strings is that substitution quote the 
>> value.  Unforunately there are two different quote functions, so I'm 
>> not sure this is practical.
>>
>>
http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html#quote(java.lang.String) 
>>
>>
http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Matcher.html#quoteReplacement(java.lang.String) 
> 
> Are you talking about replaceAll that has a GString as the second argument?
> I wasn't sure.

No, I meant that when a GString substitution was done in a slashy string 
that the value is quoted before being inserted.

That would be groovy because it would almost always be the right thing 
to do and people often leave it out, usually because they either don't 
think of it or because they think that they don't need it (until it 
unexpectedly breaks when real world data winds up in their variables).

An example:

def v = getSomethingFromTheDatabase()

def m = ~/Name:$v/

is an antipattern unless there is a contract that v will never contain a 
character that needs to be escaped in a regex.

The currently correct way to do that is:

def m = ~/Name:${Pattern.quote(v)}/

or the equivalent.

Although I have to admit one of the few cases of Groovy code using 
slashy GString substitution is in GinA and it actually uses it to build 
up a regex out of pieces, for which you do *not* want quoting.  But I 
wonder whether that is typical of "real world" Groovy...

While this may be seen as contrary to my position against slashy 
supporting GString, but I say that (although it is ambivalent) if slashy 
is gonna support GString substitution it should really be doing a first 
class job and thus carry the weight of its complexification of Groovy.

Fiendish (outlandish?) slashy GString extension:

$~ident, $~closure  ->  quote the value before substitution (which is to 
say make it into a literal pattern)

Like I said, I hesitated to bring up the issue...  ;-)

Jim

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Paul King | 2 Apr 2008 00:13
Picon
Favicon
Gravatar

Re: [groovy-dev] triple slashy string dev feedback

Jim White wrote:
> Paul King wrote:
> 
>> Jim White wrote:
>>
>> ...
>>> I think the thing to do about slashy string is patch it as I 
>>> suggested in order to allow ending \ (and it already allows ending /).
>>
>> This still seems troublesome to me. A wart on a wart? The current
>> workarounds seem as simple to explain as what the revised escaping
>> would require. But I am not totally against it, just cautious
>> because it doesn't feel Groovy.
>>
>>> Switching to / as an escape seems like the necessary thing to do 
>>> AFAICT, but I don't see how to do that short of Groovy 2.
>>
>> So, we can allow // as an additional escape in 1.6 and remove \/ in 2.0?
> 
> Hmmm, I guess that would work as the only breaking case I can think of 
> at the moment is fairly degenerate:
> 
> def s = /foo/// // Would change from 'foo' and a comment to 'foo/'.
> 
> But it can't be a general delimiter:
> 
> def s = /foo/ // I wouldn't be a comment anymore...
> 
> So the delimiter can only be used to escape the delimiter.

Yes, that was the intention.

> And I see $ is the $ escape.  Or not.
> 
> def x = 'z'
> [/$x$$x/]
> ==>
> [z$z]
> 
> I expect:
> [/$x$$x/]
> ==>
> [z$$x]
> 
> Very fishy.

No, at the moment the rule is 'if $ is followed by an
identifier or closure (i.e. curly opening brace) then
expand the variable/closure otherwise leave it as is'.

It would be a breaking change but changing the $ escaping
rules would be the other way to go.

$$ -> $
$/ -> /
\eol -> empty (swallow eol to allow line concatenation)
\/ -> / deprecate in 2.0?
$\ -> \ an alternate hack to allow \ at end of string (wouldn't be needed in 2.0)

> So I think the slashy escapes are:
> 
> $$ -> $
> // -> /   (Groovy 1.6)
> \eol  -> empty (line concatenation)
> $/ -> $end
> \/ -> /
> \\/ -> \end (n > 1 * \ -> (n - 1) * \)
> 
> Both the first and last qualify for 1.5.5 as bug fixes imo.
> 
> And I don't have a problem with deprecating the last two in Groovy 2.
> 
> The $ business in slashy strings bugs me.  I would much rather have had 
> slashy string *not* be GString.  As I see it the reason for slashy is to 
> not have non-regex escape confusion in regex strings, but by supporting 
> GString (some of) the complexity is brought right back in (yes I know $ 
> appears infrequently in regex other than last position but I still see 
> that as supporting the position against).
> 
>>> For myself, the slashy escape problem has been such that I just never 
>>> use slashy if a string needs escapes (because after all, if I need 
>>> escapes I might as well use ' because there is no question what I 
>>> need to do).
>>>
>>> And I hesistate to bring this up, but another feature that I think 
>>> should be considered for slashy strings is that substitution quote 
>>> the value.  Unforunately there are two different quote functions, so 
>>> I'm not sure this is practical.
>>>
>>>
http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html#quote(java.lang.String) 
>>>
>>>
http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Matcher.html#quoteReplacement(java.lang.String) 
>>
>>
>> Are you talking about replaceAll that has a GString as the second 
>> argument?
>> I wasn't sure.
> 
> No, I meant that when a GString substitution was done in a slashy string 
> that the value is quoted before being inserted.
> 
> That would be groovy because it would almost always be the right thing 
> to do and people often leave it out, usually because they either don't 
> think of it or because they think that they don't need it (until it 
> unexpectedly breaks when real world data winds up in their variables).
> 
> An example:
> 
> def v = getSomethingFromTheDatabase()
> 
> def m = ~/Name:$v/
> 
> is an antipattern unless there is a contract that v will never contain a 
> character that needs to be escaped in a regex.
> 
> The currently correct way to do that is:
> 
> def m = ~/Name:${Pattern.quote(v)}/
> 
> or the equivalent.
> 
> Although I have to admit one of the few cases of Groovy code using 
> slashy GString substitution is in GinA and it actually uses it to build 
> up a regex out of pieces, for which you do *not* want quoting.  But I 
> wonder whether that is typical of "real world" Groovy...
> 
> While this may be seen as contrary to my position against slashy 
> supporting GString, but I say that (although it is ambivalent) if slashy 
> is gonna support GString substitution it should really be doing a first 
> class job and thus carry the weight of its complexification of Groovy.
> 
> Fiendish (outlandish?) slashy GString extension:
> 
> $~ident, $~closure  ->  quote the value before substitution (which is to 
> say make it into a literal pattern)

Sounds useful to me. It is getting a little complex but looking at above
it becomes:

def m = ~/Name:$~v/

which doesn't look too bad.

FYI, there has been discussion of using other symbols after the $, e.g.:

xyz = null
one = "ab${xyz}cd"
two = "ab$!{xyz}cd"
assert one == 'abnullcd'
assert two == 'abcd'

mainly in templating contexts. Elvis makes this a little less needed,
e.g. three = "ab${xyz?:''}cd", but I think it still has some merit.

> Like I said, I hesitated to bring up the issue...  ;-)
> 
> Jim
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe from this list, please visit:
> 
>    http://xircles.codehaus.org/manage_email
> 
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email

Reto Kleeb | 2 Apr 2008 15:18
Picon

Re: [groovy-dev] adding parenthesis information to the CST

hi

Jochen Theodorou wrote:
> changing the CST is one thing, changing the AST is another. While we can 
> probably do the CST change for 1.5.5, we surely can not do the AST 
> change for 1.5.5. Many tools may have to be rewritten, because they 
> expect a certain structure in the AST. If the code is using our 
> visitors, then there might not be much of a problem, but if not, then 
> there is probably one. Strictly spoken the AST is no direct 
> representation of the source, it is an abstract representation of the 
> source that is used to attach the semantic. So it is quite possible that 
> for different CSTs you might get the same AST. So strictly spoken, any 
> refactoring tool has to work with the CST and AST. The AST to know what 
> to do and the CST to know how to do it. And that means no comments, no 
> parens, no whitespace and other things as well.

For our refactorings we do not depend on any change in the AST, a new 
node for the parentheised expressions would just save us a lookup in the 
sourcefile to determine whether we need to put parenthesis or not.

Without that lookup things like "(3+2)*2" would not be written back 
correctly using our ASTWriter.

 > Instead I think it would be much more interesting to have CST and AST
 > linked so that you can easily navigate in both. Due to the mutating
 > nature of the AST I guess this is only doable as AST->CST, not the
 > other way.. but I also think that this would be enough already.

This solution would be an alternative to our file lookup: Attach a 
reference to the corresponding CST node to each of the AST nodes.

However, our solution (with the applied patch from GROOVY-2697 that does 
not change the structure of the AST) works fine so far. But we could 
also live with a new nodetype the way jeremy mentioned it, as long as 
the line / col infos stay the same.

bye

martin, reto, mike

---------------------------------------------------------------------
To unsubscribe from this list, please visit:

    http://xircles.codehaus.org/manage_email


Gmane