Matthew Wild | 1 Dec 01:31
Picon
Gravatar

Re: Lua bytecode doesn't compress well (with LZMA)

2009/11/30 David Given <dg <at> cowlark.com>:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Luiz Henrique de Figueiredo wrote:
>>> Shameless promotion: http://matthewwild.co.uk/projects/squish
>>
>> Another shameless plug: http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/#lstrip
>
> Just because one can never have too many types of wheel, there's also
> Kein-Hong Man's luasrcdiet, which I'm using very successfully in Prime
> Mover:
>
> http://luaforge.net/projects/luasrcdiet/
>

Squish actually uses this as its main output filter,  (there's more
than one, such as the one that replaces Lua keywords with single bytes
>128) - and it works really well. I went a step further because I
wanted to bundle multiple libraries as well as other files (non-code)
into a nice compact file.

Matthew

KHMan | 1 Dec 02:41
Picon

Re: Lua bytecode doesn't compress well (with LZMA)

Matthew Wild wrote:
> 2009/11/30 David Given <dg <at> cowlark.com>:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Luiz Henrique de Figueiredo wrote:
>>>> Shameless promotion: http://matthewwild.co.uk/projects/squish
>>> Another shameless plug: http://www.tecgraf.puc-rio.br/~lhf/ftp/lua/#lstrip
>> Just because one can never have too many types of wheel, there's also
>> Kein-Hong Man's luasrcdiet, which I'm using very successfully in Prime
>> Mover:
>>
>> http://luaforge.net/projects/luasrcdiet/
>>
> 
> Squish actually uses this as its main output filter,  (there's more
> than one, such as the one that replaces Lua keywords with single bytes
>> 128) - and it works really well. I went a step further because I
> wanted to bundle multiple libraries as well as other files (non-code)
> into a nice compact file.

That aside, I just checked a vanilla 87KB script compiled -s to 
44KB, and 32% of the symbols were zeros in the binary chunk. 
Switch those many 32-bit count values into a variable number of 
bytes (the first byte can range from 0-127, etc.), and the binary 
chunk can be made a little smaller. Little loss of performance, no 
variable-bit encoding schemes, but not nearly as good as proper 
data compression (say, a 25% improvement is not really much vs 
LZ+statistical coding schemes).

(Continue reading)

Mark Hamburg | 1 Dec 03:53

Re: "dynamic" closures

On Nov 30, 2009, at 3:12 AM, spir wrote:

> Hello,
> 
> Was trying to understand Lua closures better (the indications in historical docs are not enough for my
limited brain), when I stepped on this:
> 
> x = 1
> function f()
>  n = 1
>  function g(a)
>    print (x + n + a) 
>  end
>  return g
> end
> 
> g= f()
> g(1)	--> 3
> 
> x = 2
> g(1)	--> 4
> 
> I'm a bit surprised the code works "as unexpected" ;-) As a consequence, g becomes totally referentially
"opaque", or do I misinterpret? (Maybe I miss a basic point, but for me this smells strong like naughty
bugs. Even more since x is a global.) Do you have use cases for such an idiom (that could hardly be refactored
into harmless expression)?
> 
> About implementation, it seems to indicate Lua embebs symbolic references (*) to outer _variables_ in
reachable scopes, when creating a closure for g0. Not pointer references (**) to _values_; otherwise the
replacement of x would not be seen by the closure. Before trying, I thought the outcome would be 3, meaning
(Continue reading)

Jim Pryor | 1 Dec 05:07
Favicon

Re: Co routine local data / best practices

On Mon, Nov 30, 2009 at 03:52:37PM -0800, Chris Gagnon wrote:
>    Since ref and the corresponding rawgeti are the fast methods for table
>    access i use the following code to create myself a new co routine.�
> 
>      lua_getfield(MainState, LUA_GLOBALSINDEX, "ThreadTable");
>      lua_State *thread = lua_newthread(MainState);
>      int refID = luaL_ref(MainState, -2);
>      lua_pop(MainState,1);
> 
>    So now the when i want to let GC clean up the coroutine i need to :
> 
>      lua_getfield(MainState, LUA_GLOBALSINDEX, "ThreadTable");
>      luaL_unref(MainState, -1, refID );
> 
>    The remaining piece is where do i get refID when i want to unref? or how
>    do I store that refID with the co routine?
>    Here are some of the possibilities/issues i have run into.
> 
>      * Environment table
>        I have multiple co routines sharing an environment which dosn't allow
>        me to uniquely store this value,
>        without another table which is undesirable from a
>        complexity/performance standpoint
>      * Stack
>        Since the new stack is the only thing completely unshared when
>        creating a co routine, I simply push the value onto the stack.
>        This works like a charm until an error case, errors leave the stack in
>        the layout of the function that ran into the problem.
>        In this case i do not know how to recover the refID to properly clean
>        up.
(Continue reading)

Mark Hamburg | 1 Dec 07:05

Re: From C, checking script syntax

On Nov 30, 2009, at 2:37 AM, Luiz Henrique de Figueiredo wrote:

>> 1. Take the byte code generated by load followed by dump, scan it for access to globals, and complain if
there are any values not on your list. See the various discussions of "linting" Lua.
>> 
>> 2. If you are prepared to modify the compiler, I think that it's actually possible to move that check into
the compiler itself.
>> 
>> So, this is entirely doable but it isn't trivial since what you are looking for isn't a syntax error
(though option 2 makes it one).
> 
> See http://lua-users.org/lists/lua-l/2006-10/msg00206.html

Thanks. And the reason I remembered its existence if not its details was, of course, that it was an answer to
my own question. I've just sort of weaned myself off of running with a modified Lua implementation as much
as possible since the modifications make upgrades harder. So, I end up looking at where one could put the
code but then not adding it. I should probably maintain two projects. One for work destined for delivery --
i.e., long life -- and one with a thoroughly hacked up Lua implementation.

Mark

Juris Kalnins | 1 Dec 07:19
Picon

Re: Lua bytecode doesn't compress well (with LZMA)

> Squish actually uses this as its main output filter,  (there's more
> than one, such as the one that replaces Lua keywords with single bytes
> > 128) - and it works really well.

And it's very simple to modify llex.c to parse such alternative tokens,
so that reduced scripts can be used directly (btw, Lua also has 20+ unused
bytes from the 0..31 control character range).

spir | 1 Dec 09:27
Picon
Favicon

Re: "dynamic" closures

Mark Hamburg <mark <at> grubmah.com> dixit:

> > About implementation, it seems to indicate Lua embebs symbolic references (*) to outer _variables_ in
reachable scopes, when creating a closure for g0. Not pointer references (**) to _values_; otherwise the
replacement of x would not be seen by the closure. Before trying, I thought the outcome would be 3, meaning
values are kind of frozen at closure creation time (would be the case if references were pointers).
> > 
> > [I imagine this behaviour may match the one of closures in functional languages (?), but anyway the whole
paradigm is so different...]
> > 
> > Denis  
> 
> This is exactly how true closures work in all languages with closures. The creation of g closes over the
references to n and x (which as you've written them are global but might as well be local -- you'll get the
same results). If you use local variables, a new instance of n is created whenever x is executed, but there
is presumably only one point where x is created hence all references lead back to the same value.
> 
> Perhaps you have experience with a poor man's closure implementation in which the values are copied into
the closure at the time the closure is created? That meets many of the needs of a closure, but it is just a weak
copy. Consider for example wanting to write:
> 
> 	local sum = 0
> 	table.foreachi( array, function( i, v ) sum = sum + v end )
> 	print( "Sum = ", sum )
> 
> In order for this to work, the reference to sum in the function have to refer to the mutable variable
declared outside the function.
> 
> Mark

(Continue reading)

KHMan | 1 Dec 10:46
Picon

Re: Lua bytecode doesn't compress well (with LZMA)

Juris Kalnins wrote:
>> Squish actually uses this as its main output filter,  (there's more
>> than one, such as the one that replaces Lua keywords with single bytes
>> > 128) - and it works really well.
> 
> And it's very simple to modify llex.c to parse such alternative tokens,
> so that reduced scripts can be used directly (btw, Lua also has 20+ unused
> bytes from the 0..31 control character range).

I have one query for Matthew, the author of squish. Want to find 
out (the lazy way) a few data points (always love data points) to 
get a feel of things:

Does anyone have a set of data points for squish, where we compare 
data-compressed sizes of sources, where:
(1) keyword token replacement filter, versus
(2) no keyword token replacement

Now, in (2), LZ coding would zap most keywords into a sliding 
dictionary match code, whereas in (1), the initial size of the 
filtered sources will be smaller but there is more variation in 
the symbol frequencies of the source code (token symbols added) 
and less chance to make sliding dictionary matches.

So, would there be a big difference when we compare compressed 
sources? Say, we tabulate results as:
(1) original
(2) token filtered
(3) original, compressed
(4) token filtered, compressed
(Continue reading)

Alexander Gladysh | 1 Dec 11:09
Picon
Gravatar

Re: Portably iterate over a directory

On Tue, Dec 1, 2009 at 02:07, Rob Kendrick <lua-l <at> nun.org.uk> wrote:
> On Mon, 30 Nov 2009 23:54:48 +0300
> Alexander Gladysh <agladysh <at> gmail.com> wrote:

>> > This won't work reliably due to charset issues, I fear.  You really
>> > have to use proper operating system APIs for this job.

>> Even if I don't need absolute paths and all files and sub-directories
>> that I traverse are guaranteed to be contain [a-zA-Z_-.] only?

> There are too many issues to count.  Like, "What if the user has
> aliased {dir,ls} to something else unexpected?"

> This is very common in the UNIX world.

I think this can be solved (in most cases) by using /bin/ls.

While I'd like to have universal solution, what I need is a practical
one, that would work out of the box for the most common cases.

If user is smart enough to modify his environment so it breaks the
utility, he should be smart enough to fix it. (The utility is for
programmers, after all.)

Alexander.

Michael Pruemm | 1 Dec 11:22
Picon

Re: Portably iterate over a directory

On Windows, you could use the command

  dir /b/s .

to get a recursive listing of all files, starting in the current
directory. The output looks like this:
D:\zzz\lua-5.1.4\COPYRIGHT
D:\zzz\lua-5.1.4\doc
D:\zzz\lua-5.1.4\etc
D:\zzz\lua-5.1.4\HISTORY
D:\zzz\lua-5.1.4\INSTALL
D:\zzz\lua-5.1.4\Makefile
D:\zzz\lua-5.1.4\README
D:\zzz\lua-5.1.4\src
D:\zzz\lua-5.1.4\test
D:\zzz\lua-5.1.4\doc\amazon.gif
D:\zzz\lua-5.1.4\doc\contents.html
D:\zzz\lua-5.1.4\doc\cover.png
D:\zzz\lua-5.1.4\doc\logo.gif
D:\zzz\lua-5.1.4\doc\lua.1
D:\zzz\lua-5.1.4\doc\lua.css
D:\zzz\lua-5.1.4\doc\lua.html
D:\zzz\lua-5.1.4\doc\luac.1
D:\zzz\lua-5.1.4\doc\luac.html
D:\zzz\lua-5.1.4\doc\manual.css
D:\zzz\lua-5.1.4\doc\manual.html
D:\zzz\lua-5.1.4\doc\readme.html
D:\zzz\lua-5.1.4\etc\all.c
D:\zzz\lua-5.1.4\etc\lua.hpp
D:\zzz\lua-5.1.4\etc\lua.ico
(Continue reading)


Gmane