Luna Rodríguez, Raúl | 17 May 2013 14:16

SIGSEGV trying to conver html entities to their actual printable codes

Hello:

 

I found that when I run the awk I’ve send to you and execute “awk –f fail.awk failhtml” the program just gets and Segmentation Fault.

 

What’s wrong with it?

 

I am using gawk version 3.1.6 for windows under Windows XP 32 bit. Version of XP: 5.1 (Build 2600.xpsp_sp3_gdr.120411-1615).

 

Thanks in advance

 

 

 

_________________________________________________________________________

Raúl Luna

Forensic and Remote Audit IAU

Raul.luna-Yu+yOSpmLcTQT0dZR+AlfA@public.gmane.org

+34 650 08 15 64

+34 983 42 75 06

CLÁUSULA DE CONFIDENCIALIDAD: Este correo y sus anexos pueden contener información confidencial o legalmente protegida. Si le hubiera llegado por error, notifíquelo inmediatamente al remitente y por favor elimínelo sin revisarlo ni reenviarlo; cualquier copia, divulgación, distribución o uso de los contenidos está prohibida. Gracias por su colaboración.

CONFIDENTIALITY CLAUSE: This email and its attachments may contain confidential information or protected by law. If you receive this email by error, please notify to the sender and please delete it without revising it or resending it; any copy, broadcast, of distribution or even use of the contents is forbidden. Thanks for your colaboration.

 

 

 

Attachment (fail.zip): application/x-zip-compressed, 1271 KiB
Manuel Collado | 17 May 2013 16:30
Picon
Picon
Favicon

Re: zerofile function

Putting the answer above the question ;-)

>> I don't understand what top-post means.

>>> - Please don't top-post.

--

-- 
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado

Manuel Collado | 17 May 2013 10:37
Picon
Picon
Favicon

Re: zerofile function

First af all, some advice:

- Please don't top-post.
- Please respond to the bug-gawk list instead of or in addition to my 
personal e-mail address.
- Please post just plain text and not HTML.
- Please quote the relevant parts of the original message, and not the 
whole stuff.

(Disclaimer: I'm quoting the full thread so others can see the whole stuff 
you sent just to my personal address)

Short answer: Your original code (at the end of this long post) is correct. 
Don't spoil it.

And now, look below to see my responses to some specific parts of your 
messages.

El 16/05/2013 23:32, david ward escribió:
> Sorry to exhaust you with my mistakes but correcting all references TO ARGV
> to ARGC I still get the same errors. My main programming language is C++
> hence the mistake. I have manage to find some errors in the functions
> presented in the manual so please don't dismiss me too lightly

Don't worry. We are always willing to help interested people.

But, your original code is correct, so don't try to amend it. Doing so just 
makes things worse.

>
> On Thu, May 16, 2013 at 10:09 PM, david ward <bamberward@...> wrote:
>
>> /gawk/awkscripts-> gawk -f  ./zerofile.awk  -f ./zerofile_funct.awk  -f
>> ./checkzero.awk ./tee2  ./empty  ./tee3   ./empty2
>> gawk: ./zerofile.awk:18:              zerofile(ARGC[Argind], Argind) # get
>> syntax error for ARGV[Argind]
>> gawk: ./zerofile.awk:18:                                  ^ use of
>> non-array as array
>>
>> gawk: ./zerofile_funct.awk:2: function zerofile(ARGC[Argind], Argind )
>> gawk: ./zerofile_funct.awk:2:                       ^ syntax error
>> gawk: ./zerofile_funct.awk:2: error: function `zerofile': can't use
>> special variable `ARGC' as a function parameter
>> ~/gawk/awkscripts->

ARGC is a scalar variable (the number or arguments), ARGV is the array of 
arguments. So ARGV[x]is correct, but ARGC[x] doesn't makes sense.

In addition, ARGC is a special variable, so you can't use this name for a 
formal argument of a function.

>>
>> On Thu, May 16, 2013 at 10:06 PM, david ward <bamberward@...> wrote:
>>
>>> Noticed that I did not make changes to the files correctly. Making these
>>> corrections does not alter the error message:

As said, making such corrections makes the things worse.

>>>
>>> #!/usr/bin/gawk -f
>>> # All known awk implementations silently skip over zero-length files.
>>> This is a by-product of awk's implicit
>>> # read-a-record-and-match-against-the-rules loop:
>>> #  when awk tries to read a record from an empty file, it immediately
>>> receives an end of file
>>> # indication, closes the file, and proceeds on to the next command-line
>>> data file, WITHOUT executing any
>>> # user-level awk program code.
>>> # Using gawk's ARGIND variable , it is possible to detect when an empty
>>> data file
>>> # has been skipped.
>>>
>>>       # zerofile.awk --- library file to process empty input files
>>>
>>>
>>>       BEGIN { Argind = 0 }
>>>
>>>       ARGIND > Argind + 1 {
>>>
>>>           for (Argind++; Argind < ARGIND; Argind++)
>>>               zerofile(ARGC[Argind], Argind) # get syntax error for
>>> ARGV[Argind]
>>>
>>>
>>>       }
>>>
>>>       ARGIND != Argind { Argind = ARGIND }
>>>
>>>       END {
>>>           d = ARGV[Argind]
>>>           if (ARGIND > Argind)
>>>               for (Argind++; Argind <= ARGIND; Argind++)
>>>                    zerofile(ARG[Argind], Argind)
>>>
>>>
>>>       }
>>> #!/usr/bin/gawk -f
>>> function zerofile(ARGC[Argind], Argind )
>>> {
>>>     print ARGIND,Argind, ARGV[Argind]
>>>
>>> }
>>> #!/usr/bin/gawk -f
>>>     {print}
>>>
>>>
>>>
>>> On Thu, May 16, 2013 at 9:52 PM, david ward <bamberward@...> wrote:
>>>
>>>> Doesn't work  with gawk 4.0.2 on Ubuntu 12.04
>>>> removing the assignment to v in the source code I get

See the previous comment

>>>> 1037  16/05/13 21:45:45 history
>>>> ~/gawk/awkscripts-> gawk -f  ./zerofile.awk  -f ./zerofile_funct.awk  -f
>>>> ./checkzero.awk ./tee2  ./empty  ./tee3   ./empty2
>>>> gawk: ./zerofile_funct.awk:2: function zerofile(ARGC{Argind], Argind )
>>>> gawk: ./zerofile_funct.awk:2:                       ^ syntax error
>>>> gawk: ./zerofile_funct.awk:2: error: function `zerofile': can't use
>>>> special variable `ARGC' as a function parameter
>>>> ~/gawk/awkscripts->
>>>> #!/usr/bin/gawk -f
>>>> # All known awk implementations silently skip over zero-length files.
>>>> This is a by-product of awk's implicit
>>>> # read-a-record-and-match-against-the-rules loop:
>>>> #  when awk tries to read a record from an empty file, it immediately
>>>> receives an end of file
>>>> # indication, closes the file, and proceeds on to the next command-line
>>>> data file, WITHOUT executing any
>>>> # user-level awk program code.
>>>> # Using gawk's ARGIND variable , it is possible to detect when an empty
>>>> data file
>>>> # has been skipped.
>>>>
>>>>       # zerofile.awk --- library file to process empty input files
>>>>
>>>>
>>>>       BEGIN { Argind = 0 }
>>>>
>>>>       ARGIND > Argind + 1 {
>>>>            # v = ARGV[Argind]
>>>>
>>>>           for (Argind++; Argind < ARGIND; Argind++)
>>>>               zerofile(a, Argind) # get syntax error for ARGV[Argind]
>>>> submitted bug report apparently works ok in
>>>>                                   # in 4.0.1 on windows XP
>>>>
>>>>
>>>>       }
>>>>
>>>>       ARGIND != Argind { Argind = ARGIND }
>>>>
>>>>       END {
>>>>           d = ARGV[Argind]
>>>>           if (ARGIND > Argind)
>>>>               for (Argind++; Argind <= ARGIND; Argind++)
>>>>                   zerofile(d, Argind)
>>>>
>>>>       }
>>>>   #!/usr/bin/gawk -f
>>>> function zerofile(ARGC{Argind], Argind )
>>>> {
>>>>     print ARGIND,Argind, ARGV[Argind]
>>>>
>>>> }
>>>>
>>>>
>>>>
>>>> On Thu, May 16, 2013 at 3:42 PM, Manuel Collado <mcollado@...>wrote:
>>>>
>>>>> El 16/05/2013 14:23, david ward escribió:
>>>>>
>>>>>   I don't know if this a bug or not but I get a syntax error when when
>>>>>> trying to pass ARGC[Argind] to zerofile. gawk version: 4.0.2 OS:
>>>>>> Ubuntu 12.4
>>>>>>
>>>>>> #!/usr/bin/gawk -f
>>>>>> # All known awk implementations silently skip over zero-length files.
>>>>>> This is a by-product of awk's implicit
>>>>>> # read-a-record-and-match-**against-the-rules loop:
>>>>>> #  when awk tries to read a record from an empty file, it immediately
>>>>>> receives an end of file
>>>>>> # indication, closes the file, and proceeds on to the next command-line
>>>>>> data file, WITHOUT executing any
>>>>>> # user-level awk program code.
>>>>>> # Using gawk's ARGIND variable , it is possible to detect when an empty
>>>>>> data file
>>>>>> # has been skipped.
>>>>>>
>>>>>>        # zerofile.awk --- library file to process empty input files
>>>>>>
>>>>>>
>>>>>>        BEGIN { Argind = 0 }
>>>>>>
>>>>>>        ARGIND > Argind + 1 {
>>>>>>              v = ARGV[Argind]
>>>>>>            for (Argind++; Argind < ARGIND; Argind++)
>>>>>>                zerofile(a, Argind) # get syntax error for ARGV[Argind]
>>>>>>
>>>>>>        }
>>>>>>
>>>>>>        ARGIND != Argind { Argind = ARGIND }
>>>>>>
>>>>>>        END {
>>>>>>            d = ARGV[Argind]
>>>>>>            if (ARGIND > Argind)
>>>>>>                for (Argind++; Argind <= ARGIND; Argind++)
>>>>>>                    zerofile(d, Argind)
>>>>>>
>>>>>>        }
>>>>>>    my zerofile function
>>>>>> #!/usr/bin/gawk -f
>>>>>> function zerofile(a, Argind )
>>>>>> {
>>>>>>      print ARGIND,Argind, ARGV[Argind]
>>>>>>
>>>>>> }
>>>>>> I wrote this to understand  how the post increment operator  took
>>>>>> effect
>>>>>> on the parameters
>>>>>>
>>>>>>
>>>>> Works Ok with GNU Awk 4.0.1 on Windows XP. No syntax errors, gives
>>>>> expected output.

Tested also with GAWK 4.0.2. Same (good) results.

Regards.
--

-- 
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado

david ward | 17 May 2013 09:39
Picon

zerofile

I want to re-submit this bug because I made some silly mistakes when typing the emails previously
The following files produced this:~/gawk/awkscripts-> gawk  --source '{ print } ' nonempty
This the contents of nonempty text file
~/gawk/awkscripts->
~/gawk/awkscripts-> gawk -f zerofile.awk  -f zerofilef.awk  --source '{ print }'empty nonempty empty2
gawk: zerofilef.awk:2: function zerofile(ARGC[Argind, Argind )
gawk: zerofilef.awk:2:                       ^ syntax error
gawk: zerofilef.awk:2: error: function `zerofile': can't use special variable `ARGC' as a function parameter
~/gawk/awkscripts->
This was produced by using the zerofile.awk in manual for 4.0.2
#zerofilef.awk
#!/usr/bin/gawk -f
function zerofile(ARGC[Argind, Argind )
{
   print ARGIND,Argind, ARGC[Argind]
 
}
This was produced by the following files:

/gawk/awkscripts-> gawk -f zerofile2.awk -f zerofilef2.awk  --source '{print}' empty nonempty empty2
2 1 empty
This the contents of nonempty text file
3 3 empty2
~/gawk/awkscripts->
!/usr/bin/gawk -f

    
     # zerofile.awk --- library file to process empty input files
    
    
     BEGIN { Argind = 0 }
    
     ARGIND > Argind + 1 {
         zf=ARGV[Argind]
         for (Argind++; Argind < ARGIND; Argind++)
             zerofile(zf, Argind) # get syntax error for ARGC[Argind]
               
     }
    
     ARGIND != Argind { Argind = ARGIND }
    
     END {
       
         if (ARGIND > Argind)
            zf2=ARGV[Argind]
             for (Argind++; Argind <= ARGIND; Argind++)
                 zerofile(zf2, Argind)
                 
     }
#zerofile2.awk
#!/usr/bin/gawk -f
function zerofile(check, Argind )
{
   print ARGIND,Argind, ARGV[Argind]
 
}
Steven Daniels | 16 May 2013 09:50
Picon
Gravatar

gawk bug

I'm getting an Assertion failed when I try the following:
$ echo '很?pos=ad 宽广?pos=va , 更?pos=ad 是?pos=vc1' | gawk '{match($0, /(([^ \?.]*\?pos=ad |([^ \?.]*\?pos=(jj|va) )[地]\?pos=dev ){0,2})/ , arr)}  { if(arr[0]) print arr[1], arr[4], $6} '
Assertion failed: (&musts[2] <= mp), function dfamust, file dfa.c, line 3951.
[1]    13263 done       echo '很?pos=ad 宽广?pos=va , 更?pos=ad 是?pos=vc1' |
       13264 abort      gawk

The point of failure seems to be "[地]", when brackets aren't used, the command works as expected.

$ echo '很?pos=ad 宽广?pos=va , 更?pos=ad 是?pos=vc1' | gawk '{match($0, /(([^ \?.]*\?pos=ad |([^ \?.]*\?pos=(jj|va) )地\?pos=dev ){0,2})/ , arr)}  { if(arr[0]) print arr[1], arr[4], $6} '  
# => 很?pos=ad



$gawk --version  
GNU Awk 4.0.2



Thanks.

-Steven Daniels
david ward | 16 May 2013 14:23
Picon

zerofile function

I don't know if this a bug or not but I get a syntax error when when trying to pass ARGC[Argind] to zerofile. gawk version: 4.0.2 OS: Ubuntu 12.4

#!/usr/bin/gawk -f
# All known awk implementations silently skip over zero-length files. This is a by-product of awk's implicit
# read-a-record-and-match-against-the-rules loop:
#  when awk tries to read a record from an empty file, it immediately receives an end of file
# indication, closes the file, and proceeds on to the next command-line data file, WITHOUT executing any
# user-level awk program code.
# Using gawk's ARGIND variable , it is possible to detect when an empty data file
# has been skipped.
    
     # zerofile.awk --- library file to process empty input files
    
    
     BEGIN { Argind = 0 }
    
     ARGIND > Argind + 1 {
           v = ARGV[Argind]
         for (Argind++; Argind < ARGIND; Argind++)
             zerofile(a, Argind) # get syntax error for ARGV[Argind]
               
     }
    
     ARGIND != Argind { Argind = ARGIND }
    
     END {
         d = ARGV[Argind]
         if (ARGIND > Argind)
             for (Argind++; Argind <= ARGIND; Argind++)
                 zerofile(d, Argind)
                 
     }
 my zerofile function
#!/usr/bin/gawk -f
function zerofile(a, Argind )
{
   print ARGIND,Argind, ARGV[Argind]
 
}
I wrote this to understand  how the post increment operator  took effect on the parameters

Hermann Peifer | 16 May 2013 13:23
Picon

Re: Gawk 4.1.0 released

On 2013-05-10 8:08, none Aharon Robbins wrote:
>
> The usual GNU build incantation should be used:
>
> 	tar -xpvzf gawk-4.1.0.tar.gz
> 	cd gawk-4.1.0
> 	./configure && make && make check
>
> Bug reports should be sent to bug-gawk@...
>
> Enjoy!
>
> Arnold Robbins (on behalf of all the gawk developers)
> arnold@...

For the sake of the exercise, I followed the above steps and ended up with:

(...)
gcc  -g -O2 -DNDEBUG   -o gawk array.o awkgram.o builtin.o cint_array.o 
command.o debug.o dfa.o eval.o ext.o field.o floatcomp.o gawkapi.o 
gawkmisc.o getopt.o getopt1.o int_array.o io.o main.o mpfr.o msg.o 
node.o profile.o random.o re.o regex.o replace.o str_array.o symbol.o 
version.o    -lreadline  -lm -lm
Undefined symbols for architecture x86_64:
   "_history_list", referenced from:
       _serialize in debug.o
       _do_save in debug.o
ld: symbol(s) not found for architecture x86_64
collect2: ld returned 1 exit status
make[2]: *** [gawk] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

Strange enough, everything works out fine when compiling in my local git 
repository:

$ git pull && ./bootstrap.sh && ./configure --prefix=$HOME/local 
--with-mpfr=/opt/local && make check

(...)
ALL TESTS PASSED

I am using a MacBook:

$ uname -a
Darwin moby 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 16:25:48 PDT 
2012; root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64 i386 MacBookAir4,2 Darwin

Regards, Hermann

riddhipratim ghosh | 15 May 2013 21:55
Picon
riddhipratim ghosh <riddhipratimghosh@...>

question

How can I convert a SAV file 605695 kb to CSV to upload it in R?
Hermann Peifer | 14 May 2013 15:35
Picon

Documentation, section: 16.8 The gawkextlib Project

Hi,

I suggest to mention a "make install" somewhere near the end of this 
section.

Regards, Hermann

david ward | 11 May 2013 15:25
Picon

error in cut.awk

Forget to mention my operating system which Ubuntu 12-04
david ward | 11 May 2013 15:21
Picon

'error' in cut.awk

gawk 4.0.2
manual 4.0.2 and in examples with distribution
In the last block , for the first if should this not be index($0,FS)= =0
not index($0,FS) !=0
Test case
The\train\tinSpain\tstays\ton\tthe\tplain
The#cat#sat#on#the#mat
The\tcat\sat\ton\tthe\tmat
with !=
igawk  -f   wc.ack   --  -s -f 2-4   cut-test
no output
with  ==
rain\tin\tspain
cat\tsat\tmat


Gmane