Problem in getting xml file while using dumppdf.py
Dear All
I am using pdfminer's dumppdf.py program to extract text from a pdf
using command as:
dumppdf.py -a [pdf file] > [output xml file]
Which works well with all the pdf but suddenly with one pdf , It gave
me below errors:
Traceback (most recent call last):
File "/usr/local/bin/dumppdf.py", line 226, in <module>
if __name__ == '__main__': sys.exit(main(sys.argv))
File "/usr/local/bin/dumppdf.py", line 223, in main
dumpall=dumpall, codec=codec)
File "/usr/local/bin/dumppdf.py", line 162, in dumppdf
doc.set_parser(parser)
File "/usr/local/lib64/python2.6/site-packages/pdfminer/
pdfparser.py", line 327, in set_parser
self.info.append(dict_value(trailer['Info']))
File "/usr/local/lib64/python2.6/site-packages/pdfminer/
pdftypes.py", line 132, in dict_value
x = resolve1(x)
File "/usr/local/lib64/python2.6/site-packages/pdfminer/
pdftypes.py", line 60, in resolve1
x = x.resolve()
File "/usr/local/lib64/python2.6/site-packages/pdfminer/
pdftypes.py", line 49, in resolve
return self.doc.getobj(self.objid)
File "/usr/local/lib64/python2.6/site-packages/pdfminer/
pdfparser.py", line 418, in getobj
(strmid, index) = xref.get_pos(objid)
File "/usr/local/lib64/python2.6/site-packages/pdfminer/
pdfparser.py", line 211, in get_pos
pos = nunpack(ent[self.fl1:self.fl1+self.fl2])
File "/usr/local/lib64/python2.6/site-packages/pdfminer/utils.py",
line 116, in nunpack
raise TypeError('invalid length: %d' % l)
TypeError: invalid length: 8
What may be the possible reasons? I am using latest pdfminer build and
python installed is 2.6
PDF is not secured and protected and has similar properties as any
other pdf has.
Waiting for responses!
Regards
Sagar