Peterson, Wayne | 9 May 2010 08:43
Favicon

Parsing XML file with Minidom has problem with cr/lf

I am parsing an XML file with Python 2.6.5 minidom in Windows and it is mostly working but minidom seems to have problems dealing with Windows cr/lf characters. It creates an extra textnode that needs to be ignored instead of just returning the xml elements. I have tried different methods of opening the file but it doesn’t seem to make a difference. It is happiest when reading a file in Unix format.

 

Wayne Peterson | Consultant
Sierra Systems

(T): 403-264-0955 (C): 403-710-9248 (F): 403-233-2108

7th Floor, Canadian Centre

833 4th Avenue SW
Calgary, Alberta, T2P 3T5

Management Consulting | System Integration | Managed Services
website: www.SierraSystems.com

----Notice Regarding Confidentiality----
This email, including any and all attachments, (this "Email") is intended only for the party to whom it is addressed and may contain information that is confidential or privileged. Sierra Systems Group Inc. and its affiliates accept no responsibility for any loss or damage suffered by any person resulting from any unauthorized use of or reliance upon this Email. If you are not the intended recipient, you are hereby notified that any dissemination, copying or other use of this Email is prohibited. Please notify us of the error in communication by return email and destroy all copies of this Email. Thank you.

_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig
kimmyaf | 26 Apr 2010 00:24
Picon
Favicon

parsing XML with minidom


Hello. I've only done a litte bit of parsing with minidom before but I'm
having trouble getting my values out of this xml. I need the latitude and
longitude values in bold. I've tried several things. I think that I am
getting into the location tag but maybe the getAttribute function is not
correct for this example?

<GeocodeResponse>
 <status>OK</status>
 <result>
  <type>street_address</type>
  <formatted_address>50 Oakland St, Wellesley, MA 02481,
USA</formatted_address>
  <address_component>
   <long_name>50</long_name>
   <short_name>50</short_name>
   <type>street_number</type>
  </address_component>
  <address_component>
   <long_name>Oakland St</long_name>
   <short_name>Oakland St</short_name>
   <type>route</type>
  </address_component>
  <address_component>
   <long_name>Wellesley</long_name>
   <short_name>Wellesley</short_name>
   <type>locality</type>
   <type>political</type>
  </address_component>
  <address_component>
   <long_name>Wellesley</long_name>
   <short_name>Wellesley</short_name>
   <type>administrative_area_level_3</type>
   <type>political</type>
  </address_component>
  <address_component>
   <long_name>Norfolk</long_name>
   <short_name>Norfolk</short_name>
   <type>administrative_area_level_2</type>
   <type>political</type>
  </address_component>
  <address_component>
   <long_name>Massachusetts</long_name>
   <short_name>MA</short_name>
   <type>administrative_area_level_1</type>
   <type>political</type>
  </address_component>
  <address_component>
   <long_name>United States</long_name>
   <short_name>US</short_name>
   <type>country</type>
   <type>political</type>
  </address_component>
  <address_component>
   <long_name>02481</long_name>
   <short_name>02481</short_name>
   <type>postal_code</type>
  </address_component>
  <geometry>
   <location>
    <lat>42.3118520</lat>
    <lng>-71.2632680</lng>
   </location>
   <location_type>ROOFTOP</location_type>
   <viewport>
    <southwest>
     <lat>42.3093524</lat>
     <lng>-71.2665476</lng>
    </southwest>
    <northeast>
     <lat>42.3156476</lat>
     <lng>-71.2602524</lng>
    </northeast>
   </viewport>
  </geometry>
 </result>
</GeocodeResponse>

Code:

body = dom.getElementsByTagName('GeocodeResponse')[0]

for item in body.getElementsByTagName('location'):
     lat = item.getAttribute('lat')
     lng = item.getAttribute('lng')
--

-- 
View this message in context: http://old.nabble.com/parsing-XML-with-minidom-tp28359328p28359328.html
Sent from the Python - xml-sig mailing list archive at Nabble.com.

_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig

Rachel Brown | 14 Apr 2010 20:26
Picon

Tags

Hi,
      Does XBEL supports TAGS attribute?
I see that in the <bookmark> element , "TAG" does not exist as an attribute.

If it's not there, can we define our own custom attribute to support tags?

Thanks,
Rachel


_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig
L Peter Deutsch | 3 Apr 2010 15:45

Python DTD parser?

Dear XML-SIG,

I am trying to find a working, maintained DTD parser written in Python.  The
main Python XML distribution does not include one.  I was using the PyXML
parser, but (1) (at least on SourceForge) it hasn't been maintained for
years, and (2) (as of the last version I could find, 0.8.4) it has a bug
that sometimes causes it to not process the very last line of a DTD -- which
in a well-modularized DTD is often an entity reference that pulls in the
main content of the DTD!

I've written my own DTD parser, but it omits some features, I really don't
know how well it conforms to the spec (which is important because I'm also
writing some DTDs of my own), and I'd much rather use a well-tested one
written by others.

Any advice will be appreciated.

		Sincerely,

						L Peter Deutsch
_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig

brown wrap | 14 Mar 2010 01:40
Picon
Favicon

Can't Import libxml2

I am trying to compile a program, gnome-doc-utils and its producing an error saying it can't import libxml2:

ImportError: No module named libxml2

I have installed the following Python module, so I don't know why it can't find it.

PyXML-0.8.4

PyXML seems to be fairly old and is says it is no long maintained. Is there a replacement? Thanks.

      
_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig

Stefan Behnel | 11 Mar 2010 21:33
Picon
Favicon

Re: Error instaling PyXML on Ubuntu 9.10

Humberto Yances, 11.03.2010 21:21:
> El jue, 11-03-2010 a las 21:15 +0100, Stefan Behnel escribió:
>> Humberto Yances, 11.03.2010 20:13:
>>> I'm trying to install PyXML 0.8.4 on Ubuntu 9.10
>>
>> Why would you want to do that? PyXML has been unmaintained for years.
>>
>> What are you trying to do with it?
 >
 > It is needed for OpenERP-Server.  For Ubuntu the package was python-xml;
 > but no longer exist, so I must download and compile it.

No, you don't. According to the relevant bug reports in he Ubuntu bug 
tracker, the OpenERP package has recently been fixed to work without PyXML, 
so use a recent release instead.

Oh, and please keep discussions on-list and avoid top-posting.

Stefan
_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig
Humberto Yances | 11 Mar 2010 20:13
Picon

Error instaling PyXML on Ubuntu 9.10

Hi everybody!

I'm trying to install PyXML 0.8.4 on Ubuntu 9.10; but the following error message appear:

http://paste.ubuntu.com/393468/

I've installed python-dev and try with python2.4; 2.5 and 2.6 and the error remain.

Please ¿Any suggestion?

Thanks,

Humberto Yances
_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig
Hichiro | 8 Mar 2010 10:42
Picon

read XML records one by one

Hi all!
I'm trying to read one by one record in XML file to find out its tag and attribute for schema matching. But I haven't done yet. So, could you help me?!
Thanks so much! :)

--
Best regards,
Vinh NV
CNPM K50 DHBKHN
Y!    : Vinh.dhbk
Sky : Vinh.dhbk
84 976 314 988

_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig
Hichiro | 8 Mar 2010 10:44
Picon

Read XML records one by one





Hi all!
I'm trying to read one by one record in XML file to find out its tag and attribute for schema matching. But I haven't done yet. So, could you help me?!
Thanks so much! :)




--
Best regards,
Vinh NV
CNPM K50 DHBKHN
Y!    : Vinh.dhbk
Sky : Vinh.dhbk
84 976 314 988
_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig
Stefan Behnel | 22 Feb 2010 15:12
Picon
Favicon

Re: HTML parse error

sharifah ummu kulthum, 22.02.2010 15:08:
> On Mon, Feb 22, 2010 at 10:06 PM, Stefan Behnel wrote:
> 
>> sharifah ummu kulthum, 22.02.2010 14:24:
>>> I am new to python. I have just installed python yesterday for my mythtv
>>> project. I found a site
>>> here<
>> https://sayap.com/blog/2008/12/30/mythtv-s-xmltv-grabber-for-malaysia-channels
>>> for
>>> getting channel listing grabber to get channel for Malaysia for my
>>> mythtv box. but I get these. I don't  know what it means
>>> [...]
>>> HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36
>> It means that what you want to parse here is not valid HTML, i.e. the web
>> page is broken. The HTMLParser package in the standard library is not made
>> for parsing broken HTML. Use another tool like html5lib or lxml.html.
>>
>> Stefan
>>
> does it means that i have to install the tool?

Yes. That's pretty easy, though. They should be readily packaged for your
platform (Linux), so you can just install them like any other software
package. Look out for "python-html5lib" or "python-lxml".

Stefan
_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig

sharifah ummu kulthum | 22 Feb 2010 14:24
Picon

HTML parse error

Hi guys

I am new to python. I have just installed python yesterday for my mythtv project. I found a site here for getting channel listing grabber to get channel for Malaysia for my mythtv box. but I get these. I don't  know what it means

Any insight is very mush appreciated as I am very new to python.

bitto <at> bitto:~$ python grabmy.py -f my.xml
Traceback (most recent call last):
  File "grabmy.py", line 236, in <module>
    main()
  File "grabmy.py", line 225, in main
    for elem in grabber.grab(date + timedelta(i), **params_dict):
  File "grabmy.py", line 102, in grab
    html = self.get_html(date, **kwargs)
  File "grabmy.py", line 63, in get_html
    return BeautifulSoup(content)
  File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1499, in __init__
  File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1230, in __init__
  File "build/bdist.linux-i686/egg/BeautifulSoup.py", line 1263, in _feed
  File "/usr/lib/python2.6/HTMLParser.py", line 108, in feed
    self.goahead(0)
  File "/usr/lib/python2.6/HTMLParser.py", line 148, in goahead
    k = self.parse_starttag(i)
  File "/usr/lib/python2.6/HTMLParser.py", line 226, in parse_starttag
    endpos = self.check_for_whole_start_tag(i)
  File "/usr/lib/python2.6/HTMLParser.py", line 301, in check_for_whole_start_tag
    self.error("malformed start tag")
  File "/usr/lib/python2.6/HTMLParser.py", line 115, in error
    raise HTMLParseError(message, self.getpos())
HTMLParser.HTMLParseError: malformed start tag, at line 830, column 36

Bitto

_______________________________________________
XML-SIG maillist  -  XML-SIG <at> python.org
http://mail.python.org/mailman/listinfo/xml-sig

Gmane