Martijn Faassen | 9 Nov 10:20 2004

vlibxml2 and lxml

Hi there,

Back from my honeymoon I see that Victor has released vlibxml2. Cool!
Thanks Victor!

I hope that we can still converge it all into a single package --
vlibxml2 aims to do a fairly low-level straightforward mapping for parts
of the libxml2 API, as I understand it, while lxml aims to raises the
level of the API. The approaches don't need to exclude each other
however and pooling them together could help both.

Victor, I'm curious about the reasons why you didn't use the existing
lxml infrastructure (svn, mailinglist, etc) to do what you've been
doing? Were you under the impression that what you were aiming at is
outside the scope of lxml? I think people would prefer there not be two
pyrex based wrappers for libxml2. Is there anything we can change with
lxml so that you'd be happy to merge the projects?

I see that Victor changed his license from GPL to BSD (though the
README.txt of the 0.1.177 release still says it's GPL-ed). That's great;
it means no matter what happens with lxml/vlibxml2 convergence, lxml can
hopefully at least 'borrow' the memory management code from vlibxml2.
Thanks again Victor!

Regards,

Martijn
Victor Ng | 9 Nov 17:24 2004
Picon

Re: vlibxml2 and lxml

Actually - I started working on the vlibxml2 stuff prior to the 
announcement of the lxml project.  I was also negotiating an agreement 
at my workplace so that I can work on the XML library during office 
hours as long as the license was a BSD license.

All of that has been sorted out now - and I've really got no time to 
setup everything you folks have already done at codespeak.

I'd really like to keep vlibxml2 and lxml separate.  This is mostly for 
technical reasons as the libxml2 library has some really quirky 
behavior in it's API.  I'd actually like to rewrite a lot of the 
vlibxml2 code now that I understand the idioms in libxml2 a little 
better.  I guess you really do need to do it 87 times before you get it 
right.  :)

So - in the interest of playing nice with everyone - can I get checkin 
privs to the lxml SVN repository?

vic

On 9-Nov-04, at 04:20 AM, Martijn Faassen wrote:

> Hi there,
>
> Back from my honeymoon I see that Victor has released vlibxml2. Cool!
> Thanks Victor!
>
> I hope that we can still converge it all into a single package --
> vlibxml2 aims to do a fairly low-level straightforward mapping for 
> parts
(Continue reading)

Martijn Faassen | 9 Nov 18:12 2004

Re: vlibxml2 and lxml

Hey,

[Philipp (hi, I'm back!), please read to the bottom for where you come in]

Victor Ng wrote:
> Actually - I started working on the vlibxml2 stuff prior to the 
> announcement of the lxml project. 

Oh, just to make it clear: I announced it back at EuroPython in early 
june, but it's easy to miss such announcements. The svn and stuff here 
came after that; the original code is in the Infrae cvs, here:

http://cvs.infrae.com/packages/lxml/

Then again, you might've been at it for a while too for all I know. I'm 
glad I caught you on the pyrex list and we can work together; it's 
already been beneficial to both of us, I hope.

> I was also negotiating an agreement 
> at my workplace so that I can work on the XML library during office 
> hours as long as the license was a BSD license.

Cool! Infrae has a 'BSD everything by default' policy, but that's 
because I co-own the company. :)

> All of that has been sorted out now - and I've really got no time to 
> setup everything you folks have already done at codespeak.

Of course that's mostly the work of Holger Krekel, helped by Philipp von 
Weitershausen; I can't really take much credit for it, just be glad that 
(Continue reading)

Victor Ng | 9 Nov 22:17 2004
Picon

Re: vlibxml2 and lxml

>> I'd really like to keep vlibxml2 and lxml separate.  This is mostly 
>> for technical reasons as the libxml2 library has some really quirky 
>> behavior in it's API.
>
> So what about a source distribution that contains both libraries (if 
> they're done at all, vlibxml2 is obviously much further in that 
> department)? vlibxml2 contains a lot of very important foundational 
> work concerning memory management that I hope we can get the higher 
> level lxml stuff to use as well after a bit of refactoring.
>
> So, what I'm proposing is merging the vlibxml2 into lxml's 'src' 
> directory, but being its own package. What do you think? Of course 
> we'd also need to figure out what to do with the extensions package; 
> I'm not familiar enough with vlibxml2's source layout to know where 
> everything goes. Ideas?

Sounds good to me.  Phillip has set me up on SVN at codespeak now.  
I'll check in my code tonight, make a branch and play around with 
source layout and maybe we can come to an agreement on how everything 
should fit together.

> If you'd like and can come up with a good name, we could rename the 
> whole 'lxml' distribution into something else that may be more 
> neutral. We could then call it all <foo>.libxml2, <foo>.libxslt, 
> <foo>.dom and <foo>.elementtree. I.e. one top level package to make 
> the namespaces clear, with sub-modules/packages that offer particular 
> functionalities. vlibxml2 would become 'libxml2'. Though perhaps this 
> all promises *too* much API compatibility with the original 
> libxml2/elementtree/etc for us to feel comfortable about?

(Continue reading)

Fred Drake | 10 Nov 06:30 2004
Picon

Re: Re: vlibxml2 and lxml

On Tue, 9 Nov 2004 16:17:09 -0500, Victor Ng <vng1 <at> mac.com> wrote:
> I'm going to take a crack at
> learning SWIG to see if I can close that gap a little faster though.
> Maybe close the gap entirely. :)

Please, no, Pyrex is the way to go!

  -Fred

--

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
Martijn Faassen | 10 Nov 10:26 2004

Re: vlibxml2 and lxml

Victor Ng wrote:
[snip]
>> So, what I'm proposing is merging the vlibxml2 into lxml's 'src' 
>> directory, but being its own package. What do you think? Of course 
>> we'd also need to figure out what to do with the extensions package; 
>> I'm not familiar enough with vlibxml2's source layout to know where 
>> everything goes. Ideas?
> 
> Sounds good to me.  Phillip has set me up on SVN at codespeak now.  I'll 
> check in my code tonight, make a branch and play around with source 
> layout and maybe we can come to an agreement on how everything should 
> fit together.

Great!

>> If you'd like and can come up with a good name, we could rename the 
>> whole 'lxml' distribution into something else that may be more 
>> neutral. We could then call it all <foo>.libxml2, <foo>.libxslt, 
>> <foo>.dom and <foo>.elementtree. I.e. one top level package to make 
>> the namespaces clear, with sub-modules/packages that offer particular 
>> functionalities. vlibxml2 would become 'libxml2'. Though perhaps this 
>> all promises *too* much API compatibility with the original 
>> libxml2/elementtree/etc for us to feel comfortable about?
> 
> Naming is pretty low on my priority list.

True: we can always reorganize in the future. The nice thing of having a 
single outer namespace package is that you're free to name anything in 
the project whatever you like. I'm doing this with the 'lxml' namespace, 
basically.
(Continue reading)

Victor Ng | 12 Nov 01:19 2004
Picon

Re: vlibxml2 and lxml

Hey,

I managed to get my code checked into the trunk line of lxml just now.  
I just checked in my old vlibxml2 tree under the trunk of lxml's trunk.

How do you want to co-ordinate the source tree changes?  Is there an 
automated build process that I should be aware of?  I'm on an OSX 
machine so I don't have access to valgrind - I was wondering if I can 
see build results that show that memory leaks are not actually 
happening.

What's the intent of having the branch vs the tag directories in the 
svn repository?  Are tags considered releases?

A couple notes about my vlibxml2 - there's a bug in the replaceNode 
method.  I'm not sure exactly what's going on but when I actually use 
the code in my 'real' project at work I get weird XML output.  I'll try 
to investigate tonight to see exactly what's going on.

so many questions, so little time...

vic
Attachment (smime.p7s): application/pkcs7-signature, 5273 bytes
_______________________________________________
lxml-dev mailing list
lxml-dev <at> codespeak.net
http://codespeak.net/mailman/listinfo/lxml-dev
Philipp von Weitershausen | 12 Nov 06:44 2004
Picon

Re: vlibxml2 and lxml

Victor Ng wrote:
> Hey,
> 
> I managed to get my code checked into the trunk line of lxml just now.  
> I just checked in my old vlibxml2 tree under the trunk of lxml's trunk.
> 
> How do you want to co-ordinate the source tree changes?  Is there an 
> automated build process that I should be aware of?  I'm on an OSX 
> machine so I don't have access to valgrind - I was wondering if I can 
> see build results that show that memory leaks are not actually happening.
> 
> What's the intent of having the branch vs the tag directories in the svn 
> repository?  Are tags considered releases?

That is a standard convention for svn repository layouts. svn only ever 
compares directory trees. So, effectively, branches are copies of 
directories, and so are tags, except that tag copies aren't modified 
after copying while branches are. Later, when you want to merge a 
branch, you let svn compare two directory trees and apply the difference 
to another one.

I strongly suggest reading some svn documentation. I can recommend the 
svn book (http://svnbook.red-bean.com/). If you already know CVS, 
there's a chapter in there for those who switch...

Philipp
Martijn Faassen | 12 Nov 09:50 2004

source code layout

Victor Ng wrote:
> I managed to get my code checked into the trunk line of lxml just now.  
> I just checked in my old vlibxml2 tree under the trunk of lxml's trunk.

Great, thank you very much!

I hope you don't mind if I babble a bit more about source code layout 
and naming issues; I'd like to have them worked out. Luckily subversion 
makes it easy to rename and move things!

One issue is that now we have the following layout of the source tree:

lxml (distribution directory)
   src
     lxml (package)
       dom (module)
       etree (module)
   vlibxml2 (distribution directory)
     src
       extensions (package?)
       victree (this looks new. it's empty however :)
       vlibxml2 (package)

What I was suggesting before was a layout more like this:

lxml (distribution directory, merge vlibxml2 distribution directory info 
into this)
   src
     lxml (package)
       dom (module)
(Continue reading)

Martijn Faassen | 12 Nov 10:07 2004

Re: vlibxml2 and lxml

Hey,

I split off a discussion about source code layout into a separate
thread, but I'll answer these ones here:

Victor Ng wrote:
[snip]
> How do you want to co-ordinate the source tree changes?  Is there an
>  automated build process that I should be aware of?

We don't have an automated build process.

I think we should be able to merge your setup.py with the one in lxml
already. Mine does it best to shut up gcc about all kinds of warnings in
Pyrex generated code. This is not very portable beyond gcc; we need to
work out eventually how to pass along different options with other
compilers such as on Windows.

The Makefile in lxml I believe I took more or less from the Zope 3
project. Most relevant are 'make' and 'make test'.

> I'm on an OSX machine so I don't have access to valgrind - I was
> wondering if I can see build results that show that memory leaks are
> not actually happening.

No automated build process. The thing I do is:

valgrind --tool=memcheck --suppressions=valgrind-python.supp python2.3 
test.py

(Continue reading)


Gmane