Olly Betts | 1 Mar 23:37
Favicon
Gravatar

Re: [Xapian-commits] 7817: trunk/xapian-core/ trunk/xapian-core/common/

On Thu, Mar 01, 2007 at 03:29:26PM +0000, richard wrote:
> * common/utils.c,common/utils.h: For windows, add an om_tostring()function for
>   64 bit integers: time() and GetProcessId() return these, and without this,
>   backends/quartz/quartz_log.cc reports an error.

This change generates a new warning in the mingw cross-build:

common/utils.cc:82: warning: int format, different type arg (arg 3)

Presumably this warning is bogus, since mingw uses the MSVC C runtime
DLL?

Cheers,
    Olly
Richard Boulton | 2 Mar 10:14

Re: Re: [Xapian-commits] 7817: trunk/xapian-core/ trunk/xapian-core/common/

Olly Betts wrote:
> On Thu, Mar 01, 2007 at 03:29:26PM +0000, richard wrote:
>> * common/utils.c,common/utils.h: For windows, add an om_tostring()function for
>>   64 bit integers: time() and GetProcessId() return these, and without this,
>>   backends/quartz/quartz_log.cc reports an error.
> 
> This change generates a new warning in the mingw cross-build:
> 
> common/utils.cc:82: warning: int format, different type arg (arg 3)
> 
> Presumably this warning is bogus, since mingw uses the MSVC C runtime
> DLL?

It sounds pretty bogus - perhaps the best plan would be to put together 
a test (presumably in internaltest.cc) which checked that it was working 
correctly.  I'll look into that shortly.

--

-- 
Richard
Olly Betts | 2 Mar 13:56
Favicon
Gravatar

Re: Re: [Xapian-commits] 7817: trunk/xapian-core/ trunk/xapian-core/common/

On Fri, Mar 02, 2007 at 09:14:56AM +0000, Richard Boulton wrote:
> Olly Betts wrote:
> >On Thu, Mar 01, 2007 at 03:29:26PM +0000, richard wrote:
> >>* common/utils.c,common/utils.h: For windows, add an 
> >>om_tostring()function for
> >>  64 bit integers: time() and GetProcessId() return these, and without 
> >>  this,
> >>  backends/quartz/quartz_log.cc reports an error.

If this is only required by the quartz log stuff, I wonder if we should
just drop that code.

It doesn't get used unless you touch "log" in a quartz database by hand,
but that's not documented anywhere so I doubt anybody except you and me
knows to even try it!  We could document it, but I never actually found
it a useful debugging aid, and haven't used it at all in years.  Plus
quartz will be deprecated for new systems as of 1.0 anyway.

But I suspect support for 64 bit integers is also required for the debug
logging code.

Hmm, and I don't understand how this worked for MSVC before this patch
if it gives an error now.  Or why I haven't seen an error building for
mingw.

> >This change generates a new warning in the mingw cross-build:
> >
> >common/utils.cc:82: warning: int format, different type arg (arg 3)
> >
> >Presumably this warning is bogus, since mingw uses the MSVC C runtime
(Continue reading)

Richard Boulton | 2 Mar 14:10

Re: Re: [Xapian-commits] 7817: trunk/xapian-core/ trunk/xapian-core/common/

Olly Betts wrote:
> If this is only required by the quartz log stuff, I wonder if we should
> just drop that code.
> 
> It doesn't get used unless you touch "log" in a quartz database by hand,
> but that's not documented anywhere so I doubt anybody except you and me
> knows to even try it!  We could document it, but I never actually found
> it a useful debugging aid, and haven't used it at all in years.  Plus
> quartz will be deprecated for new systems as of 1.0 anyway.

All true, and good points - I haven't ever really used it either.

> But I suspect support for 64 bit integers is also required for the debug
> logging code.

Quite possibly - we've not got round to building with debug logging 
turned on yet.

> Hmm, and I don't understand how this worked for MSVC before this patch
> if it gives an error now.  Or why I haven't seen an error building for
> mingw.

I think it's only needed for the newer compilers, because time_t only 
recently became a 64 bit quantity.  And GetProcessId similarly, I 
suppose.  Charlie will know more detail.

> A testcase is a good idea, but it's going to be tricky for me to
> exercise it in a cross-build...

Fair point.  I've written one anyway, but I can't test it either.  I'll 
(Continue reading)

Olly Betts | 2 Mar 15:02
Favicon
Gravatar

Re: Re: [Xapian-commits] 7817: trunk/xapian-core/ trunk/xapian-core/common/

On Fri, Mar 02, 2007 at 01:10:07PM +0000, Richard Boulton wrote:
> Olly Betts wrote:
> >Hmm, and I don't understand how this worked for MSVC before this patch
> >if it gives an error now.  Or why I haven't seen an error building for
> >mingw.
> 
> I think it's only needed for the newer compilers, because time_t only 
> recently became a 64 bit quantity.  And GetProcessId similarly, I 
> suppose.  Charlie will know more detail.

Ah, and I bet there's some C preprocessor magic or other trickery going
on so that programs can use the same runtime DLLs but just call a
different version of the functions concerned.

> >A testcase is a good idea, but it's going to be tricky for me to
> >exercise it in a cross-build...
> 
> Fair point.  I've written one anyway, but I can't test it either.  I'll 
> add it to a bug report, so we don't forget to look into this.

I suggest you commit it to HEAD anyway.  It can be tested with MSVC, and
if there's a bug in the test, the worst that can happen is it might stop
"make check" from working under native mingw when someone tries it,
until the problem is reported and fixed.

Cheers,
    Olly
Olly Betts | 2 Mar 20:47
Favicon
Gravatar

svn-ci script (was: Re: [Xapian-commits] 7830: trunk/xapian-core/)

On Fri, Mar 02, 2007 at 12:20:44PM +0000, richard wrote:
> * HACKING: Add note about how to generate ChangeLog timestamps
>   using the unix date command - and I've started generating them in
>   the same format as Olly is. (I hope.)

I actually use a perl script which does:

    print CO strftime("%a %h %d %H:%M:%S %Z %Y", localtime);

But %b and %h are just aliases and %T is %H:%M:%S, so your date command
should produce identical results.

I guess my "svn ci" wrapper script is probably useful to other people
with commit access, so I've just committed it to
xapian-maintainer-tools.  Feel free to use it as is, hack it about, or
just ignore it:

http://svn.xapian.org/trunk/xapian-maintainer-tools/svn-ci?view=markup

It writes the tedious parts of the ChangeLog entry for you, and pastes
in "svn diff" for the files/directories you are committing, which I find
helps avoid me checking in other changes that might be in my tree by
mistake, and helps make sure I document all changes in the ChangeLog
entry.  It also massages the ChangeLog entry into a suitable SVN commit
message.  If you fail to delete all the diff fragments or conflict
markers, it won't let you commit.

It has a few foibles, but generally you get to see them before it
commits at least.  The only exception is that if ChangeLog is modified
and doesn't appear to have diff fragments or conflict markers, it will
(Continue reading)

Richard Boulton | 3 Mar 02:05

Error handling in the bindings

Currently, the SWIG generated bindings catch exceptions raised by Xapian 
and rethrow them using the standard SWIG error types: for example, if 
any Xapian::DatabaseError exception is raised, SWIG will use its
standard SWIG_IOError error type to report the error.

In Python, this leads to code like the following:

try:
     db.get_document(1)
except RuntimeError, e:
     if str(e).startswith('DocNotFoundError:'):
         # Handle a DocNotFoundError

(Incidentally, the error raised in python for this actually is a 
RuntimeError, not an IOError as I would have expected from reading the 
SWIG code, but that's not a major issue.)

For Python, it would be much nicer to have the full exception heirarchy 
of Xapian mirrored as Python classes, with the appropriate inheritance 
structure.  This is probably true for other languages too.  Looking at 
the code, I don't think it would take terribly long to sort this out for 
Python, but I haven't looked at what's needed for the other bindings we 
support.  It would certainly be much nicer to be able to write the above 
Python snippet as:

try:
     db.get_document(1)
except xapian.DocNotFoundError, e:
     # Handle a DocNotFoundError

(Continue reading)

Olly Betts | 3 Mar 02:46
Favicon
Gravatar

Re: Error handling in the bindings

On Sat, Mar 03, 2007 at 01:05:05AM +0000, Richard Boulton wrote:
> try:
>     db.get_document(1)
> except RuntimeError, e:
>     if str(e).startswith('DocNotFoundError:'):
>         # Handle a DocNotFoundError

This is probably the worst thing about the bindings at present.

> For Python, it would be much nicer to have the full exception heirarchy 
> of Xapian mirrored as Python classes, with the appropriate inheritance 
> structure.  This is probably true for other languages too.  Looking at 
> the code, I don't think it would take terribly long to sort this out for 
> Python, but I haven't looked at what's needed for the other bindings we 
> support.

For all those that support classes with inheritance and throwing classes
as exceptions, I think it should be fairly easy.

Of the SWIG bindings, Python and C# do.  Ruby almost certainly does, but
I don't know it very well.  PHP5 does, but PHP4 doesn't support
exceptions (the current approach is clumsy, but probably about as good
as we're likely to get there).  Tcl8 essentially only supports throwing
a string from what I recall (but I could be wrong).

And of the non-SWIG bindings, Java does, and Perl effectively does (the
mechanism is a bit different though).

> In theory, the good thing about using the built in SWIG error handling 
> system is that we don't have to write code for each language we support 
(Continue reading)

James Aylett | 3 Mar 04:22

Re: Error handling in the bindings

On Sat, Mar 03, 2007 at 01:46:41AM +0000, Olly Betts wrote:

> For all those that support classes with inheritance and throwing classes
> as exceptions, I think it should be fairly easy.
> 
> Of the SWIG bindings, Python and C# do.  Ruby almost certainly does, but
> I don't know it very well.  PHP5 does, but PHP4 doesn't support
> exceptions (the current approach is clumsy, but probably about as good
> as we're likely to get there).  Tcl8 essentially only supports throwing
> a string from what I recall (but I could be wrong).

Given Xapian needs to be installed -- ie there are going to be very
few "standard" hosting providers that support it -- we don't have to
worry about PHP4 too much. Most of the PHP extensions and useful bits
of code have started dropping PHP4 support now. (At least, most of the
ones I find.)

I was always intending to add proper exception support to the Python
bindings, but I never found the time.

> > and it would be nice to get this done before we release 1.0 if
> > possible, so unless someone shouts up with a reason that it's not a
> > good idea to try this, I'll try making a patch to fix up the Python
> > exception handling next week, to see how feasible it is.
> 
> I think it would be very useful.  And 1.0 would be a natural point to
> make such a change.
> 
> I am starting to wonder at the number of things that have been noted
> as "ought to be done for 1.0" though.  It would be good to get it
(Continue reading)

Olly Betts | 6 Mar 02:49
Favicon
Gravatar

Merging stats from multiple databases for expand

In matcher/expandweight.cc we have:

OmExpandBits
operator+(const OmExpandBits &bits1, const OmExpandBits &bits2)
{
    OmExpandBits sum(bits1); 
    sum.multiplier += bits2.multiplier;
    sum.rtermfreq += bits2.rtermfreq;

    // FIXME - try to share this information rather than pick half of it
    if (bits2.dbsize > sum.dbsize) {
        DEBUGLINE(WTCALC, "OmExpandBits::operator+ using second operand: " <<
                  bits2.termfreq << "/" << bits2.dbsize << " instead of " <<
                  bits1.termfreq << "/" << bits1.dbsize);
        sum.termfreq = bits2.termfreq;
        sum.dbsize = bits2.dbsize;
    } else {
        DEBUGLINE(WTCALC, "OmExpandBits::operator+ using first operand: " <<
                  bits1.termfreq << "/" << bits1.dbsize << " instead of " <<
                  bits2.termfreq << "/" << bits2.dbsize);
        // sum already contains the parts of the first operand
    }
    return sum;
}

Why don't we "share this information" by just replacing the "if" by:

    sum.termfreq += bits2.termfreq;
    sum.dbsize += bits2.dbsize;

(Continue reading)


Gmane