Olly Betts | 1 Jun 01:29
Favicon
Gravatar

Re: Building RPM on RHEL4

On Thu, May 31, 2007 at 10:56:18PM +0100, Tim Brody wrote:
> [tdb01r <at> shorty xapian-core-1.0.0]$ autoreconf --force
> configure.ac:112: require Automake 1.9.5, but have 1.9.2
> autoreconf: automake failed with exit status: 1
> 
> Dropping the version to 1.9.2 lets it package on RHEL4 without error 
> (although I haven't tested it on a live system).

Should be fine - there wasn't a good reason to require 1.9.5, just a
good reason not to require anything newer.

> DAG rpm only has mono-1.1 for RHEL4, so CSharps out.

Umm (again!):

    BuildRequires: mono-devel >= 1.1

> RPM is 4.3.3 on RHEL4.
> 
> $ rpmbuild --without csharp --without python --ta 
> xapian-bindings-1.0.0.tar.gz
> ...
> $ rpmbuild --without csharp --ta xapian-bindings-1.0.0.tar.gz
> 
> ...
> 
> RPM build errors:
>    File not found: 
> /var/tmp/xapian-bindings-1.0.0-1-root/usr/lib/python2.3/site-packages/xapian.pyo
> 
(Continue reading)

Olly Betts | 1 Jun 01:58
Favicon
Gravatar

Re: Building RPM on RHEL4

On Fri, Jun 01, 2007 at 12:29:52AM +0100, Olly Betts wrote:
> It looks like newer RPM generates the .pyo.  Just touching it into
> existence doesn't seem a good plan.  I'm taking a look.

I wonder if we should always generate the .pyo in the bindings build
system, like we do for the .pyc file.  Something like this patch
maybe:

http://oligarchy.co.uk/xapian/patches/xapian-pyo.patch

Richard: What do you think?

Cheers,
    Olly
Olly Betts | 1 Jun 02:13
Favicon
Gravatar

Re: Searching date range on a custom field

On Thu, May 31, 2007 at 03:12:18PM -0700, Matt Barnicle wrote:
> <meta name="dateBegin" content="20070210" />
> <meta name="dateEnd" content="20070217" />
> 
> These correspond to the start and end dates of an event.  I also have a 
> tag for when the rendered page is an event (we have many types of pages 
> on the site):
> 
> <meta name="pageType" content="event" />
> 
> So, I need to search on events given a date.  Say the date is 20070213, 
> how do I search for event pages where the supplied date is within the 
> dateBegin .. dateEnd range?

This is backward to how Omega's date range feature works - that expects
that each document has a date and the user wants to restrict their
search to documents within a specified date range.

> I'm using htdig to crawl the site, and htdig2omega to create the index.  
> The index creation and field mapping works just fine, and so does 
> searching on the boolean page type.  Here is my htdig2omega.script file:
> 
> url : field=url hash boolean=Q unique=Q
> title : weight=3 index truncate=80 field=title
> lastMod : field=lastmod
> size : field=size
> sample : index truncate=300 field=sample
> metaDesc : field=metadesc index
> pageType : field=pageType boolean=XPT
> eventName : field=eventName weight=3 index
(Continue reading)

Olly Betts | 1 Jun 02:18
Favicon
Gravatar

Re: QueryParser prefixing terms when stemming?

On Wed, May 30, 2007 at 10:20:52PM +0100, Richard Boulton wrote:
> It might be worth mentioning that Xapian now includes some routines for 
> generating terms from text in a manner compatible with the query parser: 
> see the TermGenerator class.  (I'm afraid I'm not sure exactly how 
> that's wrapped in Search::Xapian, not being a Perl user.)

It's wrapped to closely follow the C++ API, except that it doesn't wrap
the C++ methods which take a Utf8Iterator object because we don't wrap
Utf8Iterator at all (since Perl has built in Unicode support).

Cheers,
    Olly
Matt Barnicle | 1 Jun 04:34
Favicon

Re: Searching date range on a custom field

Olly Betts wrote:
> On Thu, May 31, 2007 at 03:12:18PM -0700, Matt Barnicle wrote:
>   
>> I'm using htdig to crawl the site, and htdig2omega to create the index.  
>> The index creation and field mapping works just fine, and so does 
>> searching on the boolean page type.  Here is my htdig2omega.script file:
>>
>> url : field=url hash boolean=Q unique=Q
>> title : weight=3 index truncate=80 field=title
>> lastMod : field=lastmod
>> size : field=size
>> sample : index truncate=300 field=sample
>> metaDesc : field=metadesc index
>> pageType : field=pageType boolean=XPT
>> eventName : field=eventName weight=3 index
>> dateBegin : field=dateBegin date=yyyymmdd
>> dateEnd : field=dateEnd date=yyyymmdd
>>     
>
> The scriptindex "date" action is designed to allow you to do date range
> filtering when each document has a single date, so this won't really
> work.
>
> You could make it work if you ran the date action on every date in the
> range, but if your ranges are long, that's going to generate a lot of
> terms.
>   
You know..  That might actually work out.  I'll do some investigation on 
our data and see what is the distribution of number of active days for 
an event...  Though I'm kicking around another solution in my head at 
(Continue reading)

Richard Boulton | 1 Jun 10:38

Re: Building RPM on RHEL4

Olly Betts wrote:
> On Fri, Jun 01, 2007 at 12:29:52AM +0100, Olly Betts wrote:
>> It looks like newer RPM generates the .pyo.  Just touching it into
>> existence doesn't seem a good plan.  I'm taking a look.
> 
> I wonder if we should always generate the .pyo in the bindings build
> system, like we do for the .pyc file.  Something like this patch
> maybe:
> 
> http://oligarchy.co.uk/xapian/patches/xapian-pyo.patch
> 
> Richard: What do you think?

I can't see any problem with this idea, but I've never been convinced 
that .pyo files are actually of any use (with the current level of 
optimisation, anyway).  However, if it will help RPM builds or anything, 
go ahead.

The only risk I can see is that asserts aren't checked in .pyo files, 
which could be confusing if we're trying to track down a user-reported 
error.  However, we don't have many Python asserts anyway.

(Of course, .pyc and .pyo files shouldn't be included in debian 
packages, since they'll be generated at install time.  I imagine RPM 
does something similar.)

--

-- 
Richard
Olly Betts | 1 Jun 11:05
Favicon
Gravatar

Re: Building RPM on RHEL4

On Fri, Jun 01, 2007 at 09:38:29AM +0100, Richard Boulton wrote:
> (Of course, .pyc and .pyo files shouldn't be included in debian 
> packages, since they'll be generated at install time.  I imagine RPM 
> does something similar.)

For Fedora at least, the current recommendation is to install them:

http://fedoraproject.org/wiki/Packaging/Python#head-e48d83dfeb5e671e2018d361d6e75d7e6c6e519c

So it sounds like we should remove the "%ghost" (which seems to
mean "don't install this file, but do remove it when uninstalling
the package if it exists").

If I understand correctly, RPM generates .pyc and .pyo automatically
for any .py file at package build time.  But RHEL 4 has an older
version of RPM before this feature was added, which is why Tim hit
this problem.

Cheers,
    Olly
Tim Brody | 1 Jun 13:16
Picon
Favicon

Re: Building RPM on RHEL4

Olly Betts wrote:
> On Thu, May 31, 2007 at 10:56:18PM +0100, Tim Brody wrote:
>> [tdb01r <at> shorty xapian-core-1.0.0]$ autoreconf --force
>> configure.ac:112: require Automake 1.9.5, but have 1.9.2
>> autoreconf: automake failed with exit status: 1
>>
>> Dropping the version to 1.9.2 lets it package on RHEL4 without error 
>> (although I haven't tested it on a live system).
> 
> Should be fine - there wasn't a good reason to require 1.9.5, just a
> good reason not to require anything newer.
> 
>> DAG rpm only has mono-1.1 for RHEL4, so CSharps out.
> 
> Umm (again!):
> 
>     BuildRequires: mono-devel >= 1.1

Oops, I meant xapian requires 1.1, but DAG only has 1.0.

Here's the versions of all the direct dependencies that RHEL4 provides:

autoconf-2.59
automake-1.9.2
libtool-1.5.6
python-devel-2.3.4
php-devel-4.3.9
tcl-devel-8.4.7

Using the RPMs for mono 1.2.4 from 
(Continue reading)

James Aylett | 1 Jun 13:32

Re: Splitting terms into separate B-Trees.

On Thu, May 31, 2007 at 01:58:19PM -0700, Kevin Duraj wrote:

> This would explain why Google in their documents claim to use cheap
> servers, because partitioning B-Tree indexes by the terms can bring
> the index down to any small size that the server can handle well, and
> actually the servers would become the part of the B-Tree. There is no
> limit how much you can search and size down the database.

However Google doesn't care when identical adjacent searches give
different results. There are lots of things you can do if you're
trying to solve Google's problem, which often won't apply in a more
constrained system. (You'd never want to do that kind of thing with a
medical search system, for instance :-)

J

--

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james <at> tartarus.org                               uncertaintydivision.org
Fabrice Colin | 2 Jun 13:23
Picon
Gravatar

Re: Building RPM on RHEL4

On 6/2/07, Tim Brody <tdb01r <at> ecs.soton.ac.uk> wrote:
> Using the RPMs for mono 1.2.4 from
> http://www.go-mono.com/download-stable/rhel-4-i386/ I can build the
> -csharp package. (As mono isn't in RHEL4 anyway, that's not
> unreasonable, just less convenient than DAG).
>
> Adding your patch for the python Makefile allows the python RPM to be built.
>
> I've added the remaining two RPMs to my home directory.
>
I could upload them to http://www.xapian.org/RPM together with those
I built for Fedora 6. James, Olly, would that be okay ?

Fabrice

Gmane