Arjen van der Meijden | 10 Jan 15:37 2003
Picon
Picon

Again some questions about xapian (attributes, remote, legal)

Hi list,

I have, again, some questions about xapian. We have decided to keep
using xapian for our forum and are trying to integrate it somewhat
better, by wrapping it into the php-software.
That is to enable us to do some simple right-checks (i.e. allowing
someone to search in the subforums he has access to, by supplying a list
of booleans that represent that list of subforums.) 
There are three models that come to my mind, to use omega within the
webserver-cluster. Simply redirecting anyone who wants to search to a
"dedicated search-server" where the php-application can either call
omega as a commandline application or connect to a unix-socket, this is
the simplest but (I think) least desirable method. Another approach is
to connect with php to a tcp-socket to a c++-application that is a
version of omega adjusted to be a daemon. And the third is to set up a
remote database listener and adjust the omega-clients to use that remote
database (and have the webservers call the local omega, either as a
commandline application or again some form of (unix-)socket).

All three methods have the search-logic on one machine and depending on
the choice of distribution a small or relatively large amount of work
locally.
We'll probably start using the first method, since that is the easiest
to set up.

Does anyone on the list have some tips, which way performs best?

To integrate it even better, a direct connection from php to the
search-databases could be nice (not necessarily better, php is slow and
clumsy with memory compared to c++). What part of xapian will be
(Continue reading)

James Aylett | 10 Jan 16:03 2003

Re: Again some questions about xapian (attributes, remote, legal)

On Fri, Jan 10, 2003 at 03:37:02PM +0100, Arjen van der Meijden wrote:

> Hi list.

Hi, Arjen. I'm only going to answer a couple of your questions,
because others are better placed to talk about the others.

> To integrate it even better, a direct connection from php to the
> search-databases could be nice (not necessarily better, php is slow and
> clumsy with memory compared to c++). What part of xapian will be
> available for the php-bindings. Does that include the query-parser (ie
> we are lazy (like all programmers are said to be) and would like to
> reuse that functionality)?

The query parser will be wrapped, yes. The memory and speed
implications of using a wrapped version of the library shouldn't be
too bad, since most of the work will still be done in the C++ library,
although obviously you'll be paying a price for the interpreted code
you'll have to put on top of that. (There shouldn't be too much,
though.)

All parts of the Xapian public API will be wrapped for other
languages, which will include not only the database/document/enquire
areas, but the stemmers, libomqueryparser, and anything else that
builds as a separate library but is shipped with the main part of
Xapian. (Although some bits will probably appear first.)

> And last, but not least. Our forum-software-vendor wouldn't like it if
> they had to opensource their software because of the use of
> omega/xapian. In the second alinea I stated three ways of integrating
(Continue reading)

Olly Betts | 10 Jan 16:44 2003

Re: Again some questions about xapian (attributes, remote, legal)

On Fri, Jan 10, 2003 at 03:03:57PM +0000, James Aylett wrote:
> My understanding is that without an exemption clause (a la the linux
> kernel wrt loadable modules), the GPL will probably apply here (as in:
> it's never been tested legally, but I think the belief is that it
> would apply).

By coincidence I was discussing this with someone only a few days ago -
in general terms, but it applies to Xapian (the only difference being
that the FSF doesn't own the Xapian copyright).

I'm not a lawyer, and this isn't a legal opinion, just an interesting
view of how the GPL works.

The GPL is a copyright licence.  This means there's a limit to what it
can restrict.  Fundamentally it relies on you not having many rights to
distribute a copyrighted work - unless you're specially granted them by
the copyright holder that is.  And the copyright holder can attach
conditions to granting those extra rights.

If I take GPL code and modify it, it's pretty clear that's a derived
work.  The question we ended up discussing the other day was whether
copyright law is actually strong enough to apply to dynamic linking, and
other forms of code coupling.  The FSF likes to think so, but that
doesn't necessarily mean that the law would support this position.

But the GPL has never been tested in court (or so everybody says).  If
you read various articles by Eben Moglen, you'll see that what tends to
happen is when a GPL violation is noticed, and brought to the attention
of the violator, they comply rather than choose the expense of being
taken to court along with a lot of bad publicity.
(Continue reading)

Arjen van der Meijden | 10 Jan 17:17 2003
Picon
Picon

RE: Again some questions about xapian (attributes, remote, legal)

> This is what people call the "ASP loophole" - if you run code 
> on your own machine (even if it's a webserver), you're not 
> distributing it to anyone and so the GPL can't apply.
I was aware of this "loophole" (since we use linux, apache, mysql and
php (and all their dependencies) you're more or less forced to be aware
of that :) )

But I left something out, since we use the forumsoftware and codevelop
on it we cannot easily say "hey, we're using that forum software and
_we_ are using omega/xapian as our search engine". The codevelopment is
a rather tricky one (it is based on a friendly relationship between two
companies, legally backed up by some contracts) and we more or less
decide what to do with the software together.
Since our friends/colleagues want to be able to distribute the software
package as a large-scale community-software-package (ie forum and the
lot), they want to be able to integrate omega/xapian in their software
(as a separate heavy-site-search-module, or something like that) as much
as we do.
Therefore the testing has been done on our site, since it is a rather
nice example of a large community, but the final integration is done
together.

That's were it might get in the path of GPL, they distribute the
software. Sometimes that can be with an attached package combining
omega/xapian+"some php to tie it together".

Anyway, that is the reason why we need to be a bit carefull with the
GPL, although it won't keep us from using omega/xapian.
Of coarse, when it would turn out that we need to use the php-bindings
for best efficiency, we'll have another look at the legal issues and
(Continue reading)

James Aylett | 10 Jan 18:51 2003

Re: Again some questions about xapian (attributes, remote, legal)

On Fri, Jan 10, 2003 at 05:17:36PM +0100, Arjen van der Meijden wrote:

> Of coarse, when it would turn out that we need to use the php-bindings
> for best efficiency, we'll have another look at the legal issues and
> might release the php-binding-class as LGPL or something like that (I
> recall having read that it is ok to release your source as LGPL rather
> than GPL, if you release some gpl-dependent sources. Ie. that would set
> the core-software free from the GPL-restrictions).

That's not my understanding; as far as I know, anything linked against
GPL sources and then distributed must be done so under the GPL. The
LGPL relaxations only apply if the stuff you're linking against is
under the LGPL.

> Releasing a c++-socket-wrapper as GPL is, of coarse, no issue at all (as
> long as we don't include bussiness logic, but I can't imagine what logic
> that would be).

It wouldn't be difficult to knock up a little socket server that you
can toss search requests at - equivalent to stretching (some part of)
the Xapian API over a socket layer. I remember Richard talking about
it at one point, but that was a long time ago (and he didn't do it in
the end).

A well-written one would be an asset to the Xapian project,
IMHO. Unfortunately, everyone is likely to have their own choice of
implementation details :). (I'd probably go ahead and do one myself if
I had any time, but that's neither here nor there at the moment ...)

J
(Continue reading)

Douglas Richardson | 20 Jan 19:50 2003

Windows port

Hello Xapian developers,

First of all I'd like to complement you on your work. I used your IR library
for a small client/server program, and I'm happy with the results. The API
was both flexible and easy to use (a combination not easy to achieve) and,
based on information I've found, the performance of the system is very good.

I saw on the feature page (http://xapian.org/features.php) that a Windows
port should be possible. My question is this: Do you (the developers) have
any plans of doing this? If so, what are they?

Much thanks.

Douglas Richardson, Code Monger
Gauss Interprise, Incorporated
www.gaussvip.com

-------------------------------------------------------
This SF.NET email is sponsored by: FREE  SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your  SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
James Aylett | 21 Jan 11:43 2003

Re: Windows port

On Mon, Jan 20, 2003 at 10:50:25AM -0800, Douglas Richardson wrote:

> I saw on the feature page (http://xapian.org/features.php) that a Windows
> port should be possible. My question is this: Do you (the developers) have
> any plans of doing this? If so, what are they?

Hi, Douglas. This is down against me on the todo list, and I've played
around with it a bit. In theory it should be fairly straightforward to
build under mingw[1] (although the remote backend won't). The problems
are mostly non-ANSI functions we use (like snprintf) that need
replacements written and autoconf'ed in; also, the testsuite as it
stands is Unix-specific, and that would need fixing.

Unfortunately, I don't have a huge amount of time at the moment, and
this isn't a terribly high priority (things like autobuilding and
language bindings are probably more useful). Feel free to try to
change my mind though ("look what a cool thing we could do if ..." if
probably a good way to approach it :-).

Getting Xapian building using MSVC would be a significantly harder
task, because you'd need to write your own build system (or import it
all into a project, and fiddle with things). I can't see this being
done by the Xapian team in the near future.

[1] A gcc-based system that targets Windows, which I think is usually
used as a cross-compiler. Certainly I'm using it on a Debian system.

J

--

-- 
(Continue reading)

Arjen van der Meijden | 27 Jan 18:45 2003
Picon
Picon

RE: Again some questions about xapian (attributes, remote, legal)

Hi List,

Although some of these questions were answered very thoroughly, I'm
afraid the most important (on short notice) for me weren't.
So I've cut the answered questions (thank you very much for that) and
summarized them with a single line.

I hope there is someone able to help me out a bit further on these
issues.

> Van: xapian-discuss-admin <at> lists.sourceforge.net 
> [mailto:xapian-discuss-admin <at> lists.sourceforge.net] Namens 
> Arjen van der Meijden
> Verzonden: vrijdag 10 januari 2003 15:37
> Aan: xapian-discuss <at> lists.sourceforge.net
> Onderwerp: [Xapian-discuss] Again some questions about xapian 
> (attributes, remote, legal)
> 
> 
> We have decided 
> to keep using xapian for our forum and are trying to 
> integrate it somewhat better, by wrapping it into the 
> php-software. That is to enable us to do some simple 
> right-checks (i.e. allowing someone to search in the 
> subforums he has access to, by supplying a list of booleans 
> that represent that list of subforums.) 
> There are three models that come to my mind, to use omega 
> within the webserver-cluster. Simply redirecting anyone who 
> wants to search to a "dedicated search-server" where the 
> php-application can either call omega as a commandline 
(Continue reading)

James Aylett | 27 Jan 19:10 2003

Re: Again some questions about xapian (attributes, remote, legal)

On Mon, Jan 27, 2003 at 06:45:03PM +0100, Arjen van der Meijden wrote:

> > About the values; Is it possible to have omega sort using the 
> > reverse of the value list? So we don't have to supply both 
> > the datefield and the (maxdate - datefield) as values, to 
> > allow omega to sort on one of those two directions. (thus 
> > saving some storage space, not that many).

OmEnquire::set_sort_forward(bool sort_forward);

Set to true for ascending order (default), false for descending.

> > How does xapian 
> > store the values? If we supply date-fields as time_t's (ie 
> > unixtimestamps in long format) will it store those as text or 
> > as longs? It could be usefull for the xapian-sorter to know 
> > that it are only longs and thus allowing much faster (?) sorting.

Not as I understand it, no. Everything's a string underneath (more
accurately, a lump of bytes, I think). Given this, I don't think it'd
be worthwhile trying to optimise by conversion to a number first
(since that would have to be done as part of the sort operation
anyway).

> > We're also thinking of using the document-title as value, so 
> > omega can sort those alfabetically. But since values are also 
> > used to eliminate equivalent documents, what will happen if 
> > two documents have the same title, but not the same content, 
> > in that case?

(Continue reading)

Arjen van der Meijden | 27 Jan 19:46 2003
Picon
Picon

RE: Again some questions about xapian (attributes, remote, legal)


> Van: James Aylett [mailto:james-xapian <at> tartarus.org] 
> Verzonden: maandag 27 januari 2003 19:10
>
> On Mon, Jan 27, 2003 at 06:45:03PM +0100, Arjen van der Meijden wrote:
> 
> OmEnquire::set_sort_forward(bool sort_forward);
> 
> Set to true for ascending order (default), false for descending.
Which would require some changes to the omega source I suppose?

Something like this:
    val = cgi_params.find("SORT");
    if (val != cgi_params.end()  && !val->second.empty()) {
    if (val->second[0] == '#') {
        sort_numeric = true;
        sort_key = atoi(val->second.c_str() + 1);
    } else {
        sort_key = atoi(val->second.c_str());
    }

To this:
    val = cgi_params.find("SORT");
    if (val != cgi_params.end()  && !val->second.empty()) {
// Allow values like 1,#- (sort value 1, numeric and backwards)
// Or 1,- (sort value 1 backwards)
       if (val->second[0] == '-' || val->second[1] == '-') {
           OmEnquire::set_sort_forward(false);
       } else {
           OmEnquire::set_sort_forward(true);
(Continue reading)


Gmane