1 Jun 2007 02:48

### Is it possible to query by 2 index ranges at the same time

```   Hi, i have an interesting usecase and i would like to ask whether i
can solve it by using Elephant. I would like to index objects placed
on a discreet 2d grid of an arbitarry size, which would require
queries like (10 < x < 20) AND (-5 < y < 15).

Is it possible to perform such a query with elephant? I am not sure i
have looked in the right place, but i could not find such example in
the manual.

Ignas Mikalajūnas
```
```   Hi, i have an interesting usecase and i would like to ask whether i
can solve it by using Elephant. I would like to index objects placed
on a discreet 2d grid of an arbitarry size, which would require
queries like (10 < x < 20) AND (-5 < y < 15).

Is it possible to perform such a query with elephant? I am not sure i
have looked in the right place, but i could not find such example in
the manual.

Ignas Mikalajūnas
```
1 Jun 2007 04:17

### Re: Is it possible to query by 2 index ranges at the same time

I'm assuming x and y are properites of a data object, which has some other component z which
you with to retrieve, and you query is that you want to find all the records (x,y,z) such that
(10 < x < 20) and ( -5 < y < 15).

There is a spectrum of solutions to this problem.  However, in the general case Elephant will
not compute this with the best possible asymptotic complexity, although it may be better than
a relational database at doing so.

Here is the most naive solution:

Examine every object in the database, and report those that meet that above condition.

Although this may seem silly, don't knock it till you try it....it may be perfectly reasonable.

A second solution is to make sure that you have specified :index on the x component
(or, without loss of generality, the y component), and then using the "get-instance-by-range"
feature to get all of the instances within the x range, and use a simple lisp function to discard
those for which the y component does not match.

This will be relatively fast if the x range excludes a lot of objects and the y range doesn't.

It's not obvious to me that one can do better than that without some significant coding
(For example, a "grid file" is a datastructure designed to answer such queries efficiently.
An R-Tree is another geometric structure.)

A relational database will not do better, in the worst case.  A relational database will do
a better job at statistical sampling---that is, a query optimizer should, in a huge database,
be able to decide whether it should use the "x" index or the "y" index first based on the
selectivity of those clauses in the boolean expression.  However, fundamentally it can't
do any better than to pick the best index, use that only examine some of the objects, and then
look at the values of the other ones.

On Fri, 2007-06-01 at 03:48 +0300, Ignas Mikalajunas wrote:
Hi, i have an interesting usecase and i would like to ask whether i can solve it by using Elephant. I would like to index objects placed on a discreet 2d grid of an arbitarry size, which would require queries like (10 < x < 20) AND (-5 < y < 15). Is it possible to perform such a query with elephant? I am not sure i have looked in the right place, but i could not find such example in the manual. Ignas Mikalajūnas _______________________________________________ elephant-devel site list elephant-devel-F1HGIaG5STRyXAeb93iumQ@public.gmane.org http://common-lisp.net/mailman/listinfo/elephant-devel
```<div>
I'm assuming x and y are properites of a data object, which has some other component z which <br>
you with to retrieve, and you query is that you want to find all the records (x,y,z) such that <br>
(10 &lt; x &lt; 20) and ( -5 &lt; y &lt; 15).<br><br>
There is a spectrum of solutions to this problem.&nbsp; However, in the general case Elephant will <br>
not compute this with the best possible asymptotic complexity, although it may be better than <br>
a relational database at doing so.<br><br>
Here is the most naive solution:<br><br>
Examine every object in the database, and report those that meet that above condition.<br><br>
Although this may seem silly, don't knock it till you try it....it may be perfectly reasonable.<br><br>
A second solution is to make sure that you have specified :index on the x component <br>
(or, without loss of generality, the y component), and then using the "get-instance-by-range"<br>
feature to get all of the instances within the x range, and use a simple lisp function to discard<br>
those for which the y component does not match.<br><br>
This will be relatively fast if the x range excludes a lot of objects and the y range doesn't.<br><br>
It's not obvious to me that one can do better than that without some significant coding <br>
(For example, a "grid file" is a datastructure designed to answer such queries efficiently.<br>
An R-Tree is another geometric structure.)<br><br>
A relational database will not do better, in the worst case.&nbsp; A relational database will do <br>
a better job at statistical sampling---that is, a query optimizer should, in a huge database,<br>
be able to decide whether it should use the "x" index or the "y" index first based on the <br>
selectivity of those clauses in the boolean expression.&nbsp; However, fundamentally it can't <br>
do any better than to pick the best index, use that only examine some of the objects, and then<br>
look at the values of the other ones.<br><br>
On Fri, 2007-06-01 at 03:48 +0300, Ignas Mikalajunas wrote:
<blockquote type="CITE">

Hi, i have an interesting usecase and i would like to ask whether i
can solve it by using Elephant. I would like to index objects placed
on a discreet 2d grid of an arbitarry size, which would require
queries like (10 &lt; x &lt; 20) AND (-5 &lt; y &lt; 15).

Is it possible to perform such a query with elephant? I am not sure i
have looked in the right place, but i could not find such example in
the manual.

Ignas Mikalaj&#363;nas
_______________________________________________
elephant-devel site list
<a href="mailto:elephant-devel@...">elephant-devel@...</a>
<a href="http://common-lisp.net/mailman/listinfo/elephant-devel">http://common-lisp.net/mailman/listinfo/elephant-devel</a>

</blockquote>
</div>
```
1 Jun 2007 05:07

### Re: Is it possible to query by 2 index ranges at the same time

```Ignas,

The easiest way to do this is to follow Robert's suggestion, and
declare slots x and y to be indexed (assuming your parameters are
slot values of a persistent objects) and then say:

(remove-duplicates
(nconc (get-instances-by-range 'my-class 'x 10 20)
(get-instances-by-range 'my-class 'y -5 15)))

Of course you might do a little better, performance-wise, if you
filter the y value as you traverse the x range (or vice-versa).
Since get-instances-by-range is just a wrapper around map-inverted-
index that collects visited objects, this could be up to twice as
fast, although in practice I'd expect the benefit to be marginal:

(let ((matches nil))
(map-inverted-index (lambda (obj)
(when (and (> y -5) (< y 15))
(push obj matches)))
'my-class
'x
10
20)
matches)

This version avoids a second map-inverted-index operation (called by
get-instances-by-range) by only collecting objects if their y
coordinate is in the range.  You can wrap all this in a function like
(get-objects-in-region x1 y1 x2 y2).  Map over the x or y based on
which query is smaller, or which parameter is sparser.  If you don't
know, then it doesn't really matter which you choose!

If you get some performance comparisons of these two scenarios,
please post them to the list!

Regards,
Ian

PS - It would be interesting to see an R-tree (or one of the well-
known variants such as R* or Priority R-Trees) in Elephant.  Instead
of using indexed classes, you would directly implement the R-tree
nodes as persistent instances and write the construction and
retrieval algorithms as if they were operating on in-memory objects.
Graphs are a very natural structure to implement in Elephant and the
only special provision I can think of might be maintaining a
persistent set of free nodes for the dynamic case.

On May 31, 2007, at 10:17 PM, Robert L. Read wrote:

> I'm assuming x and y are properites of a data object, which has
> some other component z which
> you with to retrieve, and you query is that you want to find all
> the records (x,y,z) such that
> (10 < x < 20) and ( -5 < y < 15).
>
> There is a spectrum of solutions to this problem.  However, in the
> general case Elephant will
> not compute this with the best possible asymptotic complexity,
> although it may be better than
> a relational database at doing so.
>
> Here is the most naive solution:
>
> Examine every object in the database, and report those that meet
> that above condition.
>
> Although this may seem silly, don't knock it till you try it....it
> may be perfectly reasonable.
>
> A second solution is to make sure that you have specified :index on
> the x component
> (or, without loss of generality, the y component), and then using
> the "get-instance-by-range"
> feature to get all of the instances within the x range, and use a
> simple lisp function to discard
> those for which the y component does not match.
>
> This will be relatively fast if the x range excludes a lot of
> objects and the y range doesn't.
>
> It's not obvious to me that one can do better than that without
> some significant coding
> (For example, a "grid file" is a datastructure designed to answer
> such queries efficiently.
> An R-Tree is another geometric structure.)
>
> A relational database will not do better, in the worst case.  A
> relational database will do
> a better job at statistical sampling---that is, a query optimizer
> should, in a huge database,
> be able to decide whether it should use the "x" index or the "y"
> index first based on the
> selectivity of those clauses in the boolean expression.  However,
> fundamentally it can't
> do any better than to pick the best index, use that only examine
> some of the objects, and then
> look at the values of the other ones.
>
> On Fri, 2007-06-01 at 03:48 +0300, Ignas Mikalajunas wrote:
>> Hi, i have an interesting usecase and i would like to ask whether
>> i can solve it by using Elephant. I would like to index objects
>> placed on a discreet 2d grid of an arbitarry size, which would
>> require queries like (10 < x < 20) AND (-5 < y < 15). Is it
>> possible to perform such a query with elephant? I am not sure i
>> have looked in the right place, but i could not find such example
>> in the manual. Ignas Mikalajūnas
>> _______________________________________________ elephant-devel
>> site list elephant-devel@... http://common-lisp.net/
>> mailman/listinfo/elephant-devel
> _______________________________________________
> elephant-devel site list
> elephant-devel@...
> http://common-lisp.net/mailman/listinfo/elephant-devel

```
1 Jun 2007 05:11

### Re: ECOOP 2007

```Scribit Pierre THIERRY dies 25/05/2007 hora 20:39:
> I will co-author the paper

I just submitted the paper. Here is the final abstract:

The data model of an application, the nature and format of data stored
across executions, is typically a very rigid part of its early
specification, even when prototyping, and changing it after code that
relies on it was written can prove quite expensive and error-prone.

Code and data in a running Lisp image can be dynamically modified. A
MOP-based persistence library can bring this dynamicity to the data
model. This enables to extend the easy prototyping way of development
to the storage of data and helps avoiding interruptions of service.
transparently.

Quickly,
Pierre
--
nowhere.man@...
OpenPGP 0xD9D50D8A
```
```Scribit Pierre THIERRY dies 25/05/2007 hora 20:39:
> I will co-author the paper

I just submitted the paper. Here is the final abstract:

The data model of an application, the nature and format of data stored
across executions, is typically a very rigid part of its early
specification, even when prototyping, and changing it after code that
relies on it was written can prove quite expensive and error-prone.

Code and data in a running Lisp image can be dynamically modified. A
MOP-based persistence library can bring this dynamicity to the data
model. This enables to extend the easy prototyping way of development
to the storage of data and helps avoiding interruptions of service.
transparently.

Quickly,
Pierre
--

--
nowhere.man@...
OpenPGP 0xD9D50D8A
```
1 Jun 2007 05:32

### Re: ECOOP 2007

This sounds very nice --- good luck!

On Fri, 2007-06-01 at 05:11 +0200, Pierre THIERRY wrote:
Scribit Pierre THIERRY dies 25/05/2007 hora 20:39: > I will co-author the paper I just submitted the paper. Here is the final abstract: The data model of an application, the nature and format of data stored across executions, is typically a very rigid part of its early specification, even when prototyping, and changing it after code that relies on it was written can prove quite expensive and error-prone. Code and data in a running Lisp image can be dynamically modified. A MOP-based persistence library can bring this dynamicity to the data model. This enables to extend the easy prototyping way of development to the storage of data and helps avoiding interruptions of service. This article presents the conditions to do this portably and transparently. Quickly, Pierre _______________________________________________ elephant-devel site list elephant-devel-F1HGIaG5STRyXAeb93iumQ@public.gmane.org http://common-lisp.net/mailman/listinfo/elephant-devel
```<div>
This sounds very nice --- good luck!<br><br><br>
On Fri, 2007-06-01 at 05:11 +0200, Pierre THIERRY wrote:
<blockquote type="CITE">

Scribit Pierre THIERRY dies 25/05/2007 hora 20:39:
&gt; I will co-author the paper

I just submitted the paper. Here is the final abstract:

The data model of an application, the nature and format of data stored
across executions, is typically a very rigid part of its early
specification, even when prototyping, and changing it after code that
relies on it was written can prove quite expensive and error-prone.

Code and data in a running Lisp image can be dynamically modified. A
MOP-based persistence library can bring this dynamicity to the data
model. This enables to extend the easy prototyping way of development
to the storage of data and helps avoiding interruptions of service.
transparently.

Quickly,
Pierre
_______________________________________________
elephant-devel site list
<a href="mailto:elephant-devel@...">elephant-devel@...</a>
<a href="http://common-lisp.net/mailman/listinfo/elephant-devel">http://common-lisp.net/mailman/listinfo/elephant-devel</a>

</blockquote>
</div>
```
1 Jun 2007 05:43

### Re: Is it possible to query by 2 index ranges at the same time

```Please excuse if I don't make a direct reference to Elephant solving this in my comment below. However, I
remembered reading something just like this in AllegroCache's Reference Manual, in which it said, and I quote:

"In a database every object has a unique object identifier (oid). This value can be retrieved using
db-­object-­oid. An oid is an integer. There is no way to determine the class of an object given its oid.

Usually a program need not concern itself with oids. However in certain circumstances it may be convenient
to work with oids. One such case is when combining the results of multiple indexes over the same class. You
may want to ask for the set of objects whose X slot is greater than 10 and whose Y slot is greater than 20. It
consumes fewer resources to ask for the oids of objects whose X slot is greater than 10 than to ask for the
objects themselves. In the later case the objects retrieved have to be instantiated in the object cache
and there's no point in doing that if you don't need all those objects. In this case you don't need all those
objects since you only need those objects  whose Y slot is also greater then 20. Thus the optimal way to do
this query is to find the intersection of the oids corresponding to "X > 10" and those oids with "Y > 20" and
then from that intersection find the objects corresponding to the oids."

They later document the following function:

"retrieve-­from-­index-­range (class slot­name
initial­value end­value
&key (db *allegrocache*) oid)

returns all objects of the given class (a persistent­class object or a symbol) whose slot slot­name has a
value in the range beginning with initial­value up to but not including end­value. If oid is true then
the object id values are returned instead of the objects."

It's interesting to note the "oid" key argument. I'm not sure if Elephant's get-instances-by-range
supports something like that. But, a similar solution to the problem presented is:

"We could find the oids of all the employees whose first name begins with “Jo” and whose last name begins
with “F” using

(intersection (retrieve-from-index-range
‘Employee ‘first-name “Jo” “Jp” :oid t)
(retrieve-from-index-range
‘Employee ‘last-name “F” “G”   :oid t))"

given a persistent class Employee with indices on slots first-name and last-name.

If Elephant supports something similar to the "oid" key argument, it would be very useful for these kind of
queries, where you can get an initial result set and only create the instances of those objects if/when needed.

Just my \$0.02 :)

- Daniel

On Thu, May 31, 2007 11:07 pm, Ian Eslick <eslick@...> said:

> Ignas,
>
> The easiest way to do this is to follow Robert's suggestion, and
> declare slots x and y to be indexed (assuming your parameters are
> slot values of a persistent objects) and then say:
>
> (remove-duplicates
>    (nconc (get-instances-by-range 'my-class 'x 10 20)
>           (get-instances-by-range 'my-class 'y -5 15)))
>
> Of course you might do a little better, performance-wise, if you
> filter the y value as you traverse the x range (or vice-versa).
> Since get-instances-by-range is just a wrapper around map-inverted-
> index that collects visited objects, this could be up to twice as
> fast, although in practice I'd expect the benefit to be marginal:
>
> (let ((matches nil))
>    (map-inverted-index (lambda (obj)
>                          (when (and (> y -5) (< y 15))
>                            (push obj matches)))
> 	              'my-class
>          	      'x
> 	              10
>                        20)
>    matches)
>
> This version avoids a second map-inverted-index operation (called by
> get-instances-by-range) by only collecting objects if their y
> coordinate is in the range.  You can wrap all this in a function like
> (get-objects-in-region x1 y1 x2 y2).  Map over the x or y based on
> which query is smaller, or which parameter is sparser.  If you don't
> know, then it doesn't really matter which you choose!
>
> If you get some performance comparisons of these two scenarios,
> please post them to the list!
>
> Regards,
> Ian
>
> PS - It would be interesting to see an R-tree (or one of the well-
> known variants such as R* or Priority R-Trees) in Elephant.  Instead
> of using indexed classes, you would directly implement the R-tree
> nodes as persistent instances and write the construction and
> retrieval algorithms as if they were operating on in-memory objects.
> Graphs are a very natural structure to implement in Elephant and the
> only special provision I can think of might be maintaining a
> persistent set of free nodes for the dynamic case.
>
> On May 31, 2007, at 10:17 PM, Robert L. Read wrote:
>
>> I'm assuming x and y are properites of a data object, which has
>> some other component z which
>> you with to retrieve, and you query is that you want to find all
>> the records (x,y,z) such that
>> (10 < x < 20) and ( -5 < y < 15).
>>
>> There is a spectrum of solutions to this problem.  However, in the
>> general case Elephant will
>> not compute this with the best possible asymptotic complexity,
>> although it may be better than
>> a relational database at doing so.
>>
>> Here is the most naive solution:
>>
>> Examine every object in the database, and report those that meet
>> that above condition.
>>
>> Although this may seem silly, don't knock it till you try it....it
>> may be perfectly reasonable.
>>
>> A second solution is to make sure that you have specified :index on
>> the x component
>> (or, without loss of generality, the y component), and then using
>> the "get-instance-by-range"
>> feature to get all of the instances within the x range, and use a
>> simple lisp function to discard
>> those for which the y component does not match.
>>
>> This will be relatively fast if the x range excludes a lot of
>> objects and the y range doesn't.
>>
>> It's not obvious to me that one can do better than that without
>> some significant coding
>> (For example, a "grid file" is a datastructure designed to answer
>> such queries efficiently.
>> An R-Tree is another geometric structure.)
>>
>> A relational database will not do better, in the worst case.  A
>> relational database will do
>> a better job at statistical sampling---that is, a query optimizer
>> should, in a huge database,
>> be able to decide whether it should use the "x" index or the "y"
>> index first based on the
>> selectivity of those clauses in the boolean expression.  However,
>> fundamentally it can't
>> do any better than to pick the best index, use that only examine
>> some of the objects, and then
>> look at the values of the other ones.
>>
>> On Fri, 2007-06-01 at 03:48 +0300, Ignas Mikalajunas wrote:
>>> Hi, i have an interesting usecase and i would like to ask whether
>>> i can solve it by using Elephant. I would like to index objects
>>> placed on a discreet 2d grid of an arbitarry size, which would
>>> require queries like (10 < x < 20) AND (-5 < y < 15). Is it
>>> possible to perform such a query with elephant? I am not sure i
>>> have looked in the right place, but i could not find such example
>>> in the manual. Ignas Mikalajūnas
>>> _______________________________________________ elephant-devel
>>> site list elephant-devel@... http://common-lisp.net/
>>> mailman/listinfo/elephant-devel
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel@...
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>
> _______________________________________________
> elephant-devel site list
> elephant-devel@...
> http://common-lisp.net/mailman/listinfo/elephant-devel
>

```
1 Jun 2007 08:08

### Re: Is it possible to query by 2 index ranges at the same time

```That's very insightful and quite correct!  I was going to use this
mechanism as part of the query system I want to get working for 1.0.
This is pretty much the mechanism that relational databases use to do
join queries and Elephant will be no different.

As a matter of discipline, I didn't want to encourage users to mess
around with oids because of the issues outlined in an e-mail thread a
few days ago.  There will be an internal elephant api to access this
functionality for advanced users.  That function will essentially be
a copy of map-inverted-index, but using cursor-get to get only the
oid instead of cursor-pget to get the oid and primary value.  I'm
happy to be argued out of this position, but my experience is that
people will start to depend on this interface and then complain when

The query system will perform these kinds of optimizations for you.
The interface will be something like:

(map-query
(lambda (obj)
(render obj screen))
(select-objects (geometry)
where
(between geometry.x 10 20)
(between geometry.y -5 15)))

For a class named geometry with indexed slots x and y.  If x and y
are indexed, then the query will be very efficient.  If not, then it
will be a linear walk.  I aim to add informative statements and
statistics to select-objects to help people optimize queries.  This
is similar to the select clause in SQL, but will have features
oriented towards path queries (object references) and graph queries.

The select-objects statement effectively returns a generator, which
will probably be a list of oids, either in memory or maintained on
disk for large query sets.

I'm open to suggestions on syntax.  I'm not happy with my current
sketches here.

Ian

PS - If you really want to get OIDs as the user, you can using
cursors over the inverted index to get the ranges yourself, then you
can do exactly the intersection operation you mentioned before.  Of
course you need the classes of the oids so selected to recover the
objects.

On May 31, 2007, at 11:43 PM, lists@... wrote:

> Please excuse if I don't make a direct reference to Elephant
> solving this in my comment below. However, I remembered reading
> something just like this in AllegroCache's Reference Manual, in
> which it said, and I quote:
>
> "In a database every object has a unique object identifier (oid).
> This value can be retrieved using db- object- oid. An oid is an
> integer. There is no way to determine the class of an object given
> its oid.
>
> Usually a program need not concern itself with oids. However in
> certain circumstances it may be convenient to work with oids. One
> such case is when combining the results of multiple indexes over
> the same class. You may want to ask for the set of objects whose X
> slot is greater than 10 and whose Y slot is greater than 20. It
> consumes fewer resources to ask for the oids of objects whose X
> slot is greater than 10 than to ask for the objects themselves. In
> the later case the objects retrieved have to be instantiated in the
> object cache and there's no point in doing that if you don't need
> all those objects. In this case you don't need all those objects
> since you only need those objects  whose Y slot is also greater
> then 20. Thus the optimal way to do this query is to find the
> intersection of the oids corresponding to "X > 10" and those oids
> with "Y > 20" and then from that intersection find the objects
> corresponding to the oids."
>
> They later document the following function:
>
> "retrieve- from- index- range (class slot name
>                             initial value end value
>                            &key (db *allegrocache*) oid)
>
> returns all objects of the given class (a persistent class object
> or a symbol) whose slot slot name has a value in the range
> beginning with initial value up to but not including end value. If
> oid is true then the object id values are returned instead of the
> objects."
>
> It's interesting to note the "oid" key argument. I'm not sure if
> Elephant's get-instances-by-range supports something like that.
> But, a similar solution to the problem presented is:
>
> "We could find the oids of all the employees whose first name
> begins with “Jo” and whose last name begins with “F” using
>
> (intersection (retrieve-from-index-range
>                     ‘Employee ‘first-name “Jo” “Jp” :oid
> t)
>               (retrieve-from-index-range
>                     ‘Employee ‘last-name “F” “G”   :oid
> t))"
>
> given a persistent class Employee with indices on slots first-name
> and last-name.
>
> If Elephant supports something similar to the "oid" key argument,
> it would be very useful for these kind of queries, where you can
> get an initial result set and only create the instances of those
> objects if/when needed.
>
> Just my \$0.02 :)
>
> - Daniel
>
> On Thu, May 31, 2007 11:07 pm, Ian Eslick <eslick@...> said:
>
>> Ignas,
>>
>> The easiest way to do this is to follow Robert's suggestion, and
>> declare slots x and y to be indexed (assuming your parameters are
>> slot values of a persistent objects) and then say:
>>
>> (remove-duplicates
>>    (nconc (get-instances-by-range 'my-class 'x 10 20)
>>           (get-instances-by-range 'my-class 'y -5 15)))
>>
>> Of course you might do a little better, performance-wise, if you
>> filter the y value as you traverse the x range (or vice-versa).
>> Since get-instances-by-range is just a wrapper around map-inverted-
>> index that collects visited objects, this could be up to twice as
>> fast, although in practice I'd expect the benefit to be marginal:
>>
>> (let ((matches nil))
>>    (map-inverted-index (lambda (obj)
>>                          (when (and (> y -5) (< y 15))
>>                            (push obj matches)))
>> 	              'my-class
>>          	      'x
>> 	              10
>>                        20)
>>    matches)
>>
>> This version avoids a second map-inverted-index operation (called by
>> get-instances-by-range) by only collecting objects if their y
>> coordinate is in the range.  You can wrap all this in a function like
>> (get-objects-in-region x1 y1 x2 y2).  Map over the x or y based on
>> which query is smaller, or which parameter is sparser.  If you don't
>> know, then it doesn't really matter which you choose!
>>
>> If you get some performance comparisons of these two scenarios,
>> please post them to the list!
>>
>> Regards,
>> Ian
>>
>> PS - It would be interesting to see an R-tree (or one of the well-
>> known variants such as R* or Priority R-Trees) in Elephant.  Instead
>> of using indexed classes, you would directly implement the R-tree
>> nodes as persistent instances and write the construction and
>> retrieval algorithms as if they were operating on in-memory objects.
>> Graphs are a very natural structure to implement in Elephant and the
>> only special provision I can think of might be maintaining a
>> persistent set of free nodes for the dynamic case.
>>
>> On May 31, 2007, at 10:17 PM, Robert L. Read wrote:
>>
>>> I'm assuming x and y are properites of a data object, which has
>>> some other component z which
>>> you with to retrieve, and you query is that you want to find all
>>> the records (x,y,z) such that
>>> (10 < x < 20) and ( -5 < y < 15).
>>>
>>> There is a spectrum of solutions to this problem.  However, in the
>>> general case Elephant will
>>> not compute this with the best possible asymptotic complexity,
>>> although it may be better than
>>> a relational database at doing so.
>>>
>>> Here is the most naive solution:
>>>
>>> Examine every object in the database, and report those that meet
>>> that above condition.
>>>
>>> Although this may seem silly, don't knock it till you try it....it
>>> may be perfectly reasonable.
>>>
>>> A second solution is to make sure that you have specified :index on
>>> the x component
>>> (or, without loss of generality, the y component), and then using
>>> the "get-instance-by-range"
>>> feature to get all of the instances within the x range, and use a
>>> simple lisp function to discard
>>> those for which the y component does not match.
>>>
>>> This will be relatively fast if the x range excludes a lot of
>>> objects and the y range doesn't.
>>>
>>> It's not obvious to me that one can do better than that without
>>> some significant coding
>>> (For example, a "grid file" is a datastructure designed to answer
>>> such queries efficiently.
>>> An R-Tree is another geometric structure.)
>>>
>>> A relational database will not do better, in the worst case.  A
>>> relational database will do
>>> a better job at statistical sampling---that is, a query optimizer
>>> should, in a huge database,
>>> be able to decide whether it should use the "x" index or the "y"
>>> index first based on the
>>> selectivity of those clauses in the boolean expression.  However,
>>> fundamentally it can't
>>> do any better than to pick the best index, use that only examine
>>> some of the objects, and then
>>> look at the values of the other ones.
>>>
>>> On Fri, 2007-06-01 at 03:48 +0300, Ignas Mikalajunas wrote:
>>>> Hi, i have an interesting usecase and i would like to ask whether
>>>> i can solve it by using Elephant. I would like to index objects
>>>> placed on a discreet 2d grid of an arbitarry size, which would
>>>> require queries like (10 < x < 20) AND (-5 < y < 15). Is it
>>>> possible to perform such a query with elephant? I am not sure i
>>>> have looked in the right place, but i could not find such example
>>>> in the manual. Ignas Mikalajūnas
>>>> _______________________________________________ elephant-devel
>>>> site list elephant-devel@... http://common-lisp.net/
>>>> mailman/listinfo/elephant-devel
>>> _______________________________________________
>>> elephant-devel site list
>>> elephant-devel@...
>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel@...
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>
>
> _______________________________________________
> elephant-devel site list
> elephant-devel@...
> http://common-lisp.net/mailman/listinfo/elephant-devel

```
1 Jun 2007 15:35

### Re: Is it possible to query by 2 index ranges at the same time

```Hi Ian,

I read the thread from a few days ago regarding OIDs. While I agree with your position, I think it's
beneficial to be able to get a list of OIDs to be able to perform these kinds of operations. I think the
premise from the other thread was that of "relying" on OIDs for relational purposes.

I also like the idea of where the query system may be going. But, independently of that, being able to specify
something like :oid t in the get-instances-by-range function and then be able to "walk" through a set of
oids, specifying the base class and the "walk" mechanism will instantiate those objects. Think of it as a
primitive to the query system.

Thanks,
Daniel

On Fri, June 1, 2007 2:08 am, Ian Eslick <eslick@...> said:

> That's very insightful and quite correct!  I was going to use this
> mechanism as part of the query system I want to get working for 1.0.
> This is pretty much the mechanism that relational databases use to do
> join queries and Elephant will be no different.
>
> As a matter of discipline, I didn't want to encourage users to mess
> around with oids because of the issues outlined in an e-mail thread a
> few days ago.  There will be an internal elephant api to access this
> functionality for advanced users.  That function will essentially be
> a copy of map-inverted-index, but using cursor-get to get only the
> oid instead of cursor-pget to get the oid and primary value.  I'm
> happy to be argued out of this position, but my experience is that
> people will start to depend on this interface and then complain when
> features are added later that leads to bugs in user code.
>
> The query system will perform these kinds of optimizations for you.
> The interface will be something like:
>
> (map-query
>    (lambda (obj)
>       (render obj screen))
>    (select-objects (geometry)
>      where
>      (between geometry.x 10 20)
>      (between geometry.y -5 15)))
>
> For a class named geometry with indexed slots x and y.  If x and y
> are indexed, then the query will be very efficient.  If not, then it
> will be a linear walk.  I aim to add informative statements and
> statistics to select-objects to help people optimize queries.  This
> is similar to the select clause in SQL, but will have features
> oriented towards path queries (object references) and graph queries.
>
> The select-objects statement effectively returns a generator, which
> will probably be a list of oids, either in memory or maintained on
> disk for large query sets.
>
> I'm open to suggestions on syntax.  I'm not happy with my current
> sketches here.
>
> Ian
>
> PS - If you really want to get OIDs as the user, you can using
> cursors over the inverted index to get the ranges yourself, then you
> can do exactly the intersection operation you mentioned before.  Of
> course you need the classes of the oids so selected to recover the
> objects.
>
> On May 31, 2007, at 11:43 PM, lists@... wrote:
>
>> Please excuse if I don't make a direct reference to Elephant
>> solving this in my comment below. However, I remembered reading
>> something just like this in AllegroCache's Reference Manual, in
>> which it said, and I quote:
>>
>> "In a database every object has a unique object identifier (oid).
>> This value can be retrieved using db- object- oid. An oid is an
>> integer. There is no way to determine the class of an object given
>> its oid.
>>
>> Usually a program need not concern itself with oids. However in
>> certain circumstances it may be convenient to work with oids. One
>> such case is when combining the results of multiple indexes over
>> the same class. You may want to ask for the set of objects whose X
>> slot is greater than 10 and whose Y slot is greater than 20. It
>> consumes fewer resources to ask for the oids of objects whose X
>> slot is greater than 10 than to ask for the objects themselves. In
>> the later case the objects retrieved have to be instantiated in the
>> object cache and there's no point in doing that if you don't need
>> all those objects. In this case you don't need all those objects
>> since you only need those objects  whose Y slot is also greater
>> then 20. Thus the optimal way to do this query is to find the
>> intersection of the oids corresponding to "X > 10" and those oids
>> with "Y > 20" and then from that intersection find the objects
>> corresponding to the oids."
>>
>> They later document the following function:
>>
>> "retrieve- from- index- range (class slot name
>>                             initial value end value
>>                            &key (db *allegrocache*) oid)
>>
>> returns all objects of the given class (a persistent class object
>> or a symbol) whose slot slot name has a value in the range
>> beginning with initial value up to but not including end value. If
>> oid is true then the object id values are returned instead of the
>> objects."
>>
>> It's interesting to note the "oid" key argument. I'm not sure if
>> Elephant's get-instances-by-range supports something like that.
>> But, a similar solution to the problem presented is:
>>
>> "We could find the oids of all the employees whose first name
>> begins with “Jo” and whose last name begins with “F”
>> using
>>
>> (intersection (retrieve-from-index-range
>>                     ‘Employee ‘first-name “Jo”
>> “Jp” :oid
>> t)
>>               (retrieve-from-index-range
>>                     ‘Employee ‘last-name “F”
>> “G”   :oid
>> t))"
>>
>> given a persistent class Employee with indices on slots first-name
>> and last-name.
>>
>> If Elephant supports something similar to the "oid" key argument,
>> it would be very useful for these kind of queries, where you can
>> get an initial result set and only create the instances of those
>> objects if/when needed.
>>
>> Just my \$0.02 :)
>>
>> - Daniel
>>
>> On Thu, May 31, 2007 11:07 pm, Ian Eslick <eslick@...> said:
>>
>>> Ignas,
>>>
>>> The easiest way to do this is to follow Robert's suggestion, and
>>> declare slots x and y to be indexed (assuming your parameters are
>>> slot values of a persistent objects) and then say:
>>>
>>> (remove-duplicates
>>>    (nconc (get-instances-by-range 'my-class 'x 10 20)
>>>           (get-instances-by-range 'my-class 'y -5 15)))
>>>
>>> Of course you might do a little better, performance-wise, if you
>>> filter the y value as you traverse the x range (or vice-versa).
>>> Since get-instances-by-range is just a wrapper around map-inverted-
>>> index that collects visited objects, this could be up to twice as
>>> fast, although in practice I'd expect the benefit to be marginal:
>>>
>>> (let ((matches nil))
>>>    (map-inverted-index (lambda (obj)
>>>                          (when (and (> y -5) (< y 15))
>>>                            (push obj matches)))
>>> 	              'my-class
>>>          	      'x
>>> 	              10
>>>                        20)
>>>    matches)
>>>
>>> This version avoids a second map-inverted-index operation (called by
>>> get-instances-by-range) by only collecting objects if their y
>>> coordinate is in the range.  You can wrap all this in a function like
>>> (get-objects-in-region x1 y1 x2 y2).  Map over the x or y based on
>>> which query is smaller, or which parameter is sparser.  If you don't
>>> know, then it doesn't really matter which you choose!
>>>
>>> If you get some performance comparisons of these two scenarios,
>>> please post them to the list!
>>>
>>> Regards,
>>> Ian
>>>
>>> PS - It would be interesting to see an R-tree (or one of the well-
>>> known variants such as R* or Priority R-Trees) in Elephant.  Instead
>>> of using indexed classes, you would directly implement the R-tree
>>> nodes as persistent instances and write the construction and
>>> retrieval algorithms as if they were operating on in-memory objects.
>>> Graphs are a very natural structure to implement in Elephant and the
>>> only special provision I can think of might be maintaining a
>>> persistent set of free nodes for the dynamic case.
>>>
>>> On May 31, 2007, at 10:17 PM, Robert L. Read wrote:
>>>
>>>> I'm assuming x and y are properites of a data object, which has
>>>> some other component z which
>>>> you with to retrieve, and you query is that you want to find all
>>>> the records (x,y,z) such that
>>>> (10 < x < 20) and ( -5 < y < 15).
>>>>
>>>> There is a spectrum of solutions to this problem.  However, in the
>>>> general case Elephant will
>>>> not compute this with the best possible asymptotic complexity,
>>>> although it may be better than
>>>> a relational database at doing so.
>>>>
>>>> Here is the most naive solution:
>>>>
>>>> Examine every object in the database, and report those that meet
>>>> that above condition.
>>>>
>>>> Although this may seem silly, don't knock it till you try it....it
>>>> may be perfectly reasonable.
>>>>
>>>> A second solution is to make sure that you have specified :index on
>>>> the x component
>>>> (or, without loss of generality, the y component), and then using
>>>> the "get-instance-by-range"
>>>> feature to get all of the instances within the x range, and use a
>>>> simple lisp function to discard
>>>> those for which the y component does not match.
>>>>
>>>> This will be relatively fast if the x range excludes a lot of
>>>> objects and the y range doesn't.
>>>>
>>>> It's not obvious to me that one can do better than that without
>>>> some significant coding
>>>> (For example, a "grid file" is a datastructure designed to answer
>>>> such queries efficiently.
>>>> An R-Tree is another geometric structure.)
>>>>
>>>> A relational database will not do better, in the worst case.  A
>>>> relational database will do
>>>> a better job at statistical sampling---that is, a query optimizer
>>>> should, in a huge database,
>>>> be able to decide whether it should use the "x" index or the "y"
>>>> index first based on the
>>>> selectivity of those clauses in the boolean expression.  However,
>>>> fundamentally it can't
>>>> do any better than to pick the best index, use that only examine
>>>> some of the objects, and then
>>>> look at the values of the other ones.
>>>>
>>>> On Fri, 2007-06-01 at 03:48 +0300, Ignas Mikalajunas wrote:
>>>>> Hi, i have an interesting usecase and i would like to ask whether
>>>>> i can solve it by using Elephant. I would like to index objects
>>>>> placed on a discreet 2d grid of an arbitarry size, which would
>>>>> require queries like (10 < x < 20) AND (-5 < y < 15). Is it
>>>>> possible to perform such a query with elephant? I am not sure i
>>>>> have looked in the right place, but i could not find such example
>>>>> in the manual. Ignas Mikalajūnas
>>>>> _______________________________________________ elephant-devel
>>>>> site list elephant-devel@... http://common-lisp.net/
>>>>> mailman/listinfo/elephant-devel
>>>> _______________________________________________
>>>> elephant-devel site list
>>>> elephant-devel@...
>>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>
>>> _______________________________________________
>>> elephant-devel site list
>>> elephant-devel@...
>>> http://common-lisp.net/mailman/listinfo/elephant-devel
>>>
>>
>> _______________________________________________
>> elephant-devel site list
>> elephant-devel@...
>> http://common-lisp.net/mailman/listinfo/elephant-devel
>
>

```
7 Jun 2007 18:39

### Release 0.9 Available

```To all Elephants,

Robert and I have officially signed off and tagged release 0.9
(http://common-lisp.net/project/elephant/news.html).  The website
including tarballs and documentation, have been updated to reflect
this milestone.

A hearty thank you to everyone who contributed to the release over
the past months!

New development at this point will be occuring via our darcs
repository at http://www.common-lisp.net/project/elephant/darcs/
elephant.  Checkins will be less frequent here.  However, it is easy
under darcs to send small patches to the developers directly or via
the website.

This is a good time to sign on as an active developer to work on your
favorite features.  Changes will be more localized in the upcoming
development releases than in the past.  This will make it easier for
multiple developers to work concurrently.  If you wish to engage in
regular or active development, please e-mail one of the developers
directly.

Enjoy!

```
8 Jun 2007 14:16

### Test of 0.9 on LispWorks 5.0 (64-bit) for Linux

```Hi, Elephant Developers

I saw the release 0.9 yesterday, and three "should work" of LispWorks (64-bit)
on the testing status page. Fortunately I have a 5.0.2 Enterprise Edition of
LispWorks (64-bit) for Linux, runing on a Debian box, with Berkeley DB 4.5
and PostgreSQL 8.2 client installed. Glad to be a tester.

To run elephent on Debian GNU/Linux, the my-config.sexp may like this:

;; Linux defaults
#+(and (or sbcl allegro openmcl lispworks) (not (or mswindows windows)) (not (or macosx darwin)))
((:compiler . :gcc)
(:berkeley-db-include-dir . "/usr/include/")
(:berkeley-db-lib-dir . "/usr/lib/")
(:berkeley-db-lib . "/usr/lib/libdb-4.5.so")
(:berkeley-db-cachesize . 20971520)
(:berkeley-db-map-degree2 . t)
(:clsql-lib-paths . nil)
(:prebuilt-libraries . nil))

The test for Berkeley DB looks like this:

ELE-TESTS 15 > (setq *default-spec* '(:BDB "/tmp/testdb/"))
(:BDB "/tmp/testdb/")

ELE-TESTS 16 > (do-backend-tests)
;;; Compiling file /afs/163.org/user/b/binghe/lisp/elephant/tests/testbdb.lisp ...
;;; Safety = 3, Speed = 1, Space = 1, Float = 1, Interruptible = 0
;;; Compilation speed = 1, Debug = 2, Fixnum safety = 3
;;; Source level debugging is on
;;; Source file recording is  on
;;; Cross referencing is on
; (LISPWORKS:TOP-LEVEL-FORM 1)
; (LISPWORKS:TOP-LEVEL-FORM 2)
; (DEFVAR ENV)
; (DEFVAR DB)
; PREPARE-BDB
; (LISPWORKS:TOP-LEVEL-FORM 3)
; TEST-SEQUENCE1
; (LISPWORKS:TOP-LEVEL-FORM 4)
; TEST-SEQUENCE2
; (LISPWORKS:TOP-LEVEL-FORM 5)
; CLEANUP-BDB
; (LISPWORKS:TOP-LEVEL-FORM 6)
; (LISPWORKS:TOP-LEVEL-FORM 7)
Doing 132 pending tests of 132 tests total.
BIGNUMS FLOATS RATIONALS COMPLEXES BASE-STRINGS STRINGS HARD-STRINGS SYMBOLS CHARS PATHNAMES CONSES
HASH-TABLES-1 HASH-TABLES-2 ARRAYS-1 ARRAYS-2 TEST-DEEP-EQUALP TEST-SERIALIZATION-UNICODE-SLOT OBJECTS
STRUCTS STRUCT-NON-STD-CONSTRUCT CIRCULAR PERSISTENT
Second store spec missing: ignoring CROSS-STORE-REFERENCE-CONDITION UNINDEXED-CLASS-CONDITION
NON-TRANSIENT-CLASS-SLOT-1 NON-TRANSIENT-CLASS-SLOT-2 TRANSIENT-CLASS-SLOT CLASS-DEFINERS
BAD-INHERITENCE MIXES MIXES-RIGHT-SLOTS INHERIT INHERIT-RIGHT-SLOTS INITFORM-CLASSES INITFORM-TEST
INITARG-TEST NO-EVAL-INITFORM REDEFCLASS MAKUNBOUND UPDATE-CLASS CHANGE-CLASS CHANGE-CLASS3
BASICPERSISTENCE TESTOID BTREE-MAKE BTREE-PUT BTREE-GET REMOVE-KV REMOVED MAP-BTREE INDEXED-BTREE-MAKE
ADD-INDICES TEST-INDICES INDEXED-PUT INDEXED-GET SIMPLE-SLOT-GET INDEXED-GET-FROM-SLOT1
INDEXED-GET-FROM-SLOT2 REMOVE-KV-INDEXED NO-KEY-NOR-INDICES REMOVE-KV-FROM-SLOT1
NO-KEY-NOR-INDICES-SLOT1 REMOVE-KV-FROM-SLOT2 NO-KEY-NOR-INDICES-SLOT2 MAP-INDEXED GET-FIRST GET-FIRST2
GET-LAST GET-LAST2 SET SET2
Test SET-RANGE failed
Form: (WITH-TRANSACTION (:STORE-CONTROLLER *STORE-CONTROLLER*) (WITH-BTREE-CURSOR (C INDEX1)
(MULTIPLE-VALUE-BIND (HAS K V) (CURSOR-SET-RANGE C 199.5) (DECLARE (IGNORE HAS K)) (= (SLOT1 V) 200))))
Expected value: T
Actual value: #<SIMPLE-ERROR 40502326D3>.
SET-RANGE2 MAP-INDEXED-INDEX MAP-INDEX-FROM-END REM-KV REM-IDEXKV MAKE-INDEXED2 ADD-INDICES2
PUT-INDEXED2 GET-INDEXED2 GET-FROM-INDEX3 DUP-TEST NODUP-TEST PREV-NODUP-TEST PNODUP-TEST
PPREV-NODUP-TEST CUR-DEL1 INDEXED-DELETE TEST-DELETED INDEXED-DELETE2 TEST-DELETED2 CUR-DEL2 GET-BOTH
PGET-BOTH
Test PGET-BOTH-RANGE failed
Form: (WITH-BTREE-CURSOR (C INDEX3) (MULTIPLE-VALUE-BIND (M K V P) (CURSOR-PGET-BOTH-RANGE C 10 106.5)
(VALUES K V P)))
Expected values: 10
-107
107
Actual values: NIL
NIL
NIL.
Test PCURSOR failed
Form: (WITH-BTREE-CURSOR (C INDEX3) (VALUES (PCURSOR-PKEY (CURSOR-PFIRST C)) (PCURSOR-PKEY
(CURSOR-PNEXT C)) (PCURSOR-PKEY (CURSOR-PNEXT-NODUP C)) (PCURSOR-PKEY (CURSOR-PNEXT-DUP C))
(PCURSOR-PKEY (CURSOR-PPREV C)) (PCURSOR-PKEY (CURSOR-PPREV-NODUP C)) (PCURSOR-PKEY
(CURSOR-PLAST C)) (PCURSOR-PKEY (CURSOR-PSET C 300)) (PCURSOR-PKEY (CURSOR-PSET-RANGE C 199.5))
(PCURSOR-PKEY (CURSOR-PGET-BOTH C 10 101)) (PCURSOR-PKEY (CURSOR-PGET-BOTH-RANGE C 11 111.4))))
Expected values: 0
2
10
11
10
9
9999
3000
2000
101
112
Actual values: 0
2
10
11
10
9
9999
3000
NIL
101
NIL.
NEWINDEX
Test PCURSOR2 failed
Form: (WITH-BTREE-CURSOR (C INDEX4) (VALUES (PCURSOR-PKEY (CURSOR-PFIRST C)) (PCURSOR-PKEY
(CURSOR-PNEXT C)) (PCURSOR-PKEY (CURSOR-PNEXT-NODUP C)) (PCURSOR-PKEY (CURSOR-PNEXT-DUP C))
(PCURSOR-PKEY (CURSOR-PPREV C)) (PCURSOR-PKEY (CURSOR-PPREV-NODUP C)) (PCURSOR-PKEY
(CURSOR-PLAST C)) (PCURSOR-PKEY (CURSOR-PSET C 300)) (PCURSOR-PKEY (CURSOR-PSET-RANGE C 199.5))
(PCURSOR-PKEY (CURSOR-PGET-BOTH C 10 101)) (PCURSOR-PKEY (CURSOR-PGET-BOTH-RANGE C 11 111.4))))
Expected values: 0
2
10
11
10
9
9999
3000
2000
101
112
Actual values: 0
2
10
11
10
9
9999
3000
NIL
101
NIL.
INDEXING-BASIC INDEXING-CLASS-OPT INDEXING-INHERIT INDEXING-RANGE INDEXING-SLOT-MAKUNBOUND
INDEXING-WIPE-INDEX INDEXING-RECONNECT-DB INDEXING-CHANGE-CLASS INDEXING-REDEF-CLASS
Ranged get of 10/700 objects = Linear: 0.58 sec Indexed: 0.02 sec
INDEXING-TIMING
Single store mode: ignoring REMOVE-ELEMENT
Single store mode: ignoring MIGRATE-BASIC
Single store mode: ignoring MIGRATE-BTREE
Single store mode: ignoring MIGRATE-IDX-BTREE
Single store mode: ignoring MIGRATE-PCLASS
Single store mode: ignoring MIGRATE-MULT-PCLASS
Single store mode: ignoring  MIGRATE-IPCLASS PREPARES-BDB TEST-SEQ1 TEST-SEQ2 CLEANSUP-BDB
4 out of 132 total tests failed: SET-RANGE, PGET-BOTH-RANGE, PCURSOR,
PCURSOR2.
NIL

Just ask me if anything other test operations needed to do.

Chun Tian (binghe)

--

--
'()

```

Gmane