Luigi Bai | 1 Oct 2004 04:42
Picon

Collection Functions [WAS Re: Character encodings]


On Thu, 30 Sep 2004, Wolfgang Meier wrote:

>
>> I have one suggestion that makes working with the "collection" based
>> functions easier - it would be different from xmldb:collection(), which
>> currently expects a URI as its first argument to pass to
>> DatabaseManager.getCollection(uri, user, pass). As a remote user I may not
>> know the database-id of the engine running the query; it is not
>> necessarily correct to assume that xmldb:exist:///db points to the local
>> database. Also, the user running the query is already authenticated, so
>> why should they pass authentication information again in the query?
>
> Yes, I already asked myself how to simplify the various xmldb: functions. Do
> we really need to have two collection functions (three if we add
> local-collection)? We already have fn:collection() and xmldb:collection().
> Why not extend fn:collection to accept an XMLDB URI? Instead of directly
> creating a document set, fn:collection would return an object representing a
> collection. This can then be either used to call administrative functions or
> to create a document set for a query. Also, user and password could be
> specified in the URI, so no additional arguments would be required.
>
> Wolfgang

I think the problem with overloading the functionality of fn:collection is 
that like fn:xcollection(), it is specified to return a Node set, not an 
object. Better would be to stick with two (three?) functions, and overload 
xmldb:collection() to be

xmldb:collection($nm as xs:string, $u as xs:string?, $p as xs:string?) 
(Continue reading)

Luigi Bai | 1 Oct 2004 04:58
Picon

Re: Authentication from java application

Eric,

If you use DatabaseManager.getCollection(uri) you receive a collection 
authenticated as "guest". You can use DatabaseManager.getCollection(uri, 
user, pass) to receive a collection authenticated as a particular user. 
Further operations on that collection, such as XQuery, XUpdate, 
getResource, etc. are performed using that authentication.

If you use the client GUI, you can change the permissions of a collection 
to deny read access to various types of users (it works like the typical 
UNIX permissions). You can use the Java xml:db api to do that as well; you 
can ask for a UserManagementService/1.0 (which then needs to be cast to an 
eXist-specific object), which can be used to chmod/chown resources and 
collections. You can create/remove users and groups as well. In CVS, there 
is even support for changing owner/permissions from XQuery (I'm not sure 
it's in a snapshot yet).

For an example of how to secure a collection, you can try
DatabaseManager.getCollection("/db/system")
which should not allow you to getResources or see its contents. You need 
to authenticate as "admin" or another user in the "dba" group:
DatabaseManager.getCollection("/db/system", "admin", password)
to have access to "/db/system/users.xml".

The authentication is useful for having the typical database application 
owner/user split: one userid can change the data, and the other is 
read-only. You can further change your ACLs in the database to even 
prevent "guest" from seeing *anything* past the list of collections in 
"/db" (I don't think you can change the permissions on the "/db" 
collection).
(Continue reading)

Max Ischenko | 1 Oct 2004 09:37
Picon

achieving maximal exist performance


Hello,

We're using eXist as our primarily storage and expect a significant load 
when it will go to production. Therefore, I need to make sure I get the 
optimal performance from it.

Currently we're accessing eXist server using HTTP interface from python. 
Our profiling shows that eXist prepares a response in 0.1 second but for 
python's httplib it takes about *2 seconds* to retrieve it.

I've tried to use XMLRPC method, but the results are somewhat strange. 
Now, it takes *exist* two seconds to prepare response while python 
application receives it rather quickly -- 2.9 seconds in total.

I'm thinking about writing small Java application which would embed 
eXist and expose API via COM service.

But the spike I wrote shows about the same performance as XMLRPC and far 
worse than my HTTP timings. I can't believe my own figures -- how can 
serving through HTTP can be an order of magnitude (0.1 sec vs ~1 sec) 
faster than in-process exist?

I must be doing something wrong or my measures are wrong.

What can you say?

Thanks.

-------------------------------------------------------
(Continue reading)

tom dyson | 1 Oct 2004 13:33
Favicon

Improved REST interface (was Character encodings)

Can I also propose that the REST interface returns XML-formatted error 
messages for broken XQueries, rather than the current HTML response 
(which I believe is the servlet container's default error format).

I could look at this if someone gives me advice on where to start.

Tom

On 30 Sep 2004, at 02:44, Luigi Bai wrote:

> The XML-RPC spec is kind of nuts; it was only designed for very very 
> simple use cases.
>
> Here is one way to make the REST interface "fully featured":
>
> 1. declare a namespace for each Service: e.g. 
> user="http://exist-db.org/service/UserManagementService"
> 2. expose all the methods on the Service interface as XQuery functions.
>
> Then, the existing REST/Query interface allows all possible 
> operations. With the added bonus that they are also available to 
> XQuery writers. ;-)

-----------------+
tom dyson
t: +44 (0)1608 811870
m: +44 (0)7958 752657
http://torchbox.com

-------------------------------------------------------
(Continue reading)

Luigi Bai | 1 Oct 2004 15:44
Picon

Re: Improved REST interface (was Character encodings)

Tom,

Look at org.exist.http.RestServer and o.e.h.servlets.ExistServlet. For the 
most part, it seems the error reporting is done through sending an 
appropriate HTTP error code to the servlet container and letting it send 
an appropriate page out.

For your usage in the short term, you should probably configure your 
servlet container to send an XML page for the various error codes (see the 
Servlet 2.3 spec, Deployment Descriptor, "error-page"). You can make the 
response look like anything you want.

Over the long term, I think Wolfgang wants to replace the XMLRPC interface 
underlying the XML:DB api with the REST interface - in which case, the 
REST responses will have to be carefully crafted to work for both server 
and client side. Then the server code will have to stay in careful 
synchrony with the client code. That could be done also with external 
"error-pages", or internally by "hand crafting" the responses.

Luigi

On Fri, 1 Oct 2004, tom dyson wrote:

> Can I also propose that the REST interface returns XML-formatted error 
> messages for broken XQueries, rather than the current HTML response (which I 
> believe is the servlet container's default error format).
>
> I could look at this if someone gives me advice on where to start.
>
> Tom
(Continue reading)

Wolfgang Meier | 2 Oct 2004 02:21
Gravatar

Re: Improved REST interface (was Character encodings)

> Look at org.exist.http.RestServer and o.e.h.servlets.ExistServlet. For the
> most part, it seems the error reporting is done through sending an
> appropriate HTTP error code to the servlet container and letting it send
> an appropriate page out.

RESTServer catches an XPathException thrown by a query and throws a 
BadRequestException instead. The BadRequestException is then handled by the 
calling class (either EXistServlet or HttpServerConnection, depending on how 
eXist is running: in a servlet context or standalone).

I think it would be possible to propagate the XPathException to the caller 
(instead of catching it in RESTServer). The calling class can then create a 
proper response. Instead of calling sendError, the class could generate an 
XML response and set the HTTP response code accordingly.

If you want to do some work here, you are very welcome.

BTW: I planned to replace the hand-made HTTP server code used in 
HttpServerConnection and use Jetty's minimal HTTP server.

Wolfgang

-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
Wolfgang Meier | 2 Oct 2004 02:30
Gravatar

Re: Collection Functions [WAS Re: Character encodings]

> I think the problem with overloading the functionality of fn:collection is
> that like fn:xcollection(), it is specified to return a Node set, not an
> object. 

Yes, right. That doesn't work then.

> Better would be to stick with two (three?) functions, and overload 
> xmldb:collection() to be xmldb:collection($nm as xs:string, $u as 
xs:string?, $p as xs:string?) object?

Ok.

Wolfgang

-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl
Carlos deMoraes | 1 Oct 2004 22:00
Picon
Favicon

query speed doesn't seem faster if collection is indexed

i don't see a query speed difference between an
indexed collection and one that is not.  perhaps i'm
doing something wrong. i have framed the situation
below. note: element tags have been renamed to protect
the innocent.

the collection:

	collection of about 4000 docs ranging in size from
16k to 80k.  
	
	i have not bothered to define a dtd for them yet.  i
put the following at the top of each doc in order to
have a DOCTYPE for the index:

		<!DOCTYPE somedoctype [<!ELEMENT animal (#PCDATA)>]>
		
	querying takes about 7 seconds regardless of whether
the docs are indexed or not.
	
the index:

	<index doctype="somedoctype" attributes="true"
alphanum="true" default="all" index-depth="6">
	</index>

example doc (... for brevity):

	<?xml version="1.0" encoding="UTF-8"?>
	<!DOCTYPE somedoctype [<!ELEMENT animal (#PCDATA)>]>
(Continue reading)

Fabio.Ciotti | 1 Oct 2004 21:35
Picon
Favicon

Problem Xquerying text nodes... again

Hi,

I already sent a message reporting this strange (at least for me) behaviour of recent Exist snapshot, but (apart some mail exchange with Jean Marc) nobody could help me solve the problem. I've got some 1k records of metadata in mets/mods format in my exist DB. Using the 3rd July (I think, but any snapshot preceding the introduction of cocoon 2.1.5.1 is ok) snapshot I modified the mods Xquery application to query my DB. Everything worked fine (you can have a look at the production site http://www.bibliotecaitaliana.it:8080/exist/catalogo where the application is actually running). When I try my code with newer snapshot (I've just checked with the last one) all the query fails without reporting any error:
only 0 records found. By the way, I'm using the latest snapshot under Tomcat
5.0.28 and JSE 1.4.2_04, on Win XP sp1 platforms.

I investigated a little more using the basic Xquery interface with my METS/MODS metadata (I attach a sample Mets record. I can send more record if Wolfgang or someone else of the developer team request them). What I found is that any query that involves the search of text nodes inside my metadata fails, or better finds nothing, while the same query without operators to search for text nodes works fine.

For instance

xcollection('/db/metadata')//mets:mets[.//mods:name&='Alberti']

returns 0 result (the Java client returns 37 items)

while

xcollection('/db/metadata')//mets:mets[.//mods:name]

returns 1021 items (just like the java client)


Even stranger: the query

xcollection('/db/metadata')//mets:mets[.&='Alberti']

returns only one item. And this is stranger since

1) the string in the record is found only under the METS name spacescope, and not under the other namespaces scopes in the record
2) anyway it should find 37 items (as does the Java client) not only 1

Another error I found is in the following query to a TEI text (http://www.bibliotecaitaliana.it/archivio/alamanni/antigone/alamanni_antigo
ne.xml)

collection('/db/archivio/Antigone')/TEI.2//l[.&="casa"]

The text contains three <l> containing 'casa', and the java interface finds them correctly.

Instead the simple Xquery interface returns 3 items but,

The first is correctly the first <l> element in the document containing 'casa'
The second is the <l> elements immediately following the one above The third is the <l> element immediately following the second <l> element containing 'casa'

Therefore it seems that problem arises in the Cocoon Xquery interface when namespaces and text node are involved.

Any idea about how to solve this issues?

Fabio Ciotti

Attachment (1d11f71b.xml): application/octet-stream, 10 KiB
Curtis Hatter | 3 Oct 2004 06:10

namespace missing for collection result in cocoon generator

Hi,

I'm new too using eXist and apologize if this has been asked and 
answered, I tried searching the docs and lists for an answer to my 
question but was unable to find one.

Here's the cocoon snippet i'm using:

<map:match pattern="catalog/*/*">
    <map:match pattern="**/index.htm">
        <map:generate type="file" src="xmldb:exist:///db/pitapparel/{1}/"/>
        <map:serialize type="xml"/>
    </map:match>
</map:match>

This is the result if I do 
http://localhost/cocoon/pitapparel/catalog/sports_shirts/index.htm

<db:collections resources="0" collections="5" 
base="xmldb:exist:///db/pitapparel/catalog/sports_shirts/">
    <db:collection name="pocket"/>
    <db:collection name="basic_cotton_poly"/>
    <db:collection name="basic_cotton"/>
    <db:collection name="fashion"/>
    <db:collection name="long_sleeve"/>
</db:collections>

I want to pass this onto an xslt stylesheet to create a page that can be 
parsed by the XMLDBTransformer. However, the xslt processor keeps giving 
me errors because it cannot resolve the "db" namespace in the 
stylesheet. Example of what I want to do:

<map:generate type="file" src="xmldb:exist///db/pitapparel/{1}/" />
<map:transform type="xslt" src="styles/xsl/catalog-browser.xsl" />
<map:transform type="exist-xmldb" />

Is there a way around this? or should I just switch to using XSP pages?

I'm using Cocoon 2.1.5 w/ embedded eXist 1.0b1

Thanks for your help,
Curtis

-------------------------------------------------------
This SF.net email is sponsored by: IT Product Guide on ITManagersJournal
Use IT products in your business? Tell us what you think of them. Give us
Your Opinions, Get Free ThinkGeek Gift Certificates! Click to find out more
http://productguide.itmanagersjournal.com/guidepromo.tmpl

Gmane