Ian Small | 1 May 2007 01:44

RE: [MarkLogic Dev General] unordered

Lee -
 
I'm pretty sure that unordered isn't going to help you much.  It's implementation-dependent, and our ordering tends to be optimized as you outlined. 
 
ian

From: general-bounces-ld4jwAGwUXTgXEvjvSGRgMKenhbt+owO@public.gmane.org [mailto:general-bounces-ld4jwAGwUXTgXEvjvSGRgMKenhbt+owO@public.gmane.org] On Behalf Of Lee Pollington
Sent: Monday, April 30, 2007 8:14 AM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] unordered

Hi,

A general question, I'm not trying to solve a problem.

What does ML do with unordered( <expr> ) ? I've run a couple of simple tests - a straight pull of a 1000 nodes, 1200 nodes based on IDs in another document (with warm caches)  and I see little or no difference.

My understanding is that the fragment IDs from the fact lists will already be in database order. Any range indexes will require a sort to simplify the intersection with any additional fragment ID lists, so there is little scope for unordered to have an impact on performance.

Are there any scenarios where unordered is *likely* to provide a performance benefit?

Thanks,
Lee
<div>
<div><span class="309164423-30042007">Lee 
-</span></div>
<div>
<span class="309164423-30042007"></span>&nbsp;</div>
<div><span class="309164423-30042007">I'm 
pretty sure that unordered isn't going to help you much.&nbsp; It's 
implementation-dependent, and our ordering tends to be optimized as you 
outlined.&nbsp;</span></div>
<div>
<span class="309164423-30042007"></span>&nbsp;</div>
<div><span class="309164423-30042007">ian</span></div>
<br><div class="OutlookMessageHeader" lang="en-us" dir="ltr" align="left">
From: general-bounces@... 
[mailto:general-bounces@...] On Behalf Of Lee 
Pollington<br>Sent: Monday, April 30, 2007 8:14 AM<br>To: General 
Mark Logic Developer Discussion<br>Subject: [MarkLogic Dev General] 
unordered<br><br>
</div>
<div></div>Hi,<br><br>A general question, I'm not trying to solve a problem. 
<br><br>What does ML do with <span>unordered( &lt;expr&gt; )</span> ? 
I've run a couple of simple tests - a straight pull of a 1000 nodes, 1200 nodes 
based on IDs in another document (with warm caches)&nbsp; and I see little or no 
difference. <br><br>My understanding is that the fragment IDs from the fact 
lists will already be in database order. Any range indexes will require a sort 
to simplify the intersection with any additional fragment ID lists, so there is 
little scope for <span>unordered</span> to have an impact on 
performance.<br><br>Are there any scenarios where <span>unordered</span> is *likely* to 
provide a performance benefit? <br><br>Thanks,<br>Lee<br>
</div>
Danny Sokolsky | 1 May 2007 04:30

RE: [MarkLogic Dev General] building a dictionary from a word lexicon

Hi Alan,

I think your approach would work.  

If you really want a dictionary of all of the words in the database,
however, this might be easier:

xdmp:save("c:/tmp/tmp.xml",
<dictionary>{"
",
for $x in cts:words()
return (
<word>{$x}</word>, "
")
}</dictionary>)

The spaces are in there so line breaks will appear between the terms.
This includes everything in the db, not just things starting with a-z
(not sure if that is what you want or not).  I didn't try this on a
large data set, but I think it will work because it will just stream
everything out to the disk (assuming you don't run out of disk
space...).

Of course using the lexicon to create a dictionary means that all of the
words (including the misspelled ones) are put in the dictionary.  So
maybe I am not reading the intent of your question correctly.

-Danny

-----Original Message-----
From: general-bounces@...
[mailto:general-bounces@...] On Behalf Of Alan
Darnell
Sent: Monday, April 30, 2007 3:33 PM
To: General@...
Subject: [MarkLogic Dev General] building a dictionary from a word
lexicon

I'd like to build a dictionary file for use with the spelling module and
base that dictionary on words that appear in my word lexicon.  So I want
to dump the contents of the lexicon to a file formatted according to the
spelling dictionary schema.

To do this, I'm thinking of running through the lexicon letter by letter
and constructing the spelling dictionary from the output.

for $i in cts:word-match("a*") [1 to 2000]
return
<word>{$i}</word>

Is this the best way to do this?  I'm thinking that creating a
dictionary out of lexicons is probably a pretty common task and that my
approach seems cumbersome.  I'm thinking also it would be great if you
could have the dictionary automatically update itself based on the
content of one or more word lexicons as new documents were added,
updated, and deleted in a database or databases.

Alan

Alan Darnell
University of Toronto _______________________________________________
General mailing list
General@...
http://xqzone.com/mailman/listinfo/general
Paul Preuveneers | 1 May 2007 14:27
Picon

Re: [MarkLogic Dev General] Re: URI Privileges

Yes, fixed it thankyou!

It was the any-uri setting on the role that was the problem.

Thanks!

Paul

On 29/04/07, Michael Blakeley < michael.blakeley-efBvD/aTHCF8UrSeD/g0lQ@public.gmane.org> wrote:
Paul,

Have you read our documentation on "Understanding and Using Security"
(http://developer.marklogic.com/pubs/)? I'm asking because I suspect
that you may be confused about URI Privileges vs document-level
permissions. For example, there is no such thing as an "update
privilege" in the MarkLogic Server security model.

It's also misleading to talk about a "protected URI". All URIs are
protected, unless the user has the "any-uri" Execute Privilege. The
purpose of a URI Privilege is to unprotect a URI prefix for a particular
role or user.

I apologize for being so pedantic, but the terminology is important.

General debugging tips: it is always useful to say which version of
MarkLogic Server you are using. Also, each security item has a
"Describe" tab in the admin server, which provides a nice summary of the
item's configuration.

Here's how I set up a similar model to what I believe you're after: we
use this as an example in our training course. I've copied the text from
the description tab for each item.

* URI Privilege: priv-uri-public
** privilege name: priv-uri-public
** privilege action: /public/

* Role: writer
** Execute Privileges: none
** URI Privileges: priv-uri-public
** Permissions: writer (insert, update, read), reader (read)
** Collections: none

* User: writer
** Roles: writer

* Role: reader
** Execute Privileges: none
** URI Privileges: none
** Permissions: none
** Collections: none

* User: reader
** Roles: reader

With this configuration, "writer" may insert new documents under
/public/, but nowhere else in the database. The "writer" may
subsequently query, update, and overwrite those documents. The "reader"
may only query those documents.

-- Mike

> Paul Preuveneers paul.preuveneers at gmail.com
> Mon Apr 23 03:37:10 PDT 2007
>
> I am trying to lock down a particular URI to a particular role/user and I
> don't seem to be able to
> get the URI Privileges functionality to work.
>
> I have the following idiom for users and roles:
>
> Role                        User
> web-user                  my-web-user
> content-manager       my-content-manager
>
> The web-user role does not have document update privileges, whereas the
> content-manager role does.
> I connect to ML using my-web-user and only use content-manager when loading
> data or for cq.
>
> I want to be able to let the web-user role only update a specific URI and
> nowhere else, however even after creating
> a URI privilege and assigning it to that role, I still cannot create
> documents in that uri (or anywhere else!). The user
> still seems to need document update privileges? But if I grant these I can
> create docs in any URI.
> I can also still create documents in the protected URI with the
> content-manager user also, and I was hoping
> this would not be allowed until I gave the privilege to this role also.
>
> So far, I can't see the URI Privileges having any kind of effect at all...
>
> What am I doing wrong?
>
> Thanks
>
> Paul


_______________________________________________
General mailing list
General-ld4jwAGwUXTgXEvjvSGRgMKenhbt+owO@public.gmane.org
http://xqzone.com/mailman/listinfo/general



<div>
<p>Yes, fixed it thankyou!<br><br>It was the any-uri setting on the role that was the problem.<br><br>Thanks!<br><br>Paul<br><br></p>
<div>
<span class="gmail_quote">On 29/04/07, Michael Blakeley &lt;
<a href="mailto:michael.blakeley@...">michael.blakeley@...</a>&gt; wrote:</span><blockquote class="gmail_quote">
Paul,<br><br>Have you read our documentation on "Understanding and Using Security"<br>(<a href="http://developer.marklogic.com/pubs/">http://developer.marklogic.com/pubs/</a>)? I'm asking because I suspect<br>
that you may be confused about URI Privileges vs document-level<br>permissions. For example, there is no such thing as an "update<br>privilege" in the MarkLogic Server security model.<br><br>It's also misleading to talk about a "protected URI". All URIs are
<br>protected, unless the user has the "any-uri" Execute Privilege. The<br>purpose of a URI Privilege is to unprotect a URI prefix for a particular<br>role or user.<br><br>I apologize for being so pedantic, but the terminology is important.
<br><br>General debugging tips: it is always useful to say which version of<br>MarkLogic Server you are using. Also, each security item has a<br>"Describe" tab in the admin server, which provides a nice summary of the
<br>item's configuration.<br><br>Here's how I set up a similar model to what I believe you're after: we<br>use this as an example in our training course. I've copied the text from<br>the description tab for each item.
<br><br>* URI Privilege: priv-uri-public<br>** privilege name: priv-uri-public<br>** privilege action: /public/≤br><br>* Role: writer<br>** Execute Privileges: none<br>** URI Privileges: priv-uri-public<br>** Permissions: writer (insert, update, read), reader (read)
<br>** Collections: none<br><br>* User: writer<br>** Roles: writer<br><br>* Role: reader<br>** Execute Privileges: none<br>** URI Privileges: none<br>** Permissions: none<br>** Collections: none<br><br>* User: reader<br>** Roles: reader
<br><br>With this configuration, "writer" may insert new documents under<br>/public/, but nowhere else in the database. The "writer" may<br>subsequently query, update, and overwrite those documents. The "reader"
<br>may only query those documents.<br><br>-- Mike<br><br>&gt; Paul Preuveneers paul.preuveneers at <a href="http://gmail.com">gmail.com</a><br>&gt; Mon Apr 23 03:37:10 PDT 2007<br>&gt;<br>&gt; I am trying to lock down a particular URI to a particular role/user and I
<br>&gt; don't seem to be able to<br>&gt; get the URI Privileges functionality to work.<br>&gt;<br>&gt; I have the following idiom for users and roles:<br>&gt;<br>&gt; Role&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;User<br>&gt; web-user&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;my-web-user
<br>&gt; content-manager&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; my-content-manager<br>&gt;<br>&gt; The web-user role does not have document update privileges, whereas the<br>&gt; content-manager role does.<br>&gt; I connect to ML using my-web-user and only use content-manager when loading
<br>&gt; data or for cq.<br>&gt;<br>&gt; I want to be able to let the web-user role only update a specific URI and<br>&gt; nowhere else, however even after creating<br>&gt; a URI privilege and assigning it to that role, I still cannot create
<br>&gt; documents in that uri (or anywhere else!). The user<br>&gt; still seems to need document update privileges? But if I grant these I can<br>&gt; create docs in any URI.<br>&gt; I can also still create documents in the protected URI with the
<br>&gt; content-manager user also, and I was hoping<br>&gt; this would not be allowed until I gave the privilege to this role also.<br>&gt;<br>&gt; So far, I can't see the URI Privileges having any kind of effect at all...
<br>&gt;<br>&gt; What am I doing wrong?<br>&gt;<br>&gt; Thanks<br>&gt;<br>&gt; Paul<br><br><br>_______________________________________________<br>General mailing list<br><a href="mailto:General@...">General@...
</a><br><a href="http://xqzone.com/mailman/listinfo/general">http://xqzone.com/mailman/listinfo/general</a><br><br><br>
</blockquote>
</div>
<br>
</div>
Alan Darnell | 1 May 2007 19:32
Picon

Re: [MarkLogic Dev General] building a dictionary from a word lexicon

Thanks Danny,

This worked great but when I tried to load the resulting file (about
400K words -- lots of specialized medical terms) I got this error:

ERROR: eval-in sp-nih at file:/opt/MarkLogic/Modules/
XDMP-FRAGTOOLARGE: Fragment of /sp-dictionary.xml too large for
in-memory storage: XDMP-INMMLISTFULL: In-memory list storage full;
list: table=89%, wordsused=67%, wordsfree=0%, overhead=33%; tree:
table=0%, wordsused=12%, wordsfree=88%, overhead=0%

Are there some admin settings I can adjust to get past this or should
I break the dictionary file up into smaller chunks or load the thing
via XCC one word at a time?

Alan

On 4/30/07, Danny Sokolsky <dsokolsky@...> wrote:
> Hi Alan,
>
> I think your approach would work.
>
> If you really want a dictionary of all of the words in the database,
> however, this might be easier:
>
> xdmp:save("c:/tmp/tmp.xml",
> <dictionary>{"
> ",
> for $x in cts:words()
> return (
> <word>{$x}</word>, "
> ")
> }</dictionary>)
>
> The spaces are in there so line breaks will appear between the terms.
> This includes everything in the db, not just things starting with a-z
> (not sure if that is what you want or not).  I didn't try this on a
> large data set, but I think it will work because it will just stream
> everything out to the disk (assuming you don't run out of disk
> space...).
>
> Of course using the lexicon to create a dictionary means that all of the
> words (including the misspelled ones) are put in the dictionary.  So
> maybe I am not reading the intent of your question correctly.
>
> -Danny
>
> -----Original Message-----
> From: general-bounces@...
> [mailto:general-bounces@...] On Behalf Of Alan
> Darnell
> Sent: Monday, April 30, 2007 3:33 PM
> To: General@...
> Subject: [MarkLogic Dev General] building a dictionary from a word
> lexicon
>
>
> I'd like to build a dictionary file for use with the spelling module and
> base that dictionary on words that appear in my word lexicon.  So I want
> to dump the contents of the lexicon to a file formatted according to the
> spelling dictionary schema.
>
> To do this, I'm thinking of running through the lexicon letter by letter
> and constructing the spelling dictionary from the output.
>
> for $i in cts:word-match("a*") [1 to 2000]
> return
> <word>{$i}</word>
>
> Is this the best way to do this?  I'm thinking that creating a
> dictionary out of lexicons is probably a pretty common task and that my
> approach seems cumbersome.  I'm thinking also it would be great if you
> could have the dictionary automatically update itself based on the
> content of one or more word lexicons as new documents were added,
> updated, and deleted in a database or databases.
>
> Alan
>
> Alan Darnell
> University of Toronto _______________________________________________
> General mailing list
> General@...
> http://xqzone.com/mailman/listinfo/general
>
Danny Sokolsky | 1 May 2007 20:17

RE: [MarkLogic Dev General] building a dictionary from a word lexicon

Hi Alan,

The dictionary you are loading is larger than your in memory list size,
which is the largest fragment you can load into that database.  You need
to either increase your in-memory-list-size parameter on the database in
which the dictionary is being loaded or break the dictionary up into 2
or more smaller dictionaries.  The in-memory-list-size is in the Admin
Interface on the database configuration page for the db you are using.

Also, if you are going to use this with the spell APIs, you should use
the spell:load funtion to load the dictionary, which puts the document
in the proper collections for the spell API.

-Danny

-----Original Message-----
From: Alan Darnell [mailto:alan.darnell@...] 
Sent: Tuesday, May 01, 2007 10:32 AM
To: Danny Sokolsky
Cc: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] building a dictionary from a word
lexicon

Thanks Danny,

This worked great but when I tried to load the resulting file (about
400K words -- lots of specialized medical terms) I got this error:

ERROR: eval-in sp-nih at file:/opt/MarkLogic/Modules/
XDMP-FRAGTOOLARGE: Fragment of /sp-dictionary.xml too large for
in-memory storage: XDMP-INMMLISTFULL: In-memory list storage full;
list: table=89%, wordsused=67%, wordsfree=0%, overhead=33%; tree:
table=0%, wordsused=12%, wordsfree=88%, overhead=0%

Are there some admin settings I can adjust to get past this or should I
break the dictionary file up into smaller chunks or load the thing via
XCC one word at a time?

Alan

On 4/30/07, Danny Sokolsky <dsokolsky@...> wrote:
> Hi Alan,
>
> I think your approach would work.
>
> If you really want a dictionary of all of the words in the database, 
> however, this might be easier:
>
> xdmp:save("c:/tmp/tmp.xml",
> <dictionary>{"
> ",
> for $x in cts:words()
> return (
> <word>{$x}</word>, "
> ")
> }</dictionary>)
>
> The spaces are in there so line breaks will appear between the terms. 
> This includes everything in the db, not just things starting with a-z 
> (not sure if that is what you want or not).  I didn't try this on a 
> large data set, but I think it will work because it will just stream 
> everything out to the disk (assuming you don't run out of disk 
> space...).
>
> Of course using the lexicon to create a dictionary means that all of 
> the words (including the misspelled ones) are put in the dictionary.  
> So maybe I am not reading the intent of your question correctly.
>
> -Danny
>
> -----Original Message-----
> From: general-bounces@...
> [mailto:general-bounces@...] On Behalf Of Alan 
> Darnell
> Sent: Monday, April 30, 2007 3:33 PM
> To: General@...
> Subject: [MarkLogic Dev General] building a dictionary from a word 
> lexicon
>
>
> I'd like to build a dictionary file for use with the spelling module 
> and base that dictionary on words that appear in my word lexicon.  So 
> I want to dump the contents of the lexicon to a file formatted 
> according to the spelling dictionary schema.
>
> To do this, I'm thinking of running through the lexicon letter by 
> letter and constructing the spelling dictionary from the output.
>
> for $i in cts:word-match("a*") [1 to 2000]
> return
> <word>{$i}</word>
>
> Is this the best way to do this?  I'm thinking that creating a 
> dictionary out of lexicons is probably a pretty common task and that 
> my approach seems cumbersome.  I'm thinking also it would be great if 
> you could have the dictionary automatically update itself based on the

> content of one or more word lexicons as new documents were added, 
> updated, and deleted in a database or databases.
>
> Alan
>
> Alan Darnell
> University of Toronto _______________________________________________
> General mailing list
> General@... 
> http://xqzone.com/mailman/listinfo/general
>
VIKAS JAIN | 2 May 2007 09:38
Picon
Favicon

[MarkLogic Dev General] Examples for loading and searching PDF

Hi,

I am trying to build a small POC, where in I require to load and search the 
PDF documents.
All the examples and tutorials talk about loading and searching XML 
documents.

I have been able to load the PDF document on lines of XML but when it comes 
to search nothing is being returned.
I am using the code below to search a text in PDF document.
xdmp:set-response-content-type("text/html"),
    <ul>
    {
      for $i in cts:search(input(), cts:word-query("Altova"))
      return
        <li> {$i} </li>
    }
    </ul>

Request you to provide some examples that work with PDF’s or other formats 
in Mark Logic server.

Thanks for help in advance.

Regards
Vikas

_________________________________________________________________
Marriage Simplified. Match chat marry. 
http://ss1.richmedia.in/recurl.asp?pid=23

Michael Blakeley | 2 May 2007 17:11

Re: [MarkLogic Dev General] Examples for loading and searching PDF

Vikas,

I would recommend that you read chapter 10 of the developers' guide, 
available at http://developer.marklogic.com/pubs/

-- Mike

VIKAS JAIN wrote:
> Hi,
> 
> I am trying to build a small POC, where in I require to load and search 
> the PDF documents.
> All the examples and tutorials talk about loading and searching XML 
> documents.
> 
> I have been able to load the PDF document on lines of XML but when it 
> comes to search nothing is being returned.
> I am using the code below to search a text in PDF document.
> xdmp:set-response-content-type("text/html"),
>    <ul>
>    {
>      for $i in cts:search(input(), cts:word-query("Altova"))
>      return
>        <li> {$i} </li>
>    }
>    </ul>
> 
> Request you to provide some examples that work with PDF’s or other 
> formats in Mark Logic server.
> 
> Thanks for help in advance.
> 
> Regards
> Vikas
> 
> _________________________________________________________________
> Marriage Simplified. Match chat marry. 
> http://ss1.richmedia.in/recurl.asp?pid=23
> 
> _______________________________________________
> General mailing list
> General@...
> http://xqzone.com/mailman/listinfo/general

Attachment (smime.p7s): application/x-pkcs7-signature, 4532 bytes
Vikas,

I would recommend that you read chapter 10 of the developers' guide, 
available at http://developer.marklogic.com/pubs/

-- Mike

VIKAS JAIN wrote:
> Hi,
> 
> I am trying to build a small POC, where in I require to load and search 
> the PDF documents.
> All the examples and tutorials talk about loading and searching XML 
> documents.
> 
> I have been able to load the PDF document on lines of XML but when it 
> comes to search nothing is being returned.
> I am using the code below to search a text in PDF document.
> xdmp:set-response-content-type("text/html"),
>    <ul>
>    {
>      for $i in cts:search(input(), cts:word-query("Altova"))
>      return
>        <li> {$i} </li>
>    }
>    </ul>
> 
> Request you to provide some examples that work with PDF’s or other 
> formats in Mark Logic server.
> 
> Thanks for help in advance.
> 
> Regards
> Vikas
> 
> _________________________________________________________________
> Marriage Simplified. Match chat marry. 
> http://ss1.richmedia.in/recurl.asp?pid=23
> 
> _______________________________________________
> General mailing list
> General@...
> http://xqzone.com/mailman/listinfo/general

Vikas Jain | 3 May 2007 09:35

[MarkLogic Dev General] Error in saving a converted PDF file

Hi,

 

I am trying to build a small POC, where in I am trying to load a PDF document.

 

I tried using the xdmp:document-load to load the PDF document after converting the same using pdf-convert, but it will not work as xdmp:document-load expects a string but covert returns a series of nodes.

 

Hence after digging s little, found another function that could be used to load the PDF document in Marklogic “cvt:save-converted-documents”.

Now before loading the document, I am trying to convert the same and then save it. The code below is what I am using

 

import module namespace pdf = "http://marklogic.com/cpf/pdf"

                          at "/MarkLogic/conversion/pdf.xqy"

 

import module namespace cvt = "http://marklogic.com/cpf/convert"

                          at "/MarkLogic/conversion/convert.xqy"

 

  let $results := xdmp:pdf-convert( xdmp:document-get("C:\Vikas\MarkLogic_3.1_pubs\MarkLogic_3.1_pubs\pubs\books\admin.pdf"), "admin.pdf" )

  return

     cvt:save-converted-documents("C:\Vikas\MarkLogic_3.1_pubs\MarkLogic_3.1_pubs\pubs\books\admin.pdf", "admin.pdf",

        $results[1], $results[2 to last()] );

 

Now on executing the command I am getting the following error:

500 Internal Server Error

XDMP-DOCNOTFOUND: xdmp:document-add-properties("admin.pdf", /lnk:link) -- Document not found

in /MarkLogic/cpf/links.xqy, on line 476,
in lnk:insert-property(/lnk:link/ <at> to, /lnk:link)

$lnk:uri = xs:anyURI("admin.pdf")
$lnk:link = /lnk:link
$lnk:existing-link = ()
$lnk:trace = ()

in /MarkLogic/cpf/links.xqy, on line 48,
in lnk:insert(/lnk:link)

$lnk:link = /lnk:link

in /MarkLogic/cpf/links.xqy, on line 33,
in lnk:create("admin_pdf_parts/", "admin.pdf", "source", "conversion", "strong")

$lnk:from = "admin_pdf_parts/"
$lnk:to = "admin.pdf"
$lnk:role = "source"
$lnk:rev-role = "conversion"
$lnk:strength = "strong"

in /MarkLogic/conversion/convert.xqy, on line 203,
in cvt:save-converted-documents("admin.pdf", "admin.pdf", /parts, (doc(""), doc(""), doc(""), ...))

$source-uri = "admin.pdf"
$destination-uri = "admin.pdf"
$manifest = /parts
$docs = (fn:doc(""), fn:doc(""), fn:doc(""), ...)
$destination-uri = "admin.pdf"
$subdirs = "admin_pdf_parts/"
$has-main = fn:false()
$insert-main = ()
$dir = "admin_pdf_parts/"

in /use-cases/load1.xqy, on line 10

$results = (/parts, doc(""), doc(""), ...)

 

Request you to please let me know where I am doing things wrong.

 

Thanks for help in advance.

 

Regards

Vikas Jain | GlobalLogic India
Disclaimer: http://www.globallogic.com/email_disclaimer.txt

 

<div>

<div class="Section1">

<p class="MsoNormal"><span>Hi,<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>I am trying to build a small POC, where in I am trying to
load a PDF document.<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>I tried using the xdmp:document-load to load the PDF
document after converting the same using pdf-convert, but it will not work as xdmp:document-load
expects a string but covert returns a series of nodes. <p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>Hence after digging s little, found another function that
could be used to load the PDF document in Marklogic &ldquo;<span>cvt:save-converted-documents&rdquo;. </span><p></p></span></p>

<p class="MsoNormal"><span>Now before loading the document, I am trying to convert the
same and then save it. The code below is what I am using<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>import module namespace pdf =
"http://marklogic.com/cpf/pdf" <p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;
at "/MarkLogic/conversion/pdf.xqy"<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>import module namespace cvt =
"http://marklogic.com/cpf/convert" <p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;
at "/MarkLogic/conversion/convert.xqy"<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>&nbsp; let $results := xdmp:pdf-convert(
xdmp:document-get("C:\Vikas\MarkLogic_3.1_pubs\MarkLogic_3.1_pubs\pubs\books\admin.pdf"),
"admin.pdf" )<p></p></span></p>

<p class="MsoNormal"><span>&nbsp; return <p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;
cvt:save-converted-documents("C:\Vikas\MarkLogic_3.1_pubs\MarkLogic_3.1_pubs\pubs\books\admin.pdf",
"admin.pdf",<p></p></span></p>

<p class="MsoNormal"><span>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
$results[1], $results[2 to last()] );<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>Now on executing the command I am getting the following
error:<p></p></span></p>

<h1><span>500 Internal Server Error<p></p></span></h1>

<p class="MsoNormal"><span>XDMP-DOCNOTFOUND:
xdmp:document-add-properties("admin.pdf", /lnk:link) -- Document not
found <p></p></span></p>

<p class="MsoNormal"><span>in /MarkLogic/cpf/links.xqy, on line 476,<br>
in lnk:insert-property(/lnk:link/ <at> to, /lnk:link) <p></p></span></p>

<p class="MsoNormal"><span>$lnk:uri =
xs:anyURI("admin.pdf")<br>
$lnk:link = /lnk:link<br>
$lnk:existing-link = ()<br>
$lnk:trace = () <p></p></span></p>

<p class="MsoNormal"><span>in /MarkLogic/cpf/links.xqy, on line 48,<br>
in lnk:insert(/lnk:link) <p></p></span></p>

<p class="MsoNormal"><span>$lnk:link =
/lnk:link <p></p></span></p>

<p class="MsoNormal"><span>in /MarkLogic/cpf/links.xqy, on line 33,<br>
in lnk:create("admin_pdf_parts/", "admin.pdf",
"source", "conversion", "strong") <p></p></span></p>

<p class="MsoNormal"><span>$lnk:from =
"admin_pdf_parts/"<br>
$lnk:to = "admin.pdf"<br>
$lnk:role = "source"<br>
$lnk:rev-role = "conversion"<br>
$lnk:strength = "strong" <p></p></span></p>

<p class="MsoNormal"><span>in /MarkLogic/conversion/convert.xqy, on
line 203,<br>
in cvt:save-converted-documents("admin.pdf", "admin.pdf",
/parts, (doc(""), doc(""), doc(""), ...)) <p></p></span></p>

<p class="MsoNormal"><span>$source-uri =
"admin.pdf"<br>
$destination-uri = "admin.pdf"<br>
$manifest = /parts<br>
$docs = (fn:doc(""), fn:doc(""), fn:doc(""), ...)<br>
$destination-uri = "admin.pdf"<br>
$subdirs = "admin_pdf_parts/"<br>
$has-main = fn:false()<br>
$insert-main = ()<br>
$dir = "admin_pdf_parts/" <p></p></span></p>

<p class="MsoNormal"><span>in /use-cases/load1.xqy, on line 10 <p></p></span></p>

<p class="MsoNormal"><span>$results =
(/parts, doc(""), doc(""), ...) <p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>Request you to please let me know where I am doing things
wrong.<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>Thanks for help in advance.<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<div>

<p class="MsoNormal"><span>Regards</span><p></p></p>

<p class="MsoNormal"><span class="apple-style-span"><span>Vikas Jain</span></span><span class="apple-style-span"><span>&nbsp;|</span></span><span class="apple-style-span"><span> <span>GlobalLogic India</span></span></span><span><br></span><span class="MsoHyperlink"><span>Disclaimer:</span></span><span> </span><a href="http://www.globallogic.com/email_disclaimer.txt" title="http://www.globallogic.com/email_disclaimer.txt
http://www.induslogic.com/email_disclaimer.txt"><span><span title="http://www.globallogic.com/email_disclaimer.txt"><span title="http://www.globallogic.com/email_disclaimer.txt">http://www.globallogic.com/email_disclaimer.txt</span></span></span></a><p></p></p>

</div>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

</div>

</div>
Vikas Jain | 3 May 2007 10:54

[MarkLogic Dev General] Function to get the documents from file system

Hi,

 

Do we have a function available like “xdmp:http-get” which can get all the documents from the file system.

 

There seems to be a function “xdmp:directory”, but it seems that it only works with documents loaded in the MarkLogic server rather than the one on the file system.

 

Thanks for help in advance.

 

Regards

Vikas Jain
The Leader in Global Product Development
B-34/1, Sector 59,  Noida 201301 U.P

Phone: +91. 120. 406.2000 x3154 | Fax: +91.120.258.5721 
www.globallogic.com
Disclaimer: http://www.globallogic.com/email_disclaimer.txt

 

 

<div>

<div class="Section1">

<p class="MsoNormal"><span>Hi,<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>Do we have a function available like &ldquo;xdmp:http-get&rdquo;
which can get all the documents from the file system.<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>There seems to be a function &ldquo;xdmp:directory&rdquo;,
but it seems that it only works with documents loaded in the MarkLogic server
rather than the one on the file system.<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span>Thanks for help in advance</span><span>.<p></p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<div>

<div>

<p class="MsoNormal"><span>Regards</span><span><p></p></span></p>

<p class="MsoNormal"><span class="apple-style-span"><span>Vikas Jain</span></span><span> <br></span><span>The Leader in Global Product Development</span><span><br></span><span>B-34/1, Sector 59,&nbsp;
Noida 201301 U.P</span><span><p></p></span></p>

<p class="MsoNormal"><span>Phone: +91. 120.
406.2000 x3154 | Fax: +91.120.258.5721&nbsp; <br></span><span><a href="http://www.globallogic.com/" title="http://www.globallogic.com/"><span><span title="http://www.globallogic.com/"><span title="http://www.globallogic.com/">www.globallogic.com</span></span></span></a><br></span><span class="MsoHyperlink"><span>Disclaimer:</span></span><span> <a href="http://www.globallogic.com/email_disclaimer.txt" title="http://www.globallogic.com/email_disclaimer.txt
http://www.induslogic.com/email_disclaimer.txt"><span><span title="http://www.globallogic.com/email_disclaimer.txt"><span title="http://www.globallogic.com/email_disclaimer.txt">http://www.globallogic.com/email_disclaimer.txt</span></span></span></a></span><p></p></p>

</div>

</div>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

<p class="MsoNormal"><span><p>&nbsp;</p></span></p>

</div>

</div>
Mattio Valentino | 3 May 2007 15:42
Picon
Gravatar

Re: [MarkLogic Dev General] cts:search question

Darin,

This worked perfectly.  Thanks for the tip.

Mattio

On 4/18/07, McBeath, Darin W (ELS-STL) <D.McBeath@...> wrote:
> I don't believe that will work.
>
> Instead, you might want to try the following approach:
>
> cts:search(/(a | b), cts:and-query(()))
>
> Darin.
>
> -----Original Message-----
> From: general-bounces@...
> [mailto:general-bounces@...] On Behalf Of Mike
> Sokolov
> Sent: Wednesday, April 18, 2007 9:24 AM
> To: General Mark Logic Developer Discussion
> Subject: Re: [MarkLogic Dev General] cts:search question
>
> Mattio - The signature seems to imply you should be able to pass a
> sequence of  nodes using multiple paths by wrapping them in parens:
>
> cts:search ((path1, path2, path3), ...)
>
> you could also do multiple searches
>
>
> Mattio Valentino wrote:
> > Hi,
> >
> > I'm still coming up to speed on XQuery and MarkLogic and I can't quite
> > figure out something that I'm sure will be obvious once I hear the
> > answer.
> >
> > The signature on cts:search() is
> >
> > cts:search(
> > $expression as node()*,
> > $query as cts:query?,
> > [$options as xs:string*],
> > [$quality-weight as xs:double]
> > )  as  node()*
> >
> > Does "$expression as node()*" mean that I should be able to pass in
> > more than one path so I can search more than one node?  I initially
> > assumed so, but I can't figure out how to do it after trying several
> > different ways.  After looking at the Shakespeare sample app I was
> > able to build the query below, which works but seems a little awkward.
> > Is there an easier way?
> >
> > let $ORIGINAL_QUERY := "test phrase"
> > for $i in cts:search(
> //element()[self::chapter[not(descendant::entry)] |
> >                                 self::div[not(descendant::entry)] |
> >                                 self::entry |
> >                                 self::section],
> >                     cts:word-query(
> >                                     $ORIGINAL_QUERY,
> >                                     ("case-insensitive",
> > "diacritic-insensitive")
> >                                   )
> >                   )
> > return
> > (: handle results sequence :)
> >
> > Thanks,
> >
> > Mattio
> > _______________________________________________
> > General mailing list
> > General@...
> > http://xqzone.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General@...
> http://xqzone.com/mailman/listinfo/general
> _______________________________________________
> General mailing list
> General@...
> http://xqzone.com/mailman/listinfo/general
>

Gmane