Retter, Adam (RBI-UK | 29 Jul 17:50 2014
Picon

[MarkLogic Dev General] Transaction control and errors

Hi there,

We have a function that reads a collection of documents in the database, and then inserts N new documents
into the collection depending on the results of the first read. We need this behaviour to be atomic, and so
we explicitly take locks using xdmp:lock-for-update. This effectively forces the entire read and then
conditional insert into a single threaded operation (which is fine).

The problem is this, our function looks like this - 

declare private function load-entities(
    $entities as element()*,
	$etag-map as map:map,
	$entity-id as xs:string?,
	$is-orphaned-series-item-allowed as xs:boolean,
    $missing-etag-allowed as xs:boolean,
    $ignore-duplicate-updates as xs:boolean
) as element(store:etag-map) {

    let $_ := xdmp:set-transaction-mode("update")
    return
            let $entity-ids := fn:distinct-values($entities/c:id)
            let $_ := $entity-ids ! xdmp:lock-for-update(.)
            return
                let $etag-map := _load-entities($entities, $etag-map, $entity-id,
$is-orphaned-series-item-allowed, $missing-etag-allowed, $ignore-duplicate-updates)
                return
                    let $_ := xdmp:commit()
                    return
                        $etag-map
};
(Continue reading)

mbsjr | 29 Jul 15:49 2014
Picon
Picon

[MarkLogic Dev General] Please Remove me from the Developers Discussion List

 Please Remove me (mbsjr-H+0wwilmMs3R7s880joybQ@public.gmane.org)  from the Developers Discussion List.

Thanks - Matt Stevens
<div><div>
<div>&nbsp;Please Remove me (mbsjr@...) &nbsp;from the Developers Discussion List.</div>
<div><br></div>
<div>Thanks - Matt Stevens</div>
</div></div>
Harry Bakken | 29 Jul 14:42 2014
Picon

[MarkLogic Dev General] Forest not available, open replica

I have two forests on two separate databases that appear normal in the admin interface, but when we try to connect to the WebDAV app servers associated with the databases, there is a 500 error thrown. The forests are on our replica failover cluster, so they are "open replica" status. Here is the error:

2014-07-28 09:05:48.575 Notice: 8091-Modules-webdav: XDMP-FORESTNOT: Forest Modules not available: open replica 2014-07-28 09:06:30.790 Notice: 8025-moc-modules-webdav: XDMP-FORESTNOT: Forest moc-modules-1 not available: open replica

I am looking for ideas on what to do/troubleshoot. As I said, these forests appear to be fine. I have tried deleting them and establishing them again, but that didn't do anything. I was also directed to a potential fix by rolling back the forests to the last non-blocking timestamp, but that doesn't do anything to help. I have completely removed the database and forest on the replica side, then re-established it all and the error persists. Any ideas are appreciated.

Harry
<div><div dir="ltr">
<span>I have two forests on two separate databases that appear normal in the admin interface, but when we try to connect to the WebDAV app servers associated with the databases, there is a 500 error thrown. The forests are on our replica failover cluster, so they are "open replica" status. Here is the error:</span><div>
<br>
</div>
<div>
2014-07-28 09:05:48.575 Notice: 8091-Modules-webdav: XDMP-FORESTNOT: Forest Modules not available: open replica
2014-07-28 09:06:30.790 Notice: 8025-moc-modules-webdav: XDMP-FORESTNOT: Forest moc-modules-1 not available: open replica
</div>
<div><br></div>
<div>I am looking for ideas on what to do/troubleshoot. As I said, these forests appear to be fine. I have tried deleting them and establishing them again, but that didn't do anything. I was also directed to a potential fix by rolling back the forests to the last non-blocking timestamp, but that doesn't do anything to help. I have completely removed the database and forest on the replica side, then re-established it all and the error persists. Any ideas are appreciated.</div>
<div><br></div>
<div>Harry</div>
</div></div>
Erik Zander | 28 Jul 16:58 2014
Picon

[MarkLogic Dev General] XSLT check if result document has been created

Hi All

 

I’m working on an xslt transform where I’m extracting data about images from a document and put

that data into result document, It all works fine except for when the same image occurs more than once as I then get conflicting uris,

 

my code looks like this

<xsl:template match="//db:informalfigure[descendant::db:imagedata and <at> role='figure']">



       
<xsl:variable name="curImage" select="substring-after(.// <at> fileref,'/')"/>

       
<xsl:variable name="id" select="$imageMetaData//image[name=string-join(($curISBN,$curImage),'/')]/id"/>
       
<xsl:if test="not(doc-available(string-join(('out/',$id,'.xml'),'')))">   <!--This check fails, am I doing it incorrectly or is it the way xslt processes document that makes it hard to check if the id have been encountered before?-->
           
            
<xsl:result-document method="xml" href="out/{$id}.xml" indent="yes">

 

Ideally I would like to be able to check if the result document have been created or not and after that decide if I want to update it with more information or just leave it be.

 

Would appreciate any help on the subject

 

Best regards

Erik

<div>
<div class="WordSection1">
<div>
<div>
<div>
<div>
<p class="MsoNormal"><span>Hi All<p></p></span></p>
<p class="MsoNormal"><span><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span lang="EN-US">I&rsquo;m working on an xslt transform where I&rsquo;m extracting data about images from a document and put<p></p></span></p>
<p class="MsoNormal"><span lang="EN-US">that data into result document, It all works fine except for when the same image occurs more than once as I then get conflicting uris,<p></p></span></p>
<p class="MsoNormal"><span lang="EN-US"><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span lang="EN-US">my code looks like this<p></p></span></p>
<p class="MsoNormal"><span lang="EN-US">&lt;xsl:template</span><span lang="EN-US"> match</span><span lang="EN-US">=</span><span lang="EN-US">"//db:informalfigure[descendant::db:imagedata
 and  <at> role='figure']"</span><span lang="EN-US">&gt;</span><span lang="EN-US"><br><br><br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span lang="EN-US">&lt;xsl:variable</span><span lang="EN-US"> name</span><span lang="EN-US">=</span><span lang="EN-US">"curImage"</span><span lang="EN-US">
 select</span><span lang="EN-US">=</span><span lang="EN-US">"substring-after(.// <at> fileref,'/')"</span><span lang="EN-US">/&gt;</span><span lang="EN-US"><br><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span lang="EN-US">&lt;xsl:variable</span><span lang="EN-US"> name</span><span lang="EN-US">=</span><span lang="EN-US">"id"</span><span lang="EN-US">
 select</span><span lang="EN-US">=</span><span lang="EN-US">"$imageMetaData//image[name=string-join(($curISBN,$curImage),'/')]/id"</span><span lang="EN-US">/&gt;</span><span lang="EN-US"><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span><span lang="EN-US">&lt;xsl:if</span><span lang="EN-US"> test</span><span lang="EN-US">=</span><span lang="EN-US">"not(doc-available(string-join(('out/',$id,'.xml'),'')))"</span><span lang="EN-US">&gt;</span><span lang="EN-US">&nbsp;&nbsp;
</span><span lang="EN-US">&lt;!--This check fails, am I doing it incorrectly or is it the way xslt processes document that makes it hard to check if the id have been encountered before?--&gt;</span><span lang="EN-US"><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</span><span lang="EN-US">&lt;xsl:result-document</span><span lang="EN-US"> method</span><span lang="EN-US">=</span><span lang="EN-US">"xml"</span><span lang="EN-US">
 href</span><span lang="EN-US">=</span><span lang="EN-US">"out/{$id}.xml"</span><span lang="EN-US"> indent</span><span lang="EN-US">=</span><span lang="EN-US">"yes"</span><span lang="EN-US">&gt;<p></p></span></p>
<p class="MsoNormal"><span lang="EN-US"><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span lang="EN-US">Ideally I would like to be able to check if the result document have been created or not and after that decide if I want to update it with more
 information or just leave it be.<p></p></span></p>
<p class="MsoNormal"><span lang="EN-US"><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span lang="EN-US">Would appreciate any help on the subject<p></p></span></p>
<p class="MsoNormal"><span lang="EN-US"><p>&nbsp;</p></span></p>
<p class="MsoNormal"><span lang="EN-US">Best regards<p></p></span></p>
<p class="MsoNormal"><span lang="EN-US">Erik
<p></p></span></p>
</div>
</div>
</div>
</div>
</div>
</div>
Timothy W. Cook | 28 Jul 16:18 2014

[MarkLogic Dev General] Security Design

I am in the early design stages of a (hopefully) large application and would like to see if I understand the operations of collections correctly.

You can think of this in a similar context to a social media app. 
I have attached a simple diagram to aid the text.

Imagine that Joe, Sue and Tom are users and each have a collection (marked 'P' )where only they have read/write access to documents they load. 
Joe and Tom have collections that they would like to use to share (read only) with various other users, one being Sue.  This seems rather straight forward. 
However, the use case also calls for Sue being able to share (read only) Tom's documents with Joe and Joe's documents with Tom; as she sees fit without the intervention of Tom or Joe.  

Could someone expand on this to describe how this might be setup?  Do I need separate roles that are tied to each collection, for each of these exchanges?  

Thanks,
Tim


--

============================================
Timothy Cook
LinkedIn Profile:http://www.linkedin.com/in/timothywaynecook

<div><div dir="ltr">I am in the early design stages of a (hopefully) large application and would like to see if I understand the operations of collections correctly.<div><br></div>
<div>You can think of this in a similar context to a social media app.&nbsp;</div>
<div>I have attached a simple diagram to aid the text.</div>
<div><br></div>
<div>Imagine that Joe, Sue and Tom are users and each have a collection (marked 'P' )where only they have read/write access to documents they load.&nbsp;</div>
<div>Joe and Tom have collections that they would like to use to share (read only) with various other users, one being Sue. &nbsp;This seems rather straight forward.&nbsp;</div>
<div>However, the use case also calls for Sue being able to share (read only) Tom's documents with Joe and Joe's documents with Tom; as she sees fit without the intervention of Tom or Joe. &nbsp;</div>
<div><br></div>
<div>Could someone expand on this to describe how this might be setup? &nbsp;Do I need separate roles that are tied to each collection, for each of these exchanges? &nbsp;</div>
<div><br></div>
<div>Thanks,</div>
<div>Tim</div>
<div>
<br clear="all"><div><br></div>-- <br><div dir="ltr">
<br>============================================<br>Timothy Cook<br>LinkedIn Profile:<a href="http://www.linkedin.com/in/timothywaynecook" target="_blank">http://www.linkedin.com/in/timothywaynecook</a><br><div>MLHIM&nbsp;<a href="http://www.mlhim.org/" target="_blank">http://www.mlhim.org</a><br>
</div>
<div><br></div>
</div>
</div>
</div></div>
Harry Bakken | 28 Jul 12:07 2014
Picon

[MarkLogic Dev General] XDMP-INMTRPLIDXFULL

I noticed similar log entries to this on our 7.0-3 cluster and I am not sure if it is anything to be concerned with- 2014-07-28 00:23:00.812 Debug: Forest::insert: Meters XDMP-INMTRPLIDXFULL: In-memory triple-index storage full; list: table=2%, wordsused=2%, wordsfree=97%, overhead=1%; tree: table=1%, wordsused=11%, wordsfree=89%, overhead=0%
We aren't using any triple indexes in any of the databases on the system. Any info/advice is appreciated.
Harry
<div><div dir="ltr">I noticed similar log entries to this on our 7.0-3 cluster and I am not sure if it is anything to be concerned with-
2014-07-28 00:23:00.812 Debug: Forest::insert: Meters XDMP-INMTRPLIDXFULL: In-memory triple-index storage full; list: table=2%, wordsused=2%, wordsfree=97%, overhead=1%; tree: table=1%, wordsused=11%, wordsfree=89%, overhead=0%
<br>We aren't using any triple indexes in any of the databases on the system. Any info/advice is appreciated.
<br>Harry</div></div>
Dinesh | 26 Jul 04:22 2014
Picon

[MarkLogic Dev General] Search Snippet

Hi All,

 

I would like to know if it is possible for the search:snippet in the search API to always return at least 2 lines of snippet with a defined number of words surrounding the search hits.

 

Is there any options that allows the search to parse through the XML and return the required number of words for the snippet ?

 

Thanks.

 

Regards,

Dinesh

 

 

<div><div class="WordSection1">
<p class="MsoNormal"><span>Hi All,<p></p></span></p>
<p class="MsoNormal">&nbsp;<p></p></p>
<p class="MsoNormal">I would like to know if it is possible for the search:snippet in the search API to always return at least 2 lines of snippet with a defined number of words surrounding the search hits.<p></p></p>
<p class="MsoNormal">&nbsp;<p></p></p>
<p class="MsoNormal">Is there any options that allows the search to parse through the XML and return the required number of words for the snippet ?<p></p></p>
<p class="MsoNormal"><p>&nbsp;</p></p>
<p class="MsoNormal">Thanks. <p></p></p>
<p class="MsoNormal">&nbsp;<p></p></p>
<p class="MsoNormal"><span lang="EN-IN">Regards,</span><p></p></p>
<p class="MsoNormal"><span lang="EN-IN">Dinesh</span><p></p></p>
<p class="MsoNormal">&nbsp;<p></p></p>
<p class="MsoNormal"><p>&nbsp;</p></p>
</div></div>
Steve Spigarelli | 25 Jul 18:04 2014
Picon

[MarkLogic Dev General] spawn static-check

I have a large number of files that I would like to check to verify that they are valid XQuery. The way I've gone about doing this is using the xray testing framework and then looping through all modules in my project with xdmp:spawn and the static-check option.

The problem is that after a few files (<100) I find that Marklogic restarts after running out of memory due to these spawned static-checks.

Any ideas on why this spawn check seems to be causing a memory leak?

The memory does return to the system after a few minutes if I stop the check before the crash.

Thanks for any ideas,
Steve Spigarelli
<div><div dir="ltr">I have a large number of files that I would like to check to verify that they are valid XQuery. The way I've gone about doing this is using the xray testing framework and then looping through all modules in my project with xdmp:spawn and the static-check option.<div>

<br>
</div>
<div>The problem is that after a few files (&lt;100) I find that Marklogic restarts after running out of memory due to these spawned static-checks.</div>
<div><br></div>
<div>Any ideas on why this spawn check seems to be causing a memory leak?</div>

<div><br></div>
<div>The memory does return to the system after a few minutes if I stop the check before the crash.</div>
<div>
<br>Thanks for any ideas,</div>
<div>Steve Spigarelli</div>
</div></div>
Prasanth N V R | 25 Jul 16:23 2014
Picon

[MarkLogic Dev General] AWS S3 Object Count

Hi,

I am trying to get number of objects from AWS S3 using xquery via MarkLogic.

One option I tried using listing the objects from the bucket.

But it returns only 1000 keys at the max in a single call.

Is there a way to get number of objects(total count) from AWS S3 bucket using XQuery.
Or any MarkLogic APIs available.


Thanks,
Prasanth
<div><div dir="ltr">Hi,<div><br></div>
<div>I am trying to get number of objects from AWS S3 using xquery via MarkLogic.</div>
<div><br></div>
<div>One option I tried using listing the objects from the bucket.</div>
<div><br></div>
<div>But it returns only 1000 keys at the max in a single call.</div>
<div><br></div>
<div>Is there a way to get number of objects(total count) from AWS S3 bucket using XQuery.</div>
<div>Or any MarkLogic APIs available.</div>
<div><br></div>
<div><br></div>
<div>Thanks,</div>
<div>Prasanth</div>
</div></div>
Jakob Fix | 25 Jul 00:34 2014
Picon

[MarkLogic Dev General] database replication (platform mismatch)

Hi,

the idea behind my recent travails with mlcp was to replicate an existing database on my laptop onto a virtual machine in the cloud.  transferring 7 GB with rsync is feasible but still takes a lot of time and is not particularly snappy (especially if the local database gets new data in the meantime).

so I may get it done eventually, but then I thought: why not use database replication and make use of all the clever stuff like transactions and the journalling etc.?

two clusters are quickly identified and configured, and almost coupled .... except that the platforms (macosx and linux) are, apparently, not compatible. what a bummer! I was almost there!

is there really no way (not even undocumented, between you and me :-)) to get this done? I mean how big can the differences be between a 64-bit unix-like MacOSX and a 64-bit Debian 7 (Wheezy)?

this would be a really great way to migrate the data .... or do I have to set up a local linux VM into which I load the mlcp-archived data and then replicate this one?

I'm still accepting better ideas than this one :-)

cheers,
Jakob.
<div><div dir="ltr">
<div>Hi,</div>
<div><br></div>the idea behind my recent travails with mlcp was to replicate an existing database on my laptop onto a virtual machine in the cloud. &nbsp;transferring 7 GB with rsync is feasible but still takes a lot of time and is not particularly snappy (especially if the local database gets new data in the meantime).<div>

<br>
</div>
<div>so I may get it done eventually, but then I thought: why not use database replication and make use of all the clever stuff like transactions and the journalling etc.?</div>
<div><br></div>
<div>two clusters are quickly identified and configured, and almost coupled .... except that the platforms (macosx and linux) are, apparently, not compatible. what a bummer! I was almost there!</div>

<div><br></div>
<div>is there really no way (not even undocumented, between you and me :-)) to get this done? I mean how big can the differences be between a 64-bit unix-like MacOSX and a 64-bit&nbsp;Debian 7 (Wheezy)?</div>
<div>

<br>
</div>
<div>this would be a really great way to migrate the data .... or do I have to set up a local linux VM into which I load the mlcp-archived data and then replicate this one?</div>
<div><br></div>
<div>I'm still accepting better ideas than this one :-)</div>

<div>
<br clear="all"><div>cheers,<br>Jakob.</div>
</div>
</div></div>
Jakob Fix | 23 Jul 01:01 2014
Picon

[MarkLogic Dev General] mlcp export problem/question

Hi,

I'm trying to export a database using mlcp. I'm not having much success.

Here is a gist of the error output after 1009 seconds running (that's what mlcp says):

https://gist.github.com/jfix/2ef60350f8af9a4c2f33

mlcp wonders whether "Server connection lost?" but it's running as far as I can make out. Are there any special precautions to take when exporting (i.e. disabling potentially running tasks on the Taskserver, ...)?

the database is about 3 GB big (according to the Size indication in the status page: 3,045 MB).

I'm running this on a Macbook Pro with 8Gb of RAM, latest MarkLogic (7.0-3) and latest mlcp (Hadoop2-1.2-2).

cheers,
Jakob.
<div><div dir="ltr">Hi,<br><br>I'm trying to export a database using mlcp. I'm not having much success.<br><br>Here is a gist of the error output after 1009 seconds running (that's what mlcp says):<br><br><a href="https://gist.github.com/jfix/2ef60350f8af9a4c2f33">https://gist.github.com/jfix/2ef60350f8af9a4c2f33</a><br><br>mlcp wonders whether "Server connection lost?" but it's running as far as I can make out. Are there any special precautions to take when exporting (i.e. disabling potentially running tasks on the Taskserver, ...)?<div>

<br>the database is about 3 GB big (according to the Size indication in the status page: 3,045 MB).<br><br>I'm running this on a Macbook Pro with 8Gb of RAM, latest MarkLogic (7.0-3) and latest mlcp (Hadoop2-1.2-2).<br><br>cheers,<br>Jakob.</div>
</div></div>

Gmane