Mr. Yawar | 17 May 2013 18:02

Migrating from FS Persistence Manager to MySQL PM

Hi,

I am using jackrabbit 2.4.3 with some custom node types in my repository
i am using FileSystem PersistanceManager for Workspace & Versioning
i am using FileSystem datastore. Aprox 20TB of data is inserted into it

Now at this point of time i want to migrate to /DB Persistence Manager
(PM)/, As the data size is huge so i cant export whole data from FS PM to
import into DB PM repository. Can anyone please help to simply migrate the
persistence manager from Fs to DB(without moving the huge datastore
content).

Thanks for considerations.

--
View this message in context: http://jackrabbit.510166.n4.nabble.com/Migrating-from-FS-Persistence-Manager-to-MySQL-PM-tp4658723.html
Sent from the Jackrabbit - Dev mailing list archive at Nabble.com.

Davide Maestroni (JIRA | 17 May 2013 15:43
Picon
Favicon

[Updated] (JCR-3597) Dead lock when using LockManager session scoped locks on nodes


     [
https://issues.apache.org/jira/browse/JCR-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Davide Maestroni updated JCR-3597:
----------------------------------

    Attachment: thread_dump.txt

Thread dump logs file

> Dead lock when using LockManager session scoped locks on nodes
> --------------------------------------------------------------
>
>                 Key: JCR-3597
>                 URL: https://issues.apache.org/jira/browse/JCR-3597
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: jackrabbit-core
>    Affects Versions: 2.6.1
>         Environment: Sling + Jackrabbit
>            Reporter: Davide Maestroni
>            Priority: Blocker
>         Attachments: thread_dump.txt
>
>
> I have a repository created using Jackrabbit and accessed through HTTP APIs via Sling. I use the
LockManager class to create locks on nodes before adding new children and updating properties. When
testing the speed of the system I ran across an issue when using different threads to add child nodes to
different parent nodes: a dead lock happens quite consistently when one thread is logging out from a JCR
(Continue reading)

Davide Maestroni (JIRA | 17 May 2013 15:43
Picon
Favicon

[Created] (JCR-3597) Dead lock when using LockManager session scoped locks on nodes

Davide Maestroni created JCR-3597:
-------------------------------------

             Summary: Dead lock when using LockManager session scoped locks on nodes
                 Key: JCR-3597
                 URL: https://issues.apache.org/jira/browse/JCR-3597
             Project: Jackrabbit Content Repository
          Issue Type: Bug
          Components: jackrabbit-core
    Affects Versions: 2.6.1
         Environment: Sling + Jackrabbit
            Reporter: Davide Maestroni
            Priority: Blocker
         Attachments: thread_dump.txt

I have a repository created using Jackrabbit and accessed through HTTP APIs via Sling. I use the
LockManager class to create locks on nodes before adding new children and updating properties. When
testing the speed of the system I ran across an issue when using different threads to add child nodes to
different parent nodes: a dead lock happens quite consistently when one thread is logging out from a JCR
session and another one is saving its own session (I'll attach the thread dump to the JIRA issue).
So I started inspecting the source code, trying to understand what the problem was, and I think I located the
problem. Basically it all happens when calling the SharedItemStateManager.beginUpdate() method: one
thread, inside the Update.begin(), acquires the synchronized lock on the LocalItemStateManager
instance and wait for the ISMLocking lock to be released (Thread t <at> 295 in the logs); while the other, which
owns the ISMLocking lock, inside the Update.end() triggers an event which in turn try to synchronize with
the LocalItemStateManager instance (Thread t <at> 293 in the logs).
I noticed that the LocalItemStateManager implementation employs synchronized blocks only in two
instances: one in the edit() method to synchronize the whole function, and one in the getItemState()
method to synchronize the access to the cache object. Both are quite odd since: the edit() is the only
synchronized method and there are several other places in which editMode and changeLog are modified, so
(Continue reading)

Tommaso Teofili (JIRA | 17 May 2013 11:43
Picon
Favicon

[Updated] (JCR-3534) Efficient copying of binaries across repositories with the same data store


     [
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tommaso Teofili updated JCR-3534:
---------------------------------

    Attachment: JCR-3534.7.patch

here's a new patch which stores the reference key in the data store.
The AbstractDataStore class doesn't get the configured value anymore but just retrieve it if available in
the repository or stores it before if not by the abstract
addReferenceKeyRecord(DataIdentifier,InputStream) method.
By default the reference key is randomly generated using a SecureRandom with 256 bit long byte[] and the key
identifier is just a String.
Both this two last points can be enhanced by any subclass of AbstractDataStore.

> Efficient copying of binaries across repositories with the same data store
> --------------------------------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>            Assignee: Tommaso Teofili
>         Attachments: JCR-3534.2.patch, JCR-3534.3.patch, JCR-3534.4.patch, JCR-3534.6.patch,
JCR-3534.7.patch, JCR-3534.patch, JCR-3534.patch
(Continue reading)

Tommaso Teofili (JIRA | 17 May 2013 11:37
Picon
Favicon

[Commented] (JCR-3534) Efficient copying of binaries across repositories with the same data store


    [
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13660520#comment-13660520
] 

Tommaso Teofili commented on JCR-3534:
--------------------------------------

> what i am saying is that we should not have a plain txt referenceKey/secret stored anywhere

What should we have in your opinion instead? Would a randomly generated key stored in the data store address
your concern?
As far as I can see it has to be stored somewhere (not in memory) if we want it to be shared (no configuration) or
has to be (manually) configured on both repositories if we don't want to persist it.

> just changing the name doesn't make it better.

sure, in fact I didn't change that for making it better in terms of security, just because it was a better name
for it :-)

> if storing the key is always data-store specific (such as a special file in the FileDataStore, which is by
far the most important data store, not sure if anyone is using the slow database data store), you only need a
getSecret() method on the AbstractDataStore class

ok, but it'd be probably better to expose a default way of getting such a secret / referenceKey with just an
abstract method which knows how to store the reference key value given its identifier. Also a subclass of
AbstractDataStore may choose then to override the reference key identifier or value.

                
> Efficient copying of binaries across repositories with the same data store
(Continue reading)

Francesco Mari | 17 May 2013 10:10
Picon
Gravatar

[PATCH] Performance tests for JCR-3382

Hi all,

I attached a patch to JCR-3382 containing two performance tests. The tests are:
- TreeRandomReadTest: perform 1.000 reads on unique paths using a new
Session, with a clear cache for ItemManager.
- RepeatedTreeRandomReadTest: perform 1.000 reads on unique paths
using an existing Session, thus accessing items from the cache in
ItemManager.

I compared results for 2.4 and 2.6 and it looks like there is no
performance loss due to the patch.

Francesco Mari (JIRA | 16 May 2013 19:33
Picon
Favicon

[Updated] (JCR-3382) ItemManager.getNode does not do a permission check when the item data is in the item manager cache


     [
https://issues.apache.org/jira/browse/JCR-3382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Francesco Mari updated JCR-3382:
--------------------------------

    Attachment: performance.patch

Added performance tests.

> ItemManager.getNode does not do a permission check when the item data is in the item manager cache
> --------------------------------------------------------------------------------------------------
>
>                 Key: JCR-3382
>                 URL: https://issues.apache.org/jira/browse/JCR-3382
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>    Affects Versions: 2.6
>            Reporter: Unico Hommes
>            Assignee: Unico Hommes
>             Fix For: 2.6.1, 2.7
>
>         Attachments: JCR-3382.patch, performance.patch
>
>
> Read access should be checked irrespective of whether the item data is in the cache or not. Something might
have changed between first reading the node and reading the node again that impacts read access.
> We are running into the situation where node.hasNode() returns false for a node that is in the item manager
cache and for which access was revoked but node.getNode() returns the node anyway. So node.hasNode is
(Continue reading)

Alexander Klimetschek (JIRA | 16 May 2013 19:09
Picon
Favicon

[Comment Edited] (JCR-3534) Efficient copying of binaries across repositories with the same data store


    [
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659715#comment-13659715
] 

Alexander Klimetschek edited comment on JCR-3534 at 5/16/13 5:08 PM:
---------------------------------------------------------------------

 <at> Tommaso: if storing the key is always data-store specific (such as a special file in the FileDataStore,
which is by far the most important data store, not sure if anyone is using the slow database data store), you
only need a getSecret() method on the AbstractDataStore class. See my post from above: https://issues.apache.org/jira/browse/JCR-3534?focusedCommentId=13653621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13653621

      was (Author: alexander.klimetschek):
     <at> Tommaso: if the key is always stored data-store specific (such as a special file in the FileDataStore,
which is by far the most important data store, not sure if anyone is using the slow database data store), you
only need a getSecret() method on the AbstractDataStore class. See my post from above: https://issues.apache.org/jira/browse/JCR-3534?focusedCommentId=13653621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13653621

> Efficient copying of binaries across repositories with the same data store
> --------------------------------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>            Assignee: Tommaso Teofili
>         Attachments: JCR-3534.2.patch, JCR-3534.3.patch, JCR-3534.4.patch, JCR-3534.6.patch,
JCR-3534.patch, JCR-3534.patch
(Continue reading)

Alexander Klimetschek (JIRA | 16 May 2013 19:09
Picon
Favicon

[Commented] (JCR-3534) Efficient copying of binaries across repositories with the same data store


    [
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659715#comment-13659715
] 

Alexander Klimetschek commented on JCR-3534:
--------------------------------------------

 <at> Tommaso: if the key is always stored data-store specific (such as a special file in the FileDataStore,
which is by far the most important data store, not sure if anyone is using the slow database data store), you
only need a getSecret() method on the AbstractDataStore class. See my post from above: https://issues.apache.org/jira/browse/JCR-3534?focusedCommentId=13653621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13653621

> Efficient copying of binaries across repositories with the same data store
> --------------------------------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>            Assignee: Tommaso Teofili
>         Attachments: JCR-3534.2.patch, JCR-3534.3.patch, JCR-3534.4.patch, JCR-3534.6.patch,
JCR-3534.patch, JCR-3534.patch
>
>
> we have a couple of use cases, where we would like to leverage the global data store to prevent sending
around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances
configured to use the same DataStore (for the sake of this discussion assume we have the problems of
(Continue reading)

angela (JIRA | 16 May 2013 17:25
Picon
Favicon

[Commented] (JCR-3534) Efficient copying of binaries across repositories with the same data store


    [
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13659633#comment-13659633
] 

angela commented on JCR-3534:
-----------------------------

yes, anyone that has access to the filesystem is admin... that's according to our threat model.
what i am saying is that we should not have a plain txt referenceKey/secret stored anywhere. just changing
the name doesn't make
it better. 

> Efficient copying of binaries across repositories with the same data store
> --------------------------------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>            Assignee: Tommaso Teofili
>         Attachments: JCR-3534.2.patch, JCR-3534.3.patch, JCR-3534.4.patch, JCR-3534.6.patch,
JCR-3534.patch, JCR-3534.patch
>
>
> we have a couple of use cases, where we would like to leverage the global data store to prevent sending
around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances
(Continue reading)

Tommaso Teofili (JIRA | 16 May 2013 16:07
Picon
Favicon

[Updated] (JCR-3534) Efficient copying of binaries across repositories with the same data store


     [
https://issues.apache.org/jira/browse/JCR-3534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tommaso Teofili updated JCR-3534:
---------------------------------

    Attachment: JCR-3534.6.patch

here's the trivial patch which reflects the approach from my comment

> Efficient copying of binaries across repositories with the same data store
> --------------------------------------------------------------------------
>
>                 Key: JCR-3534
>                 URL: https://issues.apache.org/jira/browse/JCR-3534
>             Project: Jackrabbit Content Repository
>          Issue Type: New Feature
>          Components: jackrabbit-api, jackrabbit-core
>    Affects Versions: 2.6
>            Reporter: Felix Meschberger
>            Assignee: Tommaso Teofili
>         Attachments: JCR-3534.2.patch, JCR-3534.3.patch, JCR-3534.4.patch, JCR-3534.6.patch,
JCR-3534.patch, JCR-3534.patch
>
>
> we have a couple of use cases, where we would like to leverage the global data store to prevent sending
around and copying around large binary data unnecessarily: We have two separate Jackrabbit instances
configured to use the same DataStore (for the sake of this discussion assume we have the problems of
concurrent access and garbage collection under control). When sending content from one instance to the
(Continue reading)


Gmane