Alexandru Popescu | 1 Sep 01:50 2006
Picon

Re: OutOfMemory - adding lots of nodes in one session

#: Nicolas changed the world a bit at a time by saying (astral date: 9/1/2006 1:10 AM) :#
> 2 more ideas:
> 
> 1/ Did you try using a memory profiler so we can know what is wrong?
> 
> 2/ What happens if you logout after say 100 updates?
> 
> 
> a+
> Nico
> my blog! http://www.deviant-abstraction.net !!
> 

I may be wrong, but even if you save after each node creation, the new nodes will be kept in memory 
as a cache. However, I would expect it too last more than 100 nodes. Even if you use a RDBMS as 
persistence solution, please be aware that you are serializing an object and so you can still hit 
size problems. Sometime in the past I (with the help of Stefan) have computed a rough number of flat 
nodes that would be the max for the default persistence configuration.

Can you increase the JVM size and see if there is a change in this number?

./alex
--
.w( the_mindstorm )p.

Michael Neale | 1 Sep 10:42 2006
Picon

Re: OutOfMemory - adding lots of nodes in one session

1:
yeah I use JProfiler - top of the charts with a bullet was:
org.apache.jackrabbit.util.WeakIdentityCollection$WeakRef (a ha ! that would
explain the performance slug when GC has to kick in late in the piece).
followed by:
org.apache.derby.impl.store.raw.data.StoredRecordHeader
and of course a whole lot of byte[].

I am using default everything (which means Derby) and no blobs whatsoever
(so all in the database).

2:
If I logout, and use fresh everything, it seems to continue fine (ie fast
enough pace), but I haven't really pushed it where I wanted to get it (10000
Child nodes).

Responding to Alexandru's email (hi alex, nice work on InfoQ if I remember
correctly ! I am a fan), it would seem that the Session keeps most in
memory, which I can understand.

I guess my problem is that I am trying to load up the system to test really
basically that it scales to the numbers that I know I need to scale to, but
I am having trouble getting the data in - bulk load wise. If I bump up the
memory, it certainly seems to hum along better, but if Session is keeping a
lot around, then this will have limits - there is no way to "clear" the
session ?

Perhaps I will explain what I am using JCR for (feel free to smack me down
if this is not what JCR and Jackrabbit are ever indended for):
I am storing "atomic business rules" (which means each node is a small
(Continue reading)

Marcel Reutegger | 1 Sep 11:11 2006
Picon
Picon

Re: OutOfMemory - adding lots of nodes in one session

Michael Neale wrote:
> It screams through for about the first 100 nodes, but then slows down in
> that familiar exponential way, generally failing around 500 nodes. If I
> start it up again, and continue adding more, it powers on just fine.

can you please clarify what exactly your code does?

I understand there is node somewhere in the workspace where you start 
to add child nodes. each child nodes has a couple of properties you 
mentioned.

Do you save the session whenever you add 1 child node? or after a 
certain number of child nodes?

What exception is thrown when you reach the 500 child nodes?

regards
  marcel

Nicolas | 1 Sep 11:28 2006
Picon

Re: OutOfMemory - adding lots of nodes in one session

Hi,

It seems we understand what is the issue now. It is linked to the session
keeping nodes in memory. It might be also linked with the great number of
children to the root node. It seems odd, since I think JR has been stress
tested on this kind of issue. Can someone confirm this please?

If my guess is correct, changing PM wouldn't change anything. So... can you
please try your code with a FS based PM please?

For now, you have a temporary solution: resetting the session, but it is
definetely not a good one so we should keep going on your issue.

BR
Nico
my blog! http://www.deviant-abstraction.net !!
Michael Neale | 1 Sep 12:30 2006
Picon

Re: OutOfMemory - adding lots of nodes in one session

Hi Marcel.

Yeah, each iteration it adds a node, saves the session, checks-in the node,
and then moves on to the next (random data, several properties).
If I bump up the memory, it certainly works a lot better (-Xmx300M). Its
really quite basic.

The error is simple an OutMemoryException that happens between  400 and 700
nodes.

I will try the other persistence manager test, probably a good idea to see
if it is related to the implementation PM  for Derby (probably not). I will
post back the results.

From what people are saying, it doesn't sound like I am trying to do
anything too wierd, and that there have been some tests done in the past
(any one that can point me to them it would be appreciated).

On 9/1/06, Marcel Reutegger <marcel.reutegger <at> gmx.net> wrote:
>
> Michael Neale wrote:
> > It screams through for about the first 100 nodes, but then slows down in
> > that familiar exponential way, generally failing around 500 nodes. If I
> > start it up again, and continue adding more, it powers on just fine.
>
> can you please clarify what exactly your code does?
>
> I understand there is node somewhere in the workspace where you start
> to add child nodes. each child nodes has a couple of properties you
> mentioned.
(Continue reading)

Alexandru Popescu | 1 Sep 12:38 2006
Picon

Re: OutOfMemory - adding lots of nodes in one session

On 9/1/06, Michael Neale <michael.neale <at> gmail.com> wrote:
> Hi Marcel.
>
> Yeah, each iteration it adds a node, saves the session, checks-in the node,
> and then moves on to the next (random data, several properties).
> If I bump up the memory, it certainly works a lot better (-Xmx300M). Its
> really quite basic.
>

As far as I discussed in the past the creation of a Session should be
a pretty light process, so you still have the workaround to logout
after processing a bulk.
(I am not saying that this is good).

> The error is simple an OutMemoryException that happens between  400 and 700
> nodes.
>
> I will try the other persistence manager test, probably a good idea to see
> if it is related to the implementation PM  for Derby (probably not). I will
> post back the results.
>

No, I don't think it is related to PMs, but with the Session
implementation which caches everything it accessed (I may be a little
wrong about this, but that's what I remember).

> From what people are saying, it doesn't sound like I am trying to do
> anything too wierd, and that there have been some tests done in the past
> (any one that can point me to them it would be appreciated).
>
(Continue reading)

Jukka Zitting | 1 Sep 13:03 2006
Picon

Re: OutOfMemory - adding lots of nodes in one session

Hi,

On 9/1/06, Michael Neale <michael.neale <at> gmail.com> wrote:
> From what people are saying, it doesn't sound like I am trying to do
> anything too wierd, and that there have been some tests done in the past
> (any one that can point me to them it would be appreciated).

I'm not sure if there are very good public test data available, but
I've used a custom XML importer that does a Session.save() every
thousand nodes to import hundredths of thousands of nodes in a single
session (even over JCR-RMI), so I really don't think your problem is
an inherent limitation in Jackrabbit.

BR,

Jukka Zitting

--

-- 
Yukatan - http://yukatan.fi/ - info <at> yukatan.fi
Software craftsmanship, JCR consulting, and Java development

Michael Neale | 1 Sep 13:08 2006
Picon

Re: OutOfMemory - adding lots of nodes in one session

Thats good to hear (from Jukka and Alex).

Jukka, where you loading nodes into a flat(ish) structure? or was there a
deep hierarchy?

(ps I will work on a self contained example to try and reproduce it over the
weekend. If it is the session caching, may be possible to patch it for my
specific case - the benefits of open source !).

On 9/1/06, Jukka Zitting <jukka.zitting <at> gmail.com> wrote:
>
> Hi,
>
> On 9/1/06, Michael Neale <michael.neale <at> gmail.com> wrote:
> > From what people are saying, it doesn't sound like I am trying to do
> > anything too wierd, and that there have been some tests done in the past
> > (any one that can point me to them it would be appreciated).
>
> I'm not sure if there are very good public test data available, but
> I've used a custom XML importer that does a Session.save() every
> thousand nodes to import hundredths of thousands of nodes in a single
> session (even over JCR-RMI), so I really don't think your problem is
> an inherent limitation in Jackrabbit.
>
> BR,
>
> Jukka Zitting
>
> --
> Yukatan - http://yukatan.fi/ - info <at> yukatan.fi
(Continue reading)

Jukka Zitting | 1 Sep 13:18 2006
Picon

Re: OutOfMemory - adding lots of nodes in one session

Hi,

On 9/1/06, Michael Neale <michael.neale <at> gmail.com> wrote:
> Jukka, where you loading nodes into a flat(ish) structure? or was there a
> deep hierarchy?

I was using the importer with a Docbook-like XML schema, so it's not
too flat but not too deep either. Something like 10-100 children per
node.

BR,

Jukka Zitting

--

-- 
Yukatan - http://yukatan.fi/ - info <at> yukatan.fi
Software craftsmanship, JCR consulting, and Java development

Ducrocq Christophe | 1 Sep 14:30 2006

question about save Node

Hi,

I'm a parent which contains more children. I want to save a child who
cans potentially a new child, without to safeguard all the children
associated with the parent.  

I tested with a method save() on a Item : 

Sample: 
	Node parent = root.getNode("hello");
	Node child_1 = parent.getNode("world");
	Child_1.setProperty("message","hello world");
	

	Node child_2 = parend.addNode("newNode");
	child_2.setProperty("message","test");

If I do child_1.save(), I'm a javax.jcr.RepositoryException:
javax.jcr.RepositoryException: /hello/newNode: cannot save a new item.
	at
org.apache.jackrabbit.core.ItemImpl.getTransientStates(ItemImpl.java:335
)
	at org.apache.jackrabbit.core.ItemImpl.save(ItemImpl.java:1054)
	at
net.atos.wl.fwk.rules.repository.TestJsr.main(TestJsr.java:52)

But, I do parent.save(), I save all children of parent. (What does not
interest me)

On which node has to call upon the method save() ?
(Continue reading)


Gmane