Mark | 1 Aug 2010 02:13
Picon

Re: Columns limit

So have the TimeUUID as the key?

SearchLogs : {
     TimeUUID_1 : { metadata goes here},
     TimeUUID_2 : { metadata goes here},
     TimeUUID_3 : { metadata goes here},
     ...
}

On 7/31/10 3:42 PM, Benjamin Black wrote:
> The proper way to handle this is to have a row per time interval such
> that the number of columns per row is constrained.
>
> On Thu, Jul 29, 2010 at 2:39 PM, Mark<static.void.dev <at> gmail.com>  wrote:
>    
>> Is there any limitations on the number of columns a row can have? Does all
>> the day for a single key need to reside on a single host? If so, wouldn't
>> that mean there is an implicit limit on the number of columns one can
>> have... ie the disk size of that machine.
>>
>> What is the proper way to handle timelines in this matter. For example lets
>> say I wanted to store all user searches in a super column.
>>
>> <ColumnFamily Name="SearchLogs"
>>                     ColumnType="Super"
>>                     CompareWith="TimeUUIDType"
>>                     CompareSubcolumnsWith="BytesType"/>
>>
>> Which results in a structure as follows
>> {
(Continue reading)

Dan Washusen | 1 Aug 2010 04:16
Gravatar

Re: Upgrading to Cassanda 0.7 Thrift Erlang

Slightly off topic but still related (java instead of erlang).  I just tried using the latest trunk build available on Hudson (2010-07-31_12-31-29) and I'm getting lock ups.


The same code (without the framed transport) was working with a build form 2010-07-07_13-32-16

I'm connecting using the following:
TSocket socket = new TSocket(node, port);
transport = new TFramedTransport(socket);
protocol = new TBinaryProtocol(transport);
client = new Cassandra.Client(protocol);

transport.open();

// set the keyspace on the client and do get slice stuff
 

The locked up thread looks like:
"main" prio=5 tid=101801000 nid=0x100501000 runnable [1004fe000]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <1093daa10> (a java.io.BufferedInputStream)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)
at org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:542)
at org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:524)
 

On 28 July 2010 17:43, J T <jt4websites <at> googlemail.com> wrote:
Hi,

That fixed the problem!

I added the Framed option and like magic things have started working again.

Example:

thrift_client:start_link("localhost", 9160, cassandra_thrift, [ { framed, true } ] )

JT.



On Tue, Jul 27, 2010 at 10:04 PM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
trunk is using framed thrift connections by default now (was unframed)

On Tue, Jul 27, 2010 at 11:33 AM, J T <jt4websites <at> googlemail.com> wrote:
> Hi,
> I just tried upgrading a perfectly working Cassandra 0.6.3 to Cassandra 0.7
> and am finding that even after re-generating the erlang thrift bindings that
> I am unable to perform any operation.
> I can get a connection but if I try to login or set the keyspace I get a
> report from the erlang bindings to say that the connection is closed.
> I then tried upgrading to a later version of thrift but still get the same
> error.
> e.g.
> (zotonic3989 <at> 127.0.0.1)1> thrift_client:start_link("localhost", 9160,
> cassandra_thrift).
> {ok,<0.327.0>}
> (zotonic3989 <at> 127.0.0.1)2> {ok,C}=thrift_client:start_link("localhost", 9160,
> cassandra_thrift).
> {ok,<0.358.0>}
> (zotonic3989 <at> 127.0.0.1)3> thrift_client:call( C, set_keyspace, [ <<"Test">>
>  ]).
> =ERROR REPORT==== 27-Jul-2010::03:48:08 ===
> ** Generic server <0.358.0> terminating
> ** Last message in was {call,set_keyspace,[<<"Test">>]}
> ** When Server state == {state,cassandra_thrift,
>                          {protocol,thrift_binary_protocol,
>                           {binary_protocol,
>                            {transport,thrift_buffered_transport,<0.359.0>},
>                            true,true}},
>                          0}
> ** Reason for termination ==
> ** {{case_clause,{error,closed}},
>     [{thrift_client,read_result,3},
>      {thrift_client,catch_function_exceptions,2},
>      {thrift_client,handle_call,3},
>      {gen_server,handle_msg,5},
>      {proc_lib,init_p_do_apply,3}]}
> ** exception exit: {case_clause,{error,closed}}
>      in function  thrift_client:read_result/3
>      in call from thrift_client:catch_function_exceptions/2
>      in call from thrift_client:handle_call/3
>      in call from gen_server:handle_msg/5
>      in call from proc_lib:init_p_do_apply/3
> The cassandra log seems to indicate that a connection has been made
> (although thats only apparent by a TRACE log message saying that a logout
> has been done).
> The cassandra-cli program is able to connect and function normally so I can
> only assume that there is a problem with the erlang bindings.
> Has anyone else had any success using 0.7 from Erlang ?
> JT.



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Dan Washusen | 1 Aug 2010 04:32
Gravatar

Re: Upgrading to Cassanda 0.7 Thrift Erlang

p.s. If I set thrift_framed_transport_size_in_mb to 0 and just use TSocket instead of TFramedTransport everything works as expected...

On 1 August 2010 12:16, Dan Washusen <dan <at> reactive.org> wrote:
Slightly off topic but still related (java instead of erlang).  I just tried using the latest trunk build available on Hudson (2010-07-31_12-31-29) and I'm getting lock ups.

The same code (without the framed transport) was working with a build form 2010-07-07_13-32-16

I'm connecting using the following:
TSocket socket = new TSocket(node, port);
transport = new TFramedTransport(socket);
protocol = new TBinaryProtocol(transport);
client = new Cassandra.Client(protocol);

transport.open();

// set the keyspace on the client and do get slice stuff
 

The locked up thread looks like:
"main" prio=5 tid=101801000 nid=0x100501000 runnable [1004fe000]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:258)
at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
- locked <1093daa10> (a java.io.BufferedInputStream)
at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:369)
at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:295)
at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:202)
at org.apache.cassandra.thrift.Cassandra$Client.recv_get_slice(Cassandra.java:542)
at org.apache.cassandra.thrift.Cassandra$Client.get_slice(Cassandra.java:524)
 

On 28 July 2010 17:43, J T <jt4websites <at> googlemail.com> wrote:
Hi,

That fixed the problem!

I added the Framed option and like magic things have started working again.

Example:

thrift_client:start_link("localhost", 9160, cassandra_thrift, [ { framed, true } ] )

JT.



On Tue, Jul 27, 2010 at 10:04 PM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
trunk is using framed thrift connections by default now (was unframed)

On Tue, Jul 27, 2010 at 11:33 AM, J T <jt4websites <at> googlemail.com> wrote:
> Hi,
> I just tried upgrading a perfectly working Cassandra 0.6.3 to Cassandra 0.7
> and am finding that even after re-generating the erlang thrift bindings that
> I am unable to perform any operation.
> I can get a connection but if I try to login or set the keyspace I get a
> report from the erlang bindings to say that the connection is closed.
> I then tried upgrading to a later version of thrift but still get the same
> error.
> e.g.
> (zotonic3989 <at> 127.0.0.1)1> thrift_client:start_link("localhost", 9160,
> cassandra_thrift).
> {ok,<0.327.0>}
> (zotonic3989 <at> 127.0.0.1)2> {ok,C}=thrift_client:start_link("localhost", 9160,
> cassandra_thrift).
> {ok,<0.358.0>}
> (zotonic3989 <at> 127.0.0.1)3> thrift_client:call( C, set_keyspace, [ <<"Test">>
>  ]).
> =ERROR REPORT==== 27-Jul-2010::03:48:08 ===
> ** Generic server <0.358.0> terminating
> ** Last message in was {call,set_keyspace,[<<"Test">>]}
> ** When Server state == {state,cassandra_thrift,
>                          {protocol,thrift_binary_protocol,
>                           {binary_protocol,
>                            {transport,thrift_buffered_transport,<0.359.0>},
>                            true,true}},
>                          0}
> ** Reason for termination ==
> ** {{case_clause,{error,closed}},
>     [{thrift_client,read_result,3},
>      {thrift_client,catch_function_exceptions,2},
>      {thrift_client,handle_call,3},
>      {gen_server,handle_msg,5},
>      {proc_lib,init_p_do_apply,3}]}
> ** exception exit: {case_clause,{error,closed}}
>      in function  thrift_client:read_result/3
>      in call from thrift_client:catch_function_exceptions/2
>      in call from thrift_client:handle_call/3
>      in call from gen_server:handle_msg/5
>      in call from proc_lib:init_p_do_apply/3
> The cassandra log seems to indicate that a connection has been made
> (although thats only apparent by a TRACE log message saying that a logout
> has been done).
> The cassandra-cli program is able to connect and function normally so I can
> only assume that there is a problem with the erlang bindings.
> Has anyone else had any success using 0.7 from Erlang ?
> JT.



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com



Claire Chang | 1 Aug 2010 04:46
Favicon

Re: NullPointerException and Java client hang

I was able to reproduce this exception consistently. Then I realized that in /etc/default/cassandra

JVM_START_MEM="256"

It was set to 256 (Bytes). Before It was 128M. I changed it to 256M and made sure that the -Xmx and -Xms in
casandra.in.sh and /etc/init.d/cassandra are all in sync. 

After applying these changes, the exception no longer occurs in my 3 node cluster. BTW, my Xmx is set to 4G.

Hope this helps.

Claire
On Jul 30, 2010, at 1:05 PM, Peter Schuller wrote:

>> I am getting following exception:
>> 
>> java.lang.NullPointerException
>>         at org.apache.cassandra.db.Table.apply(Table.java:407)
> 
> Are you triggering this repeatedly without difficulty?
> 
> Can you run with the attached patch (indentation is messed up in the
> patch though - sorry, no time to fix it now)? Hopefully it should emit
> (at level error) in the log a message indication it failed to obtain a
> column family store, followed by a list of all known column family
> stores, prior to bailing with the same exception.
> 
> Is this happening on every node or just one? When did this start - did
> it start right after a schema change (keyspace addition)?
> 
> (I'm just grasping at straws based on a cursory examination; I may be
> barking up the wrong tree completely.)
> 
> -- 
> / Peter Schuller
> <cassandra-0.6.3-missingcfslog.patch>

Benjamin Black | 1 Aug 2010 06:32
Picon
Gravatar

Re: Columns limit

Have the TimeUUID as the key, and then index rows named for the time
intervals, each containing columns with TimeUUID names giving the data
in those intervals.

On Sat, Jul 31, 2010 at 5:13 PM, Mark <static.void.dev <at> gmail.com> wrote:
> So have the TimeUUID as the key?
>
> SearchLogs : {
>    TimeUUID_1 : { metadata goes here},
>    TimeUUID_2 : { metadata goes here},
>    TimeUUID_3 : { metadata goes here},
>    ...
> }
>
> On 7/31/10 3:42 PM, Benjamin Black wrote:
>>
>> The proper way to handle this is to have a row per time interval such
>> that the number of columns per row is constrained.
>>
>> On Thu, Jul 29, 2010 at 2:39 PM, Mark<static.void.dev <at> gmail.com>  wrote:
>>
>>>
>>> Is there any limitations on the number of columns a row can have? Does
>>> all
>>> the day for a single key need to reside on a single host? If so, wouldn't
>>> that mean there is an implicit limit on the number of columns one can
>>> have... ie the disk size of that machine.
>>>
>>> What is the proper way to handle timelines in this matter. For example
>>> lets
>>> say I wanted to store all user searches in a super column.
>>>
>>> <ColumnFamily Name="SearchLogs"
>>>                    ColumnType="Super"
>>>                    CompareWith="TimeUUIDType"
>>>                    CompareSubcolumnsWith="BytesType"/>
>>>
>>> Which results in a structure as follows
>>> {
>>>   SearchLogs : {
>>>       "foo" : {
>>>            timeuuid_1 : { metadata goes here}
>>>            timeuuid_2: { metadata goes here}
>>>       },
>>>       "bar" : {
>>>            timeuuid_1 : { metadata goes here}
>>>            timeuuid_2: { metadata goes here}
>>>       }
>>>  }
>>> }
>>>
>>> Couldn't this theoretically run out of columns for the same search term
>>> because for each unique term there can (and will) be many timeuuid
>>> columns?
>>>
>>> Thanks for clearing this up for me.
>>>
>>>
>>>
>
>

Animesh Kumar | 1 Aug 2010 07:22
Picon

RE: kundera: Open source JPA 1.0 compliant ORM for Cassandra

HI Michael,

 

We haven’t tried Kundera with 0.7 beta yet. However, Kundera runs fine with 0.6.3

 

-Animesh

 

From: Michael Widmann [mailto:michael.widmann <at> gmail.com]
Sent: Saturday, July 31, 2010 9:02 PM
To: user <at> cassandra.apache.org
Subject: Re: kundera: Open source JPA 1.0 compliant ORM for Cassandra

 

Hi

could we run kundera on 0.7beta Version?

Thanks for answer

Michael

2010/7/31 Sanjay Sharma <sanjay.sharma <at> impetus.co.in>

Hi All,

We are happy to announce and share a new ORM over Cassandra – kundera

The project is Apache licensed and hosted at http://kundera.googlecode.com

 

The project uses custom Cassandra Annotations and is fully JPA 1.0 compliant. <at> ColumnFamily and <at> SuperColumnFamily are the main Cassandra specific annotations.

 

Search/Indexing is automatically included by using “Lucandra” and drives the JPA-QL query support. Use of Lucandra also enables users to write Lucene queries along with JPA-QL queries.

 

As per the main author of kundera – Animesh -“ The idea behind Kundera is to make working with Cassandra drop-dead simple and fun. Kundera does not reinvent the wheel by making another client library; rather it leverages the existing libraries and builds - on top of them - a wrap-around API to help developers do away with unnecessary boiler plate codes, and program a neater-and-cleaner code that reduces code-complexity and improves quality. And above all, improves productivity.”

 

The current implementation uses the versatile “Pelops” library as the underlying client API and plans are to add support for Hector and Thrift clients as well.

 

Here is a sample kundera Entity bean -

<at> Entity 

<at> ColumnFamily(keyspace = "Keyspace1", family = "SimpleComment") 

public class SimpleComment { 

    <at> Id 

    private String id; 

    <at> Column(name = "userId") 

    private String userId; 

    <at> Column(name = "comment") 

    private String commentText; 

 

    ......  

}

JPA queries are as simple as-

        Query query = entityManager.createQuery("SELECT c from SimpleComment c where userId=’me’"); 

        List<SimpleComment> list = query.getResultList(); 

 

There is already support for Spring based persistence integration like the good old Spring+Hibernate integration and is as simple as this-

    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean"> 

        <property name="persistenceUnitName" value="myPersistenceUnit"/> 

                </bean>

More examples are available in kundera’s wiki and Animesh’s blogs. The spring integration example is here

 

Regards,

Sanjay Sharma

iLabs, Impetus

 

Impetus is sponsoring 'Hadoop India User Group Meet Up'- a technology un-conference on July 31, 2010 at Impetus Office, Noida. The event will shed light on Hadoop technology and channelized efforts to develop an active Hadoop community.

Click http://www.impetus.com/ to know more. Follow our updates on www.twitter.com/impetuscalling .


NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.




--
bayoda.com - Professional Online Backup Solutions for Small and Medium Sized Companies


Impetus is sponsoring 'Hadoop India User Group Meet Up'- a technology un-conference on July 31, 2010 at Impetus Office, Noida. The event will shed light on Hadoop technology and channelized efforts to develop an active Hadoop community.

Click http://www.impetus.com/ to know more. Follow our updates on www.twitter.com/impetuscalling .


NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
james anderson | 1 Aug 2010 09:00
Picon
Gravatar

? how to do protocol version negotiation

good morning;

is there somewhere a description of - or code which does, protocol  
version negotiation? particularly client code, across the 0.6/0.7  
transition.

all pointers would be welcome.

il-woon Ahn | 1 Aug 2010 10:04
Picon

Two questions : Server crash during compaction and UnavailableException

Hi all

Nowadays I'm using Cassandra DB under high read/write loads.

I have two questions:

First, Cassandra suddenly dies during compaction. Java core dump says that the last thread run was  "COMPACTION-POOL:1".
I suspect that my business logic could lead size of columns in a column family per a row to be greater than two gigabytes. (but i couldn't confirm it yet)
Can this be a cause of the server down and is there any solution? (should I wait 0.7?)

Second, It seems that my client program often get UnavailableException from Cassandra when Cass is running in normal.
Is it possible to get such an exception in high read/write loads even if Cassandra is up?
I handled this to try to send client's request again after 1 sec, like handling TimedOutException. Is there any other recommended reaction?

Thanks in advance.


Regards
Ilun

Ran Tavory | 1 Aug 2010 11:54
Picon
Gravatar

What's using my memory?

I know this subject has been discussed in the past on the list and I've read through all discussions but I haven't been able to find a solution to the memory problems listed below... so here again...

It seems that the cassandra cluster I'm using is either leaking memory or just using more mem than I expected it to use.
Each host in the ring uses about 12G of ram while in some cases its entire dataset is only 1.5G (take for example .252.124 below with 1.54G)
I use extensive row caching so I expect memory consumption to be >= 1.5G but I don't understand why it gets up to 12G. Most of the times I don't care so much since I have plenty of memory however at times this gets me into GC storms and very slow responses. Also, I'd like to be able to load more data to the cluster and I'm hitting the memory wall, which I didn't expect.

In the cassandra.in.sh you'd notice that I do provide Xmx=12G but given that there's so little data I wouldn't expect the process to be using all of that. As a matter of fact I wanted to insert more data to the cluster but I stopped since it wasn't handling the load very well. 

I suppose that at the end of the day I only need to know which knobs configure but after having played with the configuration for a long time I'm a little lost.

I'm running a 0.6.2 cluster consisting of 6 physical hosts (some with 16G and some 32G ram) distributed b/w two DCs. 
RF is 2 (one replica in each DC).
HH is turned off.
File access is standard (no m-mapped files, I tried that and the system just kept swapping itself to death so I switched back to normal).

I've pasted below the output of nodetool ring and cfstats as well as some vmstat and iostat (not that I think it matters...)
Also jmap -heap and attached is the jmap -histo so I hope this output can help shed some light on memory usage.
Currently the logs don't say anything out of the ordinary so I didn't include them. 

Thanks :)

$ nodetool -h cass99 -p 9004 ring
Address       Status     Load          Range                                      Ring
                                       170141183460469231731687303715884105727    
192.168.252.99Up         6.16 GB       28356863910078205288614550619314017621     |<--|
192.168.252.124Up         1.54 GB       56713727820156410577229101238628035242     |   ^
192.168.252.125Up         1.54 GB       85070591730234615865843651857942052863     v   |
192.168.254.57Up         6.15 GB       113427455640312821154458202477256070485    |   ^
192.168.254.58Up         1.54 GB       141784319550391026443072753096570088106    v   |
192.168.254.59Up         1.54 GB       170141183460469231731687303715884105727    |-->|

 <Keyspace Name="outbrain_kvdb">
      <ColumnFamily CompareWith="BytesType" Name="KvImpressions"
                    KeysCached="0"
                    RowsCached="10000000"/>
      <ColumnFamily CompareWith="BytesType" Name="KvAds"
                    KeysCached="0"
                    RowsCached="10000000"/>
      <ColumnFamily CompareWith="BytesType" Name="KvRatings"
                    KeysCached="0"
                    RowsCached="10000000"/>
      <ReplicaPlacementStrategy>org.apache.cassandra.locator.RackAwareStrategy</ReplicaPlacementStrategy>
      <ReplicationFactor>2</ReplicationFactor>
      <EndPointSnitch>org.apache.cassandra.locator.EndPointSnitch</EndPointSnitch>
    </Keyspace>


$ cat bin/cassandra.in.sh 
# Licensed to the Apache Software Foundation (ASF) under one
...
# Arguments to pass to the JVM
JVM_OPTS=" \
        -ea \
        -Xms4G \
        -Xmx12G \
        -XX:+UseParNewGC \
        -XX:+UseConcMarkSweepGC \
        -XX:+CMSParallelRemarkEnabled \
        -XX:SurvivorRatio=8 \
        -XX:MaxTenuringThreshold=1 \
        -XX:+HeapDumpOnOutOfMemoryError \
        -Dcom.sun.management.jmxremote.port=9004 \
        -Dcom.sun.management.jmxremote.ssl=false \
        -Dcom.sun.management.jmxremote.authenticate=false"




Keyspace: outbrain_kvdb
        Read Count: 5608010
        Read Latency: 8.52211627029909 ms.
        Write Count: 42794
        Write Latency: 0.10353956162078796 ms.
        Pending Tasks: 0
  
                Column Family: KvAds
                SSTable count: 11
                Space used (live): 9331647391
                Space used (total): 9331647391
                Memtable Columns Count: 84928
                Memtable Data Size: 21400502
                Memtable Switch Count: 1
                Read Count: 5602705
                Read Latency: 2.023 ms.
                Write Count: 42794
                Write Latency: 0.060 ms.
                Pending Tasks: 0
                Key cache: disabled
                Row cache capacity: 10000000
                Row cache size: 698671
                Row cache hit rate: 0.5535463700149053
                Compacted row minimum size: 391
                Compacted row maximum size: 76890
                Compacted row mean size: 635



top - 10:23:26 up 96 days, 23:04,  1 user,  load average: 5.03, 6.21, 6.08
Tasks:  93 total,   1 running,  92 sleeping,   0 stopped,   0 zombie
Cpu(s): 92.1%us,  4.1%sy,  0.0%ni,  1.8%id,  0.0%wa,  0.5%hi,  1.5%si,  0.0%st
Mem:  16443880k total, 16357676k used,    86204k free,    43448k buffers
Swap:  4194296k total,    13912k used,  4180384k free,  2625024k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                                       
 5757 cassandr  25   0 13.6g  12g 9860 S 197.2 82.3   9445:17 java                     


$ jmap -heap 5757
Attaching to process ID 5757, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 16.3-b01

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 12884901888 (12288.0MB)
   NewSize          = 21757952 (20.75MB)
   MaxNewSize       = 43581440 (41.5625MB)
   OldSize          = 65404928 (62.375MB)
   NewRatio         = 7
   SurvivorRatio    = 8
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 88080384 (84.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 39256064 (37.4375MB)
   used     = 6779480 (6.465415954589844MB)
   free     = 32476584 (30.972084045410156MB)
   17.26989236618322% used
Eden Space:
   capacity = 34930688 (33.3125MB)
   used     = 2490360 (2.3749923706054688MB)
   free     = 32440328 (30.93750762939453MB)
   7.12943300744606% used
From Space:
   capacity = 4325376 (4.125MB)
   used     = 4289120 (4.090423583984375MB)
   free     = 36256 (0.034576416015625MB)
   99.16178385416667% used
To Space:
   capacity = 4325376 (4.125MB)
   used     = 0 (0.0MB)
   free     = 4325376 (4.125MB)
   0.0% used
concurrent mark-sweep generation:
   capacity = 12841320448 (12246.4375MB)
   used     = 10867324872 (10363.888618469238MB)
   free     = 1973995576 (1882.5488815307617MB)
   84.62778353679785% used
Perm Generation:
   capacity = 30380032 (28.97265625MB)
   used     = 18100520 (17.262001037597656MB)
   free     = 12279512 (11.710655212402344MB)
   59.580319072738305% used






Attachment (jmap_histo): application/octet-stream, 87 KiB
Peter Schuller | 1 Aug 2010 12:15
Gravatar

Re: What's using my memory?

> storms and very slow responses. Also, I'd like to be able to load more data
> to the cluster and I'm hitting the memory wall, which I didn't expect.
> In the cassandra.in.sh you'd notice that I do provide Xmx=12G but given that
> there's so little data I wouldn't expect the process to be using all of
> that. As a matter of fact I wanted to insert more data to the cluster but I

So it seems to be heap space rather than mmap(), given:

> concurrent mark-sweep generation:
>    capacity = 12841320448 (12246.4375MB)
>    used     = 10867324872 (10363.888618469238MB)
>    free     = 1973995576 (1882.5488815307617MB)
>    84.62778353679785% used

The JVM will tend to gobble up the memory you allow it to gobble up
with -Xmx, depending on circumstances. Yes, there are heuristics in
the VM that are intended to prevent it from immediately going to -Xmx
if not needed, but in my experience there are many situations where
these heuristics mostly fail. This is particularly an issue with
CMS/G1, which have to try to stay incremental and avoid pauses while
at the same time trying to do something decent about memory use. The
default throughput collector should be better here (I presume, I
haven't bothered experimenting with it much since it's uninteresting
;)).

The situation is complicated because in general garbage collection
becomes more efficient the more memory you give it, so there is a
direct trade-off between memory use and performance. For this reason
one of the heuristics (which I believe are in place with CMS too;
definitely with G1) is how much time is spent on GC. Certain patterns
can cause this heuristic to cause the heap size to be bumped
(instantly to -Xmx in the case of G1 anyway...).

You can try tweaking CMS settings (there are several, such as forcing
concurrent mark to start at a constant occupancy rate, tweaking
minfree/minused etc), but it is difficult to get right. But by far the
easiest thing to do in your situation is probably to determine roughly
how large the live set is (looking at how much memory is used after a
concurrent mark/sweep as just finished is a good way of doing this)
and then set -Xmx accordingly instead of at 12 GB.

--

-- 
/ Peter Schuller


Gmane