Pavel Kogan | 29 Aug 00:31 2014

Commitlog files are not being deleted

Hi all,

Shouldn't all commitlog files be auto deleted after replaying, for example after node restart?
Using Cassandra 2.0.8 

Jan Algermissen | 28 Aug 16:18 2014

Reducing tombstones impact in queue access patterns through rolling shards?


I just came across this recipe by Netflix, that addresses the impact of tombstones in queue access patterns with a time based rolling shard to allow compaction to happen in one shard while the other is ‘busy’. (At least this is what understand from the intro)

Has anyone adopted such a pattern and can share experience?

Vineet Mishra | 28 Aug 10:04 2014

PoolTimeoutException while connecting to Cassandra

Hi All,

I have downloaded titan-server-0.4.4 and trying to integrate it with Cassandra as backend datasource. Cassandra is running as external on 4 node machine, now I am trying to start Rexster with the Cassandra as my backend source but it comes up with error while initializing. 
I have even tried with the hbase as the backend which works well, so it seems that the problem is with Cassandra or the rexter configuration.
I have mentioned below the snippet of rexster-cassandra.xml and the error I am getting.


            <!-- <graph-location>/tmp/titan</graph-location> -->


[com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration]. Ensure that it is in Rexster's path.
com.tinkerpop.rexster.config.GraphConfigurationException: GraphConfiguration could not be found or otherwise instantiated: [com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration]. Ensure that it is in Rexster's path.
at com.tinkerpop.rexster.config.GraphConfigurationContainer.getGraphFromConfiguration(
at com.tinkerpop.rexster.config.GraphConfigurationContainer.<init>(
at com.tinkerpop.rexster.server.XmlRexsterApplication.reconfigure(
at com.tinkerpop.rexster.server.XmlRexsterApplication.<init>(
at com.tinkerpop.rexster.Application.<init>(
at com.tinkerpop.rexster.Application.main(
Caused by: java.lang.IllegalArgumentException: Could not instantiate implementation: com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager
at com.thinkaurelius.titan.diskstorage.Backend.instantiate(
at com.thinkaurelius.titan.diskstorage.Backend.getImplementationClass(
at com.thinkaurelius.titan.diskstorage.Backend.getStorageManager(
at com.thinkaurelius.titan.diskstorage.Backend.<init>(
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.getBackend(
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.<init>(
at com.thinkaurelius.titan.tinkerpop.rexster.TitanGraphConfiguration.configureGraphInstance(
at com.tinkerpop.rexster.config.GraphConfigurationContainer.getGraphFromConfiguration(
... 5 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
at com.thinkaurelius.titan.diskstorage.Backend.instantiate(
... 13 more
Caused by: com.thinkaurelius.titan.diskstorage.TemporaryStorageException: Temporary failure in storage backend
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager.<init>(
... 18 more
Caused by: PoolTimeoutException: [host=, latency=40001(40001), attempts=4]Timed out waiting for connection
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxStoreManager.ensureKeyspaceExists(
... 19 more

More over even to add that the Cassandra is up as I have tried creating keyspace and columnfamily through CQLSH terminal.

I am looking out for the solution which can bring the rexter up with this external configured cassandra backend.

Looking out for expert advice at the Earliest!

Donald Smith | 27 Aug 21:38 2014

How often are JMX Cassandra metrics reset?

I’m using JMX to retrieve Cassandra metrics.   I notice that  Max and Count are cumulative and aren’t reset.    How often are the stats for Mean, 99tthPercentile, etc reset back to zero?

For example, 99thPercentile shows as 1.5 mls. Over how many minutes?



    LatencyUnit = MICROSECONDS

    FiveMinuteRate = 1.12

    FifteenMinuteRate = 1.11

    RateUnit = SECONDS

    MeanRate = 1.65

    OneMinuteRate = 1.13

    EventType = calls

   Max = 237,373.37

    Count = 961,312

    50thPercentile = 383.2

    Mean = 908.46

    Min = 95.64

    StdDev = 3,034.62

    75thPercentile = 626.34

    95thPercentile = 954.31

    98thPercentile = 1,443.11

    99thPercentile = 1,472.4

    999thPercentile = 1,858.1


Donald A. Smith | Senior Software Engineer
P: 425.201.3900 x 3866
C: (206) 819-5965
F: (646) 443-2333
donalds <at>


Razi Khaja | 27 Aug 17:29 2014

[RELEASE] Apache Cassandra 2.0.10 released

I looked for the newest release, but only see release candidates, not a stable release.
Malay Nilabh | 27 Aug 14:13 2014

Bulk load in cassandra


I installed Cassandra on one node successfully using CLI I am able to add a table to the keyspace as well as  retrieve the data from the table. My query is if I have text file on my local file system and I want to load on Cassandra cluster or you can say bulk load. How can I achieve that. Please help me out.



Malay Nilabh

BIDW BU/ Big Data CoE

L&T Infotech Ltd, Hinjewadi,Pune

: +91-20-66571746


Email: Malay.Nilabh <at>

|| Save Paper - Save Trees || 


The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"
Stephen Portanova | 27 Aug 10:50 2014

Can't Add AWS Node due to /mnt/cassandra/data directory

I already have a 3node m3.large DSE cluster, but I can't seem to add another m3.large node. I'm using the ubuntu-trusty-14.04-amd64-server-20140607.1 (ami-a7fdfee2) AMI (instance-store backed, PV) on AWS, I install java 7 and the JNA, then I go into opscenter to add a node. Things look good for 3 or 4 green circles, until I either get this error: Start Errored: Timed out waiting for Cassandra to start. or this error: Agent Connection Errored: Timed out waiting for agent to connect.

I check the system.log and output.log, and they both say:
INFO [main] 2014-08-27 08:17:24,642 (line 121) JNA mlockall successful
ERROR [main] 2014-08-27 08:17:24,644 (line 235) Directory /mnt/cassandra/data doesn't exist
ERROR [main] 2014-08-27 08:17:24,645 (line 239) Has no permission to create /mnt/cassandra/data directory
 INFO [Thread-1] 2014-08-27 08:17:24,646 (line 477) DSE shutting down...
ERROR [Thread-1] 2014-08-27 08:17:24,725 (line 199) Exception in thread Thread[Thread-1,5,main]
        at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(
        at com.datastax.bdp.gms.DseState.setActiveStatus(
        at com.datastax.bdp.server.DseDaemon.stop(
        at com.datastax.bdp.server.DseDaemon$

My agent.log file says:

Node is still provisioning, not attempting to determine ip.

 INFO [Initialization] 2014-08-27 08:40:57,848 Sleeping for 20s before trying to determine IP over JMX again

 INFO [Initialization] 2014-08-27 08:41:17,849 Node is still provisioning, not attempting to determine ip.

 INFO [Initialization] 2014-08-27 08:41:17,849 Sleeping for 20s before trying to determine IP over JMX again

 INFO [Initialization] 2014-08-27 08:41:37,849 Node is still provisioning, not attempting to determine ip.

 INFO [Initialization] 2014-08-27 08:41:37,850 Sleeping for 20s before trying to determine IP over JMX again

 INFO [Initialization] 2014-08-27 08:41:57,850 Node is still provisioning, not attempting to determine ip.

I feel like I'm missing something easy with the mount, so if you could point me in the right direction, I would really appreciate it!

Stephen Portanova
Ian Rose | 26 Aug 21:12 2014

are dynamic columns supported at all in CQL 3?

Is it possible in CQL to create a table that supports dynamic column names?  I am using C* v2.0.9, which I assume implies CQL version 3.

This page appears to show that this was supported in CQL 2 with the 'with comparator' and 'with default_validation' options but that CQL 3 does not support this:

Am I understanding that right?  If so, what is my best course of action?  Create the table using the cassandra-cli tool?

- Ian

Paulo Ricardo Motta Gomes | 26 Aug 20:38 2014

Too many SSTables after rebalancing cluster (LCS)

Hey folks,

After adding more nodes and moving tokens of "old" nodes to rebalance the ring, I noticed that the "old" nodes had significant more data then the newly bootstrapped nodes, even after cleanup.

I noticed that the old nodes had a much larger number of SSTables on LCS CFs, and most of them located on the last level:

Node N-1 (old node): [1, 10, 102/100, 173, 2403, 0, 0, 0, 0] (total:2695)
Node N (new node): [1, 10, 108/100, 214, 0, 0, 0, 0, 0] (total: 339)
Node N+1 (old node): [1, 10, 87, 113, 1076, 0, 0, 0, 0] (total: 1287)

Since these sstables have a lot of tombstones, and they're not updated frequently, they remain in the last level forever, and are never cleaned.

What is the solution here? The good old "change to STCS and then back to LCS", or is there something less brute force?

Environment: Cassandra 1.2.16 - non-vnondes

Any help would be very much appreciated.


Paulo Motta

Chaordic | Platform
+55 48 3232.3200
Leleu Eric | 26 Aug 19:20 2014

Question about MemoryMeter liveRatio




I’m trying to understand what is the liveRatio and if I have to care about it.

I found some reference on the web and if I understand them, the liveRatio represents  the Memtable size divided by the amount of data serialized on the disk. Is it the truth?



When I see the following log, what can I deduce about it ?


    INFO [MemoryMeter:1] 2014-08-26 19:02:41,047 (line 481) CFS(Keyspace='ufapi', ColumnFamily='users') liveRatio is 8.52308554793235 (just-counted was 8.514143642185562).  calculation took 3613ms for 272646 cells

   INFO [MemoryMeter:1] 2014-08-26 18:36:09,965 (line 481) CFS(Keyspace='system', ColumnFamily='compactions_in_progress') liveRatio is 40.1893491

1242604 (just-counted was 16.37869822485207).  calculation took 0ms for 7 cells



According to my read, the liveRatio is set between 1 and 64. If My liveRatio is around 64, should I care about some things?

Does Cassandra use the liveRatio for some internal task or it is just a metric?





Ce message et les pièces jointes sont confidentiels et réservés à l'usage exclusif de ses destinataires. Il peut également être protégé par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant être assurée sur Internet, la responsabilité de Worldline ne pourra être recherchée quant au contenu de ce message. Bien que les meilleurs efforts soient faits pour maintenir cette transmission exempte de tout virus, l'expéditeur ne donne aucune garantie à cet égard et sa responsabilité ne saurait être recherchée pour tout dommage résultant d'un virus transmis.

This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Worldline liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted.
Jaydeep Chovatia | 26 Aug 18:50 2014

CQL performance inserting multiple cluster keys under same partition key


I have question on inserting multiple cluster keys under same partition key. 


CREATE TABLE Employee ( 
  deptId int, 
  empId int, 
  name   varchar,
  address varchar,
  salary int,   
  PRIMARY KEY(deptId, empId) 

  INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1, 10, 'testNameA', 'testAddressA', 20000); 
  INSERT INTO Employee (deptId, empId, name, address, salary) VALUES (1, 20, 'testNameB', 'testAddressB', 30000); 

Here we are inserting two cluster keys (10 and 20) under same partition key (1). 
Q1) Is this batch transaction atomic and isolated? If yes then is there any performance overhead with this syntax?
Q2) Is this CQL syntax can be considered equivalent of Thrift "batch_mutate"?