Alain RODRIGUEZ | 31 Aug 15:48 2015

Network / GC / Latency spike


Running a 2.0.16 C* on AWS (private VPC, 2 DC).

I am facing an issue on our EU DC where I have a network burst (alongside with GC and latency increase).

My first thought was a sudden application burst, though, I see no corresponding evolution on reads / write or even CPU.

So I thought that this might come from the node themselves as IN almost equal OUT Network. I tried lowering stream throughput on the whole DC to 1 Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a lot higher about 30 M in both sides (see screenshots attached).

I have tried to use iftop to see where this network is headed too, but I was not able to do it because burst are very shorts.

So, questions are:

- Did someone experienced something similar already ? If so, any clue would be appreciated :).
- How can I know (monitor, capture) where this big amount of network is headed to or due to ?
- Am I right trying to figure out what this network is or should I follow an other lead ?

Notes: I also noticed that CPU does not spike nor does R&W, but disk reads also spikes !


Pål Andreassen | 31 Aug 15:48 2015

Cassandra 2.2 for time series



I’m currently evaluating Cassandra as a potiantial database for storing time series data from lots of devices (IoT type of scenario).

Currently we have a few thousand devices with X channels (measurements) that they report at different intervals (from 5 minutes and up).


I’ve created as simple test table to store the data:



  channelId int,

  sampleTime timestamp,

  value double,

  PRIMARY KEY (channelId, sampleTime)



This schema seems to work ok, but I have queries that I need to support that I cannot easily figure out how to perform (except getting all the data out and iterate it myself).


Query 1: For max and min queries, I not only want the maximum/minimum value, but also the corresponding timestamp.


sampleTime          value

2015-08-28 00:00    10

2015-08-28 01:00    15

2015-08-28 02:00    13

I'd like the max query to return both 2015-08-28 01:00 and 15. SELECT sampleTime, max(value) FROM DataRAW return the max value, but the first sampleTime.

Also I wonder if Cassandra has built-in support for interpolation/extrapolation. Some sort of group by hour/day/week/month and even year function.


Query 2: Give me hourly averages for channel X for yesterday. I’d expect to get 24 values each of which is the hourly average. Or give my daily averages for last year for a given channel. Should return 365 daily averages.


Best regards


Pål Andreassen

54°23'58"S 3°18'53"E


Mobil +47 982 85 504

pal.andreassen <at>


Bouvet Norge AS
Avdeling Grenland

Uniongata 18, Klosterøya

N-3732 Skien

Tlf +47 23 40 60 00


Noriaki Tatsumi | 31 Aug 03:42 2015

Cassandra Migration Tool


I built a simple and lightweight migration tool for Apache Cassandra database that's based on Axel Fontaine's Flyway project. Cassandra Migration works just like Flyway. Plain CQL and Java based migrations are supported. The Java migration interface provides DataStax's Java Driver session.

I'm aware of the existence of mutagen-cassandra but I wanted a lighter tool that uses DataStax's Java driver without Netflix's Astyanax wrapper client . I also observed that it has not been updated in 2 years. I'm writing to let the Cassandra user community know about the new tool in hopes of becoming useful to others and getting feedback on improvements and bugs.

Please check it out!

ibrahim El-sanosi | 30 Aug 23:09 2015

Is ZooKeeper still use in Cassandra?

Hi folks,

I read Cassandra white paper, I come across a text says "Cassandra system elects a leader amongst its nodes using a system called Zookeeper[13]. All nodes on joining the cluster contact the leader who tells them for what ranges they are replicas for and leader makes a concerted effort to maintain the invariant that no node is responsible for more than N-1 ranges in the ring. The metadata about the ranges a node is responsible is cached locally at each node and in a fault-tolerant manner inside Zookeeper - this way a node that crashes and comes back up knows what ranges it was responsible for."

Does Cassandra still use ZooKeeper? if yes can you refer me to any related article?


Tony Anecito | 28 Aug 22:58 2015

Upgrade from 2.1.0 to 2.1.9

Hi All,
Been awhile since I upgaded and wanted to know what the steps are to upgrade from 2.1.0 to 2.1.9. Also want to know if I need to upgrade my java database driver.

sai krishnam raju potturi | 28 Aug 20:32 2015

Re : Decommissioned node appears in logs, and is sometimes marked as "UNREACHEABLE" in `nodetool describecluster`

    we decommissioned nodes in a datacenter a while back. Those nodes keep showing up in the logs, and also sometimes marked as UNREACHABLE when `nodetool describecluster` is run. 

        However these nodes do not show up in `nodetool status` and `nodetool ring`.

Below are a couple lines from the logs.

2015-08-27 04:38:16,180 [GossipStage:473] INFO Gossiper InetAddress / is now DOWN
2015-08-27 04:38:16,183 [GossipStage:473] INFO StorageService Removing tokens [85070591730234615865843651857942052865] for /


sai krishnam raju potturi | 28 Aug 20:12 2015

Fwd: Re : Restoring nodes in a new datacenter, from snapshots in an existing datacenter

     We have cassandra cluster with Vnodes spanning across 3 data centers. We take backup of the snapshots from one datacenter.
   In a doomsday scenario, we want to restore a downed datacenter, with snapshots from another datacenter. We have same number of nodes in each datacenter.

1 : We know it requires copying the snapshots and their corresponding token ranges to the nodes in new datacenter, and running nodetool refresh.

2 : The question is, we will now have 2 datacenters, with the same exact token ranges. Will that cause any problem. 

DC1 : Node-1 : token1......token10 Node-2 : token11 .....token20 Node-3 : token21 ..... token30 Node-4 : token31 ..... token40 DC2 : Node-1 : token1.....token10 Node-2 : token11....token20 Node-3 : token21....token30 Node-4 : token31....token40


Jake Luciani | 28 Aug 19:06 2015

[RELEASE] Apache Cassandra 2.1.9 released

The Cassandra team is pleased to announce the release of Apache Cassandra
version 2.1.9.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising

Downloads of source and binary distributions are listed in our download

This version is a bug fix release[1] on the 2.1 series. As always, please pay
attention to the release notes[2] and Let us know[3] if you were to encounter
any problem.


[1]: (CHANGES.txt)
[2]: (NEWS.txt)

Cyril Scetbon | 28 Aug 18:49 2015

ccm issue

Hi guys,

I got some issues with ccm and unit tests in java-driver. Here is what I see :

tail -f /tmp/1440780247703-0/test/node5/logs/system.log
 INFO [STREAM-IN-/] 2015-08-28 16:45:06,009 (line 220) [Stream
#22d9e9f0-4da4-11e5-9409-5d8a0f12fefd] All sessions completed
 INFO [main] 2015-08-28 16:45:06,009 (line 1014) Bootstrap completed! for the
tokens [8907077543698545973]
 INFO [main] 2015-08-28 16:45:06,010 (line 785) Enqueuing flush of
Memtable-local <at> 1738175520(84/840 serialized/live bytes, 8 ops)
 INFO [FlushWriter:1] 2015-08-28 16:45:06,013 (line 331) Writing
Memtable-local <at> 1738175520(84/840 serialized/live bytes, 8 ops)
 INFO [FlushWriter:1] 2015-08-28 16:45:06,072 (line 371) Completed flushing
/tmp/1440780247703-0/test/node5/data/system/local/system-local-jb-6-Data.db (117 bytes) for
commitlog position ReplayPosition(segmentId=1440780271059, position=143914)
 INFO [main] 2015-08-28 16:45:06,074 (line 785) Enqueuing flush of
Memtable-local <at> 1171696270(50/500 serialized/live bytes, 4 ops)
 INFO [FlushWriter:1] 2015-08-28 16:45:06,074 (line 331) Writing
Memtable-local <at> 1171696270(50/500 serialized/live bytes, 4 ops)
 INFO [FlushWriter:1] 2015-08-28 16:45:06,118 (line 371) Completed flushing
/tmp/1440780247703-0/test/node5/data/system/local/system-local-jb-7-Data.db (97 bytes) for
commitlog position ReplayPosition(segmentId=1440780271059, position=144080)
 INFO [main] 2015-08-28 16:45:06,122 (line 1499) Node / state jump to normal
 INFO [main] 2015-08-28 16:45:06,124 (line 518) Waiting for gossip to settle
before accepting client requests...
 INFO [main] 2015-08-28 16:45:14,125 (line 550) No gossip backlog; proceeding
 INFO [main] 2015-08-28 16:45:14,187 (line 155) Starting listening for CQL clients on /
 INFO [main] 2015-08-28 16:45:14,224 (line 99) Using TFramedTransport with a max
frame size of 15728640 bytes.
 INFO [main] 2015-08-28 16:45:14,225 (line 118) Binding thrift service to /
 INFO [main] 2015-08-28 16:45:14,233 (line 47) Using
synchronous/threadpool thrift server on : 9160
 INFO [Thread-10] 2015-08-28 16:45:14,233 (line 135) Listening for thrift clients...

However ccm doesn't see that node5 is running and listening :

    0      [Scheduled Tasks-0] INFO  com.datastax.driver.core.Cluster  - New Cassandra host / added
    53833  [main] INFO  com.datastax.driver.core.TestUtils  - is not UP after 60s
    69528  [main] INFO  com.datastax.driver.core.CCMBridge  - Error during tests, kept C* logs in /tmp/1440780247703-0

But at the same time I can see that node5 is running and I can also connect to it :

# netstat -lnt|grep 9042
tcp        0      0*               LISTEN
tcp        0      0*               LISTEN
tcp        0      0*               LISTEN
tcp        0      0*               LISTEN
tcp        0      0*               LISTEN
root <at> ip-10-0-1-97:~# nc 9042

After the error then ccm turns all nodes down to end unit tests :

INFO [GossipStage:1] 2015-08-28 16:45:31,864 (line 863) InetAddress / is now DOWN
INFO [GossipStage:1] 2015-08-28 16:45:34,989 (line 863) InetAddress / is now DOWN
INFO [GossipStage:1] 2015-08-28 16:45:38,087 (line 863) InetAddress / is now DOWN
INFO [StorageServiceShutdownHook] 2015-08-28 16:45:39,181 (line 141) Stop
listening to thrift clients
INFO [StorageServiceShutdownHook] 2015-08-28 16:45:39,200 (line 181) Stop listening for
CQL clients

Any idea ? Any known issue ?
Tommy Stendahl | 28 Aug 13:14 2015

TTL question


I did a small test using TTL but I didn't get the result I expected.

I did this in sqlsh:

cqlsh> create TABLE ( key int, cluster int, col int, PRIMARY KEY 
(key, cluster)) ;
cqlsh> INSERT INTO (key, cluster ) VALUES ( 1,1 );
cqlsh> SELECT * FROM ;

  key | cluster | col
    1 |       1 | null

(1 rows)
cqlsh> INSERT INTO (key, cluster, col ) VALUES ( 1,1,1 ) USING 
TTL 10;
cqlsh> SELECT * FROM ;

  key | cluster | col
    1 |       1 |   1

(1 rows)

<wait for TTL to expire>

cqlsh> SELECT * FROM ;

  key | cluster | col

(0 rows)

Is this really correct?
I expected the result from the last select to be:

  key | cluster | col
    1 |       1 | null

(1 rows)


Asit KAUSHIK | 28 Aug 12:07 2015

Deployment of .net application on production is erroring out

Hi All,

Please excuse my limited knowledge . We have an application in .Net and the backend database is Cassandra.We have deployed in our application into production which is behing the Firewall. We have opened the 9042 Port from our Webserver to the cassandra cluster. But again we are getting the below error

INFO  [SharedPool-Worker-1] 2015-08-27 11:07:20,679 - Unexpected exception during request; channel = [id: 0x12af6143, / => /] Error while read(...): Connection timed out
at Method) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at$EpollSocketUnsafe.doReadBytes( ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at$EpollSocketUnsafe.epollInReady( ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$ ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.util.concurrent.DefaultThreadFactory$ ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at [na:1.7.0_65]

We have a 5 node cluster and added the static routing to all the node for port 9042 . 

Do we need open some more ports as the connection is established but is timed out

An early help is highly appreciated