Alain RODRIGUEZ | 31 Aug 15:48 2015
Picon

Network / GC / Latency spike

Hi,

Running a 2.0.16 C* on AWS (private VPC, 2 DC).

I am facing an issue on our EU DC where I have a network burst (alongside with GC and latency increase).

My first thought was a sudden application burst, though, I see no corresponding evolution on reads / write or even CPU.

So I thought that this might come from the node themselves as IN almost equal OUT Network. I tried lowering stream throughput on the whole DC to 1 Mbps, with ~30 nodes --> 30 Mbps --> ~4 MB/s max. My network went a lot higher about 30 M in both sides (see screenshots attached).

I have tried to use iftop to see where this network is headed too, but I was not able to do it because burst are very shorts.

So, questions are:

- Did someone experienced something similar already ? If so, any clue would be appreciated :).
- How can I know (monitor, capture) where this big amount of network is headed to or due to ?
- Am I right trying to figure out what this network is or should I follow an other lead ?

Notes: I also noticed that CPU does not spike nor does R&W, but disk reads also spikes !

C*heers,

Alain
Pål Andreassen | 31 Aug 15:48 2015
Picon

Cassandra 2.2 for time series

Hi

 

I’m currently evaluating Cassandra as a potiantial database for storing time series data from lots of devices (IoT type of scenario).

Currently we have a few thousand devices with X channels (measurements) that they report at different intervals (from 5 minutes and up).

 

I’ve created as simple test table to store the data:

 

CREATE TABLE DataRaw(

  channelId int,

  sampleTime timestamp,

  value double,

  PRIMARY KEY (channelId, sampleTime)

) WITH CLUSTERING ORDER BY (sampleTime ASC);

 

This schema seems to work ok, but I have queries that I need to support that I cannot easily figure out how to perform (except getting all the data out and iterate it myself).

 

Query 1: For max and min queries, I not only want the maximum/minimum value, but also the corresponding timestamp.

 

sampleTime          value

2015-08-28 00:00    10

2015-08-28 01:00    15

2015-08-28 02:00    13


I'd like the max query to return both 2015-08-28 01:00 and 15. SELECT sampleTime, max(value) FROM DataRAW return the max value, but the first sampleTime.

Also I wonder if Cassandra has built-in support for interpolation/extrapolation. Some sort of group by hour/day/week/month and even year function.

 

Query 2: Give me hourly averages for channel X for yesterday. I’d expect to get 24 values each of which is the hourly average. Or give my daily averages for last year for a given channel. Should return 365 daily averages.

 

Best regards

 

Pål Andreassen

54°23'58"S 3°18'53"E

Konsulent

Mobil +47 982 85 504

pal.andreassen <at> bouvet.no

 

Bouvet Norge AS
Avdeling Grenland

Uniongata 18, Klosterøya

N-3732 Skien

Tlf +47 23 40 60 00 

bouvet.no

 

Noriaki Tatsumi | 31 Aug 03:42 2015
Picon

Cassandra Migration Tool

Hello,

I built a simple and lightweight migration tool for Apache Cassandra database that's based on Axel Fontaine's Flyway project. Cassandra Migration works just like Flyway. Plain CQL and Java based migrations are supported. The Java migration interface provides DataStax's Java Driver session.

I'm aware of the existence of mutagen-cassandra but I wanted a lighter tool that uses DataStax's Java driver without Netflix's Astyanax wrapper client . I also observed that it has not been updated in 2 years. I'm writing to let the Cassandra user community know about the new tool in hopes of becoming useful to others and getting feedback on improvements and bugs.

Please check it out!

Thanks,
Noriaki
ibrahim El-sanosi | 30 Aug 23:09 2015
Picon

Is ZooKeeper still use in Cassandra?

Hi folks,

I read Cassandra white paper, I come across a text says "Cassandra system elects a leader amongst its nodes using a system called Zookeeper[13]. All nodes on joining the cluster contact the leader who tells them for what ranges they are replicas for and leader makes a concerted effort to maintain the invariant that no node is responsible for more than N-1 ranges in the ring. The metadata about the ranges a node is responsible is cached locally at each node and in a fault-tolerant manner inside Zookeeper - this way a node that crashes and comes back up knows what ranges it was responsible for."

Does Cassandra still use ZooKeeper? if yes can you refer me to any related article?

Regards,

Ibrahim
Tony Anecito | 28 Aug 22:58 2015
Picon

Upgrade from 2.1.0 to 2.1.9

Hi All,
Been awhile since I upgaded and wanted to know what the steps are to upgrade from 2.1.0 to 2.1.9. Also want to know if I need to upgrade my java database driver.

Thanks,
-Tony
sai krishnam raju potturi | 28 Aug 20:32 2015
Picon

Re : Decommissioned node appears in logs, and is sometimes marked as "UNREACHEABLE" in `nodetool describecluster`

hi;
    we decommissioned nodes in a datacenter a while back. Those nodes keep showing up in the logs, and also sometimes marked as UNREACHABLE when `nodetool describecluster` is run. 

        However these nodes do not show up in `nodetool status` and `nodetool ring`.

Below are a couple lines from the logs.

2015-08-27 04:38:16,180 [GossipStage:473] INFO Gossiper InetAddress /10.0.0.1 is now DOWN
2015-08-27 04:38:16,183 [GossipStage:473] INFO StorageService Removing tokens [85070591730234615865843651857942052865] for /10.0.0.1

thanks
Sai 

sai krishnam raju potturi | 28 Aug 20:12 2015
Picon

Fwd: Re : Restoring nodes in a new datacenter, from snapshots in an existing datacenter


hi;
     We have cassandra cluster with Vnodes spanning across 3 data centers. We take backup of the snapshots from one datacenter.
   In a doomsday scenario, we want to restore a downed datacenter, with snapshots from another datacenter. We have same number of nodes in each datacenter.

1 : We know it requires copying the snapshots and their corresponding token ranges to the nodes in new datacenter, and running nodetool refresh.

2 : The question is, we will now have 2 datacenters, with the same exact token ranges. Will that cause any problem. 

DC1 : Node-1 : token1......token10 Node-2 : token11 .....token20 Node-3 : token21 ..... token30 Node-4 : token31 ..... token40 DC2 : Node-1 : token1.....token10 Node-2 : token11....token20 Node-3 : token21....token30 Node-4 : token31....token40

thanks
Sai 



Jake Luciani | 28 Aug 19:06 2015
Picon

[RELEASE] Apache Cassandra 2.1.9 released

The Cassandra team is pleased to announce the release of Apache Cassandra
version 2.1.9.

Apache Cassandra is a fully distributed database. It is the right choice
when you need scalability and high availability without compromising
performance.


Downloads of source and binary distributions are listed in our download
section:


This version is a bug fix release[1] on the 2.1 series. As always, please pay
attention to the release notes[2] and Let us know[3] if you were to encounter
any problem.

Enjoy!

[1]: http://goo.gl/xnYwFa (CHANGES.txt)
[2]: http://goo.gl/QDqPhN (NEWS.txt)

Cyril Scetbon | 28 Aug 18:49 2015
Picon

ccm issue

Hi guys,

I got some issues with ccm and unit tests in java-driver. Here is what I see :

tail -f /tmp/1440780247703-0/test/node5/logs/system.log
 INFO [STREAM-IN-/127.0.1.3] 2015-08-28 16:45:06,009 StreamResultFuture.java (line 220) [Stream
#22d9e9f0-4da4-11e5-9409-5d8a0f12fefd] All sessions completed
 INFO [main] 2015-08-28 16:45:06,009 StorageService.java (line 1014) Bootstrap completed! for the
tokens [8907077543698545973]
 INFO [main] 2015-08-28 16:45:06,010 ColumnFamilyStore.java (line 785) Enqueuing flush of
Memtable-local <at> 1738175520(84/840 serialized/live bytes, 8 ops)
 INFO [FlushWriter:1] 2015-08-28 16:45:06,013 Memtable.java (line 331) Writing
Memtable-local <at> 1738175520(84/840 serialized/live bytes, 8 ops)
 INFO [FlushWriter:1] 2015-08-28 16:45:06,072 Memtable.java (line 371) Completed flushing
/tmp/1440780247703-0/test/node5/data/system/local/system-local-jb-6-Data.db (117 bytes) for
commitlog position ReplayPosition(segmentId=1440780271059, position=143914)
 INFO [main] 2015-08-28 16:45:06,074 ColumnFamilyStore.java (line 785) Enqueuing flush of
Memtable-local <at> 1171696270(50/500 serialized/live bytes, 4 ops)
 INFO [FlushWriter:1] 2015-08-28 16:45:06,074 Memtable.java (line 331) Writing
Memtable-local <at> 1171696270(50/500 serialized/live bytes, 4 ops)
 INFO [FlushWriter:1] 2015-08-28 16:45:06,118 Memtable.java (line 371) Completed flushing
/tmp/1440780247703-0/test/node5/data/system/local/system-local-jb-7-Data.db (97 bytes) for
commitlog position ReplayPosition(segmentId=1440780271059, position=144080)
 INFO [main] 2015-08-28 16:45:06,122 StorageService.java (line 1499) Node /127.0.1.5 state jump to normal
 INFO [main] 2015-08-28 16:45:06,124 CassandraDaemon.java (line 518) Waiting for gossip to settle
before accepting client requests...
 INFO [main] 2015-08-28 16:45:14,125 CassandraDaemon.java (line 550) No gossip backlog; proceeding
 INFO [main] 2015-08-28 16:45:14,187 Server.java (line 155) Starting listening for CQL clients on /127.0.1.5:9042...
 INFO [main] 2015-08-28 16:45:14,224 ThriftServer.java (line 99) Using TFramedTransport with a max
frame size of 15728640 bytes.
 INFO [main] 2015-08-28 16:45:14,225 ThriftServer.java (line 118) Binding thrift service to /127.0.1.5:9160
 INFO [main] 2015-08-28 16:45:14,233 TServerCustomFactory.java (line 47) Using
synchronous/threadpool thrift server on 127.0.1.5 : 9160
 INFO [Thread-10] 2015-08-28 16:45:14,233 ThriftServer.java (line 135) Listening for thrift clients...

However ccm doesn't see that node5 is running and listening :

    0      [Scheduled Tasks-0] INFO  com.datastax.driver.core.Cluster  - New Cassandra host /127.0.1.5:9042 added
    53833  [main] INFO  com.datastax.driver.core.TestUtils  - 127.0.1.5 is not UP after 60s
    69528  [main] INFO  com.datastax.driver.core.CCMBridge  - Error during tests, kept C* logs in /tmp/1440780247703-0

But at the same time I can see that node5 is running and I can also connect to it :

# netstat -lnt|grep 9042
tcp        0      0 127.0.1.5:9042          0.0.0.0:*               LISTEN
tcp        0      0 127.0.1.3:9042          0.0.0.0:*               LISTEN
tcp        0      0 127.0.1.4:9042          0.0.0.0:*               LISTEN
tcp        0      0 127.0.1.2:9042          0.0.0.0:*               LISTEN
tcp        0      0 127.0.1.1:9042          0.0.0.0:*               LISTEN
root <at> ip-10-0-1-97:~# nc 127.0.1.5 9042

After the error then ccm turns all nodes down to end unit tests :

INFO [GossipStage:1] 2015-08-28 16:45:31,864 Gossiper.java (line 863) InetAddress /127.0.1.1 is now DOWN
INFO [GossipStage:1] 2015-08-28 16:45:34,989 Gossiper.java (line 863) InetAddress /127.0.1.3 is now DOWN
INFO [GossipStage:1] 2015-08-28 16:45:38,087 Gossiper.java (line 863) InetAddress /127.0.1.2 is now DOWN
INFO [StorageServiceShutdownHook] 2015-08-28 16:45:39,181 ThriftServer.java (line 141) Stop
listening to thrift clients
INFO [StorageServiceShutdownHook] 2015-08-28 16:45:39,200 Server.java (line 181) Stop listening for
CQL clients

Any idea ? Any known issue ?
Tommy Stendahl | 28 Aug 13:14 2015
Picon

TTL question

Hi,

I did a small test using TTL but I didn't get the result I expected.

I did this in sqlsh:

cqlsh> create TABLE foo.bar ( key int, cluster int, col int, PRIMARY KEY 
(key, cluster)) ;
cqlsh> INSERT INTO foo.bar (key, cluster ) VALUES ( 1,1 );
cqlsh> SELECT * FROM foo.bar ;

  key | cluster | col
-----+---------+------
    1 |       1 | null

(1 rows)
cqlsh> INSERT INTO foo.bar (key, cluster, col ) VALUES ( 1,1,1 ) USING 
TTL 10;
cqlsh> SELECT * FROM foo.bar ;

  key | cluster | col
-----+---------+-----
    1 |       1 |   1

(1 rows)

<wait for TTL to expire>

cqlsh> SELECT * FROM foo.bar ;

  key | cluster | col
-----+---------+-----

(0 rows)

Is this really correct?
I expected the result from the last select to be:

  key | cluster | col
-----+---------+------
    1 |       1 | null

(1 rows)

Regards,
Tommy

Asit KAUSHIK | 28 Aug 12:07 2015
Picon

Deployment of .net application on production is erroring out

Hi All,

Please excuse my limited knowledge . We have an application in .Net and the backend database is Cassandra.We have deployed in our application into production which is behing the Firewall. We have opened the 9042 Port from our Webserver to the cassandra cluster. But again we are getting the below error

INFO  [SharedPool-Worker-1] 2015-08-27 11:07:20,679 Message.java:532 - Unexpected exception during request; channel = [id: 0x12af6143, /192.168.16.198:2159 => /10.253.2.53:9042]
java.io.IOException: Error while read(...): Connection timed out
at io.netty.channel.epoll.Native.readAddress(Native Method) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.doReadBytes(EpollSocketChannel.java:675) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.channel.epoll.EpollSocketChannel$EpollSocketUnsafe.epollInReady(EpollSocketChannel.java:714) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:326) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:264) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) ~[netty-all-4.0.23.Final.jar:4.0.23.Final]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65]


We have a 5 node cluster and added the static routing to all the node for port 9042 . 

Do we need open some more ports as the connection is established but is timed out

An early help is highly appreciated

Regards
Asit 




Gmane