Big Bear | 21 Apr 05:07 2015

bootstrap performance.


Dikang Gu | 21 Apr 03:08 2015
Picon

Bootstrap performance.

Hi guys,

We have a 100+ nodes cluster, each node has about 400G data, and is running on a flash disk. We are running 2.1.2.

When I bring in a new node into the cluster, it introduces significant load to the cluster. For the new node, the cpu usage is 100%, but disk write io is only around 50MB/s, while we have 10G network.

Does it sound normal to you?

Here are some iostat and vmstat metrics:
==== iostat ====
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          88.52    3.99    4.11    0.00    0.00    3.38

Device:            tps    MB_read/s    MB_wrtn/s    MB_read    MB_wrtn
sda               1.00         0.00         0.04          0          0
sdb             156.50         0.00        55.62          0        1

==== vmstat =====
138  0      0 86781912 438780 101523368    0    0     0 31893 264496 247316 95  4  1  0  0      2015-04-21 01:04:01 UTC
147  0      0 86562400 438780 101607248    0    0     0 90510 456635 245849 91  5  4  0  0      2015-04-21 01:04:03 UTC
143  0      0 86341168 438780 101692224    0    0     0 32392 284495 273656 92  4  4  0  0      2015-04-21 01:04:05 UTC

Thanks.
--
Dikang

Matthew Johnson | 20 Apr 18:46 2015

Connecting to Cassandra cluster in AWS from local network

Hi all,

 

I have set up a Cassandra cluster with 2.1.4 on some existing AWS boxes, just as a POC. Cassandra servers connect to each other over their internal AWS IP addresses (172.x.x.x) aliased in /etc/hosts as sales1, sales2 and sales3.

 

I connect to it from my local dev environment using the seed’s external NAT address (54.x.x.x) aliases in my Windows hosts file as sales3 (my seed).

 

When I try to connect, it connects fine, and can retrieve some data (I have very limited amounts of data in there, but it seems to retrieve ok), but I also get lots of stacktraces in my log where my dev environment is trying to connect to Cassandra on the internal IP (presumably the Cassandra seed node tells my dev env where to look):

 

 

INFO  2015-04-20 16:34:14,808 [CASSANDRA-CLIENT] {main} Cluster - New Cassandra host sales3/54.x.x.142:9042 added

INFO  2015-04-20 16:34:14,808 [CASSANDRA-CLIENT] {main} Cluster - New Cassandra host /172.x.x.237:9042 added

INFO  2015-04-20 16:34:14,808 [CASSANDRA-CLIENT] {main} Cluster - New Cassandra host /172.x.x.170:9042 added

Connected to cluster: Test Cluster

Datatacenter: datacenter1; Host: /172.x.x.170; Rack: rack1

Datatacenter: datacenter1; Host: sales3/54.x.x.142; Rack: rack1

Datatacenter: datacenter1; Host: /172.x.x.237; Rack: rack1

DEBUG 2015-04-20 16:34:14,901 [CASSANDRA-CLIENT] {Cassandra Java Driver worker-0} Connection - Connection[sales3/54.x.x.142:9042-2, inFlight=0, closed=false] Transport initialized and ready

DEBUG 2015-04-20 16:34:14,901 [CASSANDRA-CLIENT] {Cassandra Java Driver worker-0} Session - Added connection pool for sales3/54.x.x.142:9042

DEBUG 2015-04-20 16:34:19,850 [CASSANDRA-CLIENT] {Cassandra Java Driver worker-1} Connection - Connection[/172.x.x.237:9042-1, inFlight=0, closed=false] Error connecting to /172.x.x.237:9042 (connection timed out: /172.x.x.237:9042)

DEBUG 2015-04-20 16:34:19,850 [CASSANDRA-CLIENT] {Cassandra Java Driver worker-1} Connection - Defuncting connection to /172.x.x.237:9042

com.datastax.driver.core.TransportException: [/172.x.x.237:9042] Cannot connect

 

 

Does anyone have any experience with connecting to AWS clusters from dev machines? How have you set up your aliases to get around this issue?

 

Current setup in sales3 (seed node) cassandra.yaml:

 

- seeds: "sales3"

listen_address: sales3

rpc_address: sales3

 

Current setup in other nodes (eg sales2) cassandra.yaml:

 

- seeds: "sales3"

listen_address: sales2

rpc_address: sales2

 

 

Thanks!

Matt

 

shahab | 20 Apr 18:28 2015
Picon

Getting " ParNew GC in ... CMS Old Gen ... " in logs

Hi,

I am keep getting following line in the cassandra logs, apparently something related to Garbage Collection. And I guess this is one of the signs why i do not get any response (i get time-out) when I query large volume of data ?!!! 

 ParNew GC in 248ms.  CMS Old Gen: 453244264 -> 570471312; Par Eden Space: 167712624 -> 0; Par Survivor Space: 0 -> 20970080

Is above line is indication of something that need to be fixed in the system?? how can I resolve this?


best,
/Shahab

Marko Asplund | 20 Apr 17:21 2015
Picon

Cassandra based web app benchmark

Hi,

TechEmpower Web Framework Benchmarks (https://www.techempower.com/benchmarks/) is a collaborative effort for measuring performance of a large number of contemporary web development platforms. Benchmarking and test implementation code is published as open-source.

I've contributed a test implementation that uses Apache Cassandra for data storage and based on the following technology stack:
* Java
* Resin app server + Servlet 3 with asynchronous processing
* Apache Cassandra database (v2.0.12)

TFB Round 10 results are expected to be released in the near future with results from Cassandra based test implementation included.

Now that the initial test implementation has been merged as part of the project codebase, I'd like to solicit feedback from the Cassandra user and developer community on best practices, especially wrt. to performance, with the hope that the test implementation can get the best performance out of Cassandra in future benchmark rounds.

Any review comments and pull requests would be welcome. The code can be found on Github:


More info on the benchmark project, as well as the Cassandra based test implementation can be found here:

thanks,

marko
Anuj Wadehra | 20 Apr 16:23 2015
Picon

Handle Write Heavy Loads in Cassandra 2.0.3

Hi,
 
Recently, we discovered that  millions of mutations were getting dropped on our cluster. Eventually, we solved this problem by increasing the value of memtable_flush_writers from 1 to 3. We usually write 3 CFs simultaneously an one of them has 4 Secondary Indexes.
 
New changes also include:
concurrent_compactors: 12 (earlier it was default)
compaction_throughput_mb_per_sec: 32(earlier it was default)
in_memory_compaction_limit_in_mb: 400 ((earlier it was default 64)
memtable_flush_writers: 3 (earlier 1)
 
After, making above changes, our write heavy workload scenarios started giving "promotion failed" exceptions in  gc logs.
 
We have done JVM tuning and Cassandra config changes to solve this:
 
MAX_HEAP_SIZE="12G" (Increased Heap to from 8G to reduce fragmentation)
HEAP_NEWSIZE="3G"
 
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=2" (We observed that even at SurvivorRatio=4, our survivor space was getting 100% utilized under heavy write load and we thought that minor collections were directly promoting objects to Tenured generation)
 
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=20" (Lots of objects were moving from Eden to Tenured on each minor collection..may be related to medium life objects related to Memtables and compactions as suggested by heapdump)
 
JVM_OPTS="$JVM_OPTS -XX:ConcGCThreads=20"
JVM_OPTS="$JVM_OPTS -XX:+UnlockDiagnosticVMOptions"
JVM_OPTS="$JVM_OPTS -XX:+UseGCTaskAffinity"
JVM_OPTS="$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs"
JVM_OPTS="$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768"
JVM_OPTS="$JVM_OPTS -XX:+CMSScavengeBeforeRemark"
JVM_OPTS="$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=30000"
JVM_OPTS="$JVM_OPTS -XX:CMSWaitDuration=2000" //though it's default value
JVM_OPTS="$JVM_OPTS -XX:+CMSEdenChunksRecordAlways"
JVM_OPTS="$JVM_OPTS -XX:+CMSParallelInitialMarkEnabled"
JVM_OPTS="$JVM_OPTS -XX:-UseBiasedLocking"
JVM_OPTS="$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=70" (to avoid concurrent failures we reduced value)
 
Cassandra config:
compaction_throughput_mb_per_sec: 24
memtable_total_space_in_mb: 1000 (to make memtable flush frequent.default is 1/4 heap which creates more long lived objects)
 
Questions:
1. Why increasing memtable_flush_writers caused promotion failures in JVM? Does more memtable_flush_writers mean more memtables in memory?
2. Still, objects are getting promoted at high speed to Tenured space. CMS is running on Old gen every 4-5 minutes  under heavy write load. Around 750+ minor collections of upto 300ms happened in 45 mins. Do you see any problems with new JVM tuning and Cassandra config? Is the justification given against those changes sounds logical? Any suggestions?
3. What is the best practice for reducing heap fragmentation/promotion failure when allocation and promotion rates are high?
 
Thanks
Anuj
 
 


Or Sher | 20 Apr 11:02 2015
Picon

Adding nodes to existing cluster

Hi all,
In the near future I'll need to add more than 10 nodes to a 2.0.9
cluster (using vnodes).
I read this documentation on datastax website:
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_node_to_cluster_t.html

In one point it says:
"If you are using racks, you can safely bootstrap two nodes at a time
when both nodes are on the same rack."

And in another is says:
"Start Cassandra on each new node. Allow two minutes between node
initializations. You can monitor the startup and data streaming
process using nodetool netstats."

We're not using racks configuration and from reading this
documentation I'm not really sure is it safe for us to bootstrap all
nodes together (with two minutes between each other).
I really hate the tought of doing it one by one, I assume it will take
more than 6H per node.

What do you say?
--

-- 
Or Sher

Neha Trivedi | 20 Apr 07:38 2015
Picon

COPY command to export a table to CSV file

Hello all,

We are getting the OutOfMemoryError on one of the Node and the Node is down, when we run the export command to get all the data from a table.


Regards
Neha




ERROR [ReadStage:532074] 2015-04-09 01:04:00,603 CassandraDaemon.java (line 199) Exception in thread Thread[ReadStage:532074,5,main]
java.lang.OutOfMemoryError: Java heap space
        at org.apache.cassandra.io.util.RandomAccessReader.readBytes(RandomAccessReader.java:347)
        at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:392)
        at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:355)
        at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:124)
        at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:85)
        at org.apache.cassandra.db.Column$1.computeNext(Column.java:75)
        at org.apache.cassandra.db.Column$1.computeNext(Column.java:64)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:88)
        at org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:37)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:82)
        at org.apache.cassandra.db.columniterator.LazyColumnIterator.computeNext(LazyColumnIterator.java:82)
        at org.apache.cassandra.db.columniterator.LazyColumnIterator.computeNext(LazyColumnIterator.java:59)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:157)
        at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140)
        at org.apache.cassandra.utils.MergeIterator$OneToOne.computeNext(MergeIterator.java:200)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
        at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:185)
        at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:122)
        at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80)
        at org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:101)
        at org.apache.cassandra.db.RowIteratorFactory$2.getReduced(RowIteratorFactory.java:75)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.consume(MergeIterator.java:115)
        at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98)
        at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
        at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
Jimmy Lin | 20 Apr 01:13 2015
Picon

timeout creating table

hi,
we have some unit tests that run parallel that will create tmp keyspace, and tables and then drop them after tests are done.

From time to time, our create table statement run into "All hosts(s) for query failed... Timeout during read" (from datastax driver) error.

We later turn on tracing, and record something  in the following.
See below between "===" , Native_Transport-Request thread and MigrationStage thread, there was like 16 seconds doing something.

Any idea what that 16 seconds Cassandra was doing? We can work around that but increasing our datastax driver timeout value, but wondering if there is actually better way to solve this?

thanks



---------------- tracing ----------


5872bf70-e6e2-11e4-823d-93572f3db015 | 58730d97-e6e2-11e4-823d-93572f3db015 |                                                                               Key cache hit for sstable 95588 | 127.0.0.1 |           1592 | Native-Transport-Requests:102
5872bf70-e6e2-11e4-823d-93572f3db015 | 58730d98-e6e2-11e4-823d-93572f3db015 |                                                                   Seeking to partition beginning in data file | 127.0.0.1 |           1593 | Native-Transport-Requests:102
5872bf70-e6e2-11e4-823d-93572f3db015 | 58730d99-e6e2-11e4-823d-93572f3db015 |                                                                    Merging data from memtables and 3 sstables | 127.0.0.1 |           1595 | Native-Transport-Requests:102

=====================
5872bf70-e6e2-11e4-823d-93572f3db015 | 58730d9a-e6e2-11e4-823d-93572f3db015 |                                                                            Read 3 live and 0 tombstoned cells | 127.0.0.1 |           1610 | Native-Transport-Requests:102
5872bf70-e6e2-11e4-823d-93572f3db015 | 62364a40-e6e2-11e4-823d-93572f3db015 |               Executing seq scan across 1 sstables for (min(-9223372036854775808), min(-9223372036854775808)] | 127.0.0.1 |       16381594 |              MigrationStage:1
=====================

5872bf70-e6e2-11e4-823d-93572f3db015 | 62364a41-e6e2-11e4-823d-93572f3db015 |                                                                   Seeking to partition beginning in data file | 127.0.0.1 |       16381782 |              MigrationStage:1
5872bf70-e6e2-11e4-823d-93572f3db015 | 62364a42-e6e2-11e4-823d-93572f3db015 |                                                                            Read 0 live and 0 tombstoned cells | 127.0.0.1 |       16381787 |              MigrationStage:1
5872bf70-e6e2-11e4-823d-93572f3db015 | 62364a43-e6e2-11e4-823d-93572f3db015 |                                                                   Seeking to partition beginning in data file | 127.0.0.1 |       16381789 |              MigrationStage:1
5872bf70-e6e2-11e4-823d-93572f3db015 | 62364a44-e6e2-11e4-823d-93572f3db015 |                                                                            Read 0 live and 0 tombstoned cells | 127.0.0.1 |       16381791 |              MigrationStage:1
5872bf70-e6e2-11e4-823d-93572f3db015 | 62364a45-e6e2-11e4-823d-93572f3db015 |                                                                   Seeking to partition beginning in data file | 127.0.0.1 |       16381792 |              MigrationStage:1
5872bf70-e6e2-11e4-823d-93572f3db015 | 62364a46-e6e2-11e4-823d-93572f3db015 |                                                                            Read 0 live and 0 tombstoned cells | 127.0.0.1 |       16381794 |              MigrationStage:1
.
.
.

Eric Stevens | 19 Apr 15:16 2015
Picon

Re: Bookstrapping new node isn't pulling schema from cluster

Is it one of your seed nodes, or does it otherwise have itself as a seed?  A node will not bootstrap if it is in its own seeds list.

On Apr 18, 2015 2:53 PM, "Bill Miller" <bmiller <at> inthinc.com> wrote:
I upgraded a 5 node cluster from 1.2.5 to 1.2.9, ran ugradesstables and installed oracle java without issues. Then I tried upgrading one node to 2.0.14 which my Hector (I need to move from it) client didn't like, so I rolled it back to 1.2.9.  Unfortunately I didn't snapshot so I cleared all of that nodes data and attempted to bookstrap it back into the cluster.  When I do that it sets up the system keyspace and is talking to other nodes and output.log says "Startup completed! Now serving reads" without any errors.  This is immediately followed by:

java.lang.AssertionError: Unknown keyspace note_qa
at org.apache.cassandra.db.Table.<init>(Table.java:262

and then lots of of errors when it can't fimd column families:

org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=5213a16b-a648-3cb5-9006-8f6bf9315009
at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:184)

The other keyspaces/column families are never created.

The other four nodes are running fine and nodetool shows the new node as UP when it's in this state.

I attached log.  I had server debugging on.


Bill Miller | 18 Apr 23:26 2015

Cassandra 1.2.9 will not start

I tried restarting two nodes that were working and now I get this.  



INFO 15:13:50,296 Initializing system.range_xfers
 INFO 15:13:50,300 Initializing system.schema_keyspaces
 INFO 15:13:50,301 Opening /cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ic-749926 (597 bytes)
 INFO 15:13:50,302 Opening /cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ic-749927 (516 bytes)
 INFO 15:13:50,302 Opening /cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ic-749926 (597 bytes)
 INFO 15:13:50,302 Opening /cassandra/data/system/schema_keyspaces/system-schema_keyspaces-ic-749927 (516 bytes)
java.lang.AssertionError
at org.apache.cassandra.cql3.CFDefinition.<init>(CFDefinition.java:162)
at org.apache.cassandra.config.CFMetaData.updateCfDef(CFMetaData.java:1526)
at org.apache.cassandra.config.CFMetaData.fromSchema(CFMetaData.java:1441)
at org.apache.cassandra.config.KSMetaData.deserializeColumnFamilies(KSMetaData.java:306)
at org.apache.cassandra.config.KSMetaData.fromSchema(KSMetaData.java:287)
at org.apache.cassandra.db.DefsTable.loadFromTable(DefsTable.java:154)
at org.apache.cassandra.config.DatabaseDescriptor.loadSchemas(DatabaseDescriptor.java:563)
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:254)
at org.apache.cassandra.service.CassandraDaemon.init(CassandraDaemon.java:381)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.commons.daemon.support.DaemonLoader.load(DaemonLoader.java:212)
Cannot load daemon
Service exit with a return value of 3


Gmane