Nate McCall | 1 Oct 2010 06:07

Re: remove node, causes havoc

There is an underlying issue here with CassandraClientImpl.getKeyspace
calling describe_keyspace outside the context of FailoverOperator. If
the cassandra node was under load and  took too long to respond
(triggering a failure), we might exhaust all the clients in the pool
immediately (as they all make this call on every getKeyspace
invocation).

I'll drop in a short-term fix in 0.6, 0.7 and master. We need to
figure out a longer term fix, in general I don't think we need this
validation on every call (or it should be configurable like
testOnBorrow for commons-jdbc and similar).

2010/9/30 Patricio Echagüe <patricioe@...>:
> We call the command method:
>
>   public final OUTPUT execute(CassandraClientPool pool, String[] hosts,
> String keyspace,
>       ConsistencyLevel consistency) throws HectorException {
>     CassandraClient c = pool.borrowClient(hosts);
>     Keyspace ks = c.getKeyspace(keyspace, consistency);
>     try {
>       return execute(ks);
>     } finally {
>       pool.releaseClient(ks.getClient());
>     }
>   }
>
> which calls c.getKeyspace() and it throws an Exception at:
>
> Map<String, Map<String, String>> keyspaceDesc =
(Continue reading)

gabriele renzi | 1 Oct 2010 11:00
Picon
Gravatar

hang client during write

hi everyone,

lately we experienced some issues with some nodes (0.6.5) in our
cluster that caused them to stall for a long time (the usual
compaction mess up) being unable to respond to calls.

The problem is that apparently connections from hector (0.6.17  but it
happened before) remained hung, without any chance of recover.
JMX calls also where unable to terminate.

Since the connection did not fail within a reasonable timeout, no
failover happened and when all of the threads in one of our clients
got into this state the whole process hung and we had to restart it.

We are using the basic DAO api (no spring), so we depend on the global
unconfigured pool: is there something we could do to avoid situations
like this? (such as configuring different timeouts) and writing with
consistency level one.

Did anyone see anything similar in the past?
Is there a way for hector to detect such situations and consider the
node dead when it is unable to answer within a timeout for a while?

thanks in advance

--

-- 
blog en: http://www.riffraff.info
blog it: http://riffraff.blogsome.com
work: http://cascaad.com

(Continue reading)

Markus Kramer | 1 Oct 2010 13:48

Understanding the UnavailableException

Hi,
I know questions about the UnavailableException have been asked in
here before, but I still got problems.

I'm using a simple and fresh three node cluster for testing. Writing
some entries works nicely. However, if I disable one node I get an
UnavailableException. Shouldn't Hector/Cassandra failover to the two
remaining nodes?
I'm using RF=1 and (hopefully corretly) set CL to weakest possible
consistency (reads=one, writes=zero).

I'm using Cassandra 0.6.5 and Hector 0.6.0-17.
This is my test code:

--------------------------------------------
 <at> Test
public void testFailover() {
    Cluster cluster = HFactory.getOrCreateCluster("MyCluster", "moabit:
9160,bramstedt:9160,moers:9160");
    this.keyspace = HFactory.createKeyspace(this.keyspaceId, cluster,
zeroConsistencyLevel);

    String username;
    String city;

    System.out.println("\n\nStarting a fresh cluster with 3 nodes");

    for (int i = 0; i < 5; i++) {
        System.out.println("\n\n");
        username = "User1_" + i;
(Continue reading)

AndyC | 1 Oct 2010 14:58

Hector 0.7.0.17 dealing with mixed types of columns

I'm using Cassandra 0.7 nightly build and a build of Hector off git
hub .

Can someone point me to the best way of dealing with this problem ?

I have a column set  that mixes strings and biyte arrays something
like

Name:String
Address:String
Location: Byte[]
Phone: String
Date: Byte[]

I'm iterating through the returned columns using:

for (HColumn<String, String> column : slice.getColumns()) {
   PostStore pStore =new PostStore();
   String Name=column.getName();
   String Value=column.getValue();
   System.out.println(Name+ "\t ==\t" + Value
   pStore.settitle(Value);
 }

The problem I've got is I need HColumn to be <String, String> for
Name, Address and Phone but <String, Byte[]> for Location and Date.
What's the best way to deal with this in Hector ?

Andy

(Continue reading)

Janne Jalkanen | 1 Oct 2010 15:20
Picon
Gravatar

Re: Understanding the UnavailableException


RF=1 means that only one node will hold your data.  If that's the node which happens to be down, then you won't
get the data...

Cassandra divides the keyspace and distributes them according to the RF.  So, for example, if you have three
servers, each of them gets allotted (roughly) one third of the keyspace. If you have RF=1, each of the
servers then holds that one third, but no more. If you specify RF=2, then each server grabs *another*
third, so each of them holds their own third, and a third from some other server. If you specify RF=3, then
each of the servers holds its own third + the thirds from the two other servers, which essentially means
that every single server has a full copy of the entire 
keyspace.

So you need to increase your RF, so that when one node is down, other nodes which still hold the data, can step in.

CL then means a different thing: whenever you are reading, CL=ONE means that Cassandra will think a value
from any of the servers is ok. It might be stale, or it might not be. Cassandra does not care.  CL=QUORUM means
that Cassandra will ask a vote from all the machines that hold the particular key, and then sends you the
data which has over one half of the votes.  For example, if RF=3, at least two nodes need to agree what a value
is before it sends it back to the requester.

Note that if you do stuff like have RF=2, and then read with CL=QUORUM, you will get into trouble, because
when one node goes down, then the other node cannot get a quorum (2/1 = 1, and for quorum you need *over* half
of the votes). So RF=3 is the first RF which protects you against one node going down if you are doing
CL=QUORUM reads. If you read *always* with CL=1, then RF=2 is enough.

Hope this helps,

/Janne

On 1 Oct 2010, at 14:48, Markus Kramer wrote:
(Continue reading)

Nate McCall | 1 Oct 2010 16:54

Re: hang client during write

I think this is inherently the same issue as was posted last night -
the describe_keyspace call from CassandraClientImpl#getKeyspace does
not go through failover and will hang the connections if there is a
timeout.

I'm going to roll a fix for this today as I think it's a serious issue
lurking for potentially loaded cassandra nodes.

On Fri, Oct 1, 2010 at 4:00 AM, gabriele renzi <rff.rff@...> wrote:
> hi everyone,
>
> lately we experienced some issues with some nodes (0.6.5) in our
> cluster that caused them to stall for a long time (the usual
> compaction mess up) being unable to respond to calls.
>
> The problem is that apparently connections from hector (0.6.17  but it
> happened before) remained hung, without any chance of recover.
> JMX calls also where unable to terminate.
>
> Since the connection did not fail within a reasonable timeout, no
> failover happened and when all of the threads in one of our clients
> got into this state the whole process hung and we had to restart it.
>
> We are using the basic DAO api (no spring), so we depend on the global
> unconfigured pool: is there something we could do to avoid situations
> like this? (such as configuring different timeouts) and writing with
> consistency level one.
>
> Did anyone see anything similar in the past?
> Is there a way for hector to detect such situations and consider the
(Continue reading)

Markus Kramer | 1 Oct 2010 17:31

Re: Understanding the UnavailableException

Thanks a lot for the clarification!

It makes sense to me that reads might fail if I'm using RF=1.
So if I don't care at all about the consistency of my reads I'll have
to catch and ignore the UnavailableException?

However, some writes also fail. Since I'm writing with a
ConsistencyLevel of ZERO, I thought cassandra's hinted handoffs
feature would allow me to write to any node regardless of where the
data actually belongs to (even if that node doesn't keep a replica)?

Cheers, Markus

On Oct 1, 3:20 pm, Janne Jalkanen <jalka...@...> wrote:
> RF=1 means that only one node will hold your data.  If that's the node which happens to be down, then you
won't get the data...
>
> Cassandra divides the keyspace and distributes them according to the RF.  So, for example, if you have
three servers, each of them gets allotted (roughly) one third of the keyspace. If you have RF=1, each of the
servers then holds that one third, but no more. If you specify RF=2, then each server grabs *another*
third, so each of them holds their own third, and a third from some other server. If you specify RF=3, then
each of the servers holds its own third + the thirds from the two other servers, which essentially means
that every single server has a full copy of the entire
> keyspace.
>
> So you need to increase your RF, so that when one node is down, other nodes which still hold the data, can step in.
>
> CL then means a different thing: whenever you are reading, CL=ONE means that Cassandra will think a value
from any of the servers is ok. It might be stale, or it might not be. Cassandra does not care.  CL=QUORUM
means that Cassandra will ask a vote from all the machines that hold the particular key, and then sends you
(Continue reading)

Victor Kabdebon | 1 Oct 2010 17:46
Picon

Performance questions

Hello everybody,

I don't know if this question was already raised, but I need to ask it anyway.
I have looked at the API v2 and I must say that it looks way more handy than it was before : everything is more simple, wrapper are great, Exception are greatly improved.

But (there is always a but) I run my applications that use hector on a very weak server (Celeron D210 and D220) and I was wondering if you had any idea of the performance drop or increase between API v1 and API v2 ?Is the difference  negligible or important ?

Thank you, and sorry if the question was already asked.
Best Regards

Janne Jalkanen | 1 Oct 2010 17:58
Picon
Gravatar

Re: Understanding the UnavailableException


Consistency and availability are two different things: Consistency means getting the same stuff that you
wrote. Availability means getting *anything*.

Say you write first row = "Markus", column = "emailaddress", value =
"foo@..." with CL=ONE, Cassandra will return once at least one node which
would handle this row responds that it has stored the data. If your RF is 1, then there's exactly one node
that would receive this data. If you have RF=3, a response of any of these three nodes would work.  (CL=ZERO
means that Cassandra won't wait for anyone, so the write might fail or not.)

Then you write immediately row = "Markus", column = "emailaddress", value = "newaddress@...".

Then you read row = "Markus", column = "emailaddress".

If your data is available, you'll get the value "foo@..." or
"newaddress@...", depending on your CL settings and whether the data has
replicated across replicas and so on.  Consistency means that you would get
newaddress@..., because that's what you wrote last. By increasing your CL
you increase the chances of actually getting the latest data.

If all the nodes that handle "Markus" are down, then you will receive an error and the data is not available.

If you get an UnavailableException, your data is not available. It might be consistent, but you can't get to
it, because there aren't enough nodes for the CL you chose to respond.

Writes and reads can be sent to *any* node, but Cassandra will internally reroute the request to the node
which actually should contain the data. So if you write with CL=ZERO, a node will buffer up the write, but if
actual recipient node is down, the write may never go on the disk. Hence it's a good idea to write at least
with CL=ONE to be sure that at least one node was able to store the data. Writing with CL=QUORUM then means
that at least (RF/2+1) nodes acknowledged storing the data.

(Also, writing a lot of data with CL=ZERO is a great way to eat up all memory  :-P)

/Janne

On Oct 1, 2010, at 18:31 , Markus Kramer wrote:

> Thanks a lot for the clarification!
> 
> It makes sense to me that reads might fail if I'm using RF=1.
> So if I don't care at all about the consistency of my reads I'll have
> to catch and ignore the UnavailableException?
> 
> However, some writes also fail. Since I'm writing with a
> ConsistencyLevel of ZERO, I thought cassandra's hinted handoffs
> feature would allow me to write to any node regardless of where the
> data actually belongs to (even if that node doesn't keep a replica)?
> 
> Cheers, Markus
> 
> 
> On Oct 1, 3:20 pm, Janne Jalkanen <jalka...@...> wrote:
>> RF=1 means that only one node will hold your data.  If that's the node which happens to be down, then you
won't get the data...
>> 
>> Cassandra divides the keyspace and distributes them according to the RF.  So, for example, if you have
three servers, each of them gets allotted (roughly) one third of the keyspace. If you have RF=1, each of the
servers then holds that one third, but no more. If you specify RF=2, then each server grabs *another*
third, so each of them holds their own third, and a third from some other server. If you specify RF=3, then
each of the servers holds its own third + the thirds from the two other servers, which essentially means
that every single server has a full copy of the entire
>> keyspace.
>> 
>> So you need to increase your RF, so that when one node is down, other nodes which still hold the data, can
step in.
>> 
>> CL then means a different thing: whenever you are reading, CL=ONE means that Cassandra will think a value
from any of the servers is ok. It might be stale, or it might not be. Cassandra does not care.  CL=QUORUM means
that Cassandra will ask a vote from all the machines that hold the particular key, and then sends you the
data which has over one half of the votes.  For example, if RF=3, at least two nodes need to agree what a value
is before it sends it back to the requester.
>> 
>> Note that if you do stuff like have RF=2, and then read with CL=QUORUM, you will get into trouble, because
when one node goes down, then the other node cannot get a quorum (2/1 = 1, and for quorum you need *over* half
of the votes). So RF=3 is the first RF which protects you against one node going down if you are doing
CL=QUORUM reads. If you read *always* with CL=1, then RF=2 is enough.
>> 
>> Hope this helps,
>> 
>> /Janne
>> 
>> On 1 Oct 2010, at 14:48, Markus Kramer wrote:
>> 
>>> Hi,
>>> I know questions about the UnavailableException have been asked in
>>> here before, but I still got problems.
>> 
>>> I'm using a simple and fresh three node cluster for testing. Writing
>>> some entries works nicely. However, if I disable one node I get an
>>> UnavailableException. Shouldn't Hector/Cassandra failover to the two
>>> remaining nodes?
>>> I'm using RF=1 and (hopefully corretly) set CL to weakest possible
>>> consistency (reads=one, writes=zero).
>> 
>>> I'm using Cassandra 0.6.5 and Hector 0.6.0-17.
>>> This is my test code:
>> 
>>> --------------------------------------------
>>>  <at> Test
>>> public void testFailover() {
>>>    Cluster cluster = HFactory.getOrCreateCluster("MyCluster", "moabit:
>>> 9160,bramstedt:9160,moers:9160");
>>>    this.keyspace = HFactory.createKeyspace(this.keyspaceId, cluster,
>>> zeroConsistencyLevel);
>> 
>>>    String username;
>>>    String city;
>> 
>>>    System.out.println("\n\nStarting a fresh cluster with 3 nodes");
>> 
>>>    for (int i = 0; i < 5; i++) {
>>>        System.out.println("\n\n");
>>>        username = "User1_" + i;
>>>        city = "Berlin";
>>>        setCity(username, city);
>>>        Assert.assertEquals(city, getCity(username));
>>>    }
>> 
>>>    System.out.println("\n\nManually disable node 'moers'");
>> 
>>>    for (int i = 0; i < 5; i++) {
>>>        System.out.println("\n\n");
>>>        username = "User2_" + i;
>>>        city = "New York";
>>>        setCity(username, city);
>>>        Assert.assertEquals(city, getCity(username));
>>>    }
>>> }
>> 
>>> private void setCity(String key, String value) {
>>>    Mutator m = HFactory.createMutator(this.keyspace);
>>>    m.insert(key, this.columnFamily,
>>> HFactory.createColumn(this.columnId, value, this.ss, this.ss));
>>> }
>> 
>>> private String getCity(String key) {
>>>    ColumnQuery<String, String> q =
>>> HFactory.createColumnQuery(keyspace, ss, ss);
>>>    QueryResult<HColumn<String, String>> r =
>>> q.setKey(key).setName(columnId).setColumnFamily(
>>>            columnFamily).execute();
>> 
>>>    HColumn<String, String> c = r.get();
>>>    return c.getValue();
>>> }
>>> --------------------------------------------
>> 
>>> After I've disable one node I immediately get the
>>> UnavailableException.
>>> Regards, Markus

Courtney | 1 Oct 2010 19:19
Picon

Get range slice without thombstone keys in the result

I'm trying to get hector 6.0.17 to return a rangeslice without
including keys that were previously marked for deletion.
When i couldn't get it right i had a look at
http://github.com/zznate/hector-examples/blob/master/src/main/java/com/riptano/cassandra/hector/example/PaginateGetRangeSlices.java

but the examples seem to be for a different version of hector?

Anyway, i was trying to replace everything that caused an error and
managed to get inserts and reads done but i'm wondering, is there a
way to not have keys that have been marked deleted with
mutator.addDeltion or mutator.delete?

Essentially what i'm doing is going through all the rows, using the
data and deleting what's used. Paging through the results by using the
last key as the first is an idea but data can be accessed from
different machines and in different runs so the last key used is not
always available.


Gmane