Empty cqlsh cells vs. null
2014-10-23 06:37:58 GMT
Sent from Mailbox
I am hoping to get the word out that we are looking for a Cassandra Developer for a full time position at our office in Scottsdale, AZ. Please let me know what I can do to let folks know we are looking J
Jeremiah Anderson | Sr. Recruiter
Choice Hotels International, Inc. (NYSE: CHH) | www.choicehotels.com
6811 E Mayo Blvd, Ste 100, Phoenix, AZ 85054
(: 602.494.6648 | *: jeremiah_anderson <at> choicehotels.com
But do you mean that inserting columns with large size (let's say a text with 20-30 K) is potentially problematic in Cassandra?
What shall i do if I want columns with large size?
Question about the read path in cassandra. If a partition/row is in the Memtable and is being actively written to by other clients, will a READ of that partition also have to hit SStables on disk (or in the page cache)? Or can it be serviced entirely from the Memtable?
If you select all columns (e.g., “select * from ….”) then I can imagine that cassandra would need to merge whatever columns are in the Memtable with what’s in SStables on disk.
But if you select a single column (e.g., “select Name from …. where id= ….”) and if that column is in the Memtable, I’d hope cassandra could skip checking the disk. Can it do this optimization?
Donald A. Smith
| Senior Software Engineer
P: 425.201.3900 x 3866
C: (206) 819-5965
F: (646) 443-2333
donalds <at> AudienceScience.com
I’m working on an application using a Cassandra (2.1.0) cluster where
- our entire dataset is around 22GB
- each node has 48GB of memory but only a single (mechanical) hard disk
- in normal operation we have a low level of writes and no reads
- very occasionally we need to read rows very fast (>1.5K rows/second), and only read each row once.
When we try and read the rows it takes up to five minutes before Cassandra is able to keep up. The problem seems to be that it takes a while to get the data into the page cache and until then Cassandra can’t retrieve the data from disk fast enough (e.g. if I drop the page cache mid-test then Cassandra slows down for the next 5 minutes).
Given that the total amount of should fit comfortably in memory I’ve been trying to find a way to keep the rows cached in memory but there doesn’t seem to be a particularly great way to achieve this.
I’ve tried enabling the row cache and pre-populating the test by querying every row before starting the load which gives good performance, but the row cache isn’t really intended to be used this way and we’d be fighting the row cache to keep the rows in (e.g. by cyclically reading through all the rows during normal operation).
Keeping the page cache warm by running a background task to keep accessing the files for the sstables would be simpler and currently this is the solution we’re leaning towards, but we have less control over the page cache, it would be vulnerable to other processes knocking Cassandra’s files out, and it generally feels like a bit of a hack.
Has anyone had any success with trying to do something similar to this or have any suggestions for possible solutions?