donhoff_h | 1 Mar 09:57 2015

BucketCache Configuration Problem

Hi, experts

I am using HBase0.98.10 and have 3 problems about BucketCache configuration.

I  read the reference guide of Apache HBase to learn how to config  BucketCache. I find that when using offheap
BucketCache, the reference  guide says that I should config the HBASE_OFFHEAPSIZE , it also says  that I
should config -XX:MaxDirectMemorySize. Since these two parameters  are both related to the
DirectMemory, I am confused which one should I  configure?

I want to know how to configure the  BucketCache as a pure secondary cache, which I mean to bypass the 
CombinedBlockCache policy. I tried to configure as following , but when I  go to the regionserver's webUI,
I found it says "No L2 deployed"


I  made the following configuration to set the Bucket Sizes. But from  regionserver's WebUI, I found that
(4+1)K and (8+1)K sizes are used,  (64+1)K sizes are not used. What's wrong with my configuration?
hbase.bucketcache.sizes=65536  hfile.block.cache.sizes=65536 I configured these two both for I don't 
know which one is in use now.

Many Thanks!‍
Kristoffer Sjögren | 28 Feb 19:41 2015

Timerange scan


I want to understand the effectiveness of timerange scans without setting
start and stop keys? Will HBase do a full table scan or will the scan be
optimized in any way?

郝东 | 28 Feb 03:01 2015

Why I can not subscribe the maillist?

Hi, I am Donhoff.

I tried a few times to send mail to user-subscribe <at> to subscribe the hbase maillist. But
there is no any return message and I could not receive any mail from the maillist. Could anybody help me ? Thanks!
不再散步 | 28 Feb 04:06 2015

回复: use thrift2 to access hbase 0.96.10, if I try increment, anddisable the table, the code can not throw useful Exception

AHHT means namespace apache::hadoop::hbase::thrift2;‍
 namespace AHHT = apache::hadoop::hbase::thrift2;‍

------------------ 原始邮件 ------------------
发件人: "609378334";<609378334 <at>>;
发送时间: 2015年2月28日(星期六) 上午10:59
收件人: "user"<user <at>>; 

主题: 回复: use thrift2 to access hbase 0.96.10, if I try increment, anddisable the table, the code
can not throw useful Exception

sorry, my hbase is
hbase(main):017:0> version, rUnknown, Tue Dec 17 12:22:12 PST 2013‍

thrift version is 0.9.0

And, I set timeout 5s on TSocket:
 32     try{
 33         boost::shared_ptr<TSocket> socket(new TSocket(_thrift_ip, _thrift_port));
 34         boost::shared_ptr<TTransport> transport(new TBufferedTransport(socket));
 35         boost::shared_ptr<TProtocol> protocol(newTBinaryProtocol(transport));
 37         _socket = socket;
 38         _socket->setRecvTimeout(5000);
 39         _socket->setSendTimeout(5000);
(Continue reading)

不再散步 | 28 Feb 03:27 2015

use thrift2 to access hbase 0.96.10, if I try increment, and disable the table, the code can not throw useful Exception

for example:
     1. I use thrift2 put to access table,
          _client->putMultiple(table, tput_vec); //table is 'member'
     } catch (TTransportException &e) {
         ret = e.getType(); //2 or 3
         g_logger.warn("[%s:%d] [%s] [PUT] TTransportException fail: %s:%d",
                 _thrift_ip, _thrift_port, table.c_str(), e.what(), ret);
     } catch (TException &e) {
         g_logger.error("[%s:%d] [%s] [PUT] fatal error: %s, stop",
                 _thrift_ip, _thrift_port, table.c_str(), e.what());
         ret = -1;

     if I disable the table 'member', the code can catch TTransportException:
[27560] 28 Feb 10:25:22 [WARN]   [1130428752] [] [95]
[] [member] [PUT] TTransportException fail: EAGAIN (timed out):2‍‍

    2. I use thrfit2 increment to access table,
                    _client->increment(tresult, table, tincrement);
                 } catch (ATP::TProtocolException &e) {
                     ret = e.getType();
                     g_logger.warn("[%s:%d] [%s] [INCREMENT] TTransportException: %s:%d",
                             _thrift_ip, _thrift_port, table.c_str(), e.what(), ret);
                     goto error_out;
                 } catch (AT::TApplicationException &e) {
                     ret = e.getType();
(Continue reading)

郝东 | 27 Feb 10:25 2015

Questions about BucketCache


I am learning BucketCache with HBase0.98 and have a few questions about it. Could anyone help me ?

1.Since this kind of Cache divides the memory into many buckets, what is the default size of a Bucket? And how
to config the size of a Bucket ?
2.How to config the total size of the BucketCache?
3.Since each Bucket serves for specific size of blocks, and different Buckets can serve for different size
of blocks, how to setup the sizes that they serve for? And what is the default sizes that they serve?
4.I see some properties from the reference guide of the Apache HBase website , they are
hbase.bucketcache.size, hbase.bucketcache.sizes,
hfile.block.cache.sizes,hfile.block.cache.size, I am totally confused with them.  Could you tell me
their meaning ?
5.Since BucketCache is usually not on heap, when meeting a crash of RegionServer, how does this part of
memory is evicted?
6.When BucketCache is nearly full and needs to evict some parts, how does it choose which part should be
evicted? Does it evict a bucket or a block once a time?

Many Thanks!
Ian Friedman | 26 Feb 21:03 2015

HBase and YARN/MR

Hi all,

We're currently moving to Hadoop 2 (years behind, I know) and debating how to handle job resource
management using YARN where nearly 100% of our jobs are maps over HBase Tables and a large portion also
Reduce to HBase. While YARN adequately handles the resources of the machine its tasks are running on, for
the purposes of TableMapper jobs, the resources consumed are actually on the remote regionserver, which
YARN doesn't seem to be able to recognize. We've implemented things such as per-job concurrent task
limits to help deal with this on Hadoop 1, but that seems hard to do in Hadoop 2. I'm wondering if anyone has
best practices or any ideas on how to deal with an all HBase, heavily I/O and RegionServer memory/RPC bound
workload? Thanks in advance!

James Taylor | 25 Feb 20:04 2015

[ANNOUNCE] Apache Phoenix 4.3 released

The Apache Phoenix team is pleased to announce the immediate
availability of the 4.3 release. Highlights include:

- functional indexes [1]
- map-reduce over Phoenix tables [2]
- cross join support [3]
- query hint to force index usage [4]
- set HBase properties through ALTER TABLE
- ISO-8601 date format support on input
- RAND built-in for random number generation
- ANSI SQL date/time literals
- query timeout support in JDBC Statement
- over 90 bug fixes

The release is available through maven or may be downloaded here [5].

The Apache Phoenix Team



HBase connection pool

In HBase API, does 1 HTable object means 1 connection to each region server (just for 1 table)? 

The docs say (
"This class is not thread safe for reads nor write."

I got confused, as I saw there is a HTablePool class, but it's only for a table as well, can't connections be
reused for more than 1 table? 

In my java application, I used ThreadLocal variables (ThreadLocal<HTable>) to create an HTable variable
per thread. If I do several operations on each thread, I should still use the same connection, right?

Madeleine Piffaretti | 25 Feb 18:19 2015

oldWALs: what it is and how can I clean it?

Hi all,

We are running out of space in our small hadoop cluster so I was checking
disk usage on HDFS and I saw that most of the space was occupied by the*
/hbase/oldWALs* folder.

I have checked in the "HBase Definitive Book" and others books, web-site
and I have also search my issue on google but I didn't find a proper

So I would like to know what does this folder, what is use for and also how
can I free space from this folder without breaking everything...

If it's related to a specific version... our cluster is under
5.3.0-1.cdh5.3.0.p0.30 from cloudera (hbase 0.98.6).

Thx for your help!

table splitting - how to check


I have created an HBase table just like that:

t = create 'HBaseSerialWritesPOC', 'user_id_ts', {NAME => 'alnfo'},  {SPLITS =>
['100000000000000000000000', '200000000000000000000000', '300000000000000000000000',
'400000000000000000000000', '500000000000000000000000', '600000000000000000000000',
'700000000000000000000000', '800000000000000000000000', '900000000000000000000000',
'a00000000000000000000000', 'b00000000000000000000000', 'c00000000000000000000000',
'd00000000000000000000000', 'e00000000000000000000000', 'f00000000000000000000000']}

After some tests, I truncated the table.

Then I inserted 1 million rows, just to test. I was expecting to have 16 regions for this table, but when I
checked admin UI, I saw two regions:

Table Regions
Name       Region Server   Start Key       End Key Requests
HBaseSerialWritesPOC,,1424873821297.cf92656f68a16e9696d0fbfe2494219b.        host1          
 host2   800000190000125396f3f2bb                500621

I am new to HBase, so it really means just 2 regions have been created, right? It seems keys have been split in a
half, 0000.. To ffff... 

I disabled, dropped and created the table again using the same command bellow, then I saw 16 regions, as expected.

Question 1: Is it possible to check the same thing using hbase shell?
(Continue reading)