Garg, Rinku | 31 May 06:21 2016

Unscubscribe me too

Unscubscribe me too

Thanks & Regards
Rinku Garg
C: +91 941.740.5922
rinku.garg <at>

 Please consider the environment before printing this email

-----Original Message-----
From: Manisha Sethi [mailto:Manisha.Sethi <at>] 
Sent: Tuesday, May 31, 2016 9:49 AM
To: user <at>
Subject: Unscubscribe


The information contained in this message is proprietary and/or confidential. If you are not the intended
recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the
message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any
message addressed to our domain is subject to archiving and review by persons other than the intended
recipient. Thank you.
Manisha Sethi | 31 May 06:19 2016



聪聪 | 31 May 04:03 2016

Memstore blocking

HI ALL:       Recently,I met a strange problem,  the first Region’s Memstore of one table (the only one) often
blocked when flushing.(Both Version:  hbase-0.98.6-cdh5.2.0  and, I updated 0.98 to,hope to solve the problem,But failed)
       On the web UI, I can see the status shows:  ABORTED(since XXsec ago), Not flushing since already flushing. 
       But it will never flush success, and the usage of the disk will increase very high.Now other regionservers
just use 30% of the disk capacity, the problematic region server will increase to 100%,unless  killing the process.
       What’s more, the region server process cannot be shutdown normally,every time I have to use the KILL -9 command.
       I check the log,the reason why cannot flush is one of the MemstoreFlusher cannot exiting.
       The log is like blow:
       2016-05-29 19:54:11,982 INFO  [MemStoreFlusher.13] regionserver.MemStoreFlusher:
MemStoreFlusher.13 exiting
 2016-05-29 19:54:13,016 INFO  [MemStoreFlusher.6] regionserver.MemStoreFlusher:
MemStoreFlusher.6 exiting
 2016-05-29 19:54:13,260 INFO  [MemStoreFlusher.16] regionserver.MemStoreFlusher:
MemStoreFlusher.16 exiting
 2016-05-29 19:54:16,032 INFO  [MemStoreFlusher.33] regionserver.MemStoreFlusher:
MemStoreFlusher.33 exiting
 2016-05-29 19:54:16,341 INFO  [MemStoreFlusher.25] regionserver.MemStoreFlusher:
MemStoreFlusher.25 exiting
 2016-05-29 19:54:16,620 INFO  [MemStoreFlusher.31] regionserver.MemStoreFlusher:
MemStoreFlusher.31 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.29] regionserver.MemStoreFlusher:
MemStoreFlusher.29 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.23] regionserver.MemStoreFlusher:
MemStoreFlusher.23 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.32] regionserver.MemStoreFlusher:
MemStoreFlusher.32 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.1] regionserver.MemStoreFlusher:
MemStoreFlusher.1 exiting
 2016-05-29 19:54:16,621 INFO  [MemStoreFlusher.38] regionserver.MemStoreFlusher:
(Continue reading)

Lex Toumbourou | 29 May 06:23 2016

Network connectivity for CopyTable between clusters

Hi all,

I'm trying to run a large CopyTable job between clusters in totally
different datacenters and I'm trying to determine what network connectivity
is required here.

As per the Cloudera blog post about Copytable, I understand that the
network should be such that "MR TaskTrackers can access all the HBase and
ZK nodes in the destination cluster." So in practise that means that the
source task trackers should be able to access:

*  Zookeeper on port 2181
* the Master on its RPC port (16000)
* the Regions' on their RPC ports (16020)

Anything else I need to configure here? Does Hadoop on the source need to
talk to directly with the destination Hadoop etc?

Also, what's unclear to me is what I should be doing with DNS. I'm guessing
that the source cluster needs to be able to resolve the hostnames of remote
RegionServers and Master nodes as stored in Zookeeper. Anything else I need
to configure here?

Thanks for your time!


Lex ToumbourouLead engineer at <>
Ted Yu | 28 May 17:57 2016

Re: exception not descriptive relationed to zookeeper.znode.parent

Please use user <at> hbase for future correspondence.

Here is related code from ZooKeeperWatcher (NPE seems to have come from the
for loop):

  public List<String> getMetaReplicaNodes() throws KeeperException {
    List<String> childrenOfBaseNode = ZKUtil.listChildrenNoWatch(this,
    List<String> metaReplicaNodes = new ArrayList<String>(2);
    String pattern =
    for (String child : childrenOfBaseNode) {

ZKUtil.listChildrenNoWatch() would return null if the base znode doesn't

The error message you mentioned still exists:

           + "There could be a mismatch with the one configured in the
           + "There could be a mismatch with the one configured in the

With zookeeper.znode.parent  properly set, do you still experience NPE with
your code ?


(Continue reading)

Tianying Chang | 27 May 23:32 2016

Major compaction cannot remove deleted rows until the region is split. Strange!


We saw a very strange case in one of our production cluster. A couple
regions cannot get their deleted rows or delete marker removed even after
major compaction. However when the region triggered split (we set 100G for
auto split), the deletion worked. The 100G region becomes two 10G daughter
regions, and all the delete marker are gone.

Also, the same region in the slave cluster (through replication) have
normal size at about 20+G.

BTW, the delete marker in the regions are mostly deleteFamily if it

This is really weird. Anyone has any clue for this strange behavior?


A snippet of the HFile generated by the major compaction:

: \xA0\x00\x00L\x1A <at> \x1CBe\x00\x00\x08m\x03\x1A <at> \x10\x00?PF/d:/1459808114380/DeleteFamily/vlen=0/ts=2292870047
K: \xA0\x00\x00L\x1A <at> \x1CBe\x00\x00\x08m\x03\x1A <at> \x10\x00?PF/d:/1459808114011/DeleteFamily/vlen=0/ts=2292869794
K: \xA0\x00\x00L\x1A <at> \x1CBe\x00\x00\x08m\x03\x1A <at> \x10\x00?PF/d:/1459805381104/DeleteFamily/vlen=0/ts=2291072240
K: \xA0\x00\x00L\x1A <at> \x1CBe\x00\x00\x08m\x03\x1A <at> \x10\x00?PF/d:/1459805380673/DeleteFamily/vlen=0/ts=2291071997
K: \xA0\x00\x00L\x1A <at> \x1CBe\x00\x00\x08m\x03\x1A <at> \x10\x00?PF/d:/1459802643449/DeleteFamily/vlen=0/ts=2290248886
(Continue reading)

Harry Waye | 27 May 18:37 2016

HBase consistency issues (holes) and long startup

We had a regionserver fall out of our cluster, I assume due to the process
hitting a limit as the region servers .out log file just contained "Killed"
which I've experienced when hitting open file descriptors limits.  After
this, hbck then reported inconsistencies in tables:

ERROR: There is a hole in the region chain between
dce998f6f8c63d3515a3207330697ce4-ravi teja and e4.  You need to create a
new .regioninfo and region dir in hdfs to plug the hole.

`hdfs fsck` reports a healthy dfs.

I attempted to run `hbase hbck -repairHoles` which didn't resolve the

I then restarted the HBase cluster and it now appears from looking at the
master log files that there are many tasks waiting to complete, and the web
interface results in a timeout:

master.SplitLogManager: total tasks = 299 unassigned = 285 tasks={ ... }

From looking at the logs on the regionservers I see messages such as:
"regionserver.SplitLogWorker: Current region server ... has 2 tasks in
progress and can't take more".

How can I speed up working through these tasks?  I suspect our nodes can
handle many more that 2 tasks at a time. I'll likely have followup
questions ones these have been worked through but I think that's it for not.

Any other information you need?
(Continue reading)

Andrew Purtell | 26 May 20:44 2016

New Apache HBase blog post: Tuning G1GC For Your HBase Cluster

The folks at HubSpot kindly agreed to let us republish their excellent
treatise on tuning the G1GC collector for HBase deployments on our project
blog. Check it out at


Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)
Andrew Purtell | 26 May 20:30 2016

[ANNOUNCE] Mikhail Antonov joins the Apache HBase PMC

On behalf of the Apache HBase PMC I am pleased to announce that Mikhail
Antonov has accepted our invitation to become a PMC member on the Apache
HBase project. Mikhail has been an active contributor in many areas,
including recently taking on the Release Manager role for the upcoming
1.3.x code line. Please join me in thanking Mikhail for his contributions
to date and anticipation of many more contributions.

Welcome to the PMC, Mikhail!


Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)
Heng Chen | 26 May 06:48 2016

region stuck in failed close state

On master web UI, i could see region (c371fb20c372b8edbf54735409ab5c4a)
always in failed close state,  So balancer could not run.

i check the region on RS,  and found logs about this region

2016-05-26 12:42:10,490 INFO  [MemStoreFlusher.1]
regionserver.MemStoreFlusher: Waited 90447ms on a compaction to clean up
'too many store files'; waited long enough... proceeding with flush of
2016-05-26 12:42:20,043 INFO
requesting flush for region
after a delay of 20753
2016-05-26 12:42:30,043 INFO
requesting flush for region
after a delay of 7057

relate jstack information like below:

Thread 12403 (RS_CLOSE_REGION-dx-pipe-regionserver4-online:16020-2):
  State: WAITING
  Blocked count: 1
  Waited count: 2
(Continue reading)

Diwanshu Shekhar | 25 May 23:53 2016

Hbase REST API Question

​I am running through some issues related to Hbase REST API.

Let me give you a little background first:
I built an HBase table in my company's CDH cluster. The row key is "<some
string>|<some string>" . column family is "d" and the column qualifier is
the date which is also string (e.g., 2012-01-27). In order to provide
access to the table data to other interested people in the company, I built
a Django API and it works great. Somebody in my team suggested that Hbase
comes with a built-in API and I could use it directly to get access to the
data. I read the online HBase documentation and indeed looks like there is
something already off-the-shelf that comes with HBase. But I haven't been
successful using it and therefore I am here seeking for help.

 Here is a list of issues that  I am running through:

 1. Chrome Browser
      In the Chrome browser, I typed in the following url:
      http://< ip address>:20550/namespace:tablename/ #00003|313001098/d
     I was expecting that it will render the data specific to the provided
row key in the browser, but instead it downloads an html file that contains
data for only one column qualifier and it doesn't have any information on
which column qualifier the data pertains to.

 2. Curl
     I did the same with curl command in unix shell.
     curl -i http://< ip
    Please note that %23 is an encoding for # and %7C is an encoding for |
    The above command only gives me the data for one column qualifier.
(Continue reading)