Heng Chen | 14 Feb 05:47 2016

Can't see any log in log file

I am not sure why this happens,   this is my command

maintain 11444 66.9  1.1 10386988 1485888 pts/0 Sl  12:33   6:30
/usr/java/jdk/bin/java -Dproc_regionserver -XX:OnOutOfMemoryError=kill -9
%p -Xmx8000m -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails
-XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=1
-Dhbase.id.str=maintain -Dhbase.root.logger=INFO,RFA
org.apache.hadoop.hbase.regionserver.HRegionServer start

in  hbase-maintain-regionserver-dx-pipe-regionserver7-online.log there is
only information below:

Sun Feb 14 12:33:19 CST 2016 Starting regionserver on
core file size          (blocks, -c) 1024
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 514904
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 65536
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
(Continue reading)

Artem Ervits | 13 Feb 03:56 2016

MRUNit unit test example

has anyone been able to successfully test MRUnit example in HBase 1.1?

firstly, I updated the pom for hadoop2


I setup the test exactly as the code on the reference page.
I'm getting no suitable method found for newReduceDriver

this is the only line that is giving me trouble

reduceDriver = ReduceDriver.newReduceDriver(reducer);

I checked their website and I see that last check-in was 10months ago, I
emailed the user list and got crickets. Does it even make sense to keep the
example around?
Harinder Singh | 12 Feb 09:07 2016

Permissions required for creating tables in HBase.


What is the level of permission required for creating a table in HBase if I
am making a client request using RPC. Should the user have ADMIN privileges
for that?


Melissa Warnkin | 11 Feb 19:23 2016

ApacheCon NA 2016 - Important Dates!!!

 Hello everyone!
I hope this email finds you well.  I hope everyone is as excited about ApacheCon as I am!
I'd like to remind you all of a couple of important dates, as well as ask for your assistance in spreading the
word! Please use your social media platform(s) to get the word out! The more visibility, the better
ApacheCon will be for all!! :)
CFP Close: February 12, 2016CFP Notifications: February 29, 2016Schedule Announced: March 3, 2016
To submit a talk, please visit:  http://events.linuxfoundation.org/events/apache-big-data-north-america/program/cfp

Link to the main site can be found here:  http://events.linuxfoundation.org/events/apache-big-data-north-america

Apache: Big Data North America 2016 Registration Fees:
Attendee Registration Fee: US$599 through March 6, US$799 through April 10, US$999
thereafterCommitter Registration Fee: US$275 through April 10, US$375 thereafterStudent
Registration Fee: US$275 through April 10, $375 thereafter
Planning to attend ApacheCon North America 2016 May 11 - 13, 2016? There is an add-on option on the
registration form to join the conference for a discounted fee of US$399, available only to Apache: Big
Data North America attendees.
So, please tweet away!!
I look forward to seeing you in Vancouver! Have a groovy day!!
~Melissaon behalf of the ApacheCon Team

Anna Claiborne | 11 Feb 06:48 2016

Regionserver IP in Meta


I am setting up multiple test environments composed of Hbase/OpenTSDB. Each test environment consists of
1 stand alone Hbase server and 1 to n OpenTSDB servers. The issue I am having is getting the Hbase region
server to resolve to the IPv4 address on eth0, as opposed to The OpenTSDB clients all think the
region server they are looking for is at, when it is actually on another IP. Zookeeper on the
stand alone Hbase server reports the region server in the meta table as (at least this is how I
interpreted the message). See log from OpenTSDB below:

2016-02-10 15:14:40,513 INFO  [Hashed wheel timer #2-SendThread(ubuntu8.packetfabric.com:2181)]
ClientCnxn: Session establishment complete on server hostname.domain.com/1
<http://ubuntu8.packetfabric.com/1>.111.111.11:2181, sessionid = 0x152cd76fea40029,
negotiated timeout = 5000
2016-02-10 15:14:40,515 INFO  [Hashed wheel timer #2-EventThread] HBaseClient: Connecting to .META.
region  <at>
2016-02-10 15:14:40,516 WARN  [New I/O boss #34] HBaseClient: Couldn't connect to the RegionServer  <at>
2016-02-10 15:14:40,516 INFO  [New I/O boss #34] HBaseClient: Lost connection with the .META. region
2016-02-10 15:14:40,516 INFO  [New I/O worker #22] HBaseClient: Invalidated cache for .META. as null
still seems to be splitting or closing it.
2016-02-10 15:14:40,516 ERROR [New I/O boss #34] RegionClient: Unexpected exception from downstream on
[id: 0xad59dc8e]
java.net.ConnectException: Connection refused: /

I have tried setting the host name in /etc/hosts on the stand alone Hbase server to the appropriate IPv4
address, and setting the IPv4 address in conf/regionserver. However, neither seems to get Zookeeper to
report the region server accurately though. I have restarted Hbase after changing each. The docs
aren’t completely clear (at least to me) how Hbase determines what IP to put in Meta as the address for the
region server. Would greatly appreciate some help understanding/clarification on how to get this set

(Continue reading)

Mukesh Jha | 11 Feb 13:41 2016

hbase filter based on dynamic column qualifier's value


I'm storing an array of attributes having attributes+index as qualifier
name, my column family name is 'cf' here.

Each row in my HBase table that looks like below.

cf:attribute1 -> 'value apple'
cf:attribute2 -> 'value banana'
cf:attribute{N} -> 'value iphone'
cf:someId1 -> 1111
cf:someOtherQualifier -> 'value e'

While reading data our of HBase I want to scan my table and use an *ValueFilter
*on the cf:attribute* columns for a value (say "apple").

On a match I want the entirerows to be returned.

Below are possible solutions for me

   - Add multiple SingleColumnValue filters for each attribute*. But I
   donot know the no. of items that will be present in attribute* also the
   list might go till 100 so will it affect the scan performance?
   - Store the attributes arraylist as ArrayWritable [1], I'm now sure how
   the scan filter's will work here. If any of you have any experience please
   - Implement my own filter and ship it in all my RS.

(Continue reading)

houman | 10 Feb 01:01 2016

Full table scan cost after deleting Millions of Records from HBase Table


I'm thinking of creating a table that will have millions of rows; and each
day, I would insert and delete millions of rows to/from it.

Two questions:
1. I'm guessing HBase won't have any problems with this approach, but just
wanted to check that in terms of region-splits or compaction I won't run
into issues.  Can you think of any problems?
2. Let's say there are 6 million records in the table, then do a full
table-scan querying a column-family that has a single family the value in
the cell is either 1 or 0.  Let's say it takes N seconds.  Now I bulk delete
5 million records (but do not run  compaction) and run the same query again,
would I get a much faster response or will HBase need to perform the same
amount of i/o (as if there are still 6 million records there).  Once
compaction is done, then the query would run faster...

Also most queries on the table would scan the entire table.

View this message in context: http://apache-hbase.679495.n3.nabble.com/Full-table-scan-cost-after-deleting-Millions-of-Records-from-HBase-Table-tp4077676.html
Sent from the HBase User mailing list archive at Nabble.com.

Raja.Aravapalli | 9 Feb 11:47 2016

Row length is 0 at org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:503)


HBase table lookup is failing with below exception. Someone please help me in fixing this:

java.lang.RuntimeException: java.lang.IllegalArgumentException: Row length is 0 at
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128) at
backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) at
backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) at
backtype.storm.daemon.executor$fn__5265$fn__5278$fn__5329.invoke(executor.clj:794) at
backtype.storm.util$async_loop$fn__551.invoke(util.clj:465) at
clojure.lang.AFn.run(AFn.java:24) at java.lang.Thread.run(Thread.java:744) Caused by:
java.lang.IllegalArgumentException: Row length is 0 at
org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:503) at
org.apache.hadoop.hbase.client.Mutation.checkRow(Mutation.java:487) at
org.apache.hadoop.hbase.client.Get.<init>(Get.java:89) at
org.apache.storm.hbase.common.HBaseClient.constructGetRequests(HBaseClient.java:112) at
org.apache.storm.hbase.bolt.HBaseLookupBolt.execute(HBaseLookupBolt.java:65) at
backtype.storm.daemon.executor$fn__5265$tuple_action_fn__5267.invoke(executor.clj:659) at
backtype.storm.daemon.executor$mk_task_receiver$fn__5188.invoke(executor.clj:415) at
backtype.storm.disruptor$clojure_handler$reify__1064.onEvent(disruptor.clj:58) at
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) ... 6 more

I am using a storm application to do a lookup in HBase table, Get request is failing/throwing an exception
for a rowkey specified for lookup. Please help me on finding the issue and fixing it....

Raja Mahesh Aravapalli.
Rajeshkumar J | 9 Feb 09:55 2016

Issue while trying to store hbase data using pig


  I am trying to store output from the pig relation into hbase using the
following code

store hbasedata into 'hbase://evallist' USING

But it throws

Caused by: java.lang.IllegalArgumentException: Must specify table name

can any one help me in this?
Serega Sheypak | 9 Feb 00:01 2016

Confusion with new 1.0 clint API

Hi, I'm confused with new HBase 1.0 API. API says that application should
manage connections (Previously HConnections) on their own. Nothing is
managed itnernally now.

Here is an example:

It gives no clue about lifecycle :(
Right now I create single connection instance for servlet and
BufferedMutator per request.

//getConnection returns single instance, it doesn't return new connection
each time
def mutator = getConnection().getBufferedMutator(getBufferedMutatorParams())
users.each{ mutator.mutate(toPut(it))}
mutator.close() //exception is thrown here

And I get tons of exceptions thrown on "mutator.close()", what do I do

WARNING: #905, the task was rejected by the pool. This is unexpected.
Server is node04.server.com, 60020,1447338864601
java.util.concurrent.RejectedExecutionException: Task
java.util.concurrent.FutureTask <at> 5cff3b40 rejected from
java.util.concurrent.ThreadPoolExecutor <at> 686c2853[Terminated, pool size = 0,
active threads = 0, queued tasks = 0, completed tasks = 1]
(Continue reading)

Cameron, David A | 8 Feb 16:53 2016

HBase Tables and Column Families and bulk loading


I'm working on a project where we have a strange use case.

First off, we use bulk loading exclusively.  We never use the put or bulk put interface to load data into tables.

We have drivers that make me want to segregate data by tables and column families.  Our data is clearly
delineated by the job it came from.  We would like to quickly either delete, or export data from a given data
set quickly.  To enable this I have been considering using column families to make it quick for us and easy on
hbase to delete data that is no longer needed.

It is my understanding that multiple column families bite you in the back side via the put interface and
memstore.  That having multiple column families with different distributions among the partitions can
cause lumpiness in your partitions.  I have convinced myself that because our key space is so incredibly
consistent that we don't have the lumpiness issue.

And so, I ask this, given that we don't use the memstore, are there any other drawbacks to using tables and
column families to segregate data for easy/quick backup and deletion?  If you are wondering about our
backup strategy it involves using snapshots and clones.  Once a table is cloned we can delete the column
families from the table we don't want to export to tape.  And delete becomes quick because the bulk of the
work involves deleting the files from the column family from HDFS.

All feedback is greatly appreciated!