Yang | 29 Jan 01:10 2015

java API bug ? --- client.HTable init unnecessarily tries hadoop

we have a standalone java code, which simply tries to insert one record
into an existing hbase table.

here it got the following error:but it is able to proceed. so this means
the following operation which triggered the error is useless ??? if so,
shouldn't the useless code be removed?


2015-01-28 16:06:05,769 [pool-2-thread-1] WARN
 org.apache.hadoop.hbase.util.DynamicClassLoader - Failed to identify the
fs of dir hdfs://my-hbase-master.com:8020/apps/hbase/data/lib, ignored
java.io.IOException: No FileSystem for scheme: hdfs
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2385)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.hadoop.hbase.ClusterId.parseFrom(ClusterId.java:64)
(Continue reading)

Guillermo Ortiz | 28 Jan 21:37 2015

Tool to to execute an benchmark for HBase.


I'd like to do some benchmarks fo HBase but I don't know what tool
could use. I started to make some code but I guess that there're some

I've taken a look to JMeter, but I guess that I'd attack directly from
Java, JMeter looks great but I don't know if it fits well in this
scenario. What tool could I use to take some measures as time to
response some read and write request, etc. I'd like that to be able to
make the same benchmarks to Cassandra.

kennyut | 27 Jan 23:54 2015

After compression, the table folders under hdfs://hbase are empty

I tried to test HBase's data compression, I used two separate codes below:

non-compression code:
create 'SMT_KO1', {NAME=>'info', COMPRESSION=> 'NONE',  VERSIONS => 5}, 
                   {NAME=>'usg', COMPRESSION=> 'NONE;,  VERSIONS => 10}, 
                   SPLITS=> ['0','1', '2', '3', '4', '5', '6', '7', '8',

compression code:
create 'SMT_KO2', {NAME=>'info', COMPRESSION=>'lz4',  VERSIONS => 5}, 
                   {NAME=>'usg', COMPRESSION=>'lz4',  VERSIONS => 10}, 
                   SPLITS=> ['0','1', '2', '3', '4', '5', '6', '7', '8',

I can find two data folders under my hdfs://hbase path. One is SMT_KO1 and
the other is SMT_KO2.
The SMT_KO1 looks OK, the total folder size is around 3.1 Gb, but the
SMT_KO2 which is created from the compression code has only 1008 bytes.

Does someone know where do those files go? Both tables look ok if I open
them in Hue's HBase browser.
Any suggestion? 

Thank you!

View this message in context: http://apache-hbase.679495.n3.nabble.com/After-compression-the-table-folders-under-hdfs-hbase-are-empty-tp4067921.html
Sent from the HBase User mailing list archive at Nabble.com.

(Continue reading)

Sriram V | 27 Jan 22:19 2015

Re: Overwrite a row

Hi Ted,

I am referring to
http://grokbase.com/p/hbase/user/134kvzt6ah/overwrite-a-row Does this mean
that if I am using hbase 0.94.6, the mutate row option will not remove the
previously present column qualifiers if they are empty in my latest update?

Nishanth S | 26 Jan 19:13 2015

Deleting Files in oldwals


We were running an hbase cluster with replication enabled.How ever we have
moved away from  replication and  turned this off.I also went ahead and
removed the peers from hbase shell.How ever the oldwals directory is  not
cleaned up.I am using hbase version 0.96.1. Is it safe enough to  delete
these logs?.

Pun Intended | 25 Jan 02:15 2015

Very high read rate in a single RegionServer


I have noticed lately that my apps started running longer. The longest
running tasks seem all to be requesting data from a single region server.
That region server read rate is very high in comparison to the read rate of
all the other region servers (1000reqs/sec vs 4-5 reqs/sec elsewhere). That
region server has about the same number of regions as all the rest: 26-27
regions. Number of store files, total region size, everything else on the
region server seems ok and in the same ranges as the rest of the region
servers. The keys should be evenly distributed - randomly generated
38-digit numbers. I am doing a simple Hbase scan from all my MR jobs.

I'd appreciate any suggestions on what to look into it or if you have any
ideas how I can solve this issue.

Shuai Lin | 24 Jan 18:25 2015

Delete a region from hbase

Hi all,

We're using hbase 0.94-15 from CDH4 repo, and we're planning to delete
several regions which contain data that are no longer needed.

Basically we plan to use HRegion.deleteRegion
as described in this article.

We can guarantee that  there would not be any request going to these
regions during the deletion. Here are my questions:

-- Is there any caveat of using this way to delete regions, especially
those that may cause downtime? Because we'll delete the regions in our
production cluster, we need really be careful of any possible consequences.

-- After deleting the region, do we really need to re-create it? If we do
not recreate these regions, there would be "holes" in the rowkey space. Can
we use some tool like hbck to fix this? Another way is to just recreate the
regions, and later merge these empty regions with their neighbors. Which
one is better?


Tom Hood | 23 Jan 19:15 2015

HFileOutputFormat2.configureIncrementalLoad and HREGION_MAX_FILESIZE


I'm bulkloading into an empty hbase table and have called
HTableDescriptor.setMaxFileSize to override the global setting of
HConstants.HREGION_MAX_FILESIZE (i.e. hbase.hregion.max.filesize).

This newly configured table is then passed to
HFileOutputFormat2.configureIncrementalLoad to setup the MR job to generate
the hfiles.  This already configures other properties based on the settings
of the table it's given (e.g. compression, bloom, data encoding, splits,
etc).  Is there a reason it does not also configure the
HREGION_MAX_FILESIZE based on its setting from

-- Tom
joseph | 23 Jan 08:20 2015

Disabling the HBase table

Hi ,

We are using HBase 0.94.15-cdh4.7 and Hadoop version 2.0.0-cdh 4.7

We are using a small cluster 4 servers.

We are also creating tables with presplit .We are using bucket ID to
presplit the tables .We are dealing with time series data .Our row key with
format of bucketID_reverstimestamp_ID. We  inserts data using the java API.
We are inserting  9 lakh records per minute to this table.

Problem is that our table is disabling  while inserting data. Note that this
is not occurring in every day .Sometimes this issue pop up



Nishanth S | 22 Jan 20:52 2015

Hbase Error When running Map reduce

Hi All,
I am running a map reduce job which scans the hbase table for  a particular
time period and then  creates some files from that.The job runs fine for 10
minutes or so and few around 10% of maps get completed succesfully.Here is
the error that I am getting.Can some one help?

15/01/22 19:34:33 INFO mapreduce.TableRecordReaderImpl: recovered from
org.apache.hadoop.hbase.client.ScannerTimeoutException: 559843ms
passed since the last invocation, timeout is currently set to 60000
	at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:352)
	at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:194)
	at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:138)
	at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:533)
	at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
	at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hadoop.hbase.UnknownScannerException:
org.apache.hadoop.hbase.UnknownScannerException: Name:
3432603283499371482, already closed?
	at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:2973)
	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:26929)
	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
	at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1879)
(Continue reading)

James Taylor | 22 Jan 19:25 2015

[ANNOUNCE] Apache Phoenix meetup in SF on Tue, Feb 24th

I'm excited to announce the first ever Apache Phoenix meetup, hosted
by salesforce.com in San Francisco on Tuesday, February 24th  <at>  6pm.
More details here:

Please ping me if you're interested in presenting your companies use
case. We'll have live streaming available for remote participants as