Shushant Arora | 17 Apr 18:46 2014
Picon

hive hbase integration

Wanna know why hive hbase integration is required.
Is it because  hbase cannot provide all functionalities of sql like and if
yes then why?
What is storage handler and best practices for hive hbase integration?
Hansi Klose | 17 Apr 15:51 2014
Picon

taking snapshot's creates to many TCP CLOSE_WAIT handles on the hbase master server

Hi,

we use a script to take on a regular basis snapshot's and delete old one's.

We recognizes that the web interface of the hbase master was not working
any more becaues of too many open files.

The master reaches his number of open file limit of 32768

When I run lsof I saw that there where a lot of TCP CLOSE_WAIT handles open
with the regionserver as target.

On the regionserver there is just one connection to the hbase master.

I can see that the count of the CLOSE_WAIT handles grow each time 
i take a snapshot. When i delete on nothing changes.
Each time i take a snapshot  there are 20 - 30 new CLOSE_WAIT handles.

Why does the master do not close the handles? Is there a parameter
with a timeout we can use?

We use hbase 0.94.2-cdh4.2.0.

Regards Hansi

Tarang Dawer | 17 Apr 14:19 2014
Picon

Re: RetriesExhaustedWithDetailsException if table is deleted

+adding subject

On Thu, Apr 17, 2014 at 5:47 PM, Tarang Dawer <tarang.dawer@...>wrote:

> Hi All
>
> I am using HBase -  version 0.96.1-hadoop1 with Hadoop version 1.1.1.
>
> I am writing to HBase in batches of 100 and the flow is such that if the
> table does not exists , it gets created.
> Following are the snippets of the code
>
> conf = HBaseConfiguration.create();
>
>                     conf.set("hbase.zookeeper.quorum", "192.168.145.144");
>             conf.setInt("hbase.zookeeper.property.clientPort",
>                     2181);
>             conf.set("hbase.defaults.for.version.skip", "true");
>
>             // Create the connection for HBase
>             hConnection = HConnectionManager.createConnection(conf);
>
> HTableInterface hTableInterface = hConnection.getTable(tableName);
>
> try {
>                 hTableInterface.put(toPersist);
>             } catch(IOException ioe)
> {
>    //THROW EXCEPTION
> }
(Continue reading)

Tianying Chang | 17 Apr 01:41 2014
Picon

Will BloomFilter still be cached if setCacheBlocks(false) per Get()?

Hi,

We have a use case where some data are mostly random read, so it polluted
cache and caused big GC. It is better to turn off the block cache for those
data. So we are going to call setCacheBlocks(false) for those get(). We
know that the index will be still cached based on below code path, so we
are safe there.  But it is not clear if BloomFilter belong to the level <
searchTreeLevel, and also get cached also.

         // Call HFile's caching block reader API. We always cache index
         // blocks, otherwise we might get terrible performance.
          boolean shouldCache = cacheBlocks || (lookupLevel <
searchTreeLevel);
          BlockType expectedBlockType;
          if (lookupLevel < searchTreeLevel - 1) {
            expectedBlockType = BlockType.INTERMEDIATE_INDEX;
          } else if (lookupLevel == searchTreeLevel - 1) {
            expectedBlockType = BlockType.LEAF_INDEX;
          } else {
            // this also accounts for ENCODED_DATA
            expectedBlockType = BlockType.DATA;
          }

Or I think because BloomFilter is part of Meta data, so it is always cached
on read even when per-family/per-query cacheBlocks is turned off. Am I
right?

Thanks
Tian-Ying
(Continue reading)

shubhendu.singh | 16 Apr 12:33 2014

Releasing HSearch 1.0 - Search and Analytics Engine on hadoop/hbase

After our first release of *HSearch* (announced in this forum) back in
December 2010, we have been working at it incorporating customer feedback
from their production deployments. 
Recently, we included the capability to store and analyze structured data in
addition to unstructured data. 
With these changes, we are now releasing a new version naming it 1.0. 

The software is available on github at 
https://github.com/bizosys?tab=repositories
<https://github.com/bizosys?tab=repositories>  

Some key features of HSearch
Fast query on large datasets – queries typically return in milliseconds on
terabytes of data.
Multiple data structures are used for storing the data depending on the
nature of data
LRU cache layer for frequently accessed data
5MB index cells to co-locate data by business entities and secondary rollup
indexes for fast filtering on large datasets 

Check out HSearch at  hadoopsearch.net <http://hadoopsearch.net>   , do
download and try it out. 
Let us know your feedback and questions at  hsearch@...
<mailto://hsearch@...>   

Regards 
Shubhendu Shekhar Singh 
HSearch Committer 

--
(Continue reading)

Tao Xiao | 15 Apr 13:40 2014
Picon

All regions stay on two nodes out of 18 nodes

I am using HDP 2.0.6, which has 18 nodes(region servers). One of my HBase
tables has 50 regions but I found that the 50 regions all stay in just two
nodes, not spread evenly in the 18 nodes. I did not pre-create splits so
this table was gradually split into 50 regions itself.

I'd like to know why all the regions stay in just two nodes, not the 18
nodes of the cluster, and how to spread the regions evenly across all the
region servers. Thanks.
Guillermo Ortiz | 15 Apr 11:45 2014
Picon

Weird behavior splitting regions

I have a table in Hbase that sizes around 96Gb,

I generate 4 regions of 30Gb. Some time, table starts to split because the
max size for region is 1Gb (I just realize of that, I'm going to change it
or create more pre-splits.).

There're two things that I don't understand. how is it creating the splits?
right now I have 130 regions and growing. The problem is the size of the
new regions:

1.7 M    /hbase/filters/4ddbc34a2242e44c03121ae4608788a2
1.6 G    /hbase/filters/548bdcec79cfe9a99fa57cb18f801be2
3.1 G    /hbase/filters/58b50df089bd9d4d1f079f53238e060d
2.5 M    /hbase/filters/5a0d6d5b3b8faf67889ac5f5c2947c4f
1.9 G    /hbase/filters/5b0a35b5735a473b7e804c4b045ce374
883.4 M  /hbase/filters/5b49c68e305b90d87b3c64a0eee60b8c
1.7 M    /hbase/filters/5d43fd7ea9808ab7d2f2134e80fbfae7
632.4 M  /hbase/filters/5f04c7cd450d144f88fb4c7cff0796a2

There're some new regions that they're just a some KBytes!. Why they are so
small?? When does HBase decide to split? because it started to split two
hours later to create the table.

One, I create the table and insert data, I don't insert new data or modify
them.

Another interested point it's why there're major compactions:
2014-04-15 11:33:47,400 INFO org.apache.hadoop.hbase.regionserver.Store:
Renaming compacted file at
hdfs://m01.cluster:8020/hbase/filters/ef994715505054299ede8c48c600cea4/.tmp/df90c260cb4e4256a153dd178244f04c
(Continue reading)

Shashidhar Rao | 15 Apr 08:20 2014
Picon

Using HBase data

Hi,

I am starting to think of a new project using Hadoop and Hbase as my
persistent store. But I am quite confused as to how to use these HBASE data.

1. Can these HBASE data be used in web applications. Meaning retrieving the
data and showing it on the web page.

Can somebody please suggest how HBASE data is used by other companies.

Some use case links would certainly be helpful.

Regards
Shashi
Shashidhar Rao | 15 Apr 07:16 2014
Picon

Zookeeper and Hbase

Hi ,

Can somebody please explain how to configure Hbase to use zookeeper . What
are the configuration files involved on Hbase side and zookeeper side and
does the zookeeper servers have to be different from the Region Servers or
can zookeeper servers installed on the same servers as that of Region
Servers.

Regards
Shashi
Li Li | 15 Apr 02:32 2014
Picon

hbase exception: Could not reseek StoreFileScanner

Mon Apr 14 23:54:40 CST 2014,
org.apache.hadoop.hbase.client.HTable$9 <at> 14923f6b, java.io.IOException:
java.io.IOException: Could not reseek StoreFileScanner[HFileScanner
for reader reader=hdfs://192.168.11.150:8020/hbase/vc2.in_link/6
b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75,
compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true]
[cacheDataOnWrite=false] [cacheIndexesOnWrite=false]
[cacheBloomsOnWrite=false] [cacheEvictOn     Close=false]
[cacheCompressed=false],
firstKey=\xE82\x14\xFF/\xF04\xA4\xBC\xB0X\xEB\xB4\xE9\xD1\x11\x93h\xD3\xAA\xC4\xAB\x99\xC3\x09\x874\x16VZ\x05\x10/cf:an/1397117856840/Put,
lastKey=\xF0\x1F\xA7\xF7u\x9E.\xB2\x8EZ\xD5\xEB\xD6h\x03
W\x0F\x8A\xA0\x9B>\x0A\xE8\xEC\x9ELu5o\xFE\x03\xCE/cf:an/1397131734218/Put,
avgKeyLen=48, avgValueLen=14, entries=3712302, length=260849569,
cur=\xEC5cA\xF1\x03Y\x01!\xD6\x86\x15\x13"\xD6\xC9\xBDb:#A\x08\x86\x14j\xA0)\xA8\x85\x11\xDC
    F/cf:an/1397454753471/Maximum/vlen=0/ts=0] to key
\xEC5cA\xF1\x03Y\x01!\xD6\x86\x15\x13"\xD6\xC9\xBDb:#A\x08\x86\x14j\xA0)\xA8\x85\x11\xDCF/cf:an/LATEST_TIMESTAMP/Maximum/vlen=0/ts=0
3968 Mon Apr 14 23:55:50 CST 2014,
org.apache.hadoop.hbase.client.HTable$9 <at> 14923f6b, java.io.IOException:
java.io.IOException: Could not reseek StoreFileScanner[HFileScanner
for reader reader=hdfs://192.168.11.150:8020/hbase/vc2.in_link/6
b879cb43205cdae084a280c38fab34a/cf/4dc235709de44f53b2484d2903f1bb75,
compression=none, cacheConf=CacheConfig:enabled [cacheDataOnRead=true]
[cacheDataOnWrite=false] [cacheIndexesOnWrite=false]
[cacheBloomsOnWrite=false] [cacheEvictOn     Close=false]
[cacheCompressed=false],
firstKey=\xE82\x14\xFF/\xF04\xA4\xBC\xB0X\xEB\xB4\xE9\xD1\x11\x93h\xD3\xAA\xC4\xAB\x99\xC3\x09\x874\x16VZ\x05\x10/cf:an/1397117856840/Put,
lastKey=\xF0\x1F\xA7\xF7u\x9E.\xB2\x8EZ\xD5\xEB\xD6h\x03
W\x0F\x8A\xA0\x9B>\x0A\xE8\xEC\x9ELu5o\xFE\x03\xCE/cf:an/1397131734218/Put,
avgKeyLen=48, avgValueLen=14, entries=3712302, length=260849569,
cur=\xEC5cA\xF1\x03Y\x01!\xD6\x86\x15\x13"\xD6\xC9\xBDb:#A\x08\x86\x14j\xA0)\xA8\x85\x11\xDC
(Continue reading)

Srikanth Srungarapu | 14 Apr 22:43 2014
Picon

regarding failure of testExportFileSystemState

Hi Folks,
When I tried running mvn test command on my local machine by checking out
0.98 branch, I'm getting the following test failure errors.

*Failed tests:
testExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
expected:<0> but was:<1>
testSnapshotWithRefsExportFileSystemState(org.apache.hadoop.hbase.snapshot.TestExportSnapshot):
expected:<0> but was:<1> *
Any insights into this would be appreciated :).
Thanks,
Srikanth.

Gmane