Andrew Purtell (JIRA | 1 Dec 2008 09:18
Picon
Favicon

[jira] Updated: (HBASE-1037) Some test cases failing on Windows/Cygwin but not UNIX/Linux


     [
https://issues.apache.org/jira/browse/HBASE-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1037:
----------------------------------

    Fix Version/s: 0.19.0
           Status: Patch Available  (was: Open)

> Some test cases failing on Windows/Cygwin but not UNIX/Linux
> ------------------------------------------------------------
>
>                 Key: HBASE-1037
>                 URL: https://issues.apache.org/jira/browse/HBASE-1037
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: test
>         Environment: Windows NT SP3, Cygwin, 1GB RAM
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.19.0
>
>         Attachments: 1037-TestThriftServer.patch,
TEST-org.apache.hadoop.hbase.mapred.TestTableMapReduce.txt,
TEST-org.apache.hadoop.hbase.TestInfoServers.txt, TEST-org.apache.hadoop.hbase.thrift.TestThriftServer.txt
>
>
> I've been running tests under Windows/Cygwin while on the road. Meanwhile they are not failing on Hudson. 
> * TestInfoServers and TestTableMapReduce fail with timeout.
(Continue reading)

Andrew Purtell (JIRA | 1 Dec 2008 09:18
Picon
Favicon

[jira] Updated: (HBASE-1037) Some test cases failing on Windows/Cygwin but not UNIX/Linux


     [
https://issues.apache.org/jira/browse/HBASE-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1037:
----------------------------------

    Attachment: 1037-TestThriftServer.patch

Patch fixes the timestamps related failure.

> Some test cases failing on Windows/Cygwin but not UNIX/Linux
> ------------------------------------------------------------
>
>                 Key: HBASE-1037
>                 URL: https://issues.apache.org/jira/browse/HBASE-1037
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: test
>         Environment: Windows NT SP3, Cygwin, 1GB RAM
>            Reporter: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.19.0
>
>         Attachments: 1037-TestThriftServer.patch,
TEST-org.apache.hadoop.hbase.mapred.TestTableMapReduce.txt,
TEST-org.apache.hadoop.hbase.TestInfoServers.txt, TEST-org.apache.hadoop.hbase.thrift.TestThriftServer.txt
>
>
> I've been running tests under Windows/Cygwin while on the road. Meanwhile they are not failing on Hudson. 
(Continue reading)

Andrew Purtell (JIRA | 1 Dec 2008 09:32
Picon
Favicon

[jira] Updated: (HBASE-1019) Master should reassign regions away from regionservers under heap stress


     [
https://issues.apache.org/jira/browse/HBASE-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1019:
----------------------------------

    Attachment: 1019.patch

There are two parts of the attached patch, that don't necessarily have to go in together. 

1) Regionservers will relinquish meta regions by closing them proactively if under heap stress. Heap
stress is defined as available heap falling below a minimum threshold (20MB) even after forced GC.
Hysteresis is applied via a two minute waiting time between assignment of a region and a close of it, so even
if all regionservers are under heap stress a meta region will find a home. 

2) Some changes to HServerLoad make load comparisons heap aware. 
- If the number of regions is less, but heap use is greater, consider the load higher. Otherwise, lower. 
- If the number of regions is greater, the load is always greater.
- If the number of regions is equal, the load difference is determined by heap usage.

I'm not advocating this patch go in right away. There are some questions:

Item #1 above will close ROOT. Will the master be able to recover from this? (It will also close META regions,
but I think this is ok.)

Item #2 above will change the behavior of the balancer and should be tested on a cluster first.

> Master should reassign regions away from regionservers under heap stress
> ------------------------------------------------------------------------
(Continue reading)

Andrew Purtell (JIRA | 1 Dec 2008 09:32
Picon
Favicon

[jira] Updated: (HBASE-1019) Master should reassign regions away from regionservers under heap stress


     [
https://issues.apache.org/jira/browse/HBASE-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell updated HBASE-1019:
----------------------------------

    Status: Patch Available  (was: Open)

> Master should reassign regions away from regionservers under heap stress
> ------------------------------------------------------------------------
>
>                 Key: HBASE-1019
>                 URL: https://issues.apache.org/jira/browse/HBASE-1019
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: master, regionserver
>            Reporter: Andrew Purtell
>             Fix For: 0.19.0
>
>         Attachments: 1019.patch
>
>
> Once the changes for HBASE-1018 go in, the master should reassign regions away from regionservers that
indicate heap stress to those that do not. Reassignment activity must be smart enough not to overload
remaining regionservers.

--

-- 
This message is automatically generated by JIRA.
-
(Continue reading)

Andrew Purtell (JIRA | 1 Dec 2008 15:44
Picon
Favicon

[jira] Commented: (HBASE-1039) Compaction fails if bloomfilters are enabled


    [
https://issues.apache.org/jira/browse/HBASE-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652014#action_12652014
] 

Andrew Purtell commented on HBASE-1039:
---------------------------------------

One crucial detail it seems is that the bloomfilter related exception happens even when no bloomfilters
are enabled in the schema.  There are also DFS related exceptions. 

From Thibaut:

I created all the tables from scratch and didn't change them at run time. The schema for all the tables right
now is as followed. (data is a bytearray of a serialized google buffer object)

    {NAME => 'entries', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'data', BLOOMFILTER =>
'false', COMPRESSION => 'NONE', VERSIONS => '3', LENGTH => '2147483647', TTL => '-1', IN_MEMORY =>
'false', BLOCKCACHE => 'false'}]}

I reran everything from scratch with the new table scheme and got the same exception again, just on a
different table this time: (Disabling the bloomfilter, compression and the blockcache doesn't seem to
have any effect)

2008-11-30 23:22:20,774 ERROR org.apache.hadoop.hbase.regionserver.CompactSplitThread:
Compaction failed for region entries,,1228075277421
java.lang.IllegalArgumentException: maxValue must be > 0
    at org.onelab.filter.HashFunction.<init>(HashFunction.java:84)
    at org.onelab.filter.Filter.<init>(Filter.java:97)
    at org.onelab.filter.BloomFilter.<init>(BloomFilter.java:102)
(Continue reading)

Jim Kellerman (JIRA | 1 Dec 2008 18:31
Picon
Favicon

[jira] Commented: (HBASE-1039) Compaction fails if bloomfilters are enabled


    [
https://issues.apache.org/jira/browse/HBASE-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652069#action_12652069
] 

Jim Kellerman commented on HBASE-1039:
--------------------------------------

You can neither enable nor disable bloomfilters once a column has data in it.

If you enable it on a table with existing data, compact assumes all stores have a bloom filter and will NPE
because it cannot read it.

If you disable bloom filters on a table that has data in it, compact will fail because the store file knows it
has a bloom filter.

It is easier to fix the latter than the former by going through the store files and deleting the bloom filter file.

Enabling bloom filters after the table has data in it is much harder as all store files must be read and a bloom
filter created for each.

It would be better to disallow the enabling of bloom filters once the table has been created. This would at
least prevent shooting yourself in the foot.

> Compaction fails if bloomfilters are enabled
> --------------------------------------------
>
>                 Key: HBASE-1039
>                 URL: https://issues.apache.org/jira/browse/HBASE-1039
>             Project: Hadoop HBase
(Continue reading)

Andrew Purtell (JIRA | 1 Dec 2008 19:55
Picon
Favicon

[jira] Commented: (HBASE-1039) Compaction fails if bloomfilters are enabled


    [
https://issues.apache.org/jira/browse/HBASE-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652094#action_12652094
] 

Andrew Purtell commented on HBASE-1039:
---------------------------------------

According to the reporter (Thibaut, on hbase-user <at> )), the table schema never uses bloomfilters yet the
bloomfilter related exceptions occur. 

> Compaction fails if bloomfilters are enabled
> --------------------------------------------
>
>                 Key: HBASE-1039
>                 URL: https://issues.apache.org/jira/browse/HBASE-1039
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.18.1
>            Reporter: Andrew Purtell
>
> From Thibaut up on the list.
> As soon as hbase tries to compact the table, the following exception appears in the logfile: (Other
compactations also work fine without any errors)
> 2008-11-30 00:55:57,769 ERROR
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction failed for region mytable,,1228002541526
> java.lang.IllegalArgumentException: maxValue must be > 0
>     at org.onelab.filter.HashFunction.<init>(HashFunction.java:84)
>     at org.onelab.filter.Filter.<init>(Filter.java:97)
(Continue reading)

stack (JIRA | 1 Dec 2008 20:03
Picon
Favicon

[jira] Commented: (HBASE-1039) Compaction fails if bloomfilters are enabled


    [
https://issues.apache.org/jira/browse/HBASE-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652099#action_12652099
] 

stack commented on HBASE-1039:
------------------------------

Why not have the bloom Writer just not build a bloom filter if ALL inputs don't already have blooms rather
than NPE (in getReaders, if an input doesn't have nrows, set it to -1)?   Could output a warning and just carry
on.  New flushes will include bloom filters so subsequent compactions will have bloom filters to hand. 
Eventually all inputs will have bloom filters and only then on compaction, write out compacted file with blooms.

Adding disallow set/unset would be awkward in implementation; i.e. providing the appropriate context
that determines when a flag is settable or not in HTD.

> Compaction fails if bloomfilters are enabled
> --------------------------------------------
>
>                 Key: HBASE-1039
>                 URL: https://issues.apache.org/jira/browse/HBASE-1039
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.18.1
>            Reporter: Andrew Purtell
>
> From Thibaut up on the list.
> As soon as hbase tries to compact the table, the following exception appears in the logfile: (Other
compactations also work fine without any errors)
(Continue reading)

stack (JIRA | 1 Dec 2008 20:13
Picon
Favicon

[jira] Commented: (HBASE-1039) Compaction fails if bloomfilters are enabled


    [
https://issues.apache.org/jira/browse/HBASE-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652103#action_12652103
] 

stack commented on HBASE-1039:
------------------------------

bq. There seems to be some talk that HFS will incorporate bloom filter code. Anyone know the status on this or
how it will impact the need for hbase to implement this?

Do you have an issue id where this is discussed Bruce?

That you'd get the bloom filter exception on table that doesn't have it enabled -- or that never had it
enabled in the past is odd... difficult to explain.

> Compaction fails if bloomfilters are enabled
> --------------------------------------------
>
>                 Key: HBASE-1039
>                 URL: https://issues.apache.org/jira/browse/HBASE-1039
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.18.1
>            Reporter: Andrew Purtell
>
> From Thibaut up on the list.
> As soon as hbase tries to compact the table, the following exception appears in the logfile: (Other
compactations also work fine without any errors)
(Continue reading)

Jonathan Gray (JIRA | 1 Dec 2008 20:15
Picon
Favicon

[jira] Commented: (HBASE-1039) Compaction fails if bloomfilters are enabled


    [
https://issues.apache.org/jira/browse/HBASE-1039?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652104#action_12652104
] 

Jonathan Gray commented on HBASE-1039:
--------------------------------------

Correction from Thibaut on list:

{quote}
You are right. I just saw that that table (the only table) has indeed the bloomfilters and compression
enabled (blockcache is disabled).
{NAME => 'entries', IS_ROOT => 'false', IS_META => 'false', FAMILIES => [{NAME => 'data', BLOOMFILTER =>
'true', COMPRESSION => 'BLOCK', VERSIONS => '3', LENGTH => '2147483647', TTL => '-1', IN_MEMORY =>
'false', BLOCKCACHE => 'false'}]}

As for the load/dfs errors, thanks for explaining this. I was only starting up one region server to have all
the log entries at one place, and it was indeed under heavy load. (Multiple threads setting/getting keys)
{quote}

He originally said 8 node cluster, but now makes mention of only single regionserver.  So one problem seems
related to blooms, probably also seeing load-related issues.

> Compaction fails if bloomfilters are enabled
> --------------------------------------------
>
>                 Key: HBASE-1039
>                 URL: https://issues.apache.org/jira/browse/HBASE-1039
>             Project: Hadoop HBase
(Continue reading)


Gmane