frank | 7 Jan 2011 12:05

openvz with OCFS2

Hi OpenVZ users,

recently we have started to use OCFS2 between our couple of OpenVZ 
servers (which are 2.6.18-194.26.1.el5.028stab079.2PAE) and we are 
experimenting periodic system crashes. We have an open bug with Oracle 
about this, but we would like to know if there are more people who use 
this kind of filesystem with OpenVZ and what is their experience.

Thanks and regards.

Frank

--

-- 
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.
Aleksandar Ivanisevic | 10 Jan 2011 10:27
X-Face
Picon
Gravatar

Re: openvz with OCFS2

frank <frank@...> writes:

> Hi OpenVZ users,
>
> recently we have started to use OCFS2 between our couple of OpenVZ
> servers (which are 2.6.18-194.26.1.el5.028stab079.2PAE) and we are
> experimenting periodic system crashes. We have an open bug with Oracle
> about this, but we would like to know if there are more people who use
> this kind of filesystem with OpenVZ and what is their experience.

I have tested it and it was working stable but openvz lacked proper
locking to allow it to be used reliably.

How are you preventing one VE to get started on multiple nodes over
the same data?
Tim Small | 10 Jan 2011 16:35
Picon
Favicon

Re: Re: openvz with OCFS2

On 10/01/11 09:27, Aleksandar Ivanisevic wrote:
> How are you preventing one VE to get started on multiple nodes over
> the same data?
>   

We do this using pacemaker.  We don't use OCFS2, we just use ext3 on top
of drbd, but the same principles could be applied.

Cheers,

Tim.

--

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309
Frank | 10 Jan 2011 19:30

Re: openvz with OCFS2

> From: Aleksandar Ivanisevic <aleksandar@...>
> Subject: [Users] Re: openvz with OCFS2
> To: users@...
> Message-ID: <m3lj2tnm5n.fsf@...>
> Content-Type: text/plain; charset=us-ascii
>
> frank <frank@...> writes:
>
>> Hi OpenVZ users,
>>
>> recently we have started to use OCFS2 between our couple of OpenVZ
>> servers (which are 2.6.18-194.26.1.el5.028stab079.2PAE) and we are
>> experimenting periodic system crashes. We have an open bug with Oracle
>> about this, but we would like to know if there are more people who use
>> this kind of filesystem with OpenVZ and what is their experience.
>
> I have tested it and it was working stable but openvz lacked proper
> locking to allow it to be used reliably.
>
> How are you preventing one VE to get started on multiple nodes over
> the same data?
>

We use pacemaker to start/stop VEs and to be sure they only run on one
node. Anyway crashes happen periodicly. We have about 40 VEs distributed
between both nodes, some with high disk I/O
We had GFS before and we will have to return to it, but we wanted to know
about other OCFS2 installations.

Thanks and regards.
(Continue reading)

Aleksandar Ivanisevic | 11 Jan 2011 09:13
X-Face
Picon
Gravatar

Re: openvz with OCFS2

"Frank" <frank@...> writes:

> We use pacemaker to start/stop VEs and to be sure they only run on one
> node. Anyway crashes happen periodicly. We have about 40 VEs
> distributed

Pacemaker is nice, but if it fails (think split brain), there is
really nothing to prevent data loss.

How do you handle split brain on a shared volume when you have half of
VEs running on each side?

[...]
Tim Small | 11 Jan 2011 10:22
Picon
Favicon

Re: Re: openvz with OCFS2

On 11/01/11 08:13, Aleksandar Ivanisevic wrote:
> "Frank" <frank@...> writes:
>
>   
>> We use pacemaker to start/stop VEs and to be sure they only run on one
>> node. Anyway crashes happen periodicly. We have about 40 VEs
>> distributed
>>     
> Pacemaker is nice, but if it fails (think split brain), there is
> really nothing to prevent data loss.
>
> How do you handle split brain on a shared volume when you have half of
> VEs running on each side?
>   

That's unlikely with our topology (one half of the split is likely to
have no connectivity, and therefore the changes there can be discarded),
and we would do manual split brain recovery if necessary (haven't had to
yet).

We don't currently use drbd in multi-master mode, instead we have a
number of drbds per node pair, and all the VEs associated with a given
drbd must run on the same node at the same time.

Another way to solve the problem would be to use similar methods to
OCFS2 etc. (AFAIK) i.e. corosync instead of heartbeat (and use more than
two nodes in your corosync totem ring) and set your quorum to prevent
split-brain occurring at all.

Cheers,
(Continue reading)

Aleksandar Ivanisevic | 11 Jan 2011 11:23
X-Face
Picon
Gravatar

Re: openvz with OCFS2

Tim Small <tim@...> writes:

[...]

> We don't currently use drbd in multi-master mode, instead we have a
> number of drbds per node pair, and all the VEs associated with a given
> drbd must run on the same node at the same time.

Thats exactly what I'm doing too. Why do you need ocfs2 then?

> Another way to solve the problem would be to use similar methods to
> OCFS2 etc. (AFAIK) i.e. corosync instead of heartbeat (and use more than
> two nodes in your corosync totem ring) and set your quorum to prevent
> split-brain occurring at all.

I thought about that, but since drbd doesn't support more than 2
nodes, I don't really know how that would work.
frank | 12 Jan 2011 08:45

Re: openvz with OCFS2

Al 11/01/11 18:00, En/na users-request@... ha escrit:
> From: Aleksandar Ivanisevic <aleksandar@...> "Frank" 
> <frank@...> writes:
>> >  We use pacemaker to start/stop VEs and to be sure they only run on one
>> >  node. Anyway crashes happen periodicly. We have about 40 VEs
>> >  distributed
> Pacemaker is nice, but if it fails (think split brain), there is
> really nothing to prevent data loss.
Of course there is, is stonith
> How do you handle split brain on a shared volume when you have half of
> VEs running on each side?
>
> [...]
We use pacemaker with heartbeat ans stonith, and thats the way to 
prevent split brain.

Frank

--

-- 
Aquest missatge ha estat analitzat per MailScanner
a la cerca de virus i d'altres continguts perillosos,
i es considera que està net.
Aleksandar Ivanisevic | 12 Jan 2011 09:58
X-Face
Picon
Gravatar

Re: openvz with OCFS2

frank <frank@...> writes:

> Al 11/01/11 18:00, En/na users-request@... ha escrit:
>> From: Aleksandar Ivanisevic
>> <aleksandar@...> "Frank"
>> <frank@...> writes:
>>> >  We use pacemaker to start/stop VEs and to be sure they only run on one
>>> >  node. Anyway crashes happen periodicly. We have about 40 VEs
>>> >  distributed
>> Pacemaker is nice, but if it fails (think split brain), there is
>> really nothing to prevent data loss.
> Of course there is, is stonith
>> How do you handle split brain on a shared volume when you have half of
>> VEs running on each side?
>>
>> [...]
> We use pacemaker with heartbeat ans stonith, and thats the way to
> prevent split brain.

Sorry, looks like I wasn't clear enough. stonith asumes HA cluster,
but i'm talking about a load balancing cluster. Most of the time I
can't afford killing a node.

--

-- 
Ti si arogantan, prepotentan i peglaš vlastitu frustraciju. -- Ivan
Tišljar, hr.comp.os.linux
Tim Small | 12 Jan 2011 10:21
Picon
Favicon

Re: Re: openvz with OCFS2

On 12/01/11 08:58, Aleksandar Ivanisevic wrote:
>>
>>> How do you handle split brain on a shared volume when you have half of
>>> VEs running on each side?
>>>
>>> [...]
>>>       
>> We use pacemaker with heartbeat ans stonith, and thats the way to
>> prevent split brain.
>>     
> Sorry, looks like I wasn't clear enough. stonith asumes HA cluster,
> but i'm talking about a load balancing cluster. Most of the time I
> can't afford killing a node.
>   

I'm not sure what scenarios you are considering where neither an
appropriately set quorum or stonith would be applicable?  If one node
has gone crazy and is trying to compete with working nodes, surely
stonith is a potentially applicable technique?

Either split-brain is a BAD THING for your app, in which case stonith is
worth considering, or:

split-brain doesn't really matter for your app, and you're happy to
consider manual recovery if it happens?

In both cases an appropriately set quorum level should prevent it
happening at all?

Am I missing something?
(Continue reading)


Gmane