file system resource becomes inaccesible when any of the node goes down
Muhammad Sharfuddin <M.Sharfuddin <at> nds.com.pk>
2015-07-05 16:13:56 GMT
SLES 11 SP 3 + online updates(pacemaker-1.1.11-0.8.11.70
Its a dual primary drbd cluster, which mounts a file system resource on
both the cluster nodes simultaneously(file system type is ocfs2).
Whenever any of the nodes goes down, the file system(/sharedata) become
inaccessible for exact 35 seconds on the other (surviving/online) node,
and then become available again on the online node.
Please help me understand why the node which survives or remains online
unable to access the file system resource(/sharedata) for 35 seconds ?
and how can I fix the cluster so that file system remains accessible on
the surviving node without any interruption/delay(as in my case of about
By inaccessible, I meant to say that running "ls -l /sharedata" and "df
/sharedata" does not return any output and does not return the prompt
back on the online node for exact 35 seconds once the other node becomes
e.g "node1" got offline somewhere around 01:37:15, and then /sharedata
file system was inaccessible during 01:37:35 and 01:38:18 on the online
node i.e "node2".
/var/log/messages on node2, when node1 went offline:
Jul 5 01:37:26 node2 kernel: [ 675.255865] drbd r0: PingAck did not
arrive in time.
Jul 5 01:37:26 node2 kernel: [ 675.255886] drbd r0: peer( Primary ->
Unknown ) conn( Connected -> NetworkFailure ) pdsk( UpToDate -> DUnknown )