Michael Grant | 9 Nov 2002 17:04

clustering freebsd

I'm new to the list (but not new to unix!)  I've been running freebsd
for years now on a box I colo.  I've got some clients and sell some
services on my box.  I'm becomming very interested in creating a
smallish cluster of machines to make my little operation more
reliable.

One of the big things that cause me down time is upgrading the OS.
I'm also worried about hardware failure (which luckily hasn't happened 
to me yet...)  I too would like to achieve at least 5 nines.

I read all the archives of this list back to january 2002.  Andy's
phase-2 project definitely sounds cool.

Let's say I have a cluster of n machines.  Some of those n machines
may be running a web server, some a shell server, some mail server,
some pop/imap mail servers...etc.  How is an incoming connection sent
to the right machine?  It seems like that there needs to be a single
machine in front of the cluster to send connections the right way,
isn't this a single point of failure?

If you do have multiple machines answering requests, how's this done?
With multiple IP addresses?   I know one can specify multiple A
records in DNS and that it'll do a sort of round-robin.  But does this 
work well?  What if one of the machines is down and a caching dns
server returns an ip address of one of the down machines?  Seems like
you need then to start modifying the dns zone to take out the down
machines and use a low ttl.  This starts to get ugly quickly.

Second problem I have been thinking about is shared disk.  I read a
post by someone who also had this concern.  One obvious way to solve
(Continue reading)

Alex | 9 Nov 2002 19:45
Picon
Favicon

Re: clustering freebsd


Hello/Beste Michael,

Saturday, November 09, 2002, 5:04:58 PM, you wrote:

> I'm new to the list (but not new to unix!)  I've been running freebsd
> for years now on a box I colo.  I've got some clients and sell some
> services on my box.  I'm becomming very interested in creating a
> smallish cluster of machines to make my little operation more
> reliable.

> One of the big things that cause me down time is upgrading the OS.
> I'm also worried about hardware failure (which luckily hasn't happened 
> to me yet...)  I too would like to achieve at least 5 nines.

> I read all the archives of this list back to january 2002.  Andy's
> phase-2 project definitely sounds cool.

> Let's say I have a cluster of n machines.  Some of those n machines
> may be running a web server, some a shell server, some mail server,
> some pop/imap mail servers...etc.  How is an incoming connection sent
> to the right machine?  It seems like that there needs to be a single
> machine in front of the cluster to send connections the right way,
> isn't this a single point of failure?

If this is a real problem then look in to high avalibity server.
Basicaly you got two server. Is the first one goes doen the second
takes over.

> If you do have multiple machines answering requests, how's this done?
(Continue reading)

Greg Lewis | 9 Nov 2002 20:06

Re: clustering freebsd

On Sat, Nov 09, 2002 at 07:45:13PM +0100, Alex wrote:
> > Does anyone know of some list of clustering software?  Is there
> > anything I can use today to do #2 that runs on freebsd (or other bsd
> > systems)?
> 
> For most application it means rewriting the software for the use in a
> cluster.
> 
> Check the port system for the strings cluster, MPI, there is
> thirty option but i forgot this. (you find it on some site looking for
> MPI; could be MVP but i'm not sure).

PVM.  There are variants of MPI too, MPICH, LAM, etc.  There are even
variants for specific hardware like Myrinet.

You may also want to look into the PBS port, although its sort of old
and should be updated to a more recent OpenPBS version (I have 2.3.14,
which is free from the nastier license terms).  There is also a SLURM
port that has appeared recently and someone was working on a SGE port
too.

These are more HPC clustering tools though and you seem to be looking
in the HA space instead.

"Clustering software" is rather a broad category.  There is software
in that category that runs on FreeBSD, but you'll need to be more
specific to get better answers :).

I don't believe FreeBSD currently has anything for transparent process
migration like BProc or MOSIX (although IIRC MOSIX was originally 
(Continue reading)

Rouzer, Charles A (Chuck | 9 Nov 2002 20:46

iSCSI driver (Re: clustering freebsd)

       You can have two machines share a single physical SCSI bus which
would allow two machines to share a RAID or JBOD.   IMO, it would be risky
to load balance between the two machines running the same applications
because of data consistency.  It  would be good for fail-over though.
Unfortunately there aren't many open-source apps (databases being the most
critical) that provide high availability.  I do believe PostgreSQL is
working to provide replication, but this isn't needed if FreeBSD eventually
gets its own iSCSI driver.

        I checked with the author of Vinum (FreeBSD software RAID).   You
could create remote (WAN) synchronous mirroring by using iSCSI and software
RAID (Vinum) for mirroring the data between the local drive and a remote
iSCSI drive.

Chuck

> > Second problem I have been thinking about is shared disk.  I read a
> > post by someone who also had this concern.  One obvious way to solve
> > the shared disk problem is to have another box which has a bunch of
> > disks in a RAID configuration, and mount the diks via nfs.  This disk
> > box would probably need to be highly available with redundant power
> > supplies and the like.
>
> > However, I'm not so convinced that a third disk box is the right
> > answer.  I'd like to see something which could mirror (in real time) a
> > file system over the lan, thus keeping 2+ disks in sync just like a
> > RAID array spread over multiple systems.  Does such a thing exist?
> > After hours of searching, I could find nothing that did this.
>
> Keeping disk in sync is is asking for trouble, but it can be done.
(Continue reading)

Michael Grant | 9 Nov 2002 22:13

Re: clustering freebsd


Alex wrote:
> For most application it means rewriting the software for the use in a
> cluster.

That's a shame, that will make it much much harder, I don't really
want to start messing with apache, imapd, or stuff like that where I
would end up having to support a parrallel version.  Surely there must 
be an easier way?

> Check the port system for the strings cluster, MPI, there is
> thirty option but i forgot this. (you find it on some site looking for
> MPI; could be MVP but i'm not sure).

then Greg Lewis wrote:
> PVM.  There are variants of MPI too, MPICH, LAM, etc.  There are even
> variants for specific hardware like Myrinet.
>
> You may also want to look into the PBS port, although its sort of old
> and should be updated to a more recent OpenPBS version (I have 2.3.14,
> which is free from the nastier license terms).  There is also a SLURM
> port that has appeared recently and someone was working on a SGE port
> too.
>
> These are more HPC clustering tools though and you seem to be looking
> in the HA space instead.

Wow, alphabet soup!  I admit that I've never heard of MPI, PVM, MPICH, 
LAM, or HPC.  I have heard of beowulf.  I'll do some more reading on
these.  Thanks.
(Continue reading)

Rouzer, Charles A (Chuck | 9 Nov 2002 23:24

Re: clustering freebsd


        Michael,

I would suggest looking at all the currect load balancing, fail-over, and
high availability (in order of complexity) articles that are available on
the internet explaining how they built their cluster and the basics of
clusters.  Find solutions that could work for you and implement them with
your FreeBSD platform.  There really isn't a "here's a FreeBSD fail-over
cluster solution", because it is usually a custom environment like the setup
below though you might find some tools that have been ported to FreeBSD to
help you get to where you want to be.

One of the biggest issues to overcome is data integrity and consistency.
You can't have two machines trying to update the same data at once.

You can set this up now with shared SCSI, but I am planning on setting up
something like this when a decent synchronous network file system is
available:

Two machines, each running seperately with its own active users and
applications, each having local access to a mirror of the other machines
data.  When online machine notices offline machine has been down for n
minute(s) it will mount the local mirror, bring up virtual interfaces, and
start all processes with their configurations.  The offline machine would
ideally also have "shutdown" procedures in the event that the outage was
network related.  A recover procedure would involve updating then mounting
the local mirror on the offline machine, killing the processes on the online
machine, removing/creating virtual interfaces, and starting the processes on
the recovered machine.

(Continue reading)

Joshua Goodall | 9 Nov 2002 23:30

Re: clustering freebsd

On Sat, Nov 09, 2002 at 10:13:15PM +0100, Michael Grant wrote:
> It's possible that what I want doesn't exist (yet). I would like to
> make something highly reliable, but not necessarily something that
> involves failover to a hot spare.  In my mind, I'd rather have 2 or
> more boxes there available to answer requests rather than one sitting
> there uselessly until the other fails.

Typically one would specify a load balancer sitting in front of the
application servers.  The LB is responsible for handing off inbound
connections to a destination server, using a variety of algorithms
and detecting server failures.

Where I have used them, I've preferred the appliance-style dedicated
LB hardware (e.g. the Foundry ServerIron).  There are a couple of
software tools in the ports collection (net/balance and net/loadd)
but I've not tried them.

In a true HA situation, I'd deploy two LB's that share a virtual
IP address (e.g. via VRRP).

Behind the LBs is where server clustering comes in.

Right now, none of the following are possible with FreeBSD out of
the box:

a) Shared-mount filesystems
b) Distributed resource locking
c) Cluster membership service
d) Atomically reliable group communications
e) System-system-image management.
(Continue reading)

Omer Faruk Sen | 10 Nov 2002 09:08

Re: clustering freebsd


Hi. 

I want to share my experiences about HA in FreeBSD. We haven't much choice 
in FreeBSD for clustering. I have read about Sporner's 
(http://sporner.dyndns.org/freebsdclusters/ ) project a few days ago but 
didn't set it up. It seems promising but I think it lacks file replication? 
And I really want to hear from him about maturity level of his product. 

I have installed polyserve's (www.polyserve.com) clustering software for 
FreeBSD. It does nice on HA and File replication. I can suggest it but it is 
commercial product. 

Michael I think you need to learn clustering terms. You may want to look at 
www.linuxvirtualserver.org and www.ultramonkey.org. Especially ultramonkey 
has nice pictures that depicts everything. Ultramonkey is a part of vanessa 
project and vannessa also includes super sparrow (which is a global HA 
solution that uses BGP for the nearest route). Anyway go and look at the 
pictures at ultramonkey. 

There is also linux-ha.org project that is being or was ported ( I don't 
know the maturity level of it) to FreeBSD. I think that project is very good 
for HA  <at>  FreeBSD. 

Someone mentioned VRRP. There is also vrrp implementation in FreeBSD which 
named freevrrpd (/usr/ports/net/freevrrpd). 

To conclude Linux is better in clustering solutions we have to admit it :( 
despite FreeBSD is better in networking (my personal thoughts). 

(Continue reading)

Alexander Leidinger | 10 Nov 2002 12:13
Favicon

Re: clustering freebsd

On Sat, 9 Nov 2002 17:04:58 +0100 (MET)
Michael Grant <mg-fbsd3 <at> grant.org> wrote:

Judging from the other replies, I don't talk about beowulf style
clustering and concentrate on failover and HA solutions.

> One of the big things that cause me down time is upgrading the OS.
> I'm also worried about hardware failure (which luckily hasn't happened 
> to me yet...)  I too would like to achieve at least 5 nines.

> Let's say I have a cluster of n machines.  Some of those n machines
> may be running a web server, some a shell server, some mail server,
> some pop/imap mail servers...etc.  How is an incoming connection sent
> to the right machine?  It seems like that there needs to be a single
> machine in front of the cluster to send connections the right way,
> isn't this a single point of failure?

Yes, in this case it is a single point of failure, but there are other
ways to solve this.

> If you do have multiple machines answering requests, how's this done?
> With multiple IP addresses?   I know one can specify multiple A
> records in DNS and that it'll do a sort of round-robin.  But does this 
> work well?  What if one of the machines is down and a caching dns
> server returns an ip address of one of the down machines?  Seems like
> you need then to start modifying the dns zone to take out the down
> machines and use a low ttl.  This starts to get ugly quickly.

No, you can configure the remaining systems to answer on the "bad"
IP (e.g. via VRRP).
(Continue reading)

Alexander Leidinger | 10 Nov 2002 12:19
Favicon

Re: clustering freebsd

On Sun, 10 Nov 2002 03:08:57 -0500
"Omer Faruk Sen" <freebsd <at> faruk.net> wrote:

> PS: I think we can set up a FreeBSD Clustering page just like www.lcic.org 
> does which provides information articles .... I can arrange a web page, ftp, 
> cvs for it. 

I'm sure we can find a solution to integrate such a page into
FreeBSD.org, we just have to find a soul which we can suc^wconvince to
do the work.

Bye,
Alexander.

--

-- 
                Where do you think you're going today?

http://www.Leidinger.net                       Alexander  <at>  Leidinger.net
  GPG fingerprint = C518 BC70 E67F 143F BE91  3365 79E2 9C60 B006 3FE7

To Unsubscribe: send mail to majordomo <at> FreeBSD.org
with "unsubscribe freebsd-cluster" in the body of the message


Gmane