Steve Traugott | 8 Dec 2005 21:36

ISconf 4.2.8: HMAC

Hi All,

I've posted ISconf 4.2.8 as the latest stable release:

    http://trac.t7a.org/isconf/pub/isconf-4.2.8.201.tar.gz

This version includes HMAC authentication of all messages on the wire.
It's still UDP-based, but has fine-grained controls over network
topology, multiple subnets, shutting off broadcast, and so on.  See
the man page for how to use all of this.  For detailed information of
what's changed, this query will give you the timeline of bugs and
fixes back to the previous stable release:

    http://trac.t7a.org/isconf/timeline?from=12%2F08%2F05&daysback=90&ticket=on&changeset=on&milestone=on

The 4.3.1 work, starting now, will focus on the TCP mesh code.  This
will give us the ability to retire UDP, simplify firewall rules, and
provides the foundation we'll need later for reporting, asset
management, and monitoring.  Look for 4.3.2 as the stable release of
this next version.  For the long-term roadmap, see:

    http://trac.t7a.org/isconf/roadmap

I'm completely sure that there are bugs which will need to be fixed in
the 4.2.8 series -- while this version passes regression tests across
several test nodes, I don't yet have large-scale test cases built
(working on this, using Xen; should be in place before 4.3.2).  Please
beat up on it and let me know what you find.

Thanks,
(Continue reading)

Jordan Curzon | 10 Dec 2005 04:54
Picon
Gravatar

Isconf: Fetching blocks

I have been getting the following error frequently. The error occurs
after several other hosts have updated with no problems. The problem
is that if I run isconf up again it starts up again but not from where
it left off. Any ideas about debuging this?

isconf: error: missing block:
/var/is/fs/cache/internal.curzons.net/block/814/814338f5b4c910e35a55d101d972998f7b6bd949-eeb84b2ef12f9232f90d15457136d992-1:
Operation not permitted
Jordan Curzon | 10 Dec 2005 19:39
Picon
Gravatar

Reload the config file when commands are run.

I know the main.cf file will be managed by the cache later, but in the
mean time, I want to snap it and have it work. So I wrote a patch to
reread the config file every time a command is executed. This is my
first time in python so it might not be proper form, but it works.
Please give soem feed back on this feature and patch.

Jordan Curzon
Attachment (reload_config_file.patch): application/octet-stream, 3323 bytes
Steve Traugott | 11 Dec 2005 08:58

Re: Isconf: Fetching blocks

Hi Jordan,

On Fri, Dec 09, 2005 at 08:54:43PM -0700, Jordan Curzon wrote:
> I have been getting the following error frequently. The error occurs
> after several other hosts have updated with no problems. The problem
> is that if I run isconf up again it starts up again but not from where
> it left off. Any ideas about debuging this?
> 
> 
> isconf: error: missing block:
> /var/is/fs/cache/internal.curzons.net/block/814/814338f5b4c910e35a55d101d972998f7b6bd949-eeb84b2ef12f9232f90d15457136d992-1:
> Operation not permitted

What this means is that the machine showing this error is not getting
that file from any other machine.  It means that the machine *is*,
however, getting the journal file from some other machine, so we know
they have seen each other at least once on the net.  Just for a double
check, you should be able to see the journal at this path:

    /var/is/fs/cache/internal.curzons.net/volume/{branchname}/journal

Inside the journal, you should be able to find the 'snap' transaction in
question by looking for the entry with the 814338f5b4c91... block ID.  

When this transaction fails, the next 'isconf up' on the same machine
should retry the same (previously failed) transaction first.

Here's what I need to know:

- What isconf version are you running?
(Continue reading)

Juri Rischel Jensen | 13 Dec 2005 23:24
Picon
Gravatar

Questions about isconf4...

Hi all,

I'm in the process of finding the right configuration automation tool  
for my shop. I've looked at isconf several times over the last five  
years, but have been reluctant to try it as I couldn't see it fit  
into our systems. I've also found the documentation to be lacking  
when it came to instructions/guidelines to actual deployment and I'm  
really glad to see that this has changed a lot in version 4. And  
beeing implemented in Python also adds to my final vote.

Anyways, my problem is that, although I've read all the documentation  
and skimmed the messages from the last 3-4 months in the  
mailarchives, I still don't exactly understand how I'm supposed to  
use isconf4. Let me explain in more detail:

1. The documentation says that I should keep the branch count down. I  
can make sense
    of that, but what if I have 3 webservers in my domain, have them  
share the same
    branch and then on hostA do a

      isconf lock "Enabling new_apache_vhost"
      isconf snap new_apache_vhostfile.conf
      isconf exec a2ensite new_apache_vhostfile.conf
      isconf exec /etc/init.d/apache2 force-reload
      isconf ci

    Then I have a history of what I've done on hostA, I have my newly  
added vhosts
    config file in the isfs and can replay that journal entry again  
(Continue reading)

Steve Traugott | 14 Dec 2005 12:04

Re: Questions about isconf4...

Hi Juri,

On Tue, Dec 13, 2005 at 11:24:05PM +0100, Juri Rischel Jensen wrote:
> 1. The documentation says that I should keep the branch count down.
> I  can make sense of that, but what if I have 3 webservers in my
> domain, have them  share the same branch and then on hostA do a
> 
>      isconf lock "Enabling new_apache_vhost"
>      isconf snap new_apache_vhostfile.conf
>      isconf exec a2ensite new_apache_vhostfile.conf
>      isconf exec /etc/init.d/apache2 force-reload
>      isconf ci

In all likelihood, you don't actually want to snap
new_apache_vhostfile.conf -- you instead want to generate it from
whatever your current customer/vhost database says.  You use isconf to
manage the executables which do that generation; those executables
talk to the database (or flat files, or LDAP, etc.) to get the latest
and greatest data, rather than cause it to evolve as machines are
built.  If you have no database and just usually edit
new_apache_vhostfile.conf directly instead, then read on...  Also see
the section on environmental data in the man page.

What's missing in isconf4 right now is the native configuration file
management bits which were there in isconf versions 1-3.  In version 1
we used SUP, in 2 and 3 we used rsync; the former was good because it
gave us post-replication triggers (to handle e.g. the force-reload);
the latter was bad because rsync doesn't do triggers.  As Jordan
hinted a couple of days ago, isconf4 is going to sync config files
from the distributed cache rather than a central server, but otherwise
(Continue reading)

Jordan Curzon | 19 Dec 2005 15:49
Picon
Gravatar

ISconf: Cache.py - bcast method

Keeps crashing on me on the last line of the bcast method in Cache.py.
It throws a socket.error exception with the EAGAIN error. I did some
looking and other places in the code ignore that error. I trapped the
exception and everything runs fine.

Is that a bug or am I misunderstanding things.

Jordan Curzon
Daniel Hagerty | 19 Dec 2005 21:34

ISconf: Cache.py - bcast method

 > Keeps crashing on me on the last line of the bcast method in Cache.py.
 > It throws a socket.error exception with the EAGAIN error. I did some
 > looking and other places in the code ignore that error. I trapped the
 > exception and everything runs fine.
 >
 > Is that a bug or am I misunderstanding things.

    Not speaking for what isconf is *supposed* to do, but it's almost
universally the case that the correct response to EAGAIN is to try the
system call that failed again.  Something in the kernel interuptted
the system call, preventing it from completing.  You usually want to
complete whatever it is, rather than pretending the kernel performed
the task, when in fact it didn't.

    (There are some system call toolkits that actually go so far as to
prevent you seeing EAGAIN in high level interfaces -- if you really
want to see EAGAIN, the low level interface is still there).
Steve Traugott | 19 Dec 2005 22:28

Re: ISconf: Cache.py - bcast method

On Mon, Dec 19, 2005 at 07:49:16AM -0700, Jordan Curzon wrote:
> Keeps crashing on me on the last line of the bcast method
> in Cache.py.  It throws a socket.error exception with the
> EAGAIN error. I did some looking and other places in the
> code ignore that error. I trapped the exception and
> everything runs fine.
> 
> Is that a bug or am I misunderstanding things.

You're getting an EAGAIN from a UDP sendto(), right?
Bizarre.  That means that the operation would block (and I
have the socket set for non-blocking).  Not sure what would
cause that in UDP.  Without knowing what's causing this, I'm
not sure whether the correct action is to trap and discard
the exception, or to yield and retry later.  Some ideas:

- If you're using a nets file, check to make sure all of the
  IP addresses in there are valid and routable.  Check your
  routing table as well...  Add a debug() call to your patch
  to show the address that's failing.

- Check to see if there's something seriously wrong with the
  IP stack on the machine -- running out of mbufs maybe?
  What else is the machine doing?

Anyone else have any ideas for what might cause a UDP
sendto() to block?

I've created bug #63 to track this: http://trac.t7a.org/isconf/ticket/63

(Continue reading)

Daniel Hagerty | 19 Dec 2005 23:04

ISconf: Cache.py - bcast method

 > From: Daniel Hagerty <hag <at> linnaean.org>
 > Date: Mon, 19 Dec 2005 15:34:31 -0500
 >
 >     Not speaking for what isconf is *supposed* to do, but it's almost
 > universally the case that the correct response to EAGAIN is to try the
[...]

    Never mind me, I was thinking of EINTR.  I've always called EAGAIN
by its other name, EWOULDBLOCK (or at least, I've never seen anyplace
where they weren't synonymous).

Gmane