Issue #4681 has been updated by Markus Roberts.
- Status changed from Investigating to Accepted
Looking at the code in question:
def timeout(sec, klass = nil)
return yield if sec == nil or sec.zero?
raise ThreadError, "timeout within critical session" if Thread.critical
exception = klass || Class.new(ExitException)
begin
x = Thread.current
y = Thread.start {
begin
sleep sec
rescue => e
x.raise e
else
x.raise exception, "execution expired" if x.alive?
end
}
yield sec
# return true
rescue exception => e
[snip]
ensure
if y and y.alive?
y.kill
y.join # make sure y is dead.
end
end
it appears ruby isn’t being sufficiently paranoid with regard to thread critical and we (or another library) aren’t being sufficiently careful. It’s pretty easy to construct a case in which the time-out doesn’t occur:
require "timeout"
begin
Timeout::timeout(1) {
p "foo"
sleep 200
}
rescue Timeout::Error
puts "timed-out"
end
begin
Timeout::timeout(1) {
Thread.critical = true
p "bar"
sleep 200
}
rescue Timeout::Error
puts "timed-out"
end
but that’s still short of explaining the hang as described, in that it would still require the underlying code to hang and add the additional complication (clue?) that this happen in a critical region.
The next question is probably how can run_and_capture return while still leaving things in an odd state (e.g. Thread.critical = true, dangling joins, or…)?
- Author: Fredrik Eriksson
- Status: Accepted
- Priority: Normal
- Assignee: Markus Roberts
- Category: FreeBSD
- Target version:
- Affected version: 0.25.5
- Keywords:
- Branch:
relevant info from pkg_info:
- ruby-1.8.7.248_3,1 (compiled with debug)
- puppet-2.6.1.r2
- facter-1.5.7_1
uname -a
FreeBSD localhost 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Wed Jul 28 11:44:21 UTC 2010 root <at> localhost:/usr/obj/usr/src/sys/GENERIC amd64
I did also see this problem in FreeBSD 8.0 and puppet 0.25. Puppet does not hang every time it’s run, only on random Exec statements. When debugging this I simply created an endless loop which ran the puppet configuration until it hang. Sometimes it would only run a few times before it hang and sometimes it run for hours. The manifest is quite big, but contains some sensitive stuff so I can’t upload it here, but this is probably reproducable with any manifest that uses a great deal of Exec statements.
A few examples of output when it hang with command ‘ruby18 —debug /usr/local/bin/puppet agent —test —debug —trace’:
/usr/local/lib/ruby/site_ruby/1.8/puppet/type.rb:1881: warning: instance variable <at> virtual not initialized
/usr/local/lib/ruby/site_ruby/1.8/puppet/type.rb:1881: warning: instance variable <at> virtual not initialized
/usr/local/lib/ruby/site_ruby/1.8/puppet/type.rb:1881: warning: instance variable <at> virtual not initialized
debug: /Stage[main]/Webbase::Backup/Line[sudo_backup]/Exec[echo 'backup ALL= ( root ) NOPASSWD: /usr/sbin/backup' >> '/usr/local/etc/sudoers']: Executing check '/usr/bin/grep -qFx 'backup ALL= ( root ) NOPASSWD: /usr/sbin/backup' '/usr/local/etc/sudoers''
debug: Executing '/usr/bin/grep -qFx 'backup ALL= ( root ) NOPASSWD: /usr/sbin/backup' '/usr/local/etc/sudoers''
removing /tmp/puppet20100901-36732-16mgs1p-0...done
[here puppet hangs]
debug: /Stage[main]/Webbase::Apache/Exec[apache_purge]: Executing check '/bin/test -e /usr/local/etc/apache/ssl.crt'
debug: Executing '/bin/test -e /usr/local/etc/apache/ssl.crt'
removing /tmp/puppet20100901-50535-1p6j13j-0...done
[here puppet hangs]
/usr/local/lib/ruby/site_ruby/1.8/puppet/type.rb:1881: warning: instance variable <at> virtual not initialized
debug: /Stage[main]/Webbase::Nas/Exec[/sbin/mount /proc]: Executing check '/sbin/mount | /usr/bin/egrep ^procfs'
debug: Executing '/sbin/mount | /usr/bin/egrep ^procfs'
removing /tmp/puppet20100901-95974-1n5jg13-0...done
[here puppet hangs]
Since I have very little experience debugging stuff, and no previous experience with ruby I just hacked in a few debug statements in type/exec.rb, and the relevant code appears to be this code from the run function:
withenv environment do
Timeout::timeout(self[:timeout]) do
output, status = Puppet::Util::SUIDManager.run_and_capture(
[command], self[:user], self[:group]
)
Puppet.debug "returned from run_and_capture #{self[:timeout]}"
end
Puppet.debug "timeout ended"
# The shell returns 127 if the command is missing.
if status.exitstatus == 127
Puppet.debug "argument error"
raise ArgumentError, output
end
Puppet.debug "no argument error"
end
When puppet hangs it is after
Puppet.debug "returned from run_and_capture #{self[:timeout]}"
but before
Puppet.debug "timeout ended"
Since I don’t really have time to figure out how the timeout –> do –> end thing works I’ll just file this bug here so hopefully someone can say if this is a bug in puppet or an upstream bug in ruby. Let me know if you need any more info.
--
You received this message because you are subscribed to the Google Groups "Puppet Bugs" group.
To post to this group, send email to puppet-bugs <at> googlegroups.com.
To unsubscribe from this group, send email to puppet-bugs+unsubscribe <at> googlegroups.com.
For more options, visit this group at http://groups.google.com/group/puppet-bugs?hl=en.