postmaster | 9 Apr 19:46 2009
Picon

(unknown)


--
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs

Sol Kavy | 14 Apr 00:00 2009

IRQ Tracing Problem in Linux 2.6.28 Kernel

The following back trace represents a deadlock in Ubicom's SMP port of 2.6.28 kernel.   I am sure that we
are doing something unexpected.  I would appreciate the community's help in understanding what is
going wrong.

Thanks in advance for any pointers,

Sol Kavy

Problem:
Ubicom's initial port does not use GENERIC_CLOCKEVENTS.  Instead it uses a periodic timer based on HZ.  The
periodic timer calls do_timer() on each tick.

From the arch directory perspective, we are required to hold the xtime_lock before calling do_timer(). 
The lock is indeed help by cpu 3 as evidenced in the output below.

The call to get_jiffies_64() at the top of the backtrace is attempting to read the jiffies in a reliable
fashion.  The caller is required to wait for the xtime_lock not to be held.  Clearly, since we are in  a
path that is holding the xtime_lock, this will never make forward progress.

What is unclear to me is why other ports are not seeing the same problem?  

Perhaps it is because most ports now set GENERIC_CLOCKEVENTS which uses an entirely different mechanism
for doing things.  I am in the middle of switching the port to use GENERIC_CLOCKEVENTS but would like to
understand this failure in more detail.

Any feedback is greatly appreciated,

Sol

Config Flags:
(Continue reading)

Gedare Bloom | 14 Apr 14:51 2009
Picon

Re: IRQ Tracing Problem in Linux 2.6.28 Kernel

On Mon, Apr 13, 2009 at 6:00 PM, Sol Kavy <skavy <at> ubicom.com> wrote:
> The following back trace represents a deadlock in Ubicom's SMP port of 2.6.28 kernel.   I am sure that we
are doing something unexpected.  I would appreciate the community's help in understanding what is
going wrong.
>
> Thanks in advance for any pointers,
>
> Sol Kavy
>
> Problem:
> Ubicom's initial port does not use GENERIC_CLOCKEVENTS.  Instead it uses a periodic timer based on HZ.
 The periodic timer calls do_timer() on each tick.
>
> From the arch directory perspective, we are required to hold the xtime_lock before calling
do_timer().  The lock is indeed help by cpu 3 as evidenced in the output below.
>
> The call to get_jiffies_64() at the top of the backtrace is attempting to read the jiffies in a reliable
fashion.  The caller is required to wait for the xtime_lock not to be held.  Clearly, since we are in  a
path that is holding the xtime_lock, this will never make forward progress.
>
> What is unclear to me is why other ports are not seeing the same problem?
>
> Perhaps it is because most ports now set GENERIC_CLOCKEVENTS which uses an entirely different mechanism
for doing things.  I am in the middle of switching the port to use GENERIC_CLOCKEVENTS but would like to
understand this failure in more detail.
>
> Any feedback is greatly appreciated,
>
> Sol
>
(Continue reading)

Sol Kavy | 14 Apr 18:53 2009

RE: IRQ Tracing Problem in Linux 2.6.28 Kernel

Every arch's timer interrupt does the following:

	write_seqlock(&xtime_lock);
	do_timer(1);
	write_sequnlock(&xtime_lock);

This is required because the first thing do_timer() does is increment jiffies_64.

	void do_timer(unsigned long ticks)
	{
		jiffies_64 += ticks;
		update_times(ticks);
	}

The frames 8-0 in the backtrace below are in Linux main code (which as a port, we don't want to change).   The
arch calls do_timer() to advance time (and must do so with the lock held).  However, if the Linux code is then
going to eventually call get_jiffies_64(), this leads to a deadlock.  The only time that
get_jiffies_64() is called is when you have selected CONFIG_IRQSOFF_TRACER and
CONFIG_TRACE_IRQFLAGS.   

My next step is to build this on a x86 and try to understand why that port does not run into the same problem.

Sol

-----Original Message-----
From: Gedare Bloom [mailto:gedare <at> gwmail.gwu.edu] 
Sent: Tuesday, April 14, 2009 5:52 AM
To: Sol Kavy
Cc: linux-newbie <at> vger.kernel.org
Subject: Re: IRQ Tracing Problem in Linux 2.6.28 Kernel
(Continue reading)

Sol Kavy | 15 Apr 00:58 2009

RE: IRQ Tracing Problem in Linux 2.6.28 Kernel


I have discovered why other architectures do not have the same problem.   The back trace indicates a real
defect (i.e. deadlock) in the generic code.  

Most architectures override sched_clock() with their own version.  Kernel/sched_clock.c:38 is a weak
alias that will be overridden if an arch directory supplies its own.

Most of the arch directories provide an implementation that directly access the jiffies_64 variable
"without" acquiring the xtime_lock.  

Some of the implementations provide a "poor" implementation in that the value of the jiffies_64 during a 32
rollover is not taken into account.  If sched_clock() is to be called while holding xtime_lock, the core
implementation should be modified not to call get_jiffies_64() (which requires the xlock) but to use
something like the following:

unsigned long long sched_clock(void)
{
	unsigned long long my_jiffies;
	unsigned long jiffies_top;
	unsigned long jiffies_bottom;

	do {
		jiffies_top = jiffies_64 >> 32;
		jiffies_bottom = jiffies_64 & 0xffffffff;
	} while(unlikely(jiffies_top != (unsigned long)(jiffies_64 >> 32))); 

	my_jiffies = ((unsigned long long)jiffies_top << 32) | (jiffies_bottom);
	return (my_jiffies - INITIAL_JIFFIES) * (NSEC_PER_SEC / HZ);
}

(Continue reading)

Peter Teoh | 15 Apr 05:06 2009
Picon

Re: IRQ Tracing Problem in Linux 2.6.28 Kernel

Just my guess......

On Tue, Apr 14, 2009 at 10:58 PM, Sol Kavy <skavy <at> ubicom.com> wrote:
>
> I have discovered why other architectures do not have the same problem.   The back trace indicates a real
defect (i.e. deadlock) in the generic code.
>
> Most architectures override sched_clock() with their own version.  Kernel/sched_clock.c:38 is a weak
alias that will be overridden if an arch directory supplies its own.
>
> Most of the arch directories provide an implementation that directly access the jiffies_64 variable
"without" acquiring the xtime_lock.
>
> Some of the implementations provide a "poor" implementation in that the value of the jiffies_64 during a
32 rollover is not taken into account.  If sched_clock() is to be called while holding xtime_lock, the core
implementation should be modified not to call get_jiffies_64() (which requires the xlock) but to use
something like the following:
>
> unsigned long long sched_clock(void)
> {
>        unsigned long long my_jiffies;
>        unsigned long jiffies_top;
>        unsigned long jiffies_bottom;
>
>        do {
>                jiffies_top = jiffies_64 >> 32;
>                jiffies_bottom = jiffies_64 & 0xffffffff;

in general this type of operation is only done when u are in 32bit
mode.   In 64bit mode, u can do it in ONE atomic assembly
(Continue reading)

Jeffrey Cao | 24 Apr 17:50 2009
Picon

Re: IRQ Tracing Problem in Linux 2.6.28 Kernel

On 2009-04-13, Sol Kavy <skavy <at> ubicom.com> wrote:
> The following back trace represents a deadlock in Ubicom's SMP port of 2.6.28 kernel.   I am sure that we
are doing something unexpected.  I would appreciate the community's help in understanding what is
going wrong.
>
> Thanks in advance for any pointers,
>
> Sol Kavy
>
> Problem:
> Ubicom's initial port does not use GENERIC_CLOCKEVENTS.  Instead it uses a periodic timer based on HZ.  The
periodic timer calls do_timer() on each tick.
>
> From the arch directory perspective, we are required to hold the xtime_lock before calling
do_timer().  The lock is indeed help by cpu 3 as evidenced in the output below.
>
> The call to get_jiffies_64() at the top of the backtrace is attempting to read the jiffies in a reliable
fashion.  The caller is required to wait for the xtime_lock not to be held.  Clearly, since we are in  a
path that is holding the xtime_lock, this will never make forward progress.
For x86 arch, function get_jiffies_64() seems not to wait the xtime_lock,
but to do something related to CPU ordering:
get_jiffies_64()
|->read_seqbegin()
   |->smp_rmb()
      |->alternative("lock; addl $0,0(%%esp)", "lfence", X86_FEATURE_XMM2)
I'm not sure if this is the same as to accquire xtime_lock spinlock. Maybe this
is a point you need check.

Jeffrey

(Continue reading)

Jeffrey Cao | 25 Apr 05:06 2009
Picon

Re: Does cr3 register change when a new process is scheduled ?

On 2009-01-15, Peter Teoh <htmldeveloper <at> gmail.com> wrote:
> yes, every process running will have a different CR3 value....easily
> seen when u printk() the value from different kernel module....and the
> kernel module is running in the same process context as the "insmod"
> that started the kernel module....so u can see the process name is
> "insmod" if u printk() the value of the process name.   ie, user +
Not exactly, module is not a process. It is just some functions registered
into the kernel space. You see the process name is "insmod" is because that
you put your printk statement in the module initialization function. You
execute insmod to call the initialization function, so the module's init
function is in "insmod" process context. If you execute other command to
call ioctl to the module, and then the module is executed in other process
context.

Jeffrey

> kernel mode all shared the same CR3 value, but different process will
> have different value.
>
> On Thu, Jan 15, 2009 at 10:33 AM, George Kumar <grgkumar4 <at> gmail.com> wrote:
>> Let us suppose we are talking about a uni processor system. Do we
>> change cr3 page directory control register when a new process is
>> scheduled to run on the CPU.
>>
>> thanks.
>> George
>> --
>

--
(Continue reading)

Korkakakis Nikos | 28 Apr 08:24 2009
Picon

questions; gcc builtins - IO scheduler - profiling

Hi all,

since I have played a bit with the kernel sources these are some newbie
questions, that I couldn't find with trivial search engine usage;

a) It is possible to use gcc builtin functions in conjuction with some
gcc switches and the march to produce somewhat optimal code. For
instance function __builtin_popcount, together with the march=amdfam10
and -mabm produces a POPCNT assembly instruction which counts the bits
on word (machine word) in 1 instruction. For archs that does not support
the popcnt instruction  as far as I can tell gcc produces
/normal-expected/ code that does the same thing in a simple (using a
loop, shifting and counting Zero-Flags) or a more advanced way ().  Is
such programming practice condemned?

b) One of the coolest things in kernel is the different types of I/O
schedulers. I haven't exhaustively checked the source but is it possible
to have a differnet I/O scheduler per device? If for instance I have an
SSD and a normal hdd wouldn't be cooler to use noop I/O Sched for the
SSD  and anticipatory I/O Sched for the normal hdd?

c) Is there a way to profile *just* one specific module/function/group
of functions that run as a kernel thread for runtime performance? So far
I've (tried to) used (use) oprofile (http://oprofile.sf.net) for some
profiling and it is quite disturbing ( can't see the tree for the forest
:P ).

Cheers :)
--
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
(Continue reading)

Tim Cullen | 1 May 01:22 2009
Picon

process hangs when calling close


I have a simple test program that opens a block device, writes a bunch of data to it, and then calls close. The
process appears to hang in the call to close. Output from the ps command shows the process has gone into the
uninterruptible IO state and the systems cpu usage goes to zero. The process stays this way until the
system is rebooted. 

Any ideas whats going on here?

Please CC me personally with any replies.
thanks
tim

--
To unsubscribe from this list: send the line "unsubscribe linux-newbie" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.linux-learn.org/faqs


Gmane