Matthias Klein | 29 Oct 11:10 2014
Picon

Latest development tree

Hello,

where canI find the lastest development of the PREEMPT_RT patch?

At https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/ 
are only the stable releases, right?

Best regards,
Matthias

--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Friesen | 23 Oct 21:27 2014

semantics of reader/writer semaphores in rt patch

I recently noticed that when CONFIG_PREEMPT_RT_FULL is enabled we the 
semantics change.  From "include/linux/rwsem_rt.h":

  * Note that the semantics are different from the usual
  * Linux rw-sems, in PREEMPT_RT mode we do not allow
  * multiple readers to hold the lock at once, we only allow
  * a read-lock owner to read-lock recursively. This is
  * better for latency, makes the implementation inherently
  * fair and makes it simpler as well.

How is this valid?  It seems to me that there are any number of code 
paths that could depend on having multiple threads of execution be able 
to hold the reader lock simultaneously.  Something as simple as:

thread A:
take rw_semaphore X for reading
take lock Y, modify data, release lock Y
wake up thread B
wait on conditional protected by lock Y
free rw_semaphore X

thread B:
take rw_semaphore X for reading
wait on conditional protected by lock Y
send message to wake up thread A
free rw_semaphore X

In the regular kernel this would work, in the RT kernel it would deadlock.

Does the RT kernel just disallow this sort of algorithm?
(Continue reading)

Chris Friesen | 23 Oct 19:54 2014

RT/ext4/jbd2 circular dependency (was: Re: Hang writing to nfs-mounted filesystem from client)

On 10/17/2014 12:55 PM, Austin Schuh wrote:
> Use the 121 patch.  This sounds very similar to the issue that I helped
> debug with XFS.  There ended up being a deadlock due to a bug in the
> kernel work queues.  You can search the RT archives for more info.

I can confirm that the problem still shows up with the rt121 patch. (And
also with Paul Gortmaker's proposed 3.4.103-rt127 patch.)

We added some instrumentation and it looks like we've tracked down the problem.
Figuring out how to fix it is proving to be tricky.

Basically it looks like we have a circular dependency involving the
inode->i_data_sem rt_mutex, the PG_writeback bit, and the BJ_Shadow list.  It
goes something like this:

jbd2_journal_commit_transaction:
1) set page for writeback (set PG_writeback bit)
2) put jbd2 journal head on BJ_Shadow list
3) sleep on PG_writeback bit waiting for page writeback complete

ext4_da_writepages:
1) ext4_map_blocks() acquires inode->i_data_sem for writing
2) do_get_write_access() sleeps waiting for jbd2 journal head to come off
the BJ_Shadow list

At this point the flush code can't run because it can't acquire
inode->i_data_sem for reading, so the page will never get written out.
Deadlock.

The following is a more detailed timeline with information from added trace
(Continue reading)

Daniel Wagner | 16 Oct 10:15 2014

LPC: Realtime Microconfernce Etherpad link

https://etherpad.fr/p/LPC2014_RealTime
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Paul Gortmaker | 16 Oct 02:09 2014

[PATCH-rt] rtmutex/rt: don't BUG for -EDEADLK when detect_deadlock is off

The stable cherry pick of commit 3d5c9340d1949733eb37616abd15db36aef9a57c
("rtmutex: Handle deadlock detection smarter")  essentially makes the
deadlock_detect flag a no-op, as it says:

    Even in the case when deadlock detection is not requested by the
    caller, we can detect deadlocks. Right now the code stops the lock
    chain walk and keeps the waiter enqueued, even on itself. Silly not to
    yell when such a scenario is detected and to keep the waiter enqueued.

    Return -EDEADLK unconditionally and handle it at the call sites.

So, as part of that change, we see this:

  <at>  <at>  -453,7 +453,7  <at>  <at>  static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
          * which is wrong, as the other waiter is not in a deadlock
          * situation.
          */
 -       if (detect_deadlock && owner == task)
 +       if (owner == task)
                 return -EDEADLK;

However, as part of the -rt baseline patches, there exists this change
within rt-mutex-add-sleeping-spinlocks-support.patch:

	ret = task_blocks_on_rt_mutex(lock, &waiter, self, 0);
	BUG_ON(ret);

Note that the zero in the call to task_blocks_on_rt_mutex is the value
of detect_deadlock; off, but now ignored, and so we get ret = -EDEADLK
which triggers the BUG_ON().
(Continue reading)

Caroline Jones | 14 Oct 18:09 2014
Picon

read this

--

-- 
How are you doing today? Please get back to me its very important. Caro
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo <at> vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Daniel Wagner | 9 Oct 11:47 2014

Re: [PATCH] sched: Do not try to replenish from a non deadline tasks

Hi Juri,

On 10/07/2014 03:20 PM, Daniel Wagner wrote:
> On 10/07/2014 02:10 PM, Daniel Wagner wrote:
>> [   36.689416] pthread_-1555    0d..5 18486408us : sched_stat_sleep: comm=pthread_test pid=1554
delay=143975 [ns]
>> [   36.689416] pthread_-1555    0d..5 18486408us : sched_wakeup: comm=pthread_test pid=1554 prio=120
success=1 target_cpu=000
>> [   36.689416] pthread_-1555    0d..4 18486420us : sched_pi_setprio: comm=pthread_test pid=1555
oldprio=-1 newprio=-1
>> [   36.689416] pthread_-1555    0d..4 18486421us : sched_dequeue_dl_entity: comm=pthread_test
pid=1555 flags=0
>> [   36.689416] pthread_-1555    0d..4 18486421us : sched_enqueue_dl_entity: comm=pthread_test
pid=1555 pi_comm=pthread_test pi_pid=1555 flags=8
>> [   36.689416] pthread_-1555    0d..4 18486421us : sched_dequeue_dl_entity: comm=pthread_test
pid=1555 flags=0
>> [   36.689416] pthread_-1555    0d..4 18486422us : sched_enqueue_dl_entity: comm=pthread_test
pid=1555 pi_comm=pthread_test pi_pid=1555 flags=0
>> [   36.689416] pthread_-1555    0d.H4 18486539us : sched_enqueue_dl_entity: comm=pthread_test
pid=1555 pi_comm=pthread_test pi_pid=1555 flags=8
> 
> I noticed that the last two lines are different. Maybe that is yet
> another path into enqueue_task_dl().

So more testing revealed that the patch also starve both task
eventually. Both process make no progress at all.

runnable tasks:
            task   PID         tree-key  switches  prio     exec-runtime         sum-exec        sum-sleep
----------------------------------------------------------------------------------------------------------
(Continue reading)

Armin Steinhoff | 9 Oct 10:35 2014
Picon

Re: Operation not permitted / pthread_setschedparam / EOD

Thomas Gleixner schrieb:
[ clip]

>that it is calling setpriority() with the arguments

>     which = 2 	   (PRIO_USER)
>     who   = 0     (root)
>     prio  = 0x14  (20)

>And that's exactly what is causing your problem.

OK, thanks a lot!! 

I jumped back to PREEMPT_RT  after several years and wonder:

- why isn't on kernel.org a direct link to this project? There is only a reference to an outdated page of RTwiki

- why is in the "latest News" of RTwiki (from 2012 / Reporting Bugs)) mentioned: 
"As always, please search the existing bug list and consider posting the issue to the linux-rt-users
<https://rt.wiki.kernel.org/index.php/Mailinglists> mailing list before opening a new defect." ?

Do we have to add: "If you do so, you must be prepared to get personal attacks from kernel developers ??

- why is it neccessary to scroll down 4 screens of a 27" monitor to find the released "vanilla" kernels in 3.x?
  I gave up after the second screen and used then a "testing" kernel  -> and that's at the end the root of the
chaos! 

Are you sure that the current representation of the PREEMPT_RT project will convince customers or
developers to use it?

(Continue reading)

Thomas Gleixner | 5 Oct 22:54 2014
Picon

Re: Operation not permitted / pthread_setschedparam

On Sun, 5 Oct 2014, Armin Steinhoff wrote:
> But this doesn't explain why pthread_schedparam had problems with an
> earlier PREEMPT_RT kernel!
> And pointing out such a problem and problems of "ps" is for you:

Lets look at the problems.

I built the random kernel version which you used for testing and
against which you reported a bug.

# uname -a
# Linux fuzz 3.4.0-rc7-rt7+ #42 SMP PREEMPT RT Sun Oct 5 15:52:24 CEST 2014 x86_64 GNU/Linux

Using the following test program:

#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include <pthread.h>

static void *fn(void *p)
{
	while (1) sleep(1);
	return NULL;
}

int main(void)
{
	struct sched_param sp;
(Continue reading)

Armin Steinhoff | 30 Sep 12:40 2014
Picon

Operation not permitted / pthread_setschedparam


Hi,

a "pthread_setschedparam" call of the code below terminates with an
errno == 1  ... means "Operation not permitted"

    instance_l.fStopThread = FALSE;
    if (pthread_create(&instance_l.threadId, NULL, eventThread,
(void*)&instance_l) != 0)
        goto Exit;

    schedParam.__sched_priority = KERNEL_EVENT_THREAD_PRIORITY;
    if (pthread_setschedparam(instance_l.threadId, SCHED_FIFO,
&schedParam) != 0)
    {
        DEBUG_LVL_ERROR_TRACE("%s(): couldn't set thread scheduling
parameters! %d\n", __func__, errno);
    }

I'm using a PREEMPT_RT kernel 3.4.0-rc7-rt7-2.16  for SuSE 12.2
Milestone 2 (3.4.11-2.16).

KERNEL_EVENT_THREAD_PRIORITY is 55!

How to solve that problem ?

Best Regards

--Armin

(Continue reading)

Rafael Vega | 25 Sep 18:53 2014
Picon

Fwd: CpuFreq Laptop Scaling broken?

Hi,

I'm jumping into this thread as suggested by the folks at
linux-audio-user. I am the poster of the thread that Harry mentioned.
I would like to help resolve this issue but I'm not sure how. Please
advise and I'll try to help as much as I can. Here are some links with
relevant information (I think) regarding this issue:

The same post Harry mentioned:
http://forums.debian.net/viewtopic.php?f=5&t=117613&p=554333#p554333

Another Debian user reporting similar problems, he "fixed" it by using
an older kernel:
http://forums.debian.net/viewtopic.php?f=5&t=116307

The thread I started on linux-audio-user. Another person reports he
has the same issue in Arch´s package of the RT kernel.
http://lists.linuxaudio.org/pipermail/linux-audio-user/2014-September/099285.html

An issue I reported on the thermald git repository. I originally
thought the problem had to do with thermald. There are some hopefully
useful syslog messages there.
https://github.com/01org/thermal_daemon/issues/39#issuecomment-56846550

Rafael Vega.

--

-- 

Rafael Vega
email.rafa <at> gmail.com
(Continue reading)


Gmane