Andreas Speier | 2 Feb 17:53
Picon

Re: Interrupt problems causing crash of network driver and Kernel panic?

Hi L4-Hackers,

as addition to my previous mail, here are the configurations of the Fiasco micro kernel  and the L4Linux kernel...

Fiasco config file:
###################
#
# Automatically generated make config: don't edit
# Fiasco kernel version: SVN
# Mon Oct  5 12:05:33 2009
#

#
# Target configuration
#
CONFIG_IA32=y
# CONFIG_AMD64 is not set
# CONFIG_ARM is not set
CONFIG_PF_PC=y
# CONFIG_PF_UX is not set
# CONFIG_PF_REALVIEW is not set
# CONFIG_PF_INTEGRATOR is not set
# CONFIG_PF_XSCALE is not set
# CONFIG_PF_SA1100 is not set
CONFIG_ABI_V2=y
# CONFIG_ARM_PXA is not set
# CONFIG_ARM_SA is not set
# CONFIG_ARM_920T is not set
# CONFIG_ARM_926 is not set
# CONFIG_ARM_1176 is not set
(Continue reading)

Adam Lackorzynski | 7 Feb 23:50
Picon
Favicon

Re: Interrupt problems causing crash of network driver and Kernel panic?


On Thu Jan 28, 2010 at 14:33:43 +0100, Andreas Speier wrote:
> The main hardware components are an embedded board with 1GHz VIA C7
> processor, 1GB of RAM and 2 network interface cards (VIA-Rhine, 3Com).
> The drivers are compiled into the L4Linux kernel.
> 
> Running this application works fine until some more interrupt intensive
> process will be started. Then the VoIP communication (SIP) breaks down
> and I got the following output in /var/log/syslog:
> 
> #################
> Jan 28 08:24:58 TESTPC kernel: ------------[ cut here ]------------
> Jan 28 08:24:58 TESTPC kernel: kernel BUG at
> /home/[...]/l4linux-2.6.29/net/core/dev.c:2625!
> Jan 28 08:24:58 TESTPC kernel: Trap: 6: 0000 [#1]
> Jan 28 08:24:58 TESTPC kernel: last sysfs file:
> Jan 28 08:24:58 TESTPC kernel: Modules linked in:
> Jan 28 08:24:58 TESTPC kernel:
> Jan 28 08:24:58 TESTPC kernel: Pid: 872, comm: find Not tainted
> (2.6.29-l4 #3)
> Jan 28 08:24:58 TESTPC kernel: EIP: ff04:[<0063c5bf>] EFLAGS: 00010246
> CPU: 0
> Jan 28 08:24:58 TESTPC kernel: EIP is at __napi_complete+0x2f/0x40
> Jan 28 08:24:58 TESTPC kernel: EAX: 0092d45c EBX: 0092d45c ECX: 0092d45c
> EDX: 008aa40c
> Jan 28 08:24:58 TESTPC kernel: ESI: 00000001 EDI: 0092d45c EBP: b0afff14
> ESP: b0afff0c
> Jan 28 08:24:58 TESTPC kernel: DS: 4000 ES: 7032 FS: 0023 GS: 0043 SS: 0023
> Jan 28 08:24:58 TESTPC kernel: Process find (pid: 872, ti=b0afe000
> task=0a5391b0 task.ti=0a564000)
(Continue reading)

Andreas Speier | 8 Feb 09:51
Picon

Re: Interrupt problems causing crash of network driver and Kernel panic?

Hi Adam,

I did some more tests on the system using different settings in the
L4Linux and menu.lst configurations.

The following listings show the results in /var/log/syslog as far they
could be recorded for:

* Enabled roottask.config and L4Linux priority 0xA0 (some as all other
tasks) [1]
* Enabled roottask.config and L4Linux priority 0xE0 [1]
* Disabled roottask.config and L4Linux priority 0xA0 [2]
* Enabled roottask.config and L4Linux priority 0xA0 [2]

(tests done via SSH)
[1]: find /  asdf.asdf
[2]: while true; do echo $(date +%s); done

Both test sceneries have been done on an established SIP connection over
 both NICs. So we already produce some network traffic and interrupts.

Adam Lackorzynski wrote:
> On Thu Jan 28, 2010 at 14:33:43 +0100, Andreas Speier wrote:
>> The main hardware components are an embedded board with 1GHz VIA C7
>> processor, 1GB of RAM and 2 network interface cards (VIA-Rhine, 3Com).
>> The drivers are compiled into the L4Linux kernel.
>>
>> Running this application works fine until some more interrupt intensive
>> process will be started. Then the VoIP communication (SIP) breaks down
>> and I got the following output in /var/log/syslog:
(Continue reading)

Christian Prochaska | 12 Feb 08:20
Favicon

Fiasco reserved memory area overlapping with bootloader area

Hello,

I recently had the problem that a physical memory area reserved for 
Fiasco overlapped with a Bootloader area, resulting in one or possibly 
more corrupted boot modules. Apparently there was not enough physical 
memory available, but i would suggest that in this case Fiasco would 
print an error message and stop as soon as it knows about the overlap 
instead of further executing the corrupted modules.

Christian

[src] (17.00) jdb: kf
KIP @ 0xf0001000
magic: L4�K  version: 0x87004444
clock: 0000000000623d7c (6438268)
freq_cpu: 2993136kHz
freq_bus: 0kHz
sigma0_ip: 00103de8 sigma0_sp: 002da720
sigma1_ip: 00000000 sigma1_sp: 00000000
root_ip:   010301dc root_sp:   00000000
Memory (max 30 descriptors):
  1:phys [0000000000000000-000000000009fc00] Conventional
  2:phys [0000000000100000-0000000003ff0000] Conventional
  3:phys [0000000000001000-0000000000065000] Reserved
  4:phys [000000000006a000-000000000006a400] Bootloader
  5:phys [000000000009fc00-00000000000a0000] Arch
  6:phys [00000000000e8000-0000000000100000] Arch
  7:phys [0000000000100000-0000000000108400] Dedicated
  8:phys [0000000001000000-0000000001065400] Bootloader
  9:phys [00000000020de000-0000000003cef800] Bootloader
(Continue reading)

Adam Lackorzynski | 14 Feb 16:10
Picon
Favicon

Re: Fiasco reserved memory area overlapping with bootloader area

Hi,

On Fri Feb 12, 2010 at 08:20:38 +0100, Christian Prochaska wrote:
> Hello,
> 
> I recently had the problem that a physical memory area reserved for
> Fiasco overlapped with a Bootloader area, resulting in one or
> possibly more corrupted boot modules. Apparently there was not
> enough physical memory available, but i would suggest that in this
> case Fiasco would print an error message and stop as soon as it
> knows about the overlap instead of further executing the corrupted
> modules.

Yep, confirmed, I queued it up on the todo list.

Thx,
Adam
--

-- 
Adam                 adam <at> os.inf.tu-dresden.de
  Lackorzynski         http://os.inf.tu-dresden.de/~adam/
Norman Feske | 24 Feb 11:19
Favicon
Gravatar

Announcement: Genode 10.02 adds support for the NOVA hypervisor

The Genode project has released the new version 10.02 of the Genode
OS Framework. The framework is a modular user-level infrastructure
comprising device drivers, a GUI, networking support, a C library,
and Qt4. It can be executed directly on different microkernels, in
particular L4/Fiasco. Starting with the current release, we include
support for the recently released NOVA hypervisor into the official
Genode distribution. So there are now two kernels of the Dresden OS
group ready to use with the framework. For the technical details of
our porting effort, please refer to the release notes (see the link
below).

Moreover, we added support for the Codezero kernel and introduced a
new concept for managing real-time priorities on the L4ka::Pistachio
and OKL4 kernels. The most significant new functionality is the
initial port of the Python 2.6.4 script interpreter.

Release-notes summary for version 10.02
---------------------------------------
* Platform support
  * NOVA hypervisor
  * Codezero kernel
* New thread-context management
* Real-time priorities
* Python scripting

You can find the complete release notes for the version 10.02 here:

  http://genode.org/documentation/release-notes/10.02/

The new release is available via the project's subversion repository
(Continue reading)

Da Zheng | 26 Feb 10:51
Picon

A question about the softirq implementation in DDE Linux26

Hello,

After I ported DDE Linux26 to the Hurd, I test it with a NIC driver: pcnet32,
and see a problem: device interrupt of the NIC device is sometimes masked but
cannot be unmasked.

Whenever pcnet32 driver receives an interrupt, it masks device interrupts and
calls __netif_rx_schedule() and let softirq to handle the interrupt.
__netif_rx_schedule() should set NET_RX_SOFTIRQ, but it can only do that when
the local "irq" is disabled (by calling local_irq_save macro). Linux disables
irq with cli instruction. Obviously DDE cannot do that, but the implementation
of local_irq_save in DDE is quite strange. It seems that it eventually calls
raw_local_irq_disable(), which is implemented in
linux26/lib/src/arch/l4/cli_sti.c. How can increasing _refcnt has anything to do
with disabling irq?

Without disabling irq, there is a race condition in the interrupt handler and
softirq handler. When I run pcnet32 with my ported DDE Linux26 for a long time,
I sometimes see softirq fails to be scheduled after the driver receives a hard IRQ.

I don't know how the Linux drivers can work with DDE Linux in L4.
raw_local_irq_disable() apparently has problems if I read the right code.

Best regards,
Zheng Da
Dirk Vogt | 26 Feb 14:17
Picon
Favicon

Re: A question about the softirq implementation in DDE Linux26

Hi,

  the DDE models a SMP-like setup, whereas each ddekit_thread is
supposed to run on a dedicated CPU.  For each IRQ, there is a dedicated
ddekit_thread.
   As far as I understand it, disabling hard IRQs in any other
ddekit_thread than the IRQ-handler threads has no effect, because they
won't receive IRQs anyway.  For an IRQ handler thread, it also has no
effect, because it only runs when it is handling an interrupt, and won't
receive any further IRQs while handling one.

Best regards,

Dirk.
Da Zheng | 26 Feb 15:08
Picon

Re: A question about the softirq implementation in DDE Linux26

Hi,

On 10-2-26 下午9:17, Dirk Vogt wrote:
> Hi,
>  
>   the DDE models a SMP-like setup, whereas each ddekit_thread is
> supposed to run on a dedicated CPU.  For each IRQ, there is a dedicated
> ddekit_thread.
>    As far as I understand it, disabling hard IRQs in any other
> ddekit_thread than the IRQ-handler threads has no effect, because they
> won't receive IRQs anyway.  For an IRQ handler thread, it also has no
> effect, because it only runs when it is handling an interrupt, and won't
> receive any further IRQs while handling one.
I was talking about synchronization between hard IRQ handler and softirq handler.

When the driver receives a hard IRQ, it tries to raise softirq.
    local_irq_save(flags);
    list_add_tail(&n->poll_list, &__get_cpu_var(softnet_data).poll_list);
    __raise_softirq_irqoff(NET_RX_SOFTIRQ);
    local_irq_restore(flags);
The softirq thread, on the other hand, does
    local_irq_save(flags);
    if (local_softirq_pending())
        __do_softirq();
    local_irq_restore(flags);
In __do_softirq, it does
        unsigned long pending = local_softirq_pending();

        /* reset softirq count */
        set_softirq_pending(0);
(Continue reading)

Dirk Vogt | 26 Feb 16:50
Picon
Favicon

Re: A question about the softirq implementation in DDE Linux26

On Fri, 2010-02-26 at 22:08 +0800, Da Zheng wrote:
> In Linux local_irq_save() disables irqs in the local processor, so if
> the hard
> IRQ handler tries to raise softirq, it is guaranteed that the softirq
> thread
> will not be scheduled to run, and vice versa. How would that work on a
> SMP machine? 

Correct me if I am wrong, but i think even on native Linux the hard IRQ
handler and the soft IRQ handler could run on the same time (on two
different processors) as only *local* interrupts are disabled. 

> [...] 2.3.43 introduced softirqs, and re-implemented the (now
> deprecated) BHs underneath them. Softirqs are fully-SMP versions of
> BHs: they can  run on as many CPUs at once as required. This means
> they need to deal with any races in shared data using their own locks.
> [...]

[0]
http://people.netfilter.org/rusty/unreliable-guides/kernel-hacking/basics-softirqs.html

Gmane