Matthias-Christian Ott | 31 Jul 12:25 2015

Initializer and initialization macros in queue.h

I'm not sure whether this is the right place to ask but queue.h exists
under sys.

queue.h defines initializers macros (e.g. LIST_HEAD_INITIALIZER) and
initialization macros (e.g. LIST_INIT). The initializer macros have been
introduced in NetBSD. I'm not sure how what's the intended purpose of
both macro types.

If you look at the implementation both macro types generate the same
code for SLIST, LIST, SIMPLEQ, TAILQ and CIRCLEQ. Is that intended?

For static and stack allocated variables both forms seem to be correct:

#include <stddef.h>
#include <sys/queue.h>

struct entry {
    LIST_ENTRY(entry) link;

LIST_HEAD(head, entry);

struct head head1 = LIST_HEAD_INITIALIZER(&head1);

int main(void)
    struct head head2;


(Continue reading)

Maxime Villard | 30 Jul 10:53 2015

2*(void *) atomic swap?

Do we have a magic function that can perform two atomic_swap_ptr()

In kern/subr_kmem.c::kmem_guard_free(), the code used to be

	rotor = atomic_inc_uint_nv(&kg->kg_rotor) % kg->kg_depth;
	va = (vaddr_t)atomic_swap_ptr(&kg->kg_fifo[rotor], (void *)va);
	if (va != 0) {
		vmem_free(kg->kg_vmem, va, size);

The va was inserted into the FIFO atomically, and that was fine. Now
we also need to register the size of the va into that FIFO, which means
we have to perform another atomic_swap_ptr(), but there's obviously a
window in the meantime.

I don't want to use a global lock; it may slow down the system..

Maxime Villard | 29 Jul 22:39 2015

Enormous scheduler bug

(not related to my recent spl question)
I've found a bug in kern/sys_pset.c::sys_pset_assign():

	LIST_FOREACH(t, &alllwp, l_list) {
		if (t->l_affinity == NULL) {
		if (t->l_affinity == NULL) {
		if (kcpuset_isset(t->l_affinity, cpu_index(ci))) {
			return EPERM;

You can easily see that this loop tries to make sure no LWP is
already assigned to the cpu (ci). But you can also easily see
that if an LWP is already scheduled on a cpu, but not the one
we are requesting, the loop does not unlock that LWP.

How to trigger:

- - - - - - - - - - - - - - loop.c - - - - - - - - - - - - - -

(Continue reading)

Maxime Villard | 29 Jul 07:59 2015

spl question

What happens if the kernel calls splx(s) twice? And what happens if it
forgets to call splx(s)? Like:

	s = splnet();


	return;   NO SPLX(S)

I guess it creates some inconsistency, right?

Timo Buhrmester | 25 Jul 22:42 2015

panic: locking xyz against myself (linux DRM?!)

For a while now, I'm getting the occasional panic which I can't directly reproduce, but it seems to
correlate with long and/or memory- and/or video-intense firefox sessions.

This is a recent -current (7.99.20 on amd64, although the problem exists as of at least 7.99.10, likely
earlier too), my (onboard) video chip is
> radeon0 at pci1 dev 5 function 0: vendor 1002 product 9614 (rev. 0x00)
all with
> radeondrmkmsfb0 at radeon0
> radeondrmkmsfb0: framebuffer at 0xffff800046148000, size 1280x1024, depth 32, stride 5120

Besides that, I'm using Xorg from pkgsrc-2015Q2 (i.e. X11_TYPE=modular)

Now, whenever the system is up for a few days, and I didn't think of restarting firefox for a while, it
eventually crashes with:
> panic: kernel diagnostic assertion "((mutex->wwm_state != WW_OWNED) || (mutex->wwm_u.owner !=
curlwp))" failed: file "/usr/src/sys/external/bsd/drm2/linux/linux_ww_mutex.c", line 760
locking 0xfffffe804fc70220 against myself: 0xfffffe811c5b2840
> fatal breakpoint trap in supervisor mode
> trap type 1 code 0 rip ffffffff80193685 cs 8 rflags 246 cr2 7f7ff7e43000 ilevel 8 rsp fffffe8041791ac0
> curlwp 0xfffffe811c5b2840 pid 1237.1 lowest kstack 0xfffffe804178e2c0
> Stopped in pid 1237.1 (Xorg) at netbsd:breakpoint+0x5:  leave
> breakpoint() at netbsd:breakpoint+0x5
> vpanic() at netbsd:vpanic+0x13c
> kern_assert() at netbsd:kern_assert+0x4f
> linux_ww_mutex_trylock() at netbsd:linux_ww_mutex_trylock+0xc1
> ttm_bo_uvm_fault() at netbsd:ttm_bo_uvm_fault+0x69
> radeon_ttm_fault() at netbsd:radeon_ttm_fault+0x6a
> uvm_fault_internal() at netbsd:uvm_fault_internal+0x828
> trap() at netbsd:trap+0x32a
> --- trap (number 6) ---
(Continue reading)

Maxime Villard | 25 Jul 10:28 2015

Brainy: Memory Leak in if_ieee1394

there is a memory leak here:

----------------------- sys/net/if_ieee1394subr.c ----------------------

		if (m == NULL)
			goto bad;
		m->m_flags |= m0->m_flags & (M_BCAST|M_MCAST);	/* copy bcast */
		MH_ALIGN(m, sizeof(struct ieee1394_fraghdr));
		m->m_len = sizeof(struct ieee1394_fraghdr);
		ifh = mtod(m, struct ieee1394_fraghdr *);
		ifh->ifh_ft_size =
		    htons(IEEE1394_FT_SUBSEQ | IEEE1394_FT_MORE | (totlen - 1));
		ifh->ifh_etype_off = htons(off);
		ifh->ifh_dgl = htons(ic->ic_dgl);
		ifh->ifh_reserved = 0;
		m->m_next = m_copy(m0, sizeof(*ifh) + off, fraglen);
		if (m->m_next == NULL)
XX			goto bad;
		m->m_pkthdr.len = sizeof(*ifh) + fraglen;
		off += fraglen;
		*mp = m;
		mp = &m->m_nextpkt;
	ifh->ifh_ft_size &= ~htons(IEEE1394_FT_MORE);	/* last fragment */
	m_adj(m0, -(m0->m_pkthdr.len - maxsize));

	return m0;
(Continue reading)

Emmanuel Dreyfus | 16 Jul 18:57 2015

Another force unmount failure


I discovered another scenario where force unmount could not work: an
unresponsive PUFFS filesystem. The filesystem got out of order during an
operation where the filesystem root vnode is locked. As a result, trying
to unmount goes this way

sys_unmount -> namei -> namei_tryemulroot -> lookup_once -> VFS_ROOT ->
puffs_vfsop_root -> vn_lock -> VOP_LOCK  and we loose.

I can kill the userland process, but the mount remains, an any attempt
to touch it sinks in tstile.

Any idea how can that be fixed?


Emmanuel Dreyfus
manu <at>

Ryo ONODERA | 12 Jul 11:27 2015

Threaded file system benchmark


I have added sched(3) support for fio, Flexible IO Tester.
and included patch.)
However my 'top -1' says only 1 CPU is consumed.

My sched(3) support code is wrong? Or is there another reason?
If there is another resason, could you explain me the reason?
If my code is wrong, could you give me some advices?

Thank you.

My machine:
NetBSD/amd64 current on 4 CPUs laptop.
My kernel is GENERIC with dtrace support.
$ uname -a
NetBSD 7.99.19 NetBSD 7.99.19 (DTRACE7) #7: Thu Jul  9 21:58:33 JST 2015 
ryo_on <at> amd64

On one terminal:
$ cat random-write-test.fio
; random read of 128mb of data


$ fio --numjobs 4 random-write-test.fio
(Continue reading)

Matt Thomas | 11 Jul 09:15 2015

Genericizing sys/compat/netbsd32

sys/compat/netbsd32 is great at running 32-bit NetBSD on a 64-bit kernel.
But with a little tweaking, it could do so much more.

For example, aarch64 will need multiple instances of compat_netbsd32
(one for arm32 eabi, one for arm32 oabi, and possibly one for aarch64
ilp32 unless it can use the arm32 eabi)

This requires being able to change the netbsd32_ that starts every function
to something unique.

Now if we are going that far, with a little more work we can separate out
the netbsd32 specific pieces and have a generic "netbsd" on netbsd compat
layer.  This could be used on ARM or some MIPS, or even PowerPC to run a
reverse endian userland (big endian user program on little endian kernel
for example).  Or improve the efficiency of running ARM OABI programs on
an EABI kernel (since much of the netbsd32 compatibility isn’t needed and
could be skipped).

I have started some effort towards this and have a set of diffs at showing how syscalls could
be handled.  netbsd32_wait.c it the furthest along and being genericized.

I particularly like the NETBSDX_SYSCALL(foo) and 
NETBSDX_COMPAT_SYSCALL(n, foo) macros simplify things.

Robert Swindells | 10 Jul 22:21 2015

Linux ioctl emulation

I'm getting a panic in compat_ifioctl() in if_43.c when a linux binary
makes an ioctl(fd, SIOCGIFBRDADDR, &data ) call.

In this code at the end of the function oifr is NULL.

        if (cmd != ocmd)
                ifreqn2o(oifr, ifr);

Robert Swindells

David Holland | 6 Jul 11:58 2015


It has been brought to my attention that the logic in
mount_checkdirs() both (a) races with fork and (b) is probably
compromised by the *at() syscalls.

The purpose of the code is to update all processes' current dirs and
root dirs that have just been mounted over, so nobody ends up sitting
on an intermediate vnode in the middle of a mount stack.

However, in fork we first copy the parent's cwd structure and then
assign the copy, during which time the copy is invisible to
mount_checkdirs; so in theory some process's current dir (or root dir,
too) could be skipped.

Also, since you can't chdir (even fchdir) to the middle of a mount
stack, this logic was sufficient to avoid using the middle of a mount
stack as the starting point for a path lookup, even if someone had an
fd open. But the *at() system calls break this invariant wide open.

So the question is: does it matter? Do we actually care? It seems to
me that no great harm arises (other than perhaps some confusion) if
one is sitting in the middle of a mount stack, and in fact it might
even be desirable if the mount stack is a union mount.

Also it's occasionally useful to mount over things and leave a process
underneath, which this logic seems to complicate.

The logic was added to 4.4 by Kirk McKusick but without much in the
way of rationale:

(Continue reading)