Masanobu SAITOH | 29 Jan 08:31 2015

ixg(4) mutex problem (PR#49328)


 Uwe Toenjes found a bug in ixg(4). See:

Then, PR#49328 was submitted by him.

This problem is can be reproduced when

	LOCKDEBUG is defined
	ifconfig ixg0 up

ixg(4) take a spin mutex before accessing MAC core:

------ ixgbe.h -------
#define IXGBE_CORE_LOCK_INIT(_sc, _name) \
         mutex_init(&(_sc)->core_mtx, MUTEX_DEFAULT, IPL_NET)

------- ixgbe.c::ixbge_ioctl() ------
         case SIOCADDMULTI:
         case SIOCDELMULTI:
         case SIOCSIFFLAGS:
         case SIOCSIFMTU:
                 if ((error = ether_ioctl(ifp, command, data)) != ENETRESET)
                         return error;
                 if ((ifp->if_flags & IFF_RUNNING) == 0)
(Continue reading)

Manuel Bouyer | 24 Jan 14:34 2015

vnode lock and v_numoutput

I have what looks like a deadlock in a Xen dom0 when using
file-backed virtual disk and HVM domUs (the dom0 is running netbsd-6).
In this setup, a file backing a virtual disk may be acceded by both
a vnd(4) and a qemu-dm userland process.
I end up in this situation:
qemu-dm is blocked on:
trace: pid 5171 lid 5 at 0xffffa0005e744740
sleepq_block() at netbsd:sleepq_block+0xc5
cv_wait() at netbsd:cv_wait+0xf2
genfs_do_putpages() at netbsd:genfs_do_putpages+0xa0e
vflushbuf() at netbsd:vflushbuf+0x4b
ffs_full_fsync() at netbsd:ffs_full_fsync+0x143
ffs_fsync() at netbsd:ffs_fsync+0x4b
VOP_FSYNC() at netbsd:VOP_FSYNC+0x5f
sys_fsync() at netbsd:sys_fsync+0x51
syscall() at netbsd:syscall+0xc4

(gdb) l *(genfs_do_putpages+0xa0e)
0xffffffff801bc7d4 is in genfs_do_putpages (../../../../miscfs/genfs/genfs_io.c:1246).
1241    skip_scan:
1242    #endif /* !defined(DEBUG) */
1244            /* Wait for output to complete. */
1245            if (!wasclean && !async && vp->v_numoutput != 0) {
1246                    while (vp->v_numoutput != 0)
1247                            cv_wait(&vp->v_cv, slock);
1248            }
1249            onworklst = (vp->v_iflag & VI_ONWORKLST) != 0;
(Continue reading)

Edgar Fuß | 16 Jan 20:06 2015

6.1/amd64 panic in cpu_switchto()

On an amd64 system running a 6.1 from November 2013 (yes, I need to update)
I got the following panic (manual OCR from screen photographs):

fatal protection fault in supervisor mode
trap type 4 code 0 rip ffffffff80100504 cs 8 rflags 10246 cr2  7f7ffffffe50 cpl8 rsp fffffe8055879670
kernel: protection fault trap, code=0
Stopped in pid 26911.26911 (mount) at	netbsd:cpu_switchto+0xf4:	wrmsr
db{4} bt
cpu_switchto() at netbsd:cpu_switchto+0xf4
kpreempt() at netbsd:kpreempt+0xe2
Xpreemtresume() at netbsd:Xpreemptresume+0x28
cpu_fsgs_zero() at netbsd:cpu_fsgs_zero+0x6a
setregs() at netbsd:setregs+0x7f
execve_runprog() at netbsd: execve_runprog+0x708
execve1() at netbsd: execve1+0x33
linux32_syscall() at netbsd:linux32_syscall+0xb9
db{4}> show registers
ds	1b0
es	c878
fs	1b0
gs	1
rdi	fffffe841978aa50
rsi	fffffe806cb3d780
rbp	fffffe80558796f0
rbx	fffffe811d84c2c0
rdx	fbcff3dd
rcx	c0000102
rax	fb50ffff
r8	0
r9	0
(Continue reading)

Kengo NAKAHARA | 13 Jan 10:19 2015

a preliminary report of NET_MPSAFE bridge and if_wm multiqueue PoC performance


I wrote if_wm multiqueue PoC using my MSI/MSI-X prototype.
# Here is MSI/MSI-X prototype memo

Here is the PoC, which include ozaki-r <at> n.o's work NET_MPSAFE if_bridge code.
# Because I haven't written xen support code, this code cannot release build...

Furthermore, I measured the performance of the kernel, which is built
above code with enabling "options NET_MPSAFE"("NET_MPSAFE PoC kernel").
The measurement environment is below.
    + Device under target (DUT)
      + supermicro A1SRi-2758F
        - 8core Atom C2578
        - I354 ethernet controller x4
    - DUT used as bridge between 2 NIC
    - The traffic is bi-directional UDP over DUT

Here is the preliminary report.
The meaning of the graphs is below.
    + horizontal axis is packet size
      - from 74 byte to 1508 byte
      - vertical axis is mesurement results(Mbps)

    + the meaning of each line graph
      + red graph("MPSAFE-8core-Mbps.dat")
        - the kernel is "NET_MPSAFE PoC kernel"
(Continue reading)

Taylor R Campbell | 7 Jan 20:02 2015


Any objections to introducing membar_datadep_consumer?

Semantics:  Any load afterward with a data dependency on a load
beforehand is guaranteed to be ordered so.  Control dependencies are
not ordered.

Implementation:  mb on Alpha, noop everywhere else for now.  Define
__HAVE_MEMBAR_DATADEP_CONSUMER in XYZ/types.h if XYZ has it; otherwise
<sys/atomic.h> automatically defines it as a noop.

I have been sitting on the attached patch for ages and plan to
introduce its use where appropriate when I have time, with testing on
a multiprocessor Alpha known to reorder data-dependent loads.
Index: common/lib/libc/arch/alpha/atomic/membar_ops.S
RCS file: /cvsroot/src/common/lib/libc/arch/alpha/atomic/membar_ops.S,v
retrieving revision 1.6
diff -p -u -r1.6 membar_ops.S
--- common/lib/libc/arch/alpha/atomic/membar_ops.S	25 May 2008 15:56:11 -0000	1.6
+++ common/lib/libc/arch/alpha/atomic/membar_ops.S	7 Jan 2015 18:56:50 -0000
 <at>  <at>  -87,3 +87,5  <at>  <at>  ATOMIC_OP_ALIAS(membar_exit,_membar_sync
Index: lib/libc/atomic/membar_ops.3
(Continue reading)

Alexander Nasonov | 1 Jan 16:32 2015

jit code and securelevel

I don't remember seeing a policy on disabling jit code at securelevel
1 or higher. Is it something we should add?


Maxime Villard | 31 Dec 09:00 2014

NTFS: node leak

there seems to be a node leak in NTFS in two places: ntfs_ntget(ip) is
called, but the two functions return without calling ntfs_ntput(ip).

I would like some okayz before committing; I can't test it.


Index: ntfs_subr.c
RCS file: /cvsroot/src/sys/fs/ntfs/ntfs_subr.c,v
retrieving revision 1.55
diff -u -r1.55 ntfs_subr.c
--- ntfs_subr.c	28 Dec 2014 14:42:56 -0000	1.55
+++ ntfs_subr.c	29 Dec 2014 16:17:04 -0000
 <at>  <at>  -759,9 +759,9  <at>  <at> 
 	struct fnode   *fp = VTOF(vp);
 	struct ntnode  *ip = FTONT(fp);
-	struct ntvattr *vap;	/* Root attribute */
+	struct ntvattr *vap = NULL;	/* Root attribute */
 	cn_t            cn = 0;	/* VCN in current attribute */
-	void *        rdbuf;	/* Buffer to read directory's blocks  */
+	void *        rdbuf = NULL;	/* Buffer to read directory's blocks  */
 	u_int32_t       blsize;
 	u_int32_t       rdsize;	/* Length of data to read from current block */
 	struct attr_indexentry *iep;
 <at>  <at>  -779,8 +779,10  <at>  <at> 
 		return (error);
(Continue reading)

matthew green | 30 Dec 09:57 2014

race condition between (specfs) device open and close

hi folks.

while trying to fix any midi/sequencer issues i've seen in the
last year, noticed that any attempt to call 'midiplay' on more
than one midi device at the same time, while not allowed, would
leave the sequencer device always busy.

i tracked this down to an ugly race condition inside the
specfs code.  the problem is that:

	t1		t2
	opens dev
			tries to open dev
	closes dev
	since sd_opencnt is != 1, does not call close
			gets EBUSY

in this case, the device close routine is never called, and in
at least the midi case, all future opens return EBUSY due to
there already being an active open.

i've spent some time reviewing the specfs code, and i think that
the below patch fixes these specific problems.  it certainly
fixes the problem i have with trying to open /dev/music
concurrently.  it may not be entirely correct or have problems
(Continue reading)

Maxime Villard | 29 Dec 18:30 2014

fs/ headers

most of the FSs under sys/fs have their headers in /usr/include/fs/FS-NAME/.
Some are just in /usr/include/FS-NAME/: msdosfs - adosfs - filecorefs - ntfs.

Is it intentional? It's ok if I move them in fs/?

Michael van Elst | 29 Dec 04:00 2014

disk driver interface

Currently NetBSD has three programming interfaces to determine
disk geometry from userland.

- ioctl DIOCGDINFO. The traditional interface, limited to 32bit
  numbers or disks < 2TB because its data structure corresponds
  to the binary on-disk structure.

- the "get-properties" command to the drvctl(4) driver. drvctl(4)
  is missing on some ports and some disk drivers don't make
  geometry properties available.

- ioctl DIOCGWEDGEINFO. Works only for wedges but not for the
  disk drivers themselves. This is fine for operations on
  data blocks of a wedge but doesn't help e.g. partitioning
  tools. It also does not provide the sector size.

To solve this, we could

- create a new DIOCGDINFO version that uses larger numbers. AFAIK
  that is about what OpenBSD does. The on-disk structure could be
  translated but writing a label might be incompatible if partitions
  are defined beyond the 2TB limit.

- make drvctl(4) mandatory and make all disk drivers provide
  geometry properties.

- make DIOCGWEDGEINFO available for the disk drivers and
  ignore wedge-related information.

(Continue reading)

Greg Troxel | 28 Dec 20:52 2014


The ZFS bits in NetBSD seem old, and it also seems that they don't quite
100% work.

Now, it seems OpenZFS is the locus of ZFS activity, and that's how
FreeBSD's ZFS code is maintained:

Thus, it seems that it would be good to extend OpenZFS to support NetBSD
(or extend NetBSD's glue code to support OpenZFS), and to have recent
OpenZFS code in NetBSD's src/external.

I have put this notion in the "Finish ZFS" project page:

I am curious if anyone who understands ZFS better has opinions on whether
my notion of heading to OpenZFS makes sense, and how hard it is likely
to be.


    Greg Troxel <gdt <at>>