Ryota Ozaki | 29 Jun 09:54 2016
Picon

Make sure to free all interface addresses in if_detach

Hi,

Addresses of an interface (struct ifaddr) have
a (reverse) pointer of an interface object
(ifa->ifa_ifp). If the addresses are surely
freed when their interface is destroyed, the
pointer is always valid and we don't need a
tweak of replacing the pointer to if_index like
mbuf.

Is the assumption is correct? No, for now.

The following patches ensure the assumption.

  http://www.netbsd.org/~ozaki-r/debug_ifaref.diff
  http://www.netbsd.org/~ozaki-r/free_all_ifa.diff

The 1st patch adds debugging functions to check
if all addresses are freed at the end of if_detach.

The 2nd patch tweaks some destruction functions
to satisfy the assumption.

- Deactivate the interface at the firstish of
  if_detach. This prevents in6_unlink_ifa from
  saving multicast addresses
- Invalidate rtcache(s) and clear a rtentry
  referencing an address on RTM_DELETE. rtcache(s)
  may delay freeing an address
- Replace callout_stop with callout_halt of DAD
(Continue reading)

Paul Goyette | 28 Jun 04:38 2016

Removing extra copy of rb_tree stuff

I'm looking at PR lib/44090, which requests that libprop be modified to 
use the "main" copy of rb_tree rather than its local copy, and then 
remove the local copy.

Well, part 1 is already done.  The prop_rb_impl.h header already has a 
section that redefined the "local" function names to reference the 
"real" functions (conditionalized on defined(__NetBSD__)), and the 
Makefile.inc already includes the prop_rb.c file only if PROPLIB_WANT_RB 
is defined (which it never seems to be).

The question comes down to part 2 of the PR - removal of the extra copy.

The changes in the header file would imply that this code might be used 
in projects other than NetBSD.  If this were the case, it might make 
sense to keep the extra code around; it's not really hurting anything. 
But because it's not being built, it might well suffer from future 
bit-rot.

So, would it be better to keep the extra code copy in the tree (just in 
case some other project uses it) and let it rot?  Or should I delete the 
extra copy, and adjust the (relatively small number of) callers to use 
the "real" function names rather than the local names?

Recommendations requested!

+------------------+--------------------------+------------------------+
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:      |
| (Retired)        | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org |
+------------------+--------------------------+------------------------+
(Continue reading)

Ryota Ozaki | 28 Jun 03:03 2016
Picon

Make sure to insert ifa/ia after initialization

Hi,

Basically we should insert an item to a collection
(say a list) after completed item's initialization to
avoid accessing an item that is initialized halfway.

ifaddr and in{,6}_ifaddr aren't processed like so.
The patch below fixes the issue.

  http://www.netbsd.org/~ozaki-r/insert_ifa_after_init.diff

To this end, I need to tweak {arp,nd6}_rtrequest that
depend on that an ifaddr (in{,6}_ifaddr) is inserted
during its initialization; they explore interface's
address list to determine that rt_getkey(rt) of a given
rtentry is in the list (i.e., whether the route's
interface should be a loopback), which doesn't work
after the change. To make it work I use RTF_LOCAL flag
instead.

Any comments or suggestions?

Thanks,
  ozaki-r

Masanobu SAITOH | 23 Jun 12:40 2016
Gravatar

Changing the return value of xxx_attach() from void to int.

  Hi, all.

  As you know, the return value of device driver's attach function is void.
I've thought that we should change it to int for many years. I believe I'm
not the only person.

  xxx_attach() may fail the following cases:

	got unexpected behavior.

	resource allocation error.

	driver is really broken.

	some others.

  xxx_attach() is void, so the caller can't know the fail. It makes some
problems:

	a) The OS may touch the broken device while the OS is running.
	   It may causes a panic.

	b) The shutdown sequence calls xxx_shutdown() even if
	   it's not really attached. It may causes panic. We have met
	   this bugs many times.

	c) To prevent b), we have added extra code into xxx_shutdown().
	   It wastes our time and the code become big and complex.

	d) Resource leak. For example, A device_t structure is allocated
(Continue reading)

Edgar Fuß | 21 Jun 19:21 2016
Picon

quota2 qv_grace

I'm confused how the file-system default grace period is supposed to get 
copied to the per-user grace period quota2 supports. It appears that this 
happens as soon as something gets allocated for the user (i.e., just by 
system calls, not by any userland program like edquota). However, I find 
nothing in sys/ufs/ufs/ufs_quota2.c that looks like it does that. What am 
I overlooking?

I remember I wrote the patch that causes qv_timelimit to get set to 
<current time> + qv_grace, but not how the latter gets set.

David Holland | 21 Jun 09:32 2016
Picon

struct fid/filehandle size and lfs

A while back the filehandle type for ffs was fixed to have a 64-bit
inode number instead of silently truncating to 32 bits. Yesterday I
naively merged that change into lfs and it exploded the world.

It seems that lfs has an extra 32-bit field in its filehandle (that
ffs doesn't) -- it is a copy of the 32-bit "uuid" in the lfs
superblock that helps avoid tripping on old superblock copies after a
newfs. Which is well and good. However, with this field present,
making the inode number 64 bits wide causes four further bytes of
padding (because of alignment restrictions) ... and worse, on lp64
platforms only and not their 32-bit counterparts.

This would be only a nuisance, except that this padding pushes the
total size of the filehandle to 32, and in sys/fstypes.h we have

   #define FHANDLE_SIZE_COMPAT     28
   #define FHANDLE_SIZE_MIN        FHANDLE_SIZE_COMPAT

and related to this lfs has as part of its public kernel interface the
following lie:

   struct lfs_fhandle {
           /* FHANDLE_SIZE_COMPAT (but used from userland too) */
           char space[28];
   };

This is the type formally used by the fcntl the cleaner uses to get to
the ifile, so if the total filehandle size is 32 it fails and the
cleaner won't attach and nothing works. Worse, because this is passed
to fhopen() it has to be the same as other filehandles, at least in
(Continue reading)

coypu | 19 Jun 14:14 2016

SOSEND_LOAN problems in MIPS

Hi,

in emulating pmax with gxemul I had trouble using:
  cat somefile | command

when somefile is bigger than 4096 bytes.
it shows no output when with a slightly smaller file, it would.

using options SOSEND_NO_LOAN 'fixes' it.
seems many MIPS configs have this option.

what is a good way to figure out what is wrong?

thanks

coypu | 19 Jun 15:16 2016

SOSEND_LOAN trouble in MIPS

Hi,

in emulating pmax with gxemul I ran into an issue that:
  cat somefile | command

does not work if somefile is bigger than 4096 bytes (but does for
smaller, or if I use 'command somefile' instead).

using 'options SOSEND_NO_LOAN' in the kernel config 'fixes' it.
what will be a good way to figure out what is wrong?

also, many MIPS configs have it, would it make sense to just enable it
for all the ones that don't?

thanks

Anindya Mukherjee | 17 Jun 03:49 2016
Picon

Spinning down sata disk before poweroff

Hi,

I'm running NetBSD 7.0.1_PATCH (GENERIC) amd64 on a Dell laptop. Almost everything is working perfectly,
except the fact that every time I shutdown using the -p switch, the hard drive makes a loud click sound as the
system powers off. I checked the SMART status (atactl and smartctl) and after every poweroff the
Power_Off-Retract-Count parameter increases by 1.

I did some searching on the web and came across PR #21531 where this issue was discussed from 2003-2008 and
finally a patch was committed which resolved the issue by sending the STANDBY_IMMEDIATE command to the
disk before powering off. Since then the code has been refactored, but it is present in
src/sys/dev/ata/wd.c line 1970 (wd_shutdown) which calls line 1848 (wd_standby). This seemed strange
since the disk was definitely not being spun down.

I attached a remote gdb instance and stepped through the code during shutdown, breaking on
wd_flushcache() which is always called. The code path taken is wdclose()->wdlastclose() (lines 1029,
1014). I can see that the cache is flushed but then the device is deleted in line 1023. Subsequently, power
is cut off during shutdown, causing an emergency retract. So, it seems at least for newer sata disks the
spindown code is not being called. I'm fairly new to NetBSD code so there is a chance I read this wrong, so
feel free to correct me.

Ideally I'd like the disk to spin down during poweroff (-p) and halt (-h), perhaps settable using a sysctl,
but not during a reboot (-r). I am planning to patch wdlastclose() as an experiment to run the spindown code
to see if it stops the click. Is this a known issue, worthy of a PR? I can file one. I can also volunteer a patch
once I have tested it on my laptop. Comments welcome!

Anindya

Alexander Nasonov | 16 Jun 00:49 2016
Picon

dump to cgdNb device

Hi,

I setup an encrypted disk cgd1 (aes-cbc 256 on top of wd0g, disklabel
verification) with a dump device cgd1b but I can't dump to it (I enter
ddb and type sync to dump). It prints "device bad".

If I enable CGDDEBUG and set cgddebug to 1, it prints some additional
information (typing from a photo):

dumping to dev 20,17 (offset=215879, size=3119109):getcgd_softc(0x1411): unit =
1
cgdsize(0x1411)
cgdopen(0x1411, 0)
getcgd_softc(0x1411): unit = 1
cgdclose(0x1411, 0)
getcgd_softc(0x1411): unit = 1

dump cgddump(0x1411, 215879, 0xffff8000d6c3e000, 4096)
getcgd_softc(0x1411, unit = 1
device bad

With more verbose debugging it can hang instead of printing "device
bad". For example, if I set cgddebug to 4 and mount cgd1e

dumping to dev 20,17 (offset=215879, size=3119109):getcgd_softc(0x1411): unit =
1
cgdsize(0x1411)
cgdopen(0x1411, 0)
getcgd_softc(0x1411): unit = 1
getcgd_softc(0x1413): unit = 1
(Continue reading)

Kengo NAKAHARA | 14 Jun 06:33 2016
Picon

separate L3 output KERNEL_LOCK

Hi,

Currently, L3 output processing call L2 (or L3 tunneling) output routine
holding KERNEL_LOCK. This KERNEL_LOCK is controled by NET_MPSAFE kernel
option. e.g.
    https://nxr.netbsd.org/xref/src/sys/netinet/ip_output.c#215

That is, if NET_MPSAFE is enabled at current implementation, L2 and device
driver output processing run without KERNEL_LOCK whether the L2 component
and the device driver is MP-safe or not.

To enable NET_MPSAFE safely in the future, it is required to separete the
KERNEL_LOCK to each L2 component and device drivers, and then enable
KERNEL_LOCK for each component which is not MP-safe yet.

The design to implement this concept is consisted of below two parts.
    1. at L3 output processing call L2 output routine
       - remove KERNEL_LOCK from caller(L3)
       - wrap callee(L2) ifp->if_output with KERNEL_LOCK if necessary
    2. at L2 output processing call device driver output routine
       - check device driver MP-safe flags (*)
       - if device driver has MP-safe flag, call without KERNEL_LOCK and
         vice versa

      (*) Unfortunately, struct ifnet->if_flags is already used all bit.
          So, I change if__pad1 to if_extflags and use it as flags.

Here is the patch series,
    http://www.netbsd.org/~knakahara/separate-l3-lock/separate-l3-lock.tgz
and unified patch.
(Continue reading)


Gmane