Maxime Villard | 22 May 12:31 2016
Picon

amd64: move proc0's stack

The location where proc0's stack is put on amd64 is highly bizarre: it is placed
between two page levels. Which implies that a stack overflow can overwrite a
page level. This patch moves the stack out of the BOOTSTRAP TABLES chunk, puts
it before L4, and maps it independently.

We can then easily apply a redzone on it, the same way the rest of the system
does under DIAGNOSTIC.

I'll commit this patch in a week, unless someone understands why the stack was
placed that way, or unless there's a mistake in the asm.

Index: locore.S
===================================================================
RCS file: /cvsroot/src/sys/arch/amd64/amd64/locore.S,v
retrieving revision 1.93
diff -u -r1.93 locore.S
--- locore.S	22 May 2016 10:11:55 -0000	1.93
+++ locore.S	22 May 2016 10:14:07 -0000
 <at>  <at>  -1,4 +1,4  <at>  <at> 
-/*	$NetBSD: locore.S,v 1.93 2016/05/22 10:11:55 maxv Exp $	*/
+/*	$NetBSD: locore.S,v 1.92 2016/05/15 07:17:53 maxv Exp $	*/

  /*
   * Copyright-o-rama!
 <at>  <at>  -203,12 +203,11  <at>  <at> 
  #endif

  #define PROC0_PML4_OFF	0
-#define PROC0_STK_OFF	(PROC0_PML4_OFF + 1 * PAGE_SIZE)
-#define PROC0_PTP3_OFF	(PROC0_STK_OFF + UPAGES * PAGE_SIZE)
(Continue reading)

Guilherme Salazar | 21 May 21:21 2016
Picon
Gravatar

[patch] remove luactl require

Hi,

[1] is a patch that removes luactl command to require a module; as
Mark suggested, it should be up to the script being loaded to issue
'require' on modules.

[1] https://netbsd.org/~salazar/patches/bye_require.patch

Comments?

Guilherme Salazar | 21 May 00:27 2016
Picon
Gravatar

luactl & Lua standard libraries

Hi folks,

kernel Lua states created with luactl(8) are created empty. There has
been some discussion about it.

It would be nice to add an option to luactl to allow loading of Lua
standard libraries -- by calling luaL_openlibs on the Lua state. I
have added an option `-s` to `luactl require` that does this. See [1]

To make it clear, the following creates an empty kernel Lua state named s0:

    luactl create s0

What I propose is:

    luactl create s0
    luactl -s require s0

or, equivalently,

    luactl -cs require s0

(-c option creates s0 if it doesn't exist and then -s calls luaL_openlibs on s0)

Comments?

[1] https://netbsd.org/~salazar/patches/luactl.patch

Guilherme Salazar | 21 May 00:32 2016
Picon
Gravatar

luactl & Lua standard libraries

Hi folks,

kernel Lua states created with luactl(8) are created empty. There has
been some discussion about it.

It would be nice to add an option to luactl to allow loading of Lua
standard libraries -- by calling luaL_openlibs on the Lua state. I
have added an option `-s` to `luactl require` that does this. See [1]

To make it clear, the following creates an empty kernel Lua state named s0:

    luactl create s0

What I propose is:

    luactl create s0
    luactl -s require s0

or, equivalently,

    luactl -cs require s0

(-c option creates s0 if it doesn't exist and then -s calls luaL_openlibs on s0)

Comments?

[1] https://netbsd.org/~salazar/patches/luactl.patch

J. Hannken-Illjes | 20 May 16:38 2016
Picon
Picon

Vnode life cycle: add vnode state

1) Objects vnode and vcache_node always exist together so it makes sense
to merge them into one object.  This can be done in two ways:

- Add the current vcache_node elements (hash list, key and flag) to the
  vnode.
- Overlay the vnode into vcache_node.  This way we also get two views
  to vnodes, a public one (struct vnode) visible from file system
  implementations and a private one (struct vcache_node) visible from
  the VFS subsystem (sys/kern/vfs_*) only.

I prefer the second way, a diff is here:
  http://www.netbsd.org/~hannken/vstate/001_merge_vnode_and_vcache_node

2) The state of a vnode is currently determined by its flags
VI_MARKER, VI_CHANGING, VI_XLOCK and VI_CLEAN.  These flags are not
sufficient to describe the state a vnode is in.  It is for example
impossible to determine if a vnode is currently attaching to a file system.

I propose to replace these flags with a vnode state, one of:

- VN_MARKER: Used as a marker to traverse lists of vnodes, will never change
  or appear in a regular vnode operation.
- VN_LOADING: Vnode is currently attaching to its file system and loading
  or creatingthe inode.
- VN_ACTIVE: Vnode is fully initialised and useable.
- VN_BLOCKED: Vnode is active but cannot get new references through vget().
- VN_RECLAIMING: Vnode is in process to detach from file system.
- VN_RECLAIMED: Vnode is (no longer) attached to its file system, its dead.

and these operations to work on the vnode state:
(Continue reading)

bch | 15 May 19:17 2016
Picon

Fwd: Re: Audio - In kernel audio mixing

Forgot to reply-all.

---------- Forwarded message ----------
From: "bch" <brad.harder <at> gmail.com>
Date: May 15, 2016 10:01 AM
Subject: Re: Audio - In kernel audio mixing
To: "Nathanial Sloss" <nat <at> netbsd.org>
Cc:


On May 15, 2016 8:29 AM, "Nathanial Sloss" <nat <at> netbsd.org> wrote:
>
> Hi,
>
> I've been working away at in kernel audio mixing for the past 6 weeks or so.
>
> I've made archives of two different approaches to in kernel audio mixing and
> made them available on ftp.NetBSD.org/pub/NetBSD/misc/nat
>
> The first is vaudio-kern.tgz  -  This will do in-kernel audio mixing if one
> creates a vaudio device for their sound card.
> My hope is that vaudio devices will replace the standard audio devices in
> future.
>
> A vaudio device say vaudio0 vsound0  vmixer0 and vaudioctl0 work just like
> their traditional counterparts except you can open a vaudio0/vsound0 device
> for reading and writing as many times as you like and the audio will be mixed.
>
> In practice I've found that on the RPI2 I was able to play 100 songs at once
> utilising about 20% of the cpu and on my laptop 340 (15 % cpu consumption) or
> so any more than that and blocks were delayed giving a stuttering sound.
>
> I am aware of this problem and I can fix it for SMP systems but not UP as it
> would require creating additional kernel threads to aid in mixing.
>
> The second is audio-alt.tgz - This is changes to audio.c that allow for in
> kernel-mixing.
>
> However I was only able to play about 60 songs on my laptop before my computer
> froze because the mixing is done in audio_pint called from the sound cards
> interrupt handler and the mixing of channels could not be done fast enough for
> more than 60 channels resulting in a massive slowdown.
>
> The first approach vaudio introduces a little addional latency. 10-20ms or so
> and all streams are converted to 16 bit SLINEAR 44100 Hz.

Is that an automatic, out-of-the-box 10-20ms for any and all streams? 2ms on a single channel in a duplex stream is obviously noticeable (as reference for what delay a human can perceive), so 20ms could be considered relatively large. I don't know what a person would accept as far as audio/video discrepancy, but have you tried watching a video clip? What was your feeling of the quality?

> Vaudio utilizes the traditional audio device so for those who want precision
> greater than 16 bits, the traditional audio device still exists.
>
> The second approach audio-alt introduces no additional latency still
> converting streams to 16 bit SLINEAR.
>
> I believe that the vaudio approach is better and wanted to start a discussion
> about in kernel-mixing and hopefully which approach (if any) should be
> included in NetBSD in future.
>
> Best regards,
>
> Nat.

Nathanial Sloss | 15 May 17:32 2016
Picon

Audio - In kernel audio mixing

Hi,

I've been working away at in kernel audio mixing for the past 6 weeks or so.

I've made archives of two different approaches to in kernel audio mixing and 
made them available on ftp.NetBSD.org/pub/NetBSD/misc/nat

The first is vaudio-kern.tgz  -  This will do in-kernel audio mixing if one 
creates a vaudio device for their sound card.  
My hope is that vaudio devices will replace the standard audio devices in 
future.

A vaudio device say vaudio0 vsound0  vmixer0 and vaudioctl0 work just like 
their traditional counterparts except you can open a vaudio0/vsound0 device 
for reading and writing as many times as you like and the audio will be mixed.  

In practice I've found that on the RPI2 I was able to play 100 songs at once 
utilising about 20% of the cpu and on my laptop 340 (15 % cpu consumption) or 
so any more than that and blocks were delayed giving a stuttering sound.

I am aware of this problem and I can fix it for SMP systems but not UP as it 
would require creating additional kernel threads to aid in mixing.

The second is audio-alt.tgz - This is changes to audio.c that allow for in 
kernel-mixing.

However I was only able to play about 60 songs on my laptop before my computer 
froze because the mixing is done in audio_pint called from the sound cards 
interrupt handler and the mixing of channels could not be done fast enough for 
more than 60 channels resulting in a massive slowdown.

The first approach vaudio introduces a little addional latency. 10-20ms or so 
and all streams are converted to 16 bit SLINEAR 44100 Hz.

Vaudio utilizes the traditional audio device so for those who want precision 
greater than 16 bits, the traditional audio device still exists.

The second approach audio-alt introduces no additional latency still 
converting streams to 16 bit SLINEAR.

I believe that the vaudio approach is better and wanted to start a discussion 
about in kernel-mixing and hopefully which approach (if any) should be 
included in NetBSD in future.

Best regards,

Nat.

Maxime Villard | 14 May 11:56 2016
Picon

Improvements in i386

Now i386 benefits from the same improvements as amd64. There are two
differences though:
  - on non-PAE i386, NOX does not exist. Therefore the mappings all have an
    additional X permission. To benefit from X-less mappings, your CPU must
    support PAE, and your kernel must be GENERIC_PAE.
  - the segments are not large-page-aligned, which means that probably some
    parts of the segments are still mapped with normal pages. It is still more
    optimized than it used to be, but not as much as amd64 is.
The asm code is similar now, and each change made in amd64 can easily be made
in i386 as well.

The commits are here, if you spot any mistakes.

[1] http://mail-index.netbsd.org/source-changes/2016/05/14/msg074672.html
[2] http://mail-index.netbsd.org/source-changes/2016/05/14/msg074673.html
[3] http://mail-index.netbsd.org/source-changes/2016/05/14/msg074674.html
[4] http://mail-index.netbsd.org/source-changes/2016/05/14/msg074675.html
[5] http://mail-index.netbsd.org/source-changes/2016/05/14/msg074677.html
[6] http://mail-index.netbsd.org/source-changes/2016/05/14/msg074678.html

Maxime Villard | 13 May 12:53 2016
Picon

Improvements in amd64

I've committed several improvements in amd64 these last days.

In chronological order, for the record:
  - I cleaned up the asm code and fixed several comments, which makes the
    boot process much easier to understand.
  - I fixed the alignment for the text segment, so that it can be covered by
    more large pages [1] - thereby reducing TLB contention.
  - I fixed a bug in the way the secondary CPUs are launched [2], which
    caused them to crash if they tried to access an X-less page.
  - I took rodata out of the text+rodata chunk, and put it in the data+bss+
    PRELOADED_MODULES+BOOTSTRAP_TABLES chunk [3]. rodata was no longer large
    page optimized, and had RWX permissions.
  - I retook rodata out of the rodata+data+bss+PRELOADED_MODULES+
    BOOTSTRAP_TABLES chunk, and made the kernel map it independently without
    the W permision [4].
  - I made the kernel map rodata without the X permission, by using the NOX
    bit on its pages [5] (now that the secondary CPUs could handle that
    properly).
  - I took the data+bss chunk out of the data+bss+PRELOADED_MODULES+
    BOOTSTRAP_TABLES chunk, and made the kernel map it independently without
    X permission [6].
  - I made the kernel remap rodata and data+bss with large pages and proper
    permissions [7] - which reduces once again TLB contention.

Now, the way the kernel image is mapped is more flexible, more secure and
more performant. There are still several things to fix, and the same
procedure needs to be applied in i386.

If you have questions, comments, if you think something is wrong, or if you
spot any mistakes, feel free to tell me about it. Also, if someone could
review the last change, that would be nice.

[1] http://mail-index.netbsd.org/source-changes/2016/05/07/msg074501.html
[2] http://mail-index.netbsd.org/source-changes/2016/05/11/msg074613.html
[3] http://mail-index.netbsd.org/source-changes/2016/05/12/msg074625.html
[4] http://mail-index.netbsd.org/source-changes/2016/05/12/msg074627.html
[5] http://mail-index.netbsd.org/source-changes/2016/05/12/msg074628.html
[6] http://mail-index.netbsd.org/source-changes/2016/05/12/msg074632.html
[7] http://mail-index.netbsd.org/source-changes/2016/05/13/msg074650.html

coypu | 11 May 13:30 2016

Re: WAPBL not locking enough?

How about this change instead?
Still going over the rest of the function calls of wapbl_flush.

From d51a64a5e9a15169949b8f1442c79060c157536d Mon Sep 17 00:00:00 2001
From: coypu <coypu <at> sdf.org>
Date: Wed, 11 May 2016 13:08:06 +0300
Subject: [PATCH 1/2] assert wl_mtx held in wapbl_transaction_len

---
 sys/kern/vfs_wapbl.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sys/kern/vfs_wapbl.c b/sys/kern/vfs_wapbl.c
index e63f228..e317daf 100644
--- a/sys/kern/vfs_wapbl.c
+++ b/sys/kern/vfs_wapbl.c
 <at>  <at>  -2098,6 +2098,9  <at>  <at>  wapbl_transaction_len(struct wapbl *wl)
            sizeof(((struct wapbl_wc_blocklist *)0)->wc_blocks[0]);

        KASSERT(bph > 0);
+#ifdef WAPBL_DEBUG /* XXX: get rid of this eventually */
+       KASSERT(mutex_owned(&wl->wl_mtx));
+#endif

        len = wl->wl_bcount;
        len += howmany(wl->wl_bufcount, bph) * blocklen;
-- 
2.8.1

From fa8f55b0373747697f81c4fedc36deb93ad2e51d Mon Sep 17 00:00:00 2001
From: coypu <coypu <at> sdf.org>
Date: Wed, 11 May 2016 14:27:46 +0300
Subject: [PATCH 2/2] hold wl_mtx for wapbl_truncate and wl_bufcount

---
 sys/kern/vfs_wapbl.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/sys/kern/vfs_wapbl.c b/sys/kern/vfs_wapbl.c
index e317daf..3d2fd4c 100644
--- a/sys/kern/vfs_wapbl.c
+++ b/sys/kern/vfs_wapbl.c
 <at>  <at>  -1592,6 +1592,9  <at>  <at>  wapbl_flush(struct wapbl *wl, int waitfor)
         * file system didn't produce any blocks as a consequence of
         * it, but the same does not seem to be so of wl_inohashcnt.
         */
+
+       mutex_enter(&wl->wl_mtx); /* protect bufcount, truncate call */
+
        if (wl->wl_bufcount == 0) {
                goto wait_out;
        }
 <at>  <at>  -1624,6 +1627,9  <at>  <at>  wapbl_flush(struct wapbl *wl, int waitfor)
        }

        error = wapbl_truncate(wl, flushsize);
+
+       mutex_exit(&wl->wl_mtx);
+
        if (error)
                goto out;

 <at>  <at>  -1753,6 +1759,7  <at>  <at>  wapbl_flush(struct wapbl *wl, int waitfor)
                error = wapbl_truncate(wl, wl->wl_circ_size - 
                        wl->wl_reserved_bytes);
        }
+       mutex_exit(&wl->wl_mtx);

  out:
        if (error) {
--

-- 
2.8.1

Kimihiro Nonaka | 11 May 05:21 2016
Picon
Gravatar

device-major question

Hi,

I found hdmicec and MI com use same device-major "char 260".
Would it be acceptable?

----- sys/conf/majors
device-major hdmicec   char 260            hdmicec
-----

----- sys/conf/majors.tty
device-major    com             char 260                com
-----

Regards,
--

-- 
NONAKA Kimihiro


Gmane