Simon J. Gerraty | 4 Jun 17:24 2010
Picon

Re: random mouse?

Just closing the loop on this.

Turns out the problem was specific to a Microsoft Wheel mouse.
Replacing the mouse with a logitech wheel mouse resolved the problem.

--sjg

Matt Fleming | 6 Jun 12:54 2010

Re: [PATCH] Warn on unhandled interrupts

On Sat, 29 May 2010 10:03:53 +0100, Matt Fleming <matt <at> console-pimps.org> wrote:
> Hi,
> 
> while chasing a spurious interrupt bug I realised that the kernel
> doesn't warn about unhandled interrupts on amd64. In fact, it ignores
> the return value of the interrupt handlers completely. I wrote the patch
> below so that the kernel will complain when an interrupt goes
> unhandled. The warning is rate limited to once every 5 seconds and it
> will only complain at most 10 times for any given slot.
> 
> Be aware that I haven't done much x86-64 assembly programming and so
> there could be some issues with the vector.S patch. For example, I've no
> idea if the kernel expects a particular alignment for the stack.
> 
> The patch is also available from,
> http://www.netbsd.org/~mjf/patches/amd64-spurious-interrupt.patch
> 
> Comments?

I've updated the patch at the above url because I noticed that it didn't
pop the 'handled' variable off the stack when branching to label
7. Hubert Feyrer also had a good idea of #define'ing the number of times
that the patch warns.

Does anyone have a problem with me committing this to HEAD? A couple of
people have said that it would be useful.

Jean-Yves Migeon | 10 Jun 01:43 2010
Picon

[PAE support] Initial patch review

Dear all,

Here is a patch [1] that "ports" PAE support from Xen to GENERIC.

Currently, the patch triggers a double fault some time after boot with 
dom0 PAE under Xen. As such, it is not yet ready for commit, but I 
consider it mature enough to ask for an initial review.

It diverges quite a bit from the original patch from Jeremy. Reason is 
to avoid too many #ifdef's between Xen and non-Xen pmap (all the 
delightful details are in the comments inside the patch - in short, Xen 
tracks reference counts to L3 kernel page + we cannot use recursive 
mappings easily). It consumes 4kB for each CPU attached, instead of each 
process created.

Here is a summary of what the patch does; any advice on it will be 
appreciated. I tested it under QEMU; unfortunately, I can't stress test 
MP code, I am not confident on the -smp flag's capability on a mono-core 
host.

- principle: the PTP_LEVELS remains at 2. The L3 page is a page 
allocated per-CPU (below the 4GB boundary, due to %cr3 size limitation), 
and the kernel keeps track of the allocation through 2 additional struct 
cpu_info elements:
   - ci_l3_pdir for the virtual address of the PD,
   - ci_l3_pdirpa for its PA counterpart

- context switch with PAE is a matter of editing the 4 L3 entries, not 
changing %cr3 value (this is the non-PAE situation).

(Continue reading)

Mindaugas Rasiukevicius | 10 Jun 02:45 2010
Picon

Re: [PAE support] Initial patch review

Hello,

Jean-Yves Migeon <jeanyves.migeon <at> free.fr> wrote:
> 
> It diverges quite a bit from the original patch from Jeremy. Reason is 
> to avoid too many #ifdef's between Xen and non-Xen pmap <...>

Getting rid of #ifdefs, especially in pmap, is a right direction we
should continue moving in time.

> XXX misses support for ephemeral mapping. I am not sure on how to 
> implement it correctly, so pmap_load() uses tlbflush(). Same goes for 
> port-xen. I can revisit it later, after merging this patch + xen-suspend.

Which part of support?  It is same entering/removal of unmanaged mappings,
except without TLB flush (and IPIs).  These are something to track with
generation numbers, normally in such way:

	u_int gen = uvm_emap_gen_return();
	tlbflush();
	uvm_emap_update(gen);

> Any comments on the modifications there? I would prefer to have all 
> important changes "approved" before starting regression testing.

Do you have any data whether/how these changes affect performance?

I see some clean-ups and style changes in the patch - help (as well as
reviews and testing) on syncing of uvmplock branch would be useful, since
some parts of pmap are changed/split there.
(Continue reading)

Jean-Yves Migeon | 10 Jun 10:54 2010
Picon

Re: [PAE support] Initial patch review


On Thu, 10 Jun 2010 01:45:38 +0100, Mindaugas Rasiukevicius
<rmind <at> netbsd.org> wrote:
>> It diverges quite a bit from the original patch from Jeremy. Reason is 
>> to avoid too many #ifdef's between Xen and non-Xen pmap <...>
> 
> Getting rid of #ifdefs, especially in pmap, is a right direction we
> should continue moving in time.

I took some time to avoid too many #ifdef's here; but I am still not
100% satisfied, my eyes burn each time I read through pmap_load().

>> XXX misses support for ephemeral mapping. I am not sure on how to 
>> implement it correctly, so pmap_load() uses tlbflush(). Same goes for 
>> port-xen. I can revisit it later, after merging this patch +
xen-suspend.
> 
> Which part of support?  It is same entering/removal of unmanaged
mappings,
> except without TLB flush (and IPIs).  These are something to track with
> generation numbers, normally in such way:
> 
> 	u_int gen = uvm_emap_gen_return();
> 	tlbflush();
> 	uvm_emap_update(gen);

Cool then, I thought it would be trickier to use due to %cr3 handling of
PAE.
emap support is pretty much straightforward in that case.

(Continue reading)

Jonathan A. Kollasch | 15 Jun 16:02 2010
Picon

x86 cpu_rootconf, raidframe, wedges

Hi,

I'm wanting to have alternate raidframe root partitions on my drive.
For example; raid0 on wd[01]a, raid1 on wd[01]e.  I want to be able
to tell boot(8) `dev hd[01][ae]:` and have the correct raid[01]a
device automatically chosen for root.  I also want this to work when
wd[01][ae] are accessed via dk(4).

Currently, rf_autoconfig() happens before the first cpu_rootconf()
call. This means that the 'booted_wedge' hasn't been determined yet.

When rf_buildroothack() calls cpu_rootconf() the raid(4) driver
has already exclusively-opened the underlying block device.
cpu_rootconf(), specifically match_bootwedge(), will try to correlate
the booted wedge with a device.  To do this, it needs to read the
underlying block device, it tries and fails.  Thus booted_wedge isn't
known and rf_buildroothack() can't determine what raid should be root.

I should note that I've locally improved the detection of booted raid
devices in the multiple root-eligible case.  I can get my scenario to
work in the raid-on-disklabel case, and in the raid-on-wedges case if
I call cpu_rootconf() at the beginning of rf_autoconfig().  But I have
a feeling calling cpu_rootconf() so many times is probably not right.

Comments?  Suggestions?

TBH, I can get away without the raid-on-wedges case working for now.
But, given the oncoming onslaught of >2TiB drives, this won't work
forever.

(Continue reading)

David Brownlee | 22 Jun 13:23 2010

Is anyone running NetBSD on a macbook pro?

I'm getting tired of fighting a windowing system which Knows What Is Right
in all circumstances, and actually crashes more often than my old Thinkpad
under NetBSD.
I'm already dual booting it into Windows for the occasional time I really
need a native Windows box (Parallels suffices for most other windows usage),
but wondered if anyone has setup a mac to be able to boot into OS X, Windows
or NetBSD?

Paul Goyette | 23 Jun 19:42 2010

ACPI misbehavior? or bad ioapic?

I'm still trying to track down what's happening on my new machine, and I 
think it has something to do with the fact that ioapic #2 is not where 
ACPI says it is.

(The "acpidump -dt | gzip" is attached...)  ACPI says that IOAPIC 2 
lives at 0xfec30000.  Yet, when the ioapic_attach() code tries to check 
it, the read_vers_and_size gets 0xffffffff - basically saying that it 
failed to read the apic's register, even though the address block was 
just mapped a few lines earlier.  And since we determine that ID #15 is 
not the desired ID #2, we attempt to re-write the ID register, which 
fails.

I did a quick check, and FreeBSD actually has code that "punts" if the 
value read from the register comes back as 0xffffffff, while OpenBSD 
seems to ignore it the same as NetBSD.  I haven't verified myself, but 
the retailer of this system claims that it worked "just fine" under 
Centos (RedHat Linux?).

I don't know much about the x86 hardware architecture, so I'm at a loss 
on how to proceed with this problem.  The only devices known to live on 
this ioapic are the two GigE LAN ports, but it's difficult to imagine
having the great 12-Core Opteron server being disconnected from 
everything else.  :)

Any hints, clues, suggestions, or even wild-* guesses would be welcomed.

-------------------------------------------------------------------------
| Paul Goyette     | PGP Key fingerprint:     | E-mail addresses:       |
| Customer Service | FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com    |
| Network Engineer | 0786 F758 55DE 53BA 7731 | pgoyette at juniper.net |
(Continue reading)

Christoph Egger | 24 Jun 00:00 2010
Picon
Picon

Re: ACPI misbehavior? or bad ioapic?

On 23.06.10 19:42, Paul Goyette wrote:
> I'm still trying to track down what's happening on my new machine, and I
> think it has something to do with the fact that ioapic #2 is not where
> ACPI says it is.
> 
> (The "acpidump -dt | gzip" is attached...)  ACPI says that IOAPIC 2
> lives at 0xfec30000.  Yet, when the ioapic_attach() code tries to check
> it, the read_vers_and_size gets 0xffffffff

This value means 'invalid' in the ACPI world.

Christoph

Paul Goyette | 24 Jun 00:29 2010

Re: ACPI misbehavior? or bad ioapic?

On Thu, 24 Jun 2010, Christoph Egger wrote:

> On 23.06.10 19:42, Paul Goyette wrote:
>> I'm still trying to track down what's happening on my new machine, and I
>> think it has something to do with the fact that ioapic #2 is not where
>> ACPI says it is.
>>
>> (The "acpidump -dt | gzip" is attached...)  ACPI says that IOAPIC 2
>> lives at 0xfec30000.  Yet, when the ioapic_attach() code tries to check
>> it, the read_vers_and_size gets 0xffffffff
>
> This value means 'invalid' in the ACPI world.

Yeah, I got that part figured out.  But what to do about it?  All the 
PCI-E slots and the two on-board LAN interfaces appear to want to "use" 
one of the pins on ioapic2.

 	{116} dmesg | grep ioapic
 	ioapic0 at mainbus0 apid 0: pa 0xfec00000, version 21, 24 pins
 	ioapic1 at mainbus0 apid 1: pa 0xfec20000, version 21, 32 pins
 	ioapic2 at mainbus0 apid 2: pa 0xfec30000, version ff, 256 pins
 	ioapic2: misconfigured as apic 15
 	ioapic2: can't remap to apid 2
 	wm0: interrupting at ioapic2 pin 16
 	wm1: interrupting at ioapic2 pin 17
 	ahcisata0: interrupting at ioapic0 pin 22
 	ohci0: interrupting at ioapic0 pin 16
 	ohci1: interrupting at ioapic0 pin 16
 	ehci0: interrupting at ioapic0 pin 17
 	ohci2: interrupting at ioapic0 pin 18
(Continue reading)


Gmane