Marcus Bointon | 1 Oct 2007 01:22
Picon
Gravatar

Dell PowerEdge 1850 CPU socket - mystery solved

After a bit of server disembowelment I finally have a definitive  
answer: Poweredge 1850 and 1950 have completely different CPU  
sockets. The 1850 uses a PGA socket with some unknown number of pins,  
and the 1950 uses an LGA (no pins). Nowhere in the Dell server docs  
have I found this documented! Hopefully this will help someone else  
that is wondering about this.

Marcus
--

-- 
Marcus Bointon
Synchromedia Limited: Creators of http://www.smartmessages.net/
UK resellers of info <at> hand CRM solutions
marcus <at> synchromedia.co.uk | http://www.synchromedia.co.uk/

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge <at> dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Marcus Bointon | 1 Oct 2007 01:25
Picon
Gravatar

No OMSA after BIOS upgrade?

I upgraded my 1850 from A01 to A06. It all seems to still work, but I  
noticed two things: no Dell logo on boot (it still shows the progress  
bar and key options in top right), and OMSA finds nothing at all to  
report on. It worked fine before the BIOS update. Any idea how to  
solve this? I'm running the sara.nl 5.2 package on Ubuntu 6.06.

Marcus
--

-- 
Marcus Bointon
Synchromedia Limited: Creators of http://www.smartmessages.net/
UK resellers of info <at> hand CRM solutions
marcus <at> synchromedia.co.uk | http://www.synchromedia.co.uk/

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge <at> dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Kurt_Olsson | 1 Oct 2007 15:29
Picon
Favicon

RE: Dell PowerEdge 1850 CPU socket - mystery solved

1850 uses Netburst arch, not Core... 771/775 is only for Core (x) CPUs.

Key Features of the Nocona / Irwindale Processor:

Nocona processor utilizes 1MB L2 cache 
Irwindale processor utilizes 2MB L2 cache 
604-pin PGA package in a ZIF socket 
FSB address will clock at 400MHz, and data at 800MHz. 
No termination required for non-populated CPUs (must populate CPU socket
1 first) 
Peak Bandwidth at 6.4 GB/s 
64-byte Cache line size 
Data Inversion 
IOQ depth = 12 
Source -Synchronous Transfer (SST) 4x per bus clock 
Max trans per processor 29 
RISC/CISC hybrid architecture 
AGTL+ external bus 
Compatible with existing x86 code base 
Optimized for 32-bit code 
MMX support 
Streaming SIMD Extensions 2 
64-bit Extensions

-----Original Message-----
From: linux-poweredge-bounces <at> dell.com
[mailto:linux-poweredge-bounces <at> dell.com] On Behalf Of Marcus Bointon
Sent: Sunday, September 30, 2007 6:22 PM
To: linux-poweredge-Lists
Subject: Dell PowerEdge 1850 CPU socket - mystery solved
(Continue reading)

Patrick_Boyd | 1 Oct 2007 15:30
Picon
Favicon

RE: patrol reads and megaraid_sas (WAS: [Fwd: What is the "Patrol readstart/stop" snmp alert ?])

Ok another thing I would suggest checking is the Patrol Read Rate. I'm
not sure where this is exposed on megacli but it should be set at <=
30%.

As for the SAS Backplane firmware, you can use a live CD of linux to
perform the flash. 

-----Original Message-----
From: Brian A. Seklecki [mailto:seklecki <at> collaborativefusion.com] 
Sent: Friday, September 28, 2007 4:42 PM
To: Boyd, Patrick
Cc: bseklecki <at> collaborativefusion.com; linux-poweredge-Lists; Terrey,
Nico
Subject: RE: patrol reads and megaraid_sas (WAS: [Fwd: What is the
"Patrol readstart/stop" snmp alert ?])

Here is his report:

http://lists.freebsd.org/pipermail/freebsd-hardware/2007-September/00467
6.html

We'll investigate with the FreeBSD driver developer.

There has been mention here of an SAS backplane firmware patch that can 
only be applied in Windows and RHEL.

~BAS

On Fri, 28 Sep 2007, Patrick_Boyd <at> Dell.com wrote:

(Continue reading)

Patrick_Boyd | 1 Oct 2007 15:37
Picon
Favicon

RE: PowerEdge 860 SAS5/iR mptlinux driver crashing repeatedly

1)      There is no caching on the SAS 5/iR controllers. Therefore it will always be write-through.

2)      What version of the driver are you using?

 

From: linux-poweredge-bounces <at> dell.com [mailto:linux-poweredge-bounces <at> dell.com] On Behalf Of Jobe Bittman
Sent: Friday, September 28, 2007 11:30 PM
To: linux-poweredge-Lists
Subject: PowerEdge 860 SAS5/iR mptlinux driver crashing repeatedly

 

I am having issues with the PowerEdge 860 SAS5/iR controller. I am running Centos5 64bit and running latest update kernel 2.6.18-8.1.14.el5. I have 2 72G drives striped. I started out using the linux supplied driver but the dmesg always showed that I write-through caching was being used. After installing OMSA 5.2 from the Dell hw/sw repos, i discovered the linux raid driver was hanging and crashing when attempting to connect to the OMSA web interface. I reloaded the machine and tried installing the mptlinux driver from the dell repo. It seemed to work great for the day. I even saw that write-back caching was working. But now I'm running into issues while running bonnie++ to benchmark my io. The errors in /var/log/messages are below. I didnt capture the error with the linux driver but it was very similar.

Has anyone run into this?


Sep 28 21:07:26 san1-test1 kernel: mptscsih: ioc0: attempting task abort! (sc=ffff810051291e40)
Sep 28 21:07:26 san1-test1 kernel: sd 0:1:0:0:
Sep 28 21:07:26 san1-test1 kernel:         command: Write(10): 2a 00 01 d4 1a 0a 00 01 40 00
Sep 28 21:07:26 san1-test1 kernel: mptscsih: ioc0: WARNING - TM Handler for type=1: IOC Not operational (0x40001600)!
Sep 28 21:07:26 san1-test1 kernel:  Issuing HardReset!!
Sep 28 21:07:26 san1-test1 kernel: mptbase: Initiating ioc0 recovery
Sep 28 21:07:26 san1-test1 kernel: mptbase: ioc0: WARNING - IOC is in FAULT state!!!
Sep 28 21:07:26 san1-test1 kernel:            FAULT code = 1600h
Sep 28 21:07:28 san1-test1 kernel: mptbase: ioc0: Recovered from IOC FAULT
Sep 28 21:07:42 san1-test1 kernel: mptscsih: ioc0: task abort: FAILED (sc=ffff810051291e40)
Sep 28 21:07:43 san1-test1 kernel: mptscsih: ioc0: attempting target reset! (sc=ffff810051291e40)
Sep 28 21:07:43 san1-test1 kernel: sd 0:1:0:0:
Sep 28 21:07:43 san1-test1 kernel:         command: Write(10): 2a 00 01 d4 1a 0a 00 01 40 00
Sep 28 21:07:45 san1-test1 kernel: mptscsih: ioc0: target reset: SUCCESS (sc=ffff810051291e40)
Sep 28 21:09:26 san1-test1 kernel: mptbase: Initiating ioc0 recovery
Sep 28 21:09:36 san1-test1 kernel: BUG: soft lockup detected on CPU#0!
Sep 28 21:09:36 san1-test1 kernel:
Sep 28 21:09:36 san1-test1 kernel: Call Trace:
Sep 28 21:09:36 san1-test1 kernel:  <IRQ>  [<ffffffff800b2c30>] softlockup_tick+0xdb/0xed
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff800933ec>] update_process_times+0x42/0x68
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff80073d61>] smp_local_timer_interrupt+0x23/0x47
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff80074423>] smp_apic_timer_interrupt+0x41/0x47
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff8005bcc2>] apic_timer_interrupt+0x66/0x6c
Sep 28 21:09:36 san1-test1 kernel:  <EOI>  [<ffffffff8000c4d2>] __delay+0x8/0x10
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff880c2e4d>] :mptbase:WaitForDoorbellInt+0x5b/0x86
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff880c3023>] :mptbase:mpt_handshake_req_reply_wait+0x138/0x296
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff8000c4d2>] __delay+0x8/0x10
Sep 28 21:11:00 san1-test1 kernel:  [<ffffffff880c39df>] :mptbase:SendIocInit+0x229/0x310
Sep 28 21:11:01 san1-test1 shutdown[12201]: shutting down for system reboot
Sep 28 21:11:17 san1-test1 kernel:  [<ffffffff880c33a7>] :mptbase:GetIocFacts+0x7e/0x2d6
Sep 28 21:12:07 san1-test1 init: Switching to runlevel: 6
Sep 28 21:12:35 san1-test1 kernel:  [<ffffffff880c459f>] :mptbase:MakeIocReady+0x635/0xa29
Sep 28 21:12:37 san1-test1 kernel:  [<ffffffff880c71f6>] :mptbase:mpt_do_ioc_recovery+0xf0d/0xf4d
Sep 28 21:12:38 san1-test1 kernel:  [<ffffffff80072a51>] smp_send_reschedule+0x4e/0x53
Sep 28 21:12:38 san1-test1 kernel:  [<ffffffff8013b1b2>] __next_cpu+0x19/0x28
Sep 28 21:12:39 san1-test1 kernel:  [<ffffffff800857cf>] find_busiest_group+0x20d/0x621
Sep 28 21:12:39 san1-test1 kernel:  [<ffffffff8006290e>] __kprobes_text_start+0xfe/0x230
Sep 28 21:12:39 san1-test1 kernel:  [<ffffffff800627d1>] __reacquire_kernel_lock+0x2c/0x45
Sep 28 21:12:39 san1-test1 shutdown[12243]: shutting down for system reboot
Sep 28 21:12:39 san1-test1 kernel:  [<ffffffff80060b5f>] thread_return+0xb7/0xea
Sep 28 21:12:40 san1-test1 kernel:  [<ffffffff880c72e7>] :mptbase:mpt_HardResetHandler+0xb1/0x109
Sep 28 21:12:40 san1-test1 kernel:  [<ffffffff88220df1>] :mptctl:mptctl_timeout_expired+0x1b4/0x1dc
Sep 28 21:12:41 san1-test1 kernel:  [<ffffffff800613bf>] schedule_timeout+0x92/0xad
Sep 28 21:12:41 san1-test1 kernel:  [<ffffffff80092e02>] process_timeout+0x0/0x5
Sep 28 21:12:41 san1-test1 kernel:  [<ffffffff882225ce>] :mptctl:mptctl_do_mpt_command+0x7b6/0x998
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff8009b681>] autoremove_wake_function+0x0/0x2e
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff882290cb>] :mptctl:compat_mpctl_ioctl+0x230/0x31f
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff8822903b>] :mptctl:compat_mpctl_ioctl+0x1a0/0x31f
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff800e8cb8>] compat_sys_ioctl+0xc5/0x2b1
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff8005f013>] sysenter_do_call+0x1b/0x67
Sep 28 21:12:58 san1-test1 kernel:
Sep 28 21:12:59 san1-test1 kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8101005429c0)
Sep 28 21:12:59 san1-test1 kernel: sd 0:1:0:0:
Sep 28 21:12:59 san1-test1 kernel:         command: Write(10): 2a 00 02 2a df 8a 00 01 40 00
Sep 28 21:12:59 san1-test1 kernel: mptscsih: ioc0: WARNING - TM Handler for type=1: IOC Not operational (0x40001600)!
Sep 28 21:12:59 san1-test1 kernel:  Issuing HardReset!!
Sep 28 21:12:59 san1-test1 kernel: mptbase: Initiating ioc0 recovery
Sep 28 21:12:59 san1-test1 kernel: mptbase: ioc0: WARNING - IOC is in FAULT state!!!
Sep 28 21:13:00 san1-test1 kernel:            FAULT code = 1600h
Sep 28 21:13:00 san1-test1 kernel: mptbase: ioc0: Recovered from IOC FAULT
Sep 28 21:13:00 san1-test1 kernel: mptscsih: ioc0: task abort: FAILED (sc=ffff8101005429c0)
Sep 28 21:13:00 san1-test1 kernel: mptscsih: ioc0: attempting target reset! (sc=ffff8101005429c0)
Sep 28 21:13:00 san1-test1 kernel: sd 0:1:0:0:
Sep 28 21:13:01 san1-test1 kernel:         command: Write(10): 2a 00 02 2a df 8a 00 01 40 00
Sep 28 21:13:01 san1-test1 kernel: mptscsih: ioc0: target reset: SUCCESS (sc=ffff8101005429c0)
Sep 28 21:13:01 san1-test1 kernel: mptbase: Initiating ioc0 recovery
Sep 28 21:13:01 san1-test1 kernel: BUG: soft lockup detected on CPU#0!
Sep 28 21:13:01 san1-test1 kernel:
Sep 28 21:13:01 san1-test1 kernel: Call Trace:
Sep 28 21:13:01 san1-test1 kernel:  <IRQ>  [<ffffffff800b2c30>] softlockup_tick+0xdb/0xed
Sep 28 21:13:01 san1-test1 kernel:  [<ffffffff800933ec>] update_process_times+0x42/0x68
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff80073d61>] smp_local_timer_interrupt+0x23/0x47
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff80074423>] smp_apic_timer_interrupt+0x41/0x47
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff8005bcc2>] apic_timer_interrupt+0x66/0x6c
Sep 28 21:13:02 san1-test1 kernel:  <EOI>  [<ffffffff8000c4d2>] __delay+0x8/0x10
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff880c2e4d>] :mptbase:WaitForDoorbellInt+0x5b/0x86
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff880c3023>] :mptbase:mpt_handshake_req_reply_wait+0x138/0x296
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff8000c4d2>] __delay+0x8/0x10
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff880c39df>] :mptbase:SendIocInit+0x229/0x310
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff880c33a7>] :mptbase:GetIocFacts+0x7e/0x2d6
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff880c459f>] :mptbase:MakeIocReady+0x635/0xa29
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff880c71f6>] :mptbase:mpt_do_ioc_recovery+0xf0d/0xf4d
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff80072a51>] smp_send_reschedule+0x4e/0x53
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff8011735a>] avc_has_perm+0x43/0x55
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff80117a1b>] ipc_has_perm+0x59/0x67
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff8006290e>] __kprobes_text_start+0xfe/0x230
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff800862e7>] dequeue_task+0x18/0x37
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff800627d1>] __reacquire_kernel_lock+0x2c/0x45
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff80060b5f>] thread_return+0xb7/0xea
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff880c72e7>] :mptbase:mpt_HardResetHandler+0xb1/0x109
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff88220df1>] :mptctl:mptctl_timeout_expired+0x1b4/0x1dc
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff800613bf>] schedule_timeout+0x92/0xad
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff80092e02>] process_timeout+0x0/0x5
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff882225ce>] :mptctl:mptctl_do_mpt_command+0x7b6/0x998
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff8009b681>] autoremove_wake_function+0x0/0x2e
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff8002dd9c>] __wake_up+0x38/0x4f
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff882290cb>] :mptctl:compat_mpctl_ioctl+0x230/0x31f
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff8822903b>] :mptctl:compat_mpctl_ioctl+0x1a0/0x31f
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff800e8cb8>] compat_sys_ioctl+0xc5/0x2b1
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff8005f013>] sysenter_do_call+0x1b/0x67

--
Jobe Bittman

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge <at> dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq
Kuba Ober | 1 Oct 2007 16:43

Re: Dell SC1425 and large (750GB) SATA drive not seen

On Sunday 23 September 2007, Jim Nelson wrote:
> Jason Ede wrote:
> > We've just tried putting 2 x 500GB drives in a SC1425 on a SATA Raid and
> > neither windows or linux seem to be able to do anything with them... Can
> > assign partitions, but once try to format or copy files it bombs out.
> >
> > Jason
>
> I dunno - We've got 2 SC1425's with 500 GB drives in Linux RAID 1 - no
> problems out of either one of 'em.

That tells me that the onboard RAID is too dumb to deal with such large 
drives. Just use the large drives without any hardware RAID on them and 
you'll be OK.

Cheers, Kuba

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge <at> dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Chuck Remes | 1 Oct 2007 16:44
Picon

Re: failed install of OMSA on RHEL4 Update 5 x86_64 using up2date


On Sep 27, 2007, at 2:59 PM, Michael E Brown wrote:

> On Wed, Sep 26, 2007 at 02:09:21PM -0500, Chuck Remes wrote:
>>
>> [snip]
>> I was wrong about $basearch not being replaced in the URL. That is
>> working correctly. Here is the output of the up2date command after I
>> clear the caches in /var/spool/up2date.
>>
>> [root <at> jm-orc-dev-1 rhn]# up2date -i srvadmin-all
>> http://linux.dell.com/repo/hardware/mirrors.pl?
>> osname=el4&basearch=x86_64&repo_config=
>> $repo_config&dellsysidpluginver=$dellsysidpluginver
>> using mirror: http://linux.dell.com/repo/hardware/latest/
>> platform_independent/rh40_64
>>
>> Fetching Obsoletes list for channel: rhel-x86_64-as-4...
>> ########################################
>>
>> Fetching Obsoletes list for channel: dell-hw-indep-repository...
>>
>> Fetching obsoletes list for http://linux.dell.com/repo/hardware/
>> mirrors.pl?osname=el4&basearch=x86_64&repo_config=
>> $repo_config&dellsysidpluginver=
>> $dellsysidpluginver&redirect=1&redir_path=...
>> ####################################
>> Fetching rpm headers...
>> ########################################
>>
>> Name                                    Version        Rel
>> ----------------------------------------------------------
>>
>>
>> The following packages you requested were not found:
>> srvadmin-all
>
> Hmmm.
>
> What was the output of the bootstrap.cgi? Which system model are you
> runnning? The getSystemId should have been installed when it tried to
> install the dell-hw-specific-repository RPM (which you dont appear to
> have installed, that would be a bootstrap.cgi problem.)
>
> So...
>
> Can you try running:
>   up2date -i libsmbios-bin
>
> and if that works, try running:
>   up2date -i dell-hw-specific-repository
>
> The exact output of the bootstrap.cgi would also be useful.

Michael,

I am running a PowerEdge 2950. You can look it up by service tag  
7STXHD1. Plus, I'm running RHEL 4 Update 5.

Plus, here's the exact output from bootstrap.cgi.

# wget -q -O - http://linux.dell.com/repo/hardware/bootstrap.cgi | bash
Downloading GPG key: http://linux.dell.com//repo/hardware/RPM-GPG-KEY- 
dell
     Importing key into RPM.
Downloading GPG key: http://linux.dell.com//repo/hardware/RPM-GPG-KEY- 
libsmbios
     Importing key into RPM.
Installing platform-independent RPM: dell-hw-indep- 
repository-1-14.el4.noarch.rpm

Installing platform-specific repository RPM.
http://linux.dell.com/repo/hardware/mirrors.pl? 
osname=el4&basearch=x86_64&repo_config= 
$repo_config&dellsysidpluginver=$dellsysidpluginver
using mirror: http://linux.dell.com/repo/hardware/latest/ 
platform_independent/rh40_64

Fetching Obsoletes list for channel: rhel-x86_64-as-4...

Fetching Obsoletes list for channel: dell-hw-indep-repository...

Fetching rpm headers...
########################################

Name                                    Version        Rel
----------------------------------------------------------

The following packages you requested were not found:
dell-hw-specific-repository
Done!
======================================================================== 
=====
If you encounter problems, please read the FAQ at:
     http://linux.dell.com/wiki/index.php/Repository/FAQ
======================================================================== 
=====

Thanks for your help! I hope I can get this working on this box so I  
can roll it out to my other ones.

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge <at> dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Brian A. Seklecki | 1 Oct 2007 16:45
Favicon

RE: patrol reads and megaraid_sas (WAS: [Fwd: What is the "Patrol readstart/stop" snmp alert ?])

On Mon, 2007-10-01 at 08:30 -0500, Patrick_Boyd <at> Dell.com wrote:
> Ok another thing I would suggest checking is the Patrol Read Rate. I'm
> not sure where this is exposed on megacli but it should be set at <=
> 30%.

Yes you can set it via megacli and we have as part of the testing.

> As for the SAS Backplane firmware, you can use a live CD of linux to
> perform the flash. 

Ahh okay so LiveCD + Dell-PE Yum Repo subscribe, install, flash?

Did you have a recommendation on a liveCD?  I think we had a problem
with CentOS 4.5 and keyboard detection.

~BAS

IMPORTANT: This message contains confidential information and is intended only for the individual
named. If the reader of this message is not an intended recipient (or the individual responsible for the
delivery of this message to an intended recipient), please be advised that any re-use, dissemination,
distribution or copying of this message is prohibited.  Please notify the sender immediately by e-mail if
you have received this e-mail by mistake and delete this e-mail from your system.

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge <at> dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Kuba Ober | 1 Oct 2007 16:46

Re: Poweredge SC1435 HOWTO Setup BIOS

> Well, not exactly.  I really don't care about the whole management
> framework and the hardware monitoring and such like that, which rides
> along with the whole OMSA package.  I really just want to be able to
> twiddle the bits you would normally have to press F2 on the console to get
> to.  It'd be nice if I could hardware monitoring and such for the 1435s,
> but really they're horizontally scalable throw away machines, I don't
> critically need "hardware monitoring" at less than the level of whole
> machine failures.  If omitting that makes them cheaper, I'm fine with
> losing that functionality (and i think things like ipmitool and smartd
> will get me enough of the monitoring that i need...).  Requiring humans to
> hit F2 to configure the machine, though, is a PITA when you're dealing
> with hundreds of them.  Its time-consuming and error-prone.

This could be reverse-engineered. I don't know if those systems support the 
old (IBM-AT era) CMOS RAM interface, but if they do it'd be a simple matter 
to reverse-engineer what all the different bits in the RAM mean.

Cheers, Kuba

_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge <at> dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Jobe Bittman | 1 Oct 2007 17:38

Re: PowerEdge 860 SAS5/iR mptlinux driver crashing repeatedly

4.00.00.01 mptlinux from dell. i also got a similar crash with te centos 5 supplied mptlinux. there is soemthing wrong with this controller i think. If I enable SAS in BIOS do I need to open up the server to configure it as 2 separate disks and use linux software raid?

On 10/1/07, Patrick_Boyd <at> dell.com <Patrick_Boyd <at> dell.com> wrote:

1)      There is no caching on the SAS 5/iR controllers. Therefore it will always be write-through.

2)      What version of the driver are you using?

 

From: linux-poweredge-bounces <at> dell.com [mailto:linux-poweredge-bounces <at> dell.com] On Behalf Of Jobe Bittman
Sent: Friday, September 28, 2007 11:30 PM
To: linux-poweredge-Lists
Subject: PowerEdge 860 SAS5/iR mptlinux driver crashing repeatedly

 

I am having issues with the PowerEdge 860 SAS5/iR controller. I am running Centos5 64bit and running latest update kernel 2.6.18-8.1.14.el5. I have 2 72G drives striped. I started out using the linux supplied driver but the dmesg always showed that I write-through caching was being used. After installing OMSA 5.2 from the Dell hw/sw repos, i discovered the linux raid driver was hanging and crashing when attempting to connect to the OMSA web interface. I reloaded the machine and tried installing the mptlinux driver from the dell repo. It seemed to work great for the day. I even saw that write-back caching was working. But now I'm running into issues while running bonnie++ to benchmark my io. The errors in /var/log/messages are below. I didnt capture the error with the linux driver but it was very similar.

Has anyone run into this?


Sep 28 21:07:26 san1-test1 kernel: mptscsih: ioc0: attempting task abort! (sc=ffff810051291e40)
Sep 28 21:07:26 san1-test1 kernel: sd 0:1:0:0:
Sep 28 21:07:26 san1-test1 kernel:         command: Write(10): 2a 00 01 d4 1a 0a 00 01 40 00
Sep 28 21:07:26 san1-test1 kernel: mptscsih: ioc0: WARNING - TM Handler for type=1: IOC Not operational (0x40001600)!
Sep 28 21:07:26 san1-test1 kernel:  Issuing HardReset!!
Sep 28 21:07:26 san1-test1 kernel: mptbase: Initiating ioc0 recovery
Sep 28 21:07:26 san1-test1 kernel: mptbase: ioc0: WARNING - IOC is in FAULT state!!!
Sep 28 21:07:26 san1-test1 kernel:            FAULT code = 1600h
Sep 28 21:07:28 san1-test1 kernel: mptbase: ioc0: Recovered from IOC FAULT
Sep 28 21:07:42 san1-test1 kernel: mptscsih: ioc0: task abort: FAILED (sc=ffff810051291e40)
Sep 28 21:07:43 san1-test1 kernel: mptscsih: ioc0: attempting target reset! (sc=ffff810051291e40)
Sep 28 21:07:43 san1-test1 kernel: sd 0:1:0:0:
Sep 28 21:07:43 san1-test1 kernel:         command: Write(10): 2a 00 01 d4 1a 0a 00 01 40 00
Sep 28 21:07:45 san1-test1 kernel: mptscsih: ioc0: target reset: SUCCESS (sc=ffff810051291e40)
Sep 28 21:09:26 san1-test1 kernel: mptbase: Initiating ioc0 recovery
Sep 28 21:09:36 san1-test1 kernel: BUG: soft lockup detected on CPU#0!
Sep 28 21:09:36 san1-test1 kernel:
Sep 28 21:09:36 san1-test1 kernel: Call Trace:
Sep 28 21:09:36 san1-test1 kernel:  <IRQ>  [<ffffffff800b2c30>] softlockup_tick+0xdb/0xed
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff800933ec>] update_process_times+0x42/0x68
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff80073d61>] smp_local_timer_interrupt+0x23/0x47
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff80074423>] smp_apic_timer_interrupt+0x41/0x47
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff8005bcc2>] apic_timer_interrupt+0x66/0x6c
Sep 28 21:09:36 san1-test1 kernel:  <EOI>  [<ffffffff8000c4d2>] __delay+0x8/0x10
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff880c2e4d>] :mptbase:WaitForDoorbellInt+0x5b/0x86
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff880c3023>] :mptbase:mpt_handshake_req_reply_wait+0x138/0x296
Sep 28 21:09:36 san1-test1 kernel:  [<ffffffff8000c4d2>] __delay+0x8/0x10
Sep 28 21:11:00 san1-test1 kernel:  [<ffffffff880c39df>] :mptbase:SendIocInit+0x229/0x310
Sep 28 21:11:01 san1-test1 shutdown[12201]: shutting down for system reboot
Sep 28 21:11:17 san1-test1 kernel:  [<ffffffff880c33a7>] :mptbase:GetIocFacts+0x7e/0x2d6
Sep 28 21:12:07 san1-test1 init: Switching to runlevel: 6
Sep 28 21:12:35 san1-test1 kernel:  [<ffffffff880c459f>] :mptbase:MakeIocReady+0x635/0xa29
Sep 28 21:12:37 san1-test1 kernel:  [<ffffffff880c71f6>] :mptbase:mpt_do_ioc_recovery+0xf0d/0xf4d
Sep 28 21:12:38 san1-test1 kernel:  [<ffffffff80072a51>] smp_send_reschedule+0x4e/0x53
Sep 28 21:12:38 san1-test1 kernel:  [<ffffffff8013b1b2>] __next_cpu+0x19/0x28
Sep 28 21:12:39 san1-test1 kernel:  [<ffffffff800857cf>] find_busiest_group+0x20d/0x621
Sep 28 21:12:39 san1-test1 kernel:  [<ffffffff8006290e>] __kprobes_text_start+0xfe/0x230
Sep 28 21:12:39 san1-test1 kernel:  [<ffffffff800627d1>] __reacquire_kernel_lock+0x2c/0x45
Sep 28 21:12:39 san1-test1 shutdown[12243]: shutting down for system reboot
Sep 28 21:12:39 san1-test1 kernel:  [<ffffffff80060b5f>] thread_return+0xb7/0xea
Sep 28 21:12:40 san1-test1 kernel:  [<ffffffff880c72e7>] :mptbase:mpt_HardResetHandler+0xb1/0x109
Sep 28 21:12:40 san1-test1 kernel:  [<ffffffff88220df1>] :mptctl:mptctl_timeout_expired+0x1b4/0x1dc
Sep 28 21:12:41 san1-test1 kernel:  [<ffffffff800613bf>] schedule_timeout+0x92/0xad
Sep 28 21:12:41 san1-test1 kernel:  [<ffffffff80092e02>] process_timeout+0x0/0x5
Sep 28 21:12:41 san1-test1 kernel:  [<ffffffff882225ce>] :mptctl:mptctl_do_mpt_command+0x7b6/0x998
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff8009b681>] autoremove_wake_function+0x0/0x2e
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff882290cb>] :mptctl:compat_mpctl_ioctl+0x230/0x31f
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff8822903b>] :mptctl:compat_mpctl_ioctl+0x1a0/0x31f
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff800e8cb8>] compat_sys_ioctl+0xc5/0x2b1
Sep 28 21:12:42 san1-test1 kernel:  [<ffffffff8005f013>] sysenter_do_call+0x1b/0x67
Sep 28 21:12:58 san1-test1 kernel:
Sep 28 21:12:59 san1-test1 kernel: mptscsih: ioc0: attempting task abort! (sc=ffff8101005429c0)
Sep 28 21:12:59 san1-test1 kernel: sd 0:1:0:0:
Sep 28 21:12:59 san1-test1 kernel:         command: Write(10): 2a 00 02 2a df 8a 00 01 40 00
Sep 28 21:12:59 san1-test1 kernel: mptscsih: ioc0: WARNING - TM Handler for type=1: IOC Not operational (0x40001600)!
Sep 28 21:12:59 san1-test1 kernel:  Issuing HardReset!!
Sep 28 21:12:59 san1-test1 kernel: mptbase: Initiating ioc0 recovery
Sep 28 21:12:59 san1-test1 kernel: mptbase: ioc0: WARNING - IOC is in FAULT state!!!
Sep 28 21:13:00 san1-test1 kernel:            FAULT code = 1600h
Sep 28 21:13:00 san1-test1 kernel: mptbase: ioc0: Recovered from IOC FAULT
Sep 28 21:13:00 san1-test1 kernel: mptscsih: ioc0: task abort: FAILED (sc=ffff8101005429c0)
Sep 28 21:13:00 san1-test1 kernel: mptscsih: ioc0: attempting target reset! (sc=ffff8101005429c0)
Sep 28 21:13:00 san1-test1 kernel: sd 0:1:0:0:
Sep 28 21:13:01 san1-test1 kernel:         command: Write(10): 2a 00 02 2a df 8a 00 01 40 00
Sep 28 21:13:01 san1-test1 kernel: mptscsih: ioc0: target reset: SUCCESS (sc=ffff8101005429c0)
Sep 28 21:13:01 san1-test1 kernel: mptbase: Initiating ioc0 recovery
Sep 28 21:13:01 san1-test1 kernel: BUG: soft lockup detected on CPU#0!
Sep 28 21:13:01 san1-test1 kernel:
Sep 28 21:13:01 san1-test1 kernel: Call Trace:
Sep 28 21:13:01 san1-test1 kernel:  <IRQ>  [<ffffffff800b2c30>] softlockup_tick+0xdb/0xed
Sep 28 21:13:01 san1-test1 kernel:  [<ffffffff800933ec>] update_process_times+0x42/0x68
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff80073d61>] smp_local_timer_interrupt+0x23/0x47
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff80074423>] smp_apic_timer_interrupt+0x41/0x47
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff8005bcc2>] apic_timer_interrupt+0x66/0x6c
Sep 28 21:13:02 san1-test1 kernel:  <EOI>  [<ffffffff8000c4d2>] __delay+0x8/0x10
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff880c2e4d>] :mptbase:WaitForDoorbellInt+0x5b/0x86
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff880c3023>] :mptbase:mpt_handshake_req_reply_wait+0x138/0x296
Sep 28 21:13:02 san1-test1 kernel:  [<ffffffff8000c4d2>] __delay+0x8/0x10
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff880c39df>] :mptbase:SendIocInit+0x229/0x310
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff880c33a7>] :mptbase:GetIocFacts+0x7e/0x2d6
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff880c459f>] :mptbase:MakeIocReady+0x635/0xa29
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff880c71f6>] :mptbase:mpt_do_ioc_recovery+0xf0d/0xf4d
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff80072a51>] smp_send_reschedule+0x4e/0x53
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff8011735a>] avc_has_perm+0x43/0x55
Sep 28 21:13:03 san1-test1 kernel:  [<ffffffff80117a1b>] ipc_has_perm+0x59/0x67
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff8006290e>] __kprobes_text_start+0xfe/0x230
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff800862e7>] dequeue_task+0x18/0x37
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff800627d1>] __reacquire_kernel_lock+0x2c/0x45
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff80060b5f>] thread_return+0xb7/0xea
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff880c72e7>] :mptbase:mpt_HardResetHandler+0xb1/0x109
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff88220df1>] :mptctl:mptctl_timeout_expired+0x1b4/0x1dc
Sep 28 21:13:04 san1-test1 kernel:  [<ffffffff800613bf>] schedule_timeout+0x92/0xad
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff80092e02>] process_timeout+0x0/0x5
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff882225ce>] :mptctl:mptctl_do_mpt_command+0x7b6/0x998
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff8009b681>] autoremove_wake_function+0x0/0x2e
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff8002dd9c>] __wake_up+0x38/0x4f
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff882290cb>] :mptctl:compat_mpctl_ioctl+0x230/0x31f
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff8822903b>] :mptctl:compat_mpctl_ioctl+0x1a0/0x31f
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff800e8cb8>] compat_sys_ioctl+0xc5/0x2b1
Sep 28 21:13:05 san1-test1 kernel:  [<ffffffff8005f013>] sysenter_do_call+0x1b/0x67

--
Jobe Bittman




--
Jobe Bittman
Chief Network Architect
Stage6
_______________________________________________
Linux-PowerEdge mailing list
Linux-PowerEdge <at> dell.com
http://lists.us.dell.com/mailman/listinfo/linux-poweredge
Please read the FAQ at http://lists.us.dell.com/faq

Gmane