[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

SCSI / IDE Errors on i386, v2.4



Hello. 

In the last couple of days I've got an insane number of IO errors from my
monitor station running OBSD 2.4 on i386 hardware. I use this box for logging
traffic in and out of our internal network. It's running Multi-Router Traffic
Grapher so that I can get pretty pictures of packets from our cisco catalyst
switch and NFR 2.0.3 to monitor everything else. I've compiled my own kernel
with a minimum set of drivers; 3Com fast ethernet, Adaptec 2940 SCSI, SCSI
cdrom, E?IDE disk, and floppy drives.

My cdrom drive is a yamaha CRW4416S CD writer (which works quite well...)
While reading a CD though, I got a bunch of SCSI errors that made the machine
pretty much unusable. I had to reboot to clear them, and the kernel screamed
about scsi problems even as it was about to reboot... /var/log/messages
extract below.

Then, this morning, I found the console full of errors to the effect of wdc0
losing the interrupt at about 0130h. It seems to have hung like that, since at
0159h xntpd reported stepping the time 264 seconds. wd0 is a quantum 12.7GB
UDMA drive. Querying the cisco switch, there was negligible traffic during
that window; a few DNS requests, a handful of email messages, and a couple of
ntp packets.

/etc/daily runs at 0130h, and running it manually does reproduce the error, I
think it's something in /etc/security. Oddly enough, (yes, get ready to laugh)
it didn't do this right away. The scsi errors are attributable to a microsloth
cd, but the date on the kernel is March 19, and the wdc0 errors started on
March 24.

Any suggestions?

Be Well,
Chris

========================   dmesg   =========================
OpenBSD 2.4 (BLACKSTAR) #7: Fri Mar 19 16:04:21 MST 1999
    root_(_at_)_blackstar:/sys/arch/i386/compile/BLACKSTAR
cpu0: Cyrix 6x86 (486-class)
BIOS mem  = 654336 conventional, 133169152 extended
real mem  = 133824512
avail mem = 123125760
using 1659 buffers containing 6795264 bytes of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(35) BIOS, date 05/26/97
bios0: pciinfo 0xf02eb00c apminfo 0xf02eb028 diskinfo 0xf02eb050 cksumlen 1 memmap 0xf02eb0cc
apm0 at bios0: Power Management spec V1.1
apm0: APM engage (device 1): power management disabled (1)
apm0: AC on, no battery
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 vendor "Silicon Integrated System", unknown product 0x5571 rev 0x00
pcib0 at pci0 dev 1 function 0 "Silicon Integrated System 85C503 ISA Bridge" rev 0x01
"Silicon Integrated System 5513 EIDE" rev 0xc0 at pci0 dev 1 function 1 not configured
vendor "Silicon Integrated System", unknown product 0x7001 (class serial bus, subclass USB, rev 0xb0) at pci0 dev 1 function 2 not configured
xl0 at pci0 dev 9 function 0 "3Com 3c905B 100Base-TX" rev 0x24: irq 11 address 00:10:5a:08:c1:87
xl0: autoneg complete, link status good (half-duplex, 10Mbps)
ahc0 at pci0 dev 10 function 0 "Adaptec AHA-2940 Ultra" rev 0x01: irq 9
ahc0: aic7880 Single Channel, SCSI Id=7, 16 SCBs
scsibus0 at ahc0: 8 targets
ahc0: target 0 synchronous at 8.0MHz, offset = 0xf
cd0 at scsibus0 targ 0 lun 0: <YAMAHA, CRW4416S, 1.0e> SCSI2 5/cdrom removable
"S3 Trio32/64" rev 0x00 at pci0 dev 12 function 0 not configured
isa0 at mainbus0
lpt0 at isa0 port 0x378-0x37b irq 7
isadma0 at isa0
wdc0 at isa0 port 0x1f0-0x1f7 irq 14
wd0 at wdc0 drive 0: <QUANTUM FIREBALL EX12.7A>
wd0: 12159MB, 16383 cyl, 16 head, 63 sec, 512 bytes/sec, 24901632 sec total
wd0: using 16-sector 16-bit pio transfers, lba addressing (418KB cache)
npx0 at isa0 port 0xf0-0xff: using exception 16
pccom0 at isa0 port 0x3f8-0x3ff irq 4: ns16550a, 16 byte fifo
pccom1 at isa0 port 0x2f8-0x2ff irq 3: ns16550a, 16 byte fifo
vt0 at isa0 port 0x60-0x6f irq 1: S3 764 (Trio64), 80 col, color, 8 scr, mf2-kbd
fdc0 at isa0 port 0x3f0-0x3f5 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask 4240 netmask 4a40 ttymask 4ac2
root on wd0a
pctr: no performance counters in CPU
dkcsum: wd0 matched BIOS disk 80
rootdev=0x0 rrootdev=0x300 rawdev=0x302

======================   /var/log/messages   =======================

Mar 19 17:59:40 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 19 17:59:40 blackstar /bsd:     SENSE KEY: Illegal Request
Mar 19 17:59:40 blackstar /bsd:      ASC/ASCQ: Logical Block Address Out of Range

Mar 19 20:05:57 blackstar reboot: rebooted by root

Mar 20 16:38:32 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 20 16:38:32 blackstar /bsd:     SENSE KEY: Illegal Request
Mar 20 16:38:32 blackstar /bsd:      ASC/ASCQ: Logical Block Address Out of Range

Mar 22 15:07:13 blackstar /bsd: cd0(ahc0:0:0): timed out in dataout phase, SCSISIGI == 0x0
Mar 22 15:07:13 blackstar /bsd: cd0(ahc0:0:0): BUS DEVICE RESET message queued.
Mar 22 15:07:13 blackstar /bsd: Bus Device Reset Message Sent
Mar 22 15:07:48 blackstar /bsd: ahc0:A:0: no active SCB for reconnecting target - issuing ABORT
Mar 22 15:07:48 blackstar /bsd: SAVED_TCL == 0x0
Mar 22 15:23:05 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:23:06 blackstar /bsd:     SENSE KEY: Media Error
Mar 22 15:23:06 blackstar /bsd:    INFO FIELD: 29
Mar 22 15:23:06 blackstar /bsd:      ASC/ASCQ: No Seek Complete
Mar 22 15:23:24 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:23:24 blackstar /bsd:     SENSE KEY: Media Error
Mar 22 15:23:24 blackstar /bsd:    INFO FIELD: 29
Mar 22 15:23:24 blackstar /bsd:      ASC/ASCQ: No Seek Complete
Mar 22 15:23:42 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:23:42 blackstar /bsd:     SENSE KEY: Media Error
Mar 22 15:23:42 blackstar /bsd:    INFO FIELD: 29
Mar 22 15:23:42 blackstar /bsd:      ASC/ASCQ: No Seek Complete
Mar 22 15:24:01 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:24:01 blackstar /bsd:     SENSE KEY: Media Error
Mar 22 15:24:01 blackstar /bsd:    INFO FIELD: 29
Mar 22 15:24:01 blackstar /bsd:      ASC/ASCQ: No Seek Complete
Mar 22 15:24:19 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:24:19 blackstar /bsd:     SENSE KEY: Media Error
Mar 22 15:24:20 blackstar /bsd:    INFO FIELD: 29
Mar 22 15:24:20 blackstar /bsd:      ASC/ASCQ: No Seek Complete
Mar 22 15:24:38 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:24:38 blackstar /bsd:     SENSE KEY: Media Error
Mar 22 15:24:38 blackstar /bsd:    INFO FIELD: 29
Mar 22 15:24:38 blackstar /bsd:      ASC/ASCQ: No Seek Complete
Mar 22 15:25:19 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:25:19 blackstar /bsd:     SENSE KEY: Media Error
Mar 22 15:25:19 blackstar /bsd:    INFO FIELD: 29
Mar 22 15:25:19 blackstar /bsd:      ASC/ASCQ: No Seek Complete
Mar 22 15:25:57 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:25:57 blackstar /bsd:     SENSE KEY: Media Error
Mar 22 15:25:57 blackstar /bsd:    INFO FIELD: 29
Mar 22 15:25:57 blackstar /bsd:      ASC/ASCQ: No Seek Complete
Mar 22 15:26:35 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:26:35 blackstar /bsd:     SENSE KEY: Media Error
Mar 22 15:26:35 blackstar /bsd:    INFO FIELD: 29
Mar 22 15:26:35 blackstar /bsd:      ASC/ASCQ: No Seek Complete
Mar 22 15:27:24 blackstar /bsd: cd0(ahc0:0:0): Check Condition on opcode 8
Mar 22 15:27:24 blackstar /bsd:     SENSE KEY: Unit Attention
Mar 22 15:27:24 blackstar /bsd:      ASC/ASCQ: Power On, Reset, or Bus Device Reset Occurred
Mar 22 15:27:41 blackstar reboot: rebooted by root

Mar 24 01:31:30 blackstar /bsd: wdc0(wd0): lost interrupt
Mar 24 01:32:55 blackstar /bsd: wdc0(wd0): lost interrupt
Mar 24 01:32:56 blackstar /bsd: wdc0: reset failed
Mar 24 01:32:56 blackstar /bsd: wd0: wdccontrol: recal failed (1)
Mar 24 01:32:57 blackstar /bsd: wdc0: reset failed
Mar 24 01:32:57 blackstar /bsd: wdc0(wd0): lost interrupt
Mar 24 01:32:57 blackstar /bsd: wdc0(wd0): lost interrupt
Mar 24 01:32:57 blackstar /bsd: wdc0: reset failed
Mar 24 01:32:57 blackstar /bsd: wd0: wdccontrol: recal failed (1)
Mar 24 01:32:58 blackstar /bsd: wdc0: reset failed
Mar 24 01:32:58 blackstar /bsd: wd0: wdccontrol: recal failed (1)
Mar 24 01:32:58 blackstar /bsd: wdc0: reset failed
Mar 24 01:32:59 blackstar /bsd: wd0: wdccontrol: recal failed (1)
Mar 24 01:32:59 blackstar /bsd: wdc0(wd0): lost interrupt
Mar 24 01:33:00 blackstar /bsd: wdc0: reset failed
Mar 24 01:33:01 blackstar /bsd: wd0: wdccontrol: recal failed (1)
Mar 24 01:33:01 blackstar /bsd: wdc0: reset failed
Mar 24 01:33:01 blackstar /bsd: wd0: wdccontrol: recal failed (1)
Mar 24 01:33:01 blackstar /bsd: wdc0: reset failed
Mar 24 01:33:02 blackstar /bsd: wd0: wdccontrol: recal failed (1)
Mar 24 01:33:02 blackstar /bsd: wdc0(wd0): lost interrupt
Mar 24 01:33:02 blackstar /bsd: wdc0(wd0): lost interrupt
Mar 24 01:33:02 blackstar /bsd: wdc0: reset failed
Mar 24 01:33:03 blackstar /bsd: wd0: wdccontrol: recal failed (1)
Mar 24 01:33:03 blackstar /bsd: wdc0(wd0): lost interrupt
Mar 24 01:33:03 blackstar /bsd: wdc0: reset failed
Mar 24 01:33:03 blackstar /bsd: wd0: wdccontrol: recal failed (1)