[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Timing issue with 3.6 SMP kernel



>Submitter-Id:	net
>Originator:	Michael Clarke
>Organization: Lancaster University
net
>Synopsis:	System clock regression with 3.6 STABLE SMP kernel
>Severity:	serious
>Priority:	medium
>Category:	kernel
>Class:		sw-bug
>Release:	3.6 STABLE
>Environment:
	System      : OpenBSD 3.6
	Architecture: OpenBSD.i386
	Machine     : i386
>Description:
	I've been trying to set up Nagios on OBSD 3.6, and noticed that
	occassionally the ping plugin would return an unknown value.
	Investigation pointed to the system ping returning a negative
	value for min rtt -- along the lines of -1.-057.
	
	We compilied a version of ping with some extra printf calls,
	which showed that ping was functioning correctly, but
	that if you call gettimeofday() twice, once before sending
	the echo request and again on receiving it, the second call
	could return an earlier time than the first one.
	
	I then compiled two new kernels from the STABLE tree -- one
	single CPU and one SMP. The problem manifests itself only
	with the SMP kernel.

	DMESG OUTPUT
	============

OpenBSD 3.6-stable (GENERIC.MP) #0: Thu Dec 23 09:56:35 GMT 2004
    clarkema_(_at_)_cleese_(_dot_)_lancs_(_dot_)_ac_(_dot_)_uk:/usr/src/sys/arch/i386/compile/GENERIC.MP
cpu0: Intel(R) Xeon(TM) CPU 3.06GHz ("GenuineIntel" 686-class) 3.06 GHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID
real mem  = 2146902016 (2096584K)
avail mem = 1952944128 (1907172K)
using 4278 buffers containing 107446272 bytes (104928K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 04/14/04, BIOS32 rev. 0 @ 0xffe90
pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfc4a0/144 (7 entries)
pcibios0: PCI Interrupt Router at 000:15:0 ("ServerWorks CSB5 SouthBridge" rev 0x00)
pcibios0: PCI bus #0 is the last bus
bios0: ROM list: 0xc0000/0x8000 0xc8000/0x4000 0xec000/0x4000!
mainbus0: Intel MP Specification (Version 1.4) (DELL     PE 014A     )
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: apic clock running at 132 MHz
cpu1 at mainbus0: apid 6 (application processor)
cpu1: Intel(R) Xeon(TM) CPU 3.06GHz ("GenuineIntel" 686-class) 3.06 GHz
cpu1: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,CNXT-ID
mainbus0: bus 0 is type PCI   
mainbus0: bus 1 is type PCI   
mainbus0: bus 2 is type PCI   
mainbus0: bus 3 is type PCI   
mainbus0: bus 4 is type PCI   
mainbus0: bus 5 is type ISA   
ioapic0 at mainbus0: apid 8 pa 0xfec00000, version 11, 16 pins
ioapic0: misconfigured as apic 0, remapped to apic 8
ioapic1 at mainbus0: apid 9 pa 0xfec01000, version 11, 16 pins
ioapic1: misconfigured as apic 0, remapped to apic 9
ioapic2 at mainbus0: apid 10 pa 0xfec02000, version 11, 16 pins
ioapic2: misconfigured as apic 0, remapped to apic 10
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "ServerWorks CNB20-HE" rev 0x33
pchb1 at pci0 dev 0 function 1 "ServerWorks CNB20-HE" rev 0x00
pci1 at pchb1 bus 1
pchb2 at pci0 dev 0 function 2 "ServerWorks CNB20-HE" rev 0x00
pci2 at pchb2 bus 3
vga1 at pci0 dev 14 function 0 "ATI Rage XL" rev 0x27
wsdisplay0 at vga1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
pchb3 at pci0 dev 15 function 0 "ServerWorks CSB5 SouthBridge" rev 0x93
pciide0 at pci0 dev 15 function 1 "ServerWorks CSB5 IDE" rev 0x93: DMA
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: <TEAC, CD-224E, K.9A> SCSI0 5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 2
ohci0 at pci0 dev 15 function 2 "ServerWorks OSB4/CSB5 USB" rev 0x05: apic 8 int 11 (irq 11), version 1.0, legacy support
ohci0: SMM does not respond, resetting
usb0 at ohci0: USB revision 1.0
uhub0 at usb0
uhub0: ServerWorks OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 4 ports with 4 removable, self powered
pcib0 at pci0 dev 15 function 3 "ServerWorks CSB5 PCI" rev 0x00
pchb4 at pci0 dev 16 function 0 "ServerWorks CIOB-E" rev 0x12
pchb5 at pci0 dev 16 function 2 "ServerWorks CIOB-E" rev 0x12
pci3 at pchb5 bus 2
bge0 at pci3 dev 0 function 0 "Broadcom BCM5704C" rev 0x02: apic 9 int 0 (irq 7) address 00:0f:1f:67:3d:c9
brgphy0 at bge0 phy 1: BCM5704 10/100/1000baseT PHY, rev. 0
bge1 at pci3 dev 0 function 1 "Broadcom BCM5704C" rev 0x02: apic 9 int 1 (irq 5) address 00:0f:1f:67:3d:cb
brgphy1 at bge1 phy 1: BCM5704 10/100/1000baseT PHY, rev. 0
pchb6 at pci0 dev 17 function 0 "ServerWorks CIOBX2" rev 0x05
pchb7 at pci0 dev 17 function 2 "ServerWorks CIOBX2" rev 0x05
pci4 at pchb7 bus 4
mpt0 at pci4 dev 5 function 0 "Symbios Logic 53c1030" rev 0x07: apic 9 int 2 (irq 7)
mpt0: sending FW Upload request to IOC (size: 36, img size: 40048)
scsibus1 at mpt0: 16 targets
sd0 at scsibus1 targ 0 lun 0: <MAXTOR, ATLAS10K4_146SCA, DFX1> SCSI3 0/direct fixed
sd0: 140014MB, 49070 cyl, 8 head, 730 sec, 512 bytes/sec, 286749480 sec total
sd1 at scsibus1 targ 1 lun 0: <MAXTOR, ATLAS10K4_146SCA, DFX1> SCSI3 0/direct fixed
sd1: 140014MB, 49070 cyl, 8 head, 730 sec, 512 bytes/sec, 286749480 sec total
sd2 at scsibus1 targ 2 lun 0: <MAXTOR, ATLAS10K4_146SCA, DFX1> SCSI3 0/direct fixed
sd2: 140014MB, 49070 cyl, 8 head, 730 sec, 512 bytes/sec, 286749480 sec total
uk0 at scsibus1 targ 6 lun 0: <PE/PV, 1x3 SCSI BP, 1.1> SCSI2 3/processor fixed
uk0: unknown device
mpt0: target 0 Synchronous at 160MHz width 16bit offset 127 QAS 0 DT 1 IU 1
mpt0: target 1 Synchronous at 160MHz width 16bit offset 127 QAS 0 DT 1 IU 1
mpt0: target 2 Synchronous at 160MHz width 16bit offset 127 QAS 0 DT 1 IU 1
mpt0: target 6 Asynchronous at 0MHz width 8bit offset 0 QAS 0 DT 0 IU 0
mpt1 at pci4 dev 5 function 1 "Symbios Logic 53c1030" rev 0x07: apic 9 int 3 (irq 5)
mpt1: sending FW Upload request to IOC (size: 36, img size: 40048)
scsibus2 at mpt1: 16 targets
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pmsi0 at pckbc0 (aux slot)
pckbc0: using irq 12 for aux slot
wsmouse0 at pmsi0 mux 0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
sysbeep0 at pcppi0
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask 0 netmask 0 ttymask 0
pctr: user-level cycle counter enabled
dkcsum: sd0 matched BIOS disk 80
dkcsum: sd1 matched BIOS disk 81
dkcsum: sd2 matched BIOS disk 82
root on sd0a
rootdev=0x400 rrootdev=0xd00 rawdev=0xd02


>How-To-Repeat:

	A minimum test case is:

/* grind.c */
#include <sys/time.h>
#include <stdio.h>

int main (void) {

    while ( 1 ) {
	struct timeval tod0;
	struct timeval tod1;
	long tnorm0;
	long tnorm1;
	
	gettimeofday( &tod0, (struct timezone *)0 );
	gettimeofday( &tod1, (struct timezone *)0 );
	tnorm0 = (tod0.tv_sec * 1000000) + tod0.tv_usec;
	tnorm1 = (tod1.tv_sec * 1000000) + tod1.tv_usec;

	if ( tnorm0 > tnorm1 ) {
	    printf( "Bad times %d.%d  %d.%d - %d %d\n",
		    tod0.tv_sec,
		    tod0.tv_usec,
		    tod1.tv_sec,
		    tod1.tv_usec,
		    tnorm0,
		    tnorm1 );
	}
    }
    
    return 0;
}

	Compile and run the above code. It should run without
	output on a single-CPU kernel. Running it on an SMP 
	kernel should give output similar to the following:

503 clarkema_(_at_)_cleese:~> ./grind 
Bad times 1103797726.163719  1103797726.163685 - -278973689 -278973723
Bad times 1103797726.183639  1103797726.183605 - -278953769 -278953803
Bad times 1103797726.193599  1103797726.193565 - -278943809 -278943843
Bad times 1103797726.213519  1103797726.213485 - -278923889 -278923923
Bad times 1103797726.233439  1103797726.233407 - -278903969 -278904001
Bad times 1103797726.283239  1103797726.283205 - -278854169 -278854203
Bad times 1103797726.293198  1103797726.293165 - -278844210 -278844243
Bad times 1103797726.303159  1103797726.303125 - -278834249 -278834283

>Fix:
	None known. My C-fu is not that strong ;)



Visit your host, monkey.org