[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

kern/72630: NFS copy of large file crash/hangs 5.2.1 while running Xorg/gnome2

>Number:         72630
>Category:       kern
>Synopsis:       NFS copy of large file crash/hangs 5.2.1 while running Xorg/gnome2
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Oct 13 09:00:38 GMT 2004
>Originator:     bob frazier
>Release:        5.2.1-RELEASE-p9
SFT Inc.
FreeBSD beater.SFT.local 5.2.1-RELEASE-p9 FreeBSD 5.2.1-RELEASE-p9 #0:
Sat Aug 28 13:40:02 PDT 2004 root_(_at_)_beater_(_dot_)_SFT_(_dot_)_local:/usr/obj/usr/src/sys/GENERIC  i386

Copying large file via NFS share (approximately 770Mb) causes the total number of buffers to slowly increase until it results in a system hang.  I have observed this on multiple occasions, once while runnning systat -vmstat.  When large files are copied TO the machine via NFS, it seems that the network speed is faster than the drive speed, and buffers are filled up but not written to disk.  When the number of 'dirty' buffers gets to a certain point (not sure exactly where) the system will 'hang'.  I have only observed this while Xorg with gnome2/sawfish (latest ports as of a week ago) is running at the same time.      
Create NFS share.  In this case it's on drive /dev/ad2s0d on a system with 2 hard drives (/ /var /usr all mount from /dev/ad0xxx).  Run Xorg (assuming it doesn't crash on startup, different problem, worth mentioning) with gnome2, open up a few applications (xchat, mozilla, mozilla mail, xterm sessions).
On a separate FreeBSD machine (same OS, different hardware) mount the NFS share and attempt to copy an 800Mb file to the first machine via the NFS share.
Observe the number of buffers from a console running systat -vmstat
More often than not it will crash/hang the system to the point where it's non-responsive.

I attempted to remedy the situation by a) booting single user, b) fsck'ing all of the drives, c) adding the line kern.maxbcache="32MB" to /boot/loader.conf, and d) rebooting the system into multi-user, allowing the other machine's copy process (which was in a hang/unkillable state again) to terminate properly, unmounting and remounting the NFS share, and then attempting the copy again but with Xorg/gnome2/sawfish NOT running.  The file copied correctly, without any problems.  I have not tried to reproduce this with Xorg/gnome2/sawfish running, however.