[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: runaway shell (tcsh) processes and load average




Here's a guess: the user has a .logout file. I believe tcsh-6.08.00
fixes that problem.

-Costa

John Nitis <john@interscape.com> writes:

> Hello,
> 
> I'm running OpenBSD binkley 2.4 GENERIC#20 sparc.  Whenever someone
> doesn't properly logout of their shell (such as when they are disconnected
> due to network troubles) their shell process ends up eating up all
> available CPU as a runaway process.  As far as I know this happens across
> rlogin, telnet, and secure shell logins.  The shell being used is tcsh.
> 
> Recently I grabbed some debugging information on one of the runaway shells
> via ktrace and I've included a bit of the kdump output below.
> 
> The other question I had relates to the load average.  It's constantly
> between 0.10 - 0.20 even when there's nothing running to bring the load
> average up that high.  Is there anything that may be causing this on
> OpenBSD/sparc? (110mhz sparc4 with 64mb ram) `top' shows:
> 
> CPU states:  0.5% user,  0.0% nice,  1.0% system,  0.0% interrupt, 98.5%
> idle
> Memory: Real: 1736K/26M act/tot  Free: 35M  Swap: 1496K/119M used/tot
>   
>   PID USERNAME PRI NICE  SIZE   RES STATE WAIT     TIME    CPU COMMAND
> 24547 root       2    0  324K  148K idle  select  10:05  0.00% sshd1
> 10726 root       2    0  640K  176K sleep select   2:23  0.00% httpd
> ...
> 
> I'd appreciate it if someone could cc: my email address
> <john@interscape.com> if a solution is available, as I'm not a member of
> this mailing list.  Thanks in advance for any assistance.
> 
> 
> ktrace from runaway tcsh process:
> 
>  29237 tcsh     CALL  write(0x11,0x5ed18,0x9)
>  29237 tcsh     RET   write -1 errno 5 Input/output error
>  29237 tcsh     CALL  issetugid
>  29237 tcsh     RET   issetugid 0
>  29237 tcsh     CALL  open(0xf7ffec10,0,0)
>  29237 tcsh     NAMI  "/usr/share/nls/C/libc.cat"
>  29237 tcsh     RET   open 0
>  29237 tcsh     CALL  fstat(0,0xf7ffeb48)
>  29237 tcsh     RET   fstat 0
>  29237 tcsh     CALL  mmap(0,0xe5a,0x1,0x1,0,0,0,0)
>  29237 tcsh     RET   mmap 135708672/0x816c000
>  29237 tcsh     CALL  close(0)
>  29237 tcsh     RET   close 0
>  29237 tcsh     CALL  munmap(0x816c000,0xe5a)
>  29237 tcsh     RET   munmap 0
>  29237 tcsh     CALL  lseek(0x10,0,0,0,0x2,0)
>  29237 tcsh     RET   lseek -1 errno 9 Bad file descriptor
>  29237 tcsh     CALL  ioctl(0xf,TIOCSPGRP,0xf7ffefa4)
>  29237 tcsh     RET   ioctl -1 errno 25 Inappropriate ioctl for device
>  29237 tcsh     CALL  sigreturn(0x5cf18)
>  29237 tcsh     RET   sigreturn JUSTRETURN
>  29237 tcsh     CALL  sigprocmask(0x1,0)
>  29237 tcsh     RET   sigprocmask 2
>  29237 tcsh     CALL  sigprocmask(0x3,0)
>  29237 tcsh     RET   sigprocmask 2
>  29237 tcsh     CALL  close(0)
>  29237 tcsh     RET   close -1 errno 9 Bad file descriptor
>  29237 tcsh     CALL  close(0x1)
>  29237 tcsh     RET   close -1 errno 9 Bad file descriptor
>  29237 tcsh     CALL  close(0x2)
>  29237 tcsh     RET   close -1 errno 9 Bad file descriptor
> 
> ..and so on in numerical order until close(0x3f).  After the failed
> close(0x3f), it restarts from the beginning.