[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: memory (mbuf) leak in fxp driver.



I checked on my machine, and I noticed that the 2 interfaces this happens on
are revision 0x08:

fxp1 at pci1 dev 8 function 0 "Intel 82557" rev 0x08: irq 11, address
00:d0:b7:bd:b6:89
inphy1 at fxp1 phy 1: i82555 10/100 media interface, rev. 4
fxp2 at pci1 dev 9 function 0 "Intel 82557" rev 0x08: irq 9, address
00:d0:b7:af:b9:6e
inphy2 at fxp2 phy 1: i82555 10/100 media interface, rev. 4

However, this interface (fxp0) has been permanently active, and does not
seem to exhibit the memory leak. Maybe something to do with this revision
and with interfaces that use the phy? fxp0 on my machine only routes IP;
there is no sniffer or bridge running on it.

fxp0 at pci1 dev 1 function 0 "Intel 82557" rev 0x08: irq 10, address
00:d0:b7:e1:01:73
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4

Here are the rest of my interfaces, all rev 0x05:

fxp3 at pci2 dev 4 function 0 "Intel 82557" rev 0x05: irq 11, address
00:03:47:0d:9b:c8
inphy3 at fxp3 phy 1: i82555 10/100 media interface, rev. 0
fxp4 at pci2 dev 5 function 0 "Intel 82557" rev 0x05: irq 10, address
00:03:47:0d:9b:c9
inphy4 at fxp4 phy 1: i82555 10/100 media interface, rev. 0
ppb2 at pci1 dev 11 function 0 "DEC DECchip 21152 PCI-PCI" rev 0x03
pci3 at ppb2 bus 3
fxp5 at pci3 dev 4 function 0 "Intel 82557" rev 0x05: irq 10, address
00:03:47:0d:9a:22
inphy5 at fxp5 phy 1: i82555 10/100 media interface, rev. 0
fxp6 at pci3 dev 5 function 0 "Intel 82557" rev 0x05: irq 11, address
00:03:47:0d:9a:23
inphy6 at fxp6 phy 1: i82555 10/100 media interface, rev. 0

I think I can do a little more testing tomorrow.. I'll fire up tcpdump on
fxp1 and fxp2 and see what happens.

--Doug

on 5/23/01 7:31 AM, Benninghoff, John at JABenninghoff@dainrauscher.com
wrote:

> It takes about a week... but the 2 factors that really magnify the problem
> are:
> 
> 1. it's running tcpdump all the time
> 2. it's running at 10 MBit
> 
> there may be some additional factors here as well...
> 
> * tcpdump is being killed & restarted every hour
> * the leak also seems to happen faster on networks with LOW network
> utilization (weird)
> * It may be related to the revision level of the card, (Intel 82557 rev
> 0x08)
> 
> -----Original Message-----
> From: Rob Paisley [mailto:rsp5870@cs.rit.edu]
> Sent: Tuesday, May 22, 2001 11:09 PM
> To: Benninghoff, John
> Subject: Re: memory (mbuf) leak in fxp driver.
> 
> 
> John-
> How long does the box have to be up to fill the buffer??
> 
> I've got a card that uses the fxp driver, and It's gotten to uptimes of
> 100+ days, without problem.  Now it's a firewall, with TWO cards that use
> the fxp driver, and still haven't seen any problem.
> 
> Not sure what to tell you.  netstat -m for me gives the following output:
> 
> 123 mbufs in use:
> 100 mbufs allocated to data
> 14 mbufs allocated to packet headers
> 9 mbufs allocated to socket names and addresses
> 102/342 mapped pages in use
> 699 Kbytes allocated to network (31% in use)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines
> 
> Tell me what ya think
> -Rob
> 
> 
> On Tue, 22 May 2001, Benninghoff, John wrote:
> 
>> Date: Tue, 22 May 2001 17:20:05 -0500
>> From: "Benninghoff, John" <JABenninghoff@dainrauscher.com>
>> To: "'tech@openbsd.org'" <tech@openbsd.org>
>> Subject: memory (mbuf) leak in fxp driver.
>> 
>> Hello all,
>> 
>> I'm fairly certain that I've uncovered a memory leak in the fxp driver, at
>> least for 2.8-stable. What I'm seeing is a steadily increasing number of
>> mbufs in use, until I get a "mb_map full" kernel error and the network
> stops
>> working.
>> 
>> At this point, netstat -m shows something like this:
>> 
>> 18390 mbufs in use:
>> 2265 mbufs allocated to data
>> 16124 mbufs allocated to packet headers
>> 1 mbuf allocated to socket names and addresses
>> 8190/8192 mapped pages in use
>> 18682 Kbytes allocated to network (99% in use)
>> 0 requests for memory denied
>> 0 requests for memory delayed
>> 1941 calls to protocol drain routines
>> 
>> As you can see, I've already set NMBCLUSTERS="8192" in my kernel
>> configuration. This really only delays the inevitable.
>> 
>> I searched for similar problems in the mailing list archives, and I
> noticed
>> that someone else was experiencing a similar problem when running a bridge
>> using fxp (Intel) NICs...
>> 
> http://www.sigmasoft.com/~openbsd/archive/openbsd-tech/200104/msg00024.html
>> (note that this appears to be 10 meg, not 100)
>> 
>> After further testing / experimenting I noticed the following:
>> 
>> * running tcpdump (as I do) makes it worse, the mbufs fill up much faster.
>> I'm doing sniffing on heavily-utilized networks.
>> * cards running at 10 meg fill up much faster than cards running at 100
> meg
>> (not what I would expect)
>> * the problem seems to exist in 2.7, 2.8, and 2.9 (beta). I haven't
> checked
>> -current.
>> * unplugging the network connection doesn't reduce the mbufs in use.
>> * I've seen similar problems reported on NetBSD and FreeBSD, perhaps
> because
>> they all share parent code (?)
>> * hard to say for sure, but other drivers, like xl, don't seem to behave
> the
>> same way.
>> 
>> It really looks like a leak in fxp, but I lack the expertise to find it in
>> the source code ...
>> 
>> Any suggestions ? Should I submit this as a bug report ?
>> 
>> here are the relevant lines from dmesg:
>> fxp0 at pci0 dev 1 function 0 "Intel 82557" rev 0x08: irq 11, address
>> 00:b0:d0:e1:0b:68
>> inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
>> fxp1 at pci0 dev 2 function 0 "Intel 82557" rev 0x08: irq 10, address
>> 00:b0:d0:e1:0b:69
>> inphy1 at fxp1 phy 1: i82555 10/100 media interface, rev. 4
>> 
>> ------------------------------------------
>> John A Benninghoff
>> mailto:jabenninghoff@dainrauscher.com