[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ZFS leaking vnodes (sort of)



On Sat, Jul 07, 2007 at 02:26:17PM +0100, Doug Rabson wrote:
> I've been testing ZFS recently and I noticed some performance issues 
> while doing large-scale port builds on a ZFS mounted /usr/ports tree. 
> Eventually I realised that virtually nothing ever ended up on the vnode 
> free list. This meant that when the system reached its maximum vnode 
> limit, it had to resort to reclaiming vnodes from the various 
> filesystem's active vnode lists (via vlrureclaim). Since those lists 
> are not sorted in LRU order, this led to pessimal cache performance 
> after the system got into that state.
> 
> I looked a bit closer at the ZFS code and poked around with DDB and I 
> think the problem was caused by a couple of extraneous calls to vhold 
> when creating a new ZFS vnode. On FreeBSD, getnewvnode returns a vnode 
> which is already held (not on the free list) so there is no need to 
> call vhold again.

Whoa! Nice catch... The patch works here - I did some pretty heavy
tests, so please commit it ASAP.

I also wonder if this can help with some of those 'kmem_map too small'
panics. I was observing that ARC cannot reclaim memory and this may be
because all vnodes and thus associated data are beeing held.

To ZFS users having problems with performance and/or stability of ZFS:
Can you test the patch and see if it helps?

> This patch appears to fix the problem (only very lightly tested):
> 
> Index: zfs_vnops.c
> ===================================================================
> RCS 
> file: /home/ncvs/src/sys/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c,v
> retrieving revision 1.22
> diff -u -r1.22 zfs_vnops.c
> --- zfs_vnops.c	28 May 2007 02:37:43 -0000	1.22
> +++ zfs_vnops.c	7 Jul 2007 13:01:41 -0000
> @@ -3493,7 +3493,7 @@
>  		rele = 0;
>  	vp->v_data = NULL;
>  	ASSERT(vp->v_holdcnt > 1);
> -	vdropl(vp);
> +	VI_UNLOCK(vp);
>  	if (!zp->z_unlinked && rele)
>  		VFS_RELE(zfsvfs->z_vfs);
>  	return (0);
> Index: zfs_znode.c
> ===================================================================
> RCS 
> file: /home/ncvs/src/sys/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c,v
> retrieving revision 1.8
> diff -u -r1.8 zfs_znode.c
> --- zfs_znode.c	6 May 2007 19:05:37 -0000	1.8
> +++ zfs_znode.c	7 Jul 2007 13:17:32 -0000
> @@ -115,7 +115,6 @@
>  		ASSERT(error == 0);
>  		zp->z_vnode = vp;
>  		vp->v_data = (caddr_t)zp;
> -		vhold(vp);
>  		vp->v_vnlock->lk_flags |= LK_CANRECURSE;
>  		vp->v_vnlock->lk_flags &= ~LK_NOSHARE;
>  	} else {
> @@ -601,7 +600,6 @@
>  			ASSERT(err == 0);
>  			vp = ZTOV(zp);
>  			vp->v_data = (caddr_t)zp;
> -			vhold(vp);
>  			vp->v_vnlock->lk_flags |= LK_CANRECURSE;
>  			vp->v_vnlock->lk_flags &= ~LK_NOSHARE;
>  			vp->v_type = IFTOVT((mode_t)zp->z_phys->zp_mode);

-- 
Pawel Jakub Dawidek                       http://www.wheel.pl
pjd_(_at_)_FreeBSD_(_dot_)_org                           http://www.FreeBSD.org
FreeBSD committer                         Am I Evil? Yes, I Am!

Attachment: pgpIncAH7jgjT.pgp
Description: PGP signature


Visit your host, monkey.org