[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[patch] telldir(), etc: prevent memory leak



Hello all,

I'm kind of new to this mailing list, so please don't be too quick
to flame. Having said that, I'm certainly open to honest critique and feedback.


Are patches offered here automatically considered for future releases
of OpenBSD?

History:

Noted when running smbd (samba suite), would encounter large memory
leaks for certain operations, especially noticeable on larger directories. Tracked to use of telldir() and seekdir() within smbd.


See also:
  http://marc.theaimsgroup.com/?l=openbsd-misc&m=114213950013512&w=2

Basically, doing something like:

	struct dirent	*direntry;
	DIR *d = opendir("/my/big/directory/path");

	while (direntry = readdir(d))
	{
		long cookie = telldir(d);
		...
	}
	closedir(d);

Would leak 16 bytes for each entry in the referenced directory.

Arguably, smbd could probably be redesigned and patched to avoid the
behaviour of causing the memory leak; but, it seemed like telldir()
should also be recoded to safeguard against the memory leak, making
OpenBSD more resistant to this kind of usage of telldir() across
userland processes.

Solution:

Free telldir() cache when the directory is closed via closedir().

Basically, a variation on that proposed here:
  http://mail-index.netbsd.org/netbsd-bugs/2004/02/05/0008.html

I specifically didn't want to change the overall prototypes for
telldir() and struct _dirdesc, though if the overall opinion is
that a linked list attached to (DIR *) is a better way to go,
I'll be happy to rearrange the code to accomodate.

On application of the patch and rebuild of libc, the memory leak
attributable to telldir() disappeared from smbd (at least it did for
me), and testing using independent test code for the conditions outlined
above showed no leak. I can provide source for the tests if it is
desired.

Caveats:

Removal of phrasing in man page specifying that a telldir() offset
can be used after a closedir() as long as a readdir() isn't called.
When the patched code cleans up memory after the closedir(), all
previous telldir() values become invalidated. This patch will
probably break any code that is relying on this behaviour.

As far as I can tell, the POSIX standards for telldir() and
seekdir() do not specify accessibility after a re-open as necessary
or desired behaviour.

Patch:

(Note: this patch is cvs diff -u against OPENBSD_3_8_BASE and created
 from cwd = lib/libc/gen directory where all changes are isolated.
 If this is not standard procedure for creating patches, please let
 me know and I will re-create the patch in a different format or
 against a different cvs tag.)


Index: closedir.c =================================================================== RCS file: /cvs/src/lib/libc/gen/closedir.c,v retrieving revision 1.6 diff -u -r1.6 closedir.c --- closedir.c 2005/08/08 08:05:33 1.6 +++ closedir.c 2006/03/15 18:51:23 @@ -34,6 +34,9 @@ #include <unistd.h> #include "thread_private.h"

+/* function in telldir.c to hide underlying cache data structures */
+extern void __free_telldir(const DIR *);
+
 /*
  * close a directory.
  */
@@ -45,7 +48,10 @@

 	if ((ret = _FD_LOCK(dirp->dd_fd, FD_READ, NULL)) != 0)
 		return (ret);
-	seekdir(dirp, dirp->dd_rewind);	/* free seekdir storage */
+
+	/* free telldir storage */
+	__free_telldir(dirp);
+
 	fd = dirp->dd_fd;
 	dirp->dd_fd = -1;
 	dirp->dd_loc = 0;
Index: directory.3
===================================================================
RCS file: /cvs/src/lib/libc/gen/directory.3,v
retrieving revision 1.16
diff -u -r1.16 directory.3
--- directory.3	2003/06/02 20:18:34	1.16
+++ directory.3	2006/03/15 18:51:23
@@ -135,12 +135,6 @@
 If the directory is closed and then reopened, the
 .Fn telldir
 value may be invalidated due to undetected directory compaction.
-It is safe to use a previous
-.Fn telldir
-value immediately after a call to
-.Fn opendir
-and before any calls to
-.Fn readdir .
 .Pp
 The
 .Fn rewinddir
Index: rewinddir.c
===================================================================
RCS file: /cvs/src/lib/libc/gen/rewinddir.c,v
retrieving revision 1.7
diff -u -r1.7 rewinddir.c
--- rewinddir.c	2005/08/08 08:05:34	1.7
+++ rewinddir.c	2006/03/15 18:51:25
@@ -31,12 +31,10 @@
 #include <sys/types.h>
 #include <dirent.h>

-void __seekdir(DIR *, long);
-
 void
 rewinddir(DIR *dirp)
 {

-	__seekdir(dirp, dirp->dd_rewind);
+	seekdir(dirp, dirp->dd_rewind);
 	dirp->dd_rewind = telldir(dirp);
 }
Index: telldir.c
===================================================================
RCS file: /cvs/src/lib/libc/gen/telldir.c,v
retrieving revision 1.7
diff -u -r1.7 telldir.c
--- telldir.c	2005/08/08 08:05:34	1.7
+++ telldir.c	2006/03/15 18:51:25
@@ -33,88 +33,250 @@
 #include <stdlib.h>
 #include <unistd.h>

-/*
- * The option SINGLEUSE may be defined to say that a telldir
- * cookie may be used only once before it is freed. This option
- * is used to avoid having memory usage grow without bound.
+/* + * Caches the directory position details every time telldir() is
+ * called. This way, we can validate on seekdir() and also return
+ * a previously created "cookie" if called from the same directory
+ * location.
*/
-#define SINGLEUSE
+struct _dirpos {
+ long _dpseek; /* magic cookie returned by getdirentries */
+ long _dploc; /* offset of entry in buffer */
+};


/*
- * One of these structures is malloced to describe the current directory
- * position each time telldir is called. It records the current magic - * cookie returned by getdirentries and the offset within the buffer
- * associated with that return value.
- */
-struct ddloc {
- struct ddloc *loc_next;/* next structure in list */
- long loc_index; /* key associated with structure */
- long loc_seek; /* magic cookie returned by getdirentries */
- long loc_loc; /* offset of entry in buffer */
+ * To avoid changing the prototype and definition of struct dirdesc
+ * in dirent.h, we define a linked list of directories, indexed by + * file descriptor held statically, as was done historically in
+ * a (rather mem-leak prone) hash construct.
+ *
+ * Theoretically, we could remove some of the complexity here by
+ * attaching a linked list of _dirpos structs to the DIR * (and
+ * arguably, that would be more ... intuitive), but for large
+ * directories with many telldir() calls, traversing the list
+ * can get time and resource consuming.
+ *
+ * Alternately, we could attach the telldir cache to struct dirdesc (DIR *),
+ * instead of holding it statically here, but that would mean losing + * some control over implementation details and possibly expose the + * caller to caching structures that should be fully internal.
+ *
+ * DIRTELL_CACHE_BLOCK controls the block factor for memory directory cache
+ * reallocation. A higher value will make telldir() run faster at the + * expense of transitory memory usage. A smaller value will make + * telldir() run slower (more reallocs) but waste less memory.
+ */
+#define DIRTELL_CACHE_BLOCK (32)
+struct _dirtell {
+ int fd; /* directory file descriptor */
+ long cache_size; /* number of _dirpos allocated */
+ long cache_entries; /* number of _dirpos used in dlist */
+ struct _dirpos *dlist; /* ptr to cache */
+ struct _dirtell *dnxt; /* ptr to next _dirtell */
};
-
-#define NDIRHASH 32 /* Num of hash lists, must be a power of 2 */
-#define LOCHASH(i) ((i)&(NDIRHASH-1))


-static long	dd_loccnt;	/* Index of entry for sequential readdir's */
-static struct	ddloc *dd_hash[NDIRHASH];   /* Hash list heads for ddlocs */
-
-void __seekdir(DIR *, long);
+/*
+ * Pointer into our internal directory tell cache.
+ * Allocated via telldir(), const used by seekdir(), freed by __free_telldir()
+ */
+static struct _dirtell *_dirtells = NULL;

/*
+ * telldir()
+ *
* return a pointer into a directory
*/
long
telldir(const DIR *dirp)
{
- int index;
- struct ddloc *lp;
+ long index;
+ struct _dirtell *tp;
+ struct _dirpos *lp;
+
+ /*
+ * Find the _dirtell structure for this directory.
+ */
+ tp = _dirtells;
+ while (tp != NULL)
+ { + if (tp->fd == dirp->dd_fd)
+ { break;
+ }
+ tp = tp->dnxt;
+ }


- if ((lp = (struct ddloc *)malloc(sizeof(struct ddloc))) == NULL)
- return (-1);
- index = dd_loccnt++;
- lp->loc_index = index;
- lp->loc_seek = dirp->dd_seek;
- lp->loc_loc = dirp->dd_loc;
- lp->loc_next = dd_hash[LOCHASH(index)];
- dd_hash[LOCHASH(index)] = lp;
+ /*
+ * If there isn't a _dirtell, then allocate and initialise one.
+ */
+ if (tp == NULL)
+ {
+ if ((tp = (struct _dirtell *) malloc(sizeof(struct _dirtell))) == NULL)
+ { return(-1);
+ }
+
+ tp->fd = dirp->dd_fd;
+ tp->dlist = NULL;
+ tp->cache_size = 0;
+ tp->cache_entries = 0;
+ tp->dnxt = _dirtells;
+ _dirtells = tp;
+ }
+
+ /*
+ * Find the directory entry details.
+ */
+ lp = tp->dlist;
+ for (index = 0; index < tp->cache_entries; index++)
+ {
+ if (lp->_dpseek == dirp->dd_seek &&
+ lp->_dploc == dirp->dd_loc)
+ {
+ /* made before, reuse */
+ return index;
+ }
+
+ lp++;
+ }
+ + /*
+ * Allocate detail memory, if this is the first time through ...
+ * Reallocate space if the entries have hit the limit of the
+ * telldir() cache for this directory.
+ */
+ if (tp->cache_entries == tp->cache_size)
+ { + size_t m_size = + (tp->cache_size + DIRTELL_CACHE_BLOCK) * sizeof(struct _dirpos);
+
+ tp->dlist = (struct _dirpos *) (lp == NULL ?
+ malloc(m_size) : realloc(tp->dlist, m_size));
+ if (tp->dlist == NULL)
+ { return(-1);
+ }
+ tp->cache_size += DIRTELL_CACHE_BLOCK; + lp = tp->dlist + index;
+ }
+
+ lp->_dpseek = dirp->dd_seek;
+ lp->_dploc = dirp->dd_loc;
+ tp->cache_entries++;
+
return (index);
}


/*
- * seek to an entry in a directory.
- * Only values returned by "telldir" should be passed to seekdir.
+ * __free_telldir
+ *
+ * Internal. Should be called just prior to freeing the DIR * structure,
+ * in closedir().
+ *
+ * Frees all internal memory associated with any telldir() calls against
+ * dirp.
*/
-void
-__seekdir(DIR *dirp, long loc)
+void __free_telldir(const DIR *dirp)
{
- struct ddloc *lp;
- struct ddloc **prevlp;
+ struct _dirtell *tp, **link;
+
+ /*
+ * Find the _dirtell structure for this directory.
+ * Keep track of who is pointing to it so we can easily remove the
+ * structure from the linked list later.
+ */
+ tp = _dirtells;
+ link = &_dirtells;
+ while (tp != NULL)
+ { + if (tp->fd == dirp->dd_fd)
+ { break;
+ }
+ link = &(tp->dnxt);
+ tp = tp->dnxt;
+ }
+
+ /*
+ * If there isn't one, then there is nothing to free()
+ */
+ if (tp == NULL)
+ return;
+
+ /*
+ * Release the memory for the telldir() details cache
+ */
+ if (tp->dlist != NULL)
+ free((void *) tp->dlist);
+
+ /*
+ * Remove the directory entry from _telldirs linked list
+ */
+ *link = tp->dnxt;
+ + /*
+ * free() up our _telldir struct memory
+ */
+ free((void *) tp);
+}
+
+/*
+ * __seekdir
+ *
+ * Internal. Implements seekdir(). See seekdir.c.
+ */
+void __seekdir(DIR *dirp, long offset)
+{
+ struct _dirtell *tp;
+ struct _dirpos *lp;
struct dirent *dp;


- prevlp = &dd_hash[LOCHASH(loc)];
- lp = *prevlp;
- while (lp != NULL) {
- if (lp->loc_index == loc)
- break;
- prevlp = &lp->loc_next;
- lp = lp->loc_next;
+ /*
+ * Find the _dirtell structure for this directory.
+ */
+ tp = _dirtells;
+ while (tp != NULL)
+ { + if (tp->fd == dirp->dd_fd)
+ { break;
+ }
+ tp = tp->dnxt;
}
- if (lp == NULL)
+
+ /*
+ * Can't seek if telldir() hasn't been called.
+ * Check basic sanity of "offset"
+ */
+ if (tp == NULL || tp->dlist == NULL || + tp->cache_entries <= offset || offset < 0)
return;
- if (lp->loc_loc == dirp->dd_loc && lp->loc_seek == dirp->dd_seek)
- goto found;
- (void) lseek(dirp->dd_fd, (off_t)lp->loc_seek, SEEK_SET);
- dirp->dd_seek = lp->loc_seek;
+
+ /*
+ * Find the directory entry details.
+ * We can't really protect against someone using an index beyond the
+ * basic sanity checks above. If the offset is even marginally legit, + * we seek!
+ *
+ * Note also that we are using offset here as a pointer into the
+ * directory entry details cache, not necessarily as an indicator
+ * as to the entry position in the directory itself. "offset" is
+ * really an internal "cookie" and should be treated as such by the
+ * caller. See POSIX standards for seekdir() and telldir().
+ */
+ lp = tp->dlist + offset;
+
+ /*
+ * If we are already there, ignore the seek
+ */
+ if (lp->_dploc == dirp->dd_loc && lp->_dpseek == dirp->dd_seek)
+ return;
+
+ /*
+ * Return to the offset spec as provided by the directory details
+ * from the previous telldir().
+ */
+ dirp->dd_seek = lseek(dirp->dd_fd, (off_t) lp->_dpseek, SEEK_SET);
dirp->dd_loc = 0;
- while (dirp->dd_loc < lp->loc_loc) {
- dp = readdir(dirp);
- if (dp == NULL)
+ while (dirp->dd_loc < lp->_dploc) + {
+ if (readdir(dirp) == NULL)
break;
}
-found:
-#ifdef SINGLEUSE
- *prevlp = lp->loc_next;
- free((caddr_t)lp);
-#endif
}
+