[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

libkse / libthr bugs?



On Fri, 27 Jun 2003, Mike Makonnen wrote:
> On Fri, 27 Jun 2003 00:26:33 -0400 (EDT)
> Daniel Eischen <eischen_(_at_)_vigrid_(_dot_)_com> wrote:
> 
> > 
> > Signal handling and locks (low-level, CV, mutex, etc) are
> > somewhat difficult to deal with, especially when there are
> > mutexes in libc that the application doesn't even see.
> > 
> 
> Heh. no kidding :-)
> 
> > In general, signals can't be deferred around big locks
> > (mutexes, CVs, rwlocks, etc), but may be around low-level
> > locks (which is what I think your patch is doing).
> 
> Correct. It wasn't designed to solve Marcel's problem. It was to narrow the
> possible culprits down.
> 
> > It is valid for an application to have a thread blocked
> > in pthread_mutex_lock(), pthread_cond_timedwait(), etc,
> > receive a signal.  The signal handler should run, but
> > those functions should not return EINTR; they should
> > continue blocking when the handler returns.
> >
> > It is also valid for a thread to be blocked in fwrite()
> > (or some other libc function that has locking) and
> > receive a signal.
> 
> Yes, this is my understanding as well-- unless they attempt to use a pthreads
> facility or call non-async safe libc routines from signal handlers, in which
> case all bets are off. I believe the pthread_cancel() family of functions is the
> only one that's guaranteed to be async safe.
> 
> 
> > In either case, you also have to handle the the thread
> > _not_ returning normally; it could [sig]longjmp() or
> > setcontext() out of the locked area.  So if you are keeping
> > any internal queues for mutexes, CVs, etc, you have
> > to ensure the thread is removed from the queue before
> > the signal handler is invoked, and then reinsert the
> > thread back into the queue if the signal handler
> > returns normally.
> 
> This is what I don't understand very well. Is it necessary in a 1:1 library? I
> can understand this sort of behind-the-scenes manipultaion in lic_r or libkse,
> where the uts might need to be protected from those kinds of situations. But for
> libthr, where all scheduling and signaling is taken care of in the kernel I
> would be inclined to say "all bets are off." As I wrote earlier, it's my
> understanding that the only pthreads routines you can rely on to be async safe
> are pthread_cancel() and friends.
> 
> I admit I could be grossly misunderstanding the situation, in which case I
> would really appreciate a clarification.

To answer your first question, yes, I think it's necessary.
Here's an example:

	void
	sigalarmhandler(int sig)
	{
		longjmp(env, 1);
	}

	int
	somenslookup(...)
	{
		...
		alarm(5);	/* Wait a maximum of 5 seconds. */
		if (setjmp(env) == 0) {
			nsdispatch(...);
			return (0);
		}
		return (-1);
	}

Now perhaps this isn't the best example; imagine using
something that used malloc()/free() or any one of the other
locked libc functions.  There is also the use of application
level-locks which should work similarly, but by using libc
example I avoid any argument about "the application shouldn't
be doing that" :-)

It's also possible that the thread calls the same set
of libc functions again, and if it isn't removed from
the internal mutex queue, then you'd get the exact
error message Marcel was seeing (already on mutexq).

-- 
Dan Eischen