DragonFly BSD
DragonFly bugs List (threaded) for 2011-04
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: panic: assertion: p->p_lock == 0 in kern_wait

From: Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>
Date: Thu, 21 Apr 2011 11:36:10 -0700 (PDT)

:> After poking here and the, I think this KKASSERT() can simply go away
:> as proc_remove_zombie() will wait for p->p_lock to drop to zero anyway.
:The following is what I have in my tree.  What it does is to hold proc_token
:while waiting for p->p_lock to drop, just as done in proc_remove_zombie().
:If I don't hold the proc_token as in the first chunk, I see the
:  proc_remove_zombie: waiting for %p->p_lock to drop
:message a few times every hour on the console.  I guess it may also
:imply that the race is between a code which holds proc_token for
:a long time but not p->p_token, like all*_scan().

    It looks good, I would make two changes.   One would be to shortcut
    the case where p->p_lock is already 0 to avoid unnecessary contention
    with proc_token in the critical exit path.

    if (p->p_lock) {
	    while (p->p_lock)
		    tsleep(p, 0, "reap3", hz);

:@@ -661,6 +661,7 @@ proc_remove_zombie(struct proc *p)
: {
: 	lwkt_gettoken(&proc_token);
: 	while (p->p_lock) {
:+		kprintf("%s: waiting for %p->p_lock to drop\n", __func__, p);
: 		tsleep(p, 0, "reap1", hz / 10);
: 	}
: 	LIST_REMOVE(p, p_list); /* off zombproc */

     This one may unnecessarily spam the kprintf since the wait is 1/10
     of a second.  Maybe conditionalize it with a variable so it only issues
     one kprintf().


     With regards to getting rid of the timeout in the tsleep and using a
     proactive wakeup(), we have to avoid calling wakeup() for 1->0
     transitions unless someone is known to be waiting on p_lock.  This
     can be implementing by adding a WAITING flag to the field and using
     atomic_cmpset_int() to handle the (WAITING | 1) -> (0) transition and
     then calling wakeup() if WAITING was set.

     I will augment the sys/refcount.h API and add refcount_wait() and
     refcount_release_wakeup() which encapsulate the appropriate atomic
     ops.  I will leave it up to you if you want to then use the new API
     functions for PHOLD/PRELE, which would give the tsleep case a
     proactive wakeup() instead of having to wait for it to timeout.

					Matthew Dillon 

[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]