Discussion:
lagg: a race between ioctl and clone_destroy
(too old to reply)
Andriy Gapon
2018-11-30 12:27:45 UTC
Permalink
I am working on analyzing a crash with the following stack trace:
_sx_slock_hard lagg_media_status ifmedia_ioctl lagg_ioctl ifioctl kern_ioctl
sys_ioctl
It appears that the crash happened because of a destroyed sx lock.
Please note that the code base is the CURRENT from just before the time when the
network stack started to use the epoch mechanism.

My theory is that the following race happened.
lagg_clone_destroy() and lagg_ioctl() were called concurrently.
lagg_clone_destroy() won a race to lock sc_sx while lagg_media_status() got
blocked on it. I think that after some adaptive spinning the thread was placed
on a sleep queue. Then, lagg_clone_destroy() unlocked the lock and proceeded to
destroy it. After the lagg_media_status() thread was waken up it found the lock
in the destroyed state and crashed on it in a typical fashion (trying to
dereference a NULL pointer as a struct thread pointer).

My knowledge of the network stack is rather shallow, so I have a few questions.

Q1. Is the described race plausible?

Q2. If yes, then has this class[*] of races been fixed by the epoch mechanism?
I suspect that lagg_ioctl() can still have that race if it's called concurrently
with lagg_clone_destroy().

Q3. Is there a way to protect against this type of a race in the code from
before the epoch mechanism?
I think that the safest thing to do would be to free/destroy the softc only
after if_refcount goes to zero, but I could not find any callback for that.
That is, I think that this code in if_free() ensures that ifp stays valid as
long as it's referenced and it's being accessed under the epoch protection:
if (refcount_release(&ifp->if_refcount))
epoch_call(net_epoch_preempt, &ifp->if_epoch_ctx, if_destroy);

But the code in lagg_clone_destroy() would destroy the softc without waiting on
anything:
LAGG_XUNLOCK(sc);

ifmedia_removeall(&sc->sc_media);
ether_ifdetach(ifp);
if_free(ifp); <---- ifp can still be used and valid after this

LAGG_LIST_LOCK();
SLIST_REMOVE(&V_lagg_list, sc, lagg_softc, sc_entries);
LAGG_LIST_UNLOCK();

LAGG_SX_DESTROY(sc);
free(sc, M_DEVBUF); <---- but sc is immediately destroyed in any case
}

[*] My impression is that the problem is (or, at least, was) not limited to
lagg. I think that other drivers can also have a similar race between a
clone_destroy callback and an operation on an interface that needs to access a
softc.
--
Andriy Gapon
Andriy Gapon
2018-12-03 12:34:32 UTC
Permalink
Post by Andriy Gapon
Q1. Is the described race plausible?
Q2. If yes, then has this class[*] of races been fixed by the epoch mechanism?
I suspect that lagg_ioctl() can still have that race if it's called concurrently
with lagg_clone_destroy().
So, to re-iterate, I think that the code currently allows for the following race
with respect to drivers that use if_clone_simple mechanism (at least).

Thread T1 calls ifioctl with, for instance, SIOCGIFMEDIA parameter.
T1 calls ifunit_ref(), finds the interface with !(if_flags & IFF_DYING).
T1 increments the interface's reference count and proceeds to call ifhwioctl()
and then the interface's (driver's) if_ioctl method.

Enter thread T2.
T2 calls ifioctl(SIOCIFDESTROY) on the same interface.
T2 invokes if_clone_destroy() that looks up the interface by name and increments
its reference count.
Then, T2 calls if_clone_destroyif() that calls ifc_simple_destroy().
The latter calls ifcs_destroy method on the interface.

From here on we consider driver-specific code that, obviously, varies from
driver to driver. But after having reviewed a handful of drivers that use
if_clone_simple I see that all of them have the same pattern.

So, T2 calls ifcs_destroy.
A driver's ifcs_destroy would handle its internal state.
Then, it would typically call if_free() on the interface.
Since the interface at this point has multiple outstanding references, including
one taken by T2 itself, it is not actually freed. It's just marked as
IFF_DYING. Also, its reference count is decremented by one, so that it can be
actually freed after T2 and T1 release their references.

Afterwards, ifcs_destroy would typically free if_softc.

At this point the driver's if_ioctl method is being executed by T1.
The method can access if_softc that has been freed by now.
So, that's the race.

Any internal locking on the driver level does not help, because the lock would
be destroyed and freed together with the if_softc in ifcs_destroy. So, if_ioctl
attempting to get that lock is the same kind of the problem.

Any thoughts and suggestions?

My idea is that there should be something like 'if_freed' or 'if_destroyed'
method that could be used by drivers for their cleaning up. That method would
be called from if_destroy() when we really know that the interface has no
references and, thus, no threads are accessing it. Then it must really be safe
to destroy the softc as well.
How does this sound?

Thanks!
--
Andriy Gapon
Andriy Gapon
2018-12-03 13:28:00 UTC
Permalink
Post by Andriy Gapon
My idea is that there should be something like 'if_freed' or 'if_destroyed'
method that could be used by drivers for their cleaning up. That method would
be called from if_destroy() when we really know that the interface has no
references and, thus, no threads are accessing it. Then it must really be safe
to destroy the softc as well.
How does this sound?
This is what I meant, in C: https://reviews.freebsd.org/D18420
--
Andriy Gapon
Loading...