DragonFly commits List (threaded) for 2005-07
Re: cvs commit: src/sys/kern uipc_socket.c uipc_socket2.c src/sys/sys socketvar.h
:Matthew Dillon wrote:
:> Fix a sockbuf race. Currently the m_free*() path can block, due to
:> objcache_put() blocking when it must access the global depot. This breaks
:> the critical section *AND the BGL during a time when the sockbuf state is
:> inconsistent. Another process accessing the same sockbuf would then
:> corrupt it. Since depot access is fairly rare, this bug typically required
:> a number of hours to reproduce.
:Would it work better if we would pass down flags to objcache_put, which
:can indicate that blocking is bad, and then either spin with trytoken or
:plainly destruct the object directly?
There are several problems here. First and foremost, the term
'atomicy' has a relative meaning here. Even the totally non-blocking
free() path has to send an IPI message some times and this can result
in the processing of incoming IPIS under certain circumstances, so
we wind up walking a very fine line over what we mean by 'atomic operation'
when we call m_free() or free(). IPI interactions are an issue even AFTER
we've fixed the BGL and blocking issues.
Jeff has suggested that mainline code (aka read()) use an IPI message
to obtain atomic access to the sockbuf interlocks against the protocol
stack's access. That would certainly work but I think it could seriously
impact performance. We might just have to redesign the sockbuf code,
perhaps by giving it a fronting FIFO using tail-chasing indexes so the
locking only occurs between competing read()'s rather then between a
read() and the protocol stack (or a write() and the protocol stack).
An option for the m_free*() path would be to guarentee that objcache_put
is non-blocking at all times (I think passing a flag is a bit hokey, I'd
rather just guarentee it). Such a guarentee is possible to make if we
do not access the underlying depot and instead just place the free mbuf
on a per-cpu list which a helper thread moves to the depot, or something
like that. But even here there are side-issues in m_free() itself, in
particular the fact that it has to free sidebar structures such as
In anycase, it requires a lot of thought which is one of the reasons why
the quick hack goes in now.