DragonFly kernel List (threaded) for 2005-02
Re: rc and smf
Dan Melomedman wrote:
Joerg Sonnenberger wrote:
Actually, this is exactly one of the situations where I don't want
automatic, silent restarts. It hides problems, which is in my position
even more problematic. "Magic restart" doesn't solve every problem.
Nothing solves every problem. Supervision solves the 'Oops, something
crashed, and needs to be restarted' problem. If my nearby nuclear power
plant's reactor monitoring software running on a Unix box gets killed
due to a memory leak, I want it restarted immediately, not wait for the
administrator to find out by the time the reactor melts down.
No you do not.
What you DO want, when *any* fault occurs of that nature, is for a
totally separate system - usually a 'state machine' - or even *gravity*
to take over and 'safe' that plant until the real cause is scrutinized
by a team of experts. Too much is at stake to blindly restart a daemon
OR the OS.
Unix has no more business running nuke power plants than Windows. That
is specialized RT OS ground. Or state machines monitored by specialized
computers. Or both.
tolerant systems have some kind of supervision in software.
All seriously critical ones have hardware / firmware fall-backs and
manual overrides as well.
All failures be they oil-refinery, chemical plant, power plant or web
and mail servers *should* be brought to human attention, examined and
attended to by folks with brains. That way we can fix them, not be
victims of them.