DragonFly BSD
DragonFly kernel List (threaded) for 2005-02
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]

Re: rc and smf


From: Bill Hacker <wbh@xxxxxxxxxxxxx>
Date: Fri, 25 Feb 2005 03:03:20 +0800

Dan Melomedman wrote:

Joerg Sonnenberger wrote:

Actually, this is exactly one of the situations where I don't want
automatic, silent restarts. It hides problems, which is in my position
even more problematic. "Magic restart" doesn't solve every problem.

Joerg


Nothing solves every problem. Supervision solves the 'Oops, something
crashed, and needs to be restarted' problem. If my nearby nuclear power
plant's reactor monitoring software running on a Unix box gets killed
due to a memory leak, I want it restarted immediately, not wait for the
administrator to find out by the time the reactor melts down.

No you do not.


What you DO want, when *any* fault occurs of that nature, is for a totally separate system - usually a 'state machine' - or even *gravity* to take over and 'safe' that plant until the real cause is scrutinized by a team of experts. Too much is at stake to blindly restart a daemon OR the OS.

Unix has no more business running nuke power plants than Windows. That is specialized RT OS ground. Or state machines monitored by specialized computers. Or both.

All fault
tolerant systems have some kind of supervision in software.

All seriously critical ones have hardware / firmware fall-backs and manual overrides as well.


All failures be they oil-refinery, chemical plant, power plant or web and mail servers *should* be brought to human attention, examined and attended to by folks with brains. That way we can fix them, not be victims of them.

Bill



[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]