DragonFly users List (threaded) for 2007-02
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]

DragonFlyBSD Thread on osnews

From:	Jonathan Weeks <jbweeks [at] mac [dot] com>
Date:	Fri, 2 Feb 2007 16:35:58 -0800

FYI -- there was a DragonFlyBSD 1.8 announcement on osnews, with a thread discussing Linux scalability vs DragonFlyBSD, which might bear an educated response:

http://www.osnews.com/comment.php?news_id=17114&offset=30&rows=34&threshold=-1

I admit I'm not the most experienced kernel programmer in the world, but I have a few years of Linux and AIX kernel programming experience. Maybe you are more qualified, I don't know.

You say Linux scales up to 2048 CPUs, but on what kind of system?

The top end of the SGI Altix line of Linux supercomputers runs 4096 CPUs, and IBM validated Linux on a 2048-CPU System P. Linux scales to 1024 CPUs without any serious lock contention. At 2048 it shows some contention for root and /usr inode locks, but no serious performance impact. Directory traversal will be the first to suffer as we move toward 4096 CPUs and higher, so that's where the current work is focused.

Is this the same kernel I get on RHEL. Can I use this same kernel on a 4 CPU systemm? What Linux version allows you to mix any amount of computers with whatever amount of cpus and treats them all as one logical computer while being able to scale linearly?

Choose the latest SMP kernel image from Red Hat. The feature that allows this massive scaling is called scheduler domains, introduced by Nick Piggin at version 2.6.7 (in 2004). There is no special kernel config flag or recompilation required to activate this feature, but there are some tunables you need to set (via a userspace interface) to reflect the topology of your supercomputer (i.e. grouping CPUs in a tree of domains).

Usually massive supercomputers are installed, configured, and tuned by the vendor. They'd probably compile a custom kernel instead of using the default RHEL image. But it could work out of the box if you really wanted it to.

. ..rather than rely on locking, spinning, threading processes to infinity, it will assign processes to cpus and then allow the processes to communicate to each other through messages.

That's fine. It's just that nobody has proven that message passing is more efficient than fine-grained locking. It's my understanding (correct me if I'm wrong) that DF requires that, in order to modify the hardware page table, a process must to send a message to all other CPUs and block waiting for responses from all of them. In addition, an interrupted process is guaranteed to resume on the same processor after return from interrupt even if the interrupt modified the local runqueue.

The result is that minor page faults (page is resident in memory but not in the hardware page table) become blocking operations. Plus, you have interrupts returning to threads that have become blocked by the interrupt (and must immediately yield), and the latency for waking up the highest priority thread on a CPU can be as high as one whole timeslice.

DF has serialization resources, but they are called tokens instead of locks. I'm not quite sure what the difference is. There also seems to be a highly-touted locking system that allows multiple writers to write to different parts of a file, which is interesting because Linux, FreeBSD, and even SVR4 have extent-based filocks that do the same thing. What's different about this method?

I hope I've addressed your questions adequately. Locks are evil, I know, but they seem to be doing quite well at the moment. Maybe by the time DF is ready for production use there will be machines that push other UNIX implementations beyond their capabilities. But for now, Linux is a free kernel for over a dozen architectures that scales better than some proprietary UNIX kernels do on their target architecture. That says a lot about the success of its design

Follow-Ups:
- Re: DragonFlyBSD Thread on osnews
  - From: Petr Janda <elekktretterr@exemail.com.au>

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]