DragonFly users List (threaded) for 2004-10
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]

Re: Support for Cluster of SMP (or workstation) in DragonFly

From:	"Noah Yan" <yanyh@xxxxxxxxx>
Date:	Thu, 14 Oct 2004 11:03:28 -0600

Thanks very much for your answering, Matt

to continue my first question and your answer(may be I didnot get your 
points fully): current high-performance clusters have trend that computation 
node does not install a general-purpose OS, but something like light-weitht 
kernel dedicated for computation and messaging(no file system), and some 
dedicated I/O node with general full-functional OS like Linux installed for 
file access. On the computation node, run-time systems for parallel 
programming model(like MPI, PVM, etc) and batch scheduler (like LSF, Pbs) 
are installed and should be tightly integrated with the kernel (but not the 
case). For scalability, it is very important to optimize global 
synchornization and co-scheduling for parallel applications whose processes 
may be spawned from hundreds or tens of thousands. I am very intestered in 
addressing this issues from kernel's ground. are these issues well 
considered in the design of these new features? I know it is hard and 
imposibble to cover all the posibble situation in this level. yet,I do 
expect that these issues are addressed.

one more question(just for discussion, and this may more suitable for the 
kernel group): is it posibble to organize the os in three level, 
kernel-level, system-level, user-level(like Minix)? system level is 
instroduced to address the issues from OS clustering. And all context-switch 
(or system call) to kernel-level are heavy, but user-system levels are 
lightweight switch. So in clustering, all cross-box operation(mostly 
messaging passing) happens between system-level with lightweith context 
swtich. In this way, we can reduce the performance overhead by removing the 
context swtich cost to kernel level. System level is used perform basic 
protection, which is less costly compared with kernel-level protection, and 
also different from user-land operation without any protection.

Thanks very much
Noah Yan


"Matthew Dillon" <dillon@xxxxxxxxxxxxxxxxxxxx> wrote in message 
news:200410140243.i9E2hieY001179@xxxxxxxxxxxxxxxxxxxxxxx
>
> :Hi, all
> :
> :I am new in DragonFly and just found this very interesting thing. I read
> :some doc about it but still several questions on it. (but some may be too
> :out-of-scope or too naive, please do forgive)
> :
> :1. In the initial motivation, DFL seems targeting on SMP machie(memory 
> are
> :hardware shared in most case from my understanding). So is there any
> :consideration how those ides(like LWKT, IPI, messaging) support cluster 
> of
> :SMP (or workstations) where messaging passing are the most common
> :programming model,and co-scheduling of parallel applications and global
> :synchronization accross cluster are major issues in them.
>
>    DragonFly is designed to operate efficiently on both UP and SMP
>    machines.  That is, the algorithms we are choosing should theoretically
>    work equally well on either type of platform.  The ultimate goal is to
>    implement a transparent clustering capability.  This does not yet 
> exist.
>    Likewise, many of the MP algorithms are in place but much of the code
>    remains under the Big Giant Lock, and will continue to remain under
>    that lock until we mostly finish the major infrastructure work (e.g.
>    like VFS, which I am working on now).  We have to finish what subsystem
>    threading we intend to do before we start worrying about undoing the
>    Big Giant Lock.
>
> :2. Any benchmarks, or performance evaluation about DFl with other Unix?
>
>    We haven't progressed far enough for benchmarks to really be 
> meaningful,
>    but at the moment our performance is similar to FreeBSD-4.x on UP 
> machines,
>    and hopefully a little better then 4.x on SMP machines.  On SMP boxes
>    we beat FreeBSD-5 and 6 on some things, they beat us on other things,
>    though I don't expect it to stay that way.  On UP boxes we are far more
>    efficient then FreeBSD-5/6.
>
>    For filesystem operations we are still using UFS, so our performance 
> will
>    be roughly similar to FreeBSD-4, which is to say better then FreeBSD-5
>    but not as good as Linux (at least not for directory operations).  I
>    think our NFS performance is very good, even when compared with Linux.
>
> :3. What would impact the application programmer by introducing these new
> :features?
>
>    The (not yet written) clustering is intended to be as transparent as
>    possible, so I'm hoping that the app programmer will simple be able
>    to rfork()/clone() as per normal... what will matter will be how much
>    cross-thread pollution there winds up actually being.  That will
>    ultimately govern how easily a program can be distributed in a 
> clustered
>    environment.
>
> :4. Any consideration in DFl to please and take advantage of the new
> :architecture advancements, like hyperthreading, multi-core (or
> :chip-multithreading) in P4/Xeon, UltraSPARC IV, Power 4, etc
>
>    Multi-core just looks like another cpu.  Hyperthreading only
>    introduces minor improvements for particular types of problems, which
>    is why Intel is getting the pollucks beat out of it by AMD.  Both
>    vendors are heading towards multi-core.  Hyperthreading is difficult
>    to deal with, so we don't really try other then to implement some minor
>    affinity.
>
> -Matt
> Matthew Dillon
> <dillon@xxxxxxxxxxxxx>
>
> :5. more may come ... :)
> :
> :Thanks very much in advance
> :Noah Yan
> :
> :
>

Follow-Ups:
- Re: Support for Cluster of SMP (or workstation) in DragonFly
  - From: Matthew Dillon

References:
- Support for Cluster of SMP (or workstation) in DragonFly
  - From: Noah Yan
- Re: Support for Cluster of SMP (or workstation) in DragonFly
  - From: Matthew Dillon

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]