DragonFly users List (threaded) for 2009-03
DragonFly BSD
DragonFly users List (threaded) for 2009-03
[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]


From: Bill Hacker <wbh@xxxxxxxxxxxxx>
Date: Tue, 03 Mar 2009 13:27:14 +0800

Dmitri Nikulin wrote:
On Tue, Mar 3, 2009 at 1:08 PM, Mag Gam <magawake@gmail.com> wrote:
I was wondering if HAMMER will ever have network based RAID 5. After
researching several file systems it seems HAMMER is probably  the
closest to achieve this problem and will make HAMMER a pioneer.

Intuitively I highly doubt network RAID5 is worth it. Even local disk RAID5 is unusable for many work loads.

In contrast, check out some of the more flexible RAID10 modes
available in Linux:
You can get N/M effective space (N raw storage / M copies) with
RAID0-like striping for all of it. It performs very well and certainly
much better than the parity-based RAID5.

Imagine how RAID5 would work with network devices:

Read old data block from one server
Read parity block from another server
Generate new parity block
Write data block to one server
Write parity block to another server

All with NO atomicity guarantees, so HAMMER would have to pick up the
slack. Even in the best case you have 8x the latency of a single trip
to a machine (4 request/response pairs of 2 IOs each). All compared to
a one round trip (2 IOs) to write to a plain slave, or N round trips
for N redundant copies. What is an acceptable penalty on local disks
is pretty heavy for network storage.

If you really want, you can use vinum over iSCSI to get networked
RAID5, but it will not perform well.

Adding to that (as we have spent the past 12+ months researching all this..)

- there IS prior art, and lots of it. [1]

- none of it is fast - even over local 'Infiniband'

- the most practical compromise seems to be deferred background replication to 'pools' that are themselves *hardware8 RAID5 (6 or 10).

- 'hammer mirror-stream', especially if done over something faster than ssh, - eg: localy over 10GigE, iSCSI, or e-SAt over raw Ethernet, is a primo candidate for having at least one rapid-restoration near-real-time snapshot.

But at the present state of the art, HAMMER is challenged w/r quotas, subvolume-only selective replication, and r/w mounting of the mirrored snapshot(s).

Quite possibly there will be no 'one size fits all' solution, Too many compromises that pull in opposing directions.

As has always been the case......



[1] Start with the Wikipedia article on distributed file systems, paticularly replicated and fault-tolerant.

Most are either IBM/Sun/Oracle/$AN-vendor, 'mainframe & big-bucks' class, ELSE Linux whole-damn-world-in-the kernel wannabees.

Among the contenders:

- Gluster (problematic getting it to work with fuse on FreeBSD)

- GFarm (wants to link in its own utils)

- MooseFS (compiles sweetly on FreeBSD - but sparse docs)

- Chiron (dirt-simple, but needs manual work if/as/when backends break)

- Ceph (relies on btrfs - which is scary as the btrfs developers claim 'not ready yet..'

Aside from Ceph, most of the others I mention use 'any POSIX fs' for eventual store.

Chiron, to name one of many, expects those to be already-mounted smbfs or NFS mounts.

AFAIK, 'POSIX' compatibility includes HAMMER fs, whether over sshfs sshftp, NFS, SMBFs, or ...

so ...... 'possibilites abound'.

Speaking from the transpacific fiber private-network alpha test exposure, there ain't no magic to the network, though!

What folks forget is that the delays introduced by each router or switch add up - even at 'light speed' to latency 'puters do not like.

One can hope for paired electron technology.... but not 'soon'


[Date Prev][Date Next]  [Thread Prev][Thread Next]  [Date Index][Thread Index]