DragonFly bugs List (threaded) for 2008-05
[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]

Re: File system panic on recent HEAD

From:	Matthew Dillon <dillon@xxxxxxxxxxxxxxxxxxxx>
Date:	Wed, 21 May 2008 10:52:42 -0700 (PDT)

:Hi,
:
:* Matthew Dillon wrote:
:> 
:>     Matthias, did that partition (/build) at any point in the past fill up
:>     to 100% ?
:
:No.  The partition has enough space left:
:
:/dev/ad10s2    256G    33G   203G    14%    /build
:
:>     In anycase, what I recommend is that you umount the partition and do
:>     another manual fsck of it, then be careful not to fill it up and see
:>     if the problem reoccurs.
:
:Manual fsck did not help.  I did it several times before and now again.
:Sadly the machine crashed again last night.  And again no coredump:
:
:dev = #ad/0x1e130057, block = 30408, fs = /build
:panic: ffs_blkfree: freeing free block
:mp_lock = 00000000; cpuid = 0
:boot() called on cpu#0
:
:syncing disks... Warning: vfsync_bp skipping dirty buffer 0xc4491550
:640 Warning: vfsync_bp skipping dirty buffer 0xc4491550
:619 Warning: vfsync_bp skipping dirty buffer 0xc4491550
:[...]
:
:Regards
:
:	Matthias

    Another possibility is that fsck is not able to repair the problem.
    Try completely wiping the filesystem by re-newfs'ing it maybe?
    If THAT doesn't work then we've wiped out any chance of it being the
    only currently known UFS bug and something in the package build you
    are running is tweaking another bug. 

    I am going to go ahead and bring in the fix from FreeBSD related to
    full UFS filesystems, but I don't think it fixes the issue you are
    hitting.

    Which leads us to a second option: Get the absolute latest HEAD running
    on that box and run a HAMMER filesystem for that partition instead of
    UFS.  With the proviseo that you absolutely cannot fill it up or HAMMER
    will blow up (that being the last bit of the filesystem that I am 
    working on this week).  If there is any corruption occuring in the
    driver/DMA path HAMMER should be able to detect it.  Also make sure
    the latest version of the hammer utility and newfs_hammer is installed
    from /usr/src/sbin/{newfs_hammer,hammer}.

    Something like this:

	newfs_hammer -L BUILD /dev/ad10s2

    And in fstab:

	/dev/ad10s2    /build    hammer     rw,nohistory   2   2

    I'm presuming you wouldn't want it to retain historical information.
    However, even if you mount with the nohistory option it is still a 
    good idea to reblock the filesystem every so often to clean up loose
    ends.  Something like this in a once-a-day cron job would do the
    trick:

	hammer -t 600 -c /var/db/reblock-build.cycle reblock-inodes /build 100
	hammer -t 600 -c /var/db/reblock-build.cycle reblock-btree /build 100
	hammer -t 600 -c /var/db/reblock-build.cycle reblock-data /build 95

    In anycase, I am not suggesting that you start using HAMMER for
    production stuff yet, but it might give us another data point on whether
    the corruption is due to a bug in UFS, or due to a bug in the rest of
    the kernel, or due to driver/DMA issues.  And there's a risk you might
    hit a HAMMER bug too and start tearing your hair out for real.

    I am going to also note that this particular bug has been reported
    sporatically on FreeBSD and DragonFly, NetBSD, and OpenBSD.


					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

Follow-Ups:
- Re: File system panic on recent HEAD
  - From: Matthias Schmidt <matthias@dragonflybsd.org>
- Re: File system panic on recent HEAD
  - From: Matthew Dillon <dillon@apollo.backplane.com>

References:
- File system panic on recent HEAD
  - From: Matthias Schmidt <matthias@dragonflybsd.org>
- Re: File system panic on recent HEAD
  - From: Matthew Dillon <dillon@apollo.backplane.com>
- Re: File system panic on recent HEAD
  - From: Matthias Schmidt <matthias@dragonflybsd.org>

[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index]