|From:||"Thomas E. Spanjaard" <tgen@xxxxxxxxxxxxx>|
|Date:||Fri, 15 Dec 2006 19:50:15 +0000|
On Thu, Dec 14, 2006 at 09:13:17PM +0000, Thomas E. Spanjaard wrote:YONETANI Tomokazu wrote:On Tue, Dec 12, 2006 at 03:13:45PM +0000, Thomas E. Spanjaard wrote:I fear this panic is unrelated, as Victor Balada Diaz is having the same on his 1.6 system. His /sbin/init is stuck in nanosleep, and apparently never jumped to.YONETANI Tomokazu wrote:If I boot a UP kernel, it proceeds to "start_init: trying /sbin/init", but then stuck there(the backtrace in DDB is from console handler).The backtrace looks something like this: Debugger(c02a1a77) scgetc(c030e8a0,2,c017ba0b,0,c0306b40) sckbdevent(c0306b40,0,c030e8a0) atkbd_intr(c0306b40,0,cd682d84,c015d699,c0306b40) atkbd_isa_intr(c0306b40,0) ithread_handler(1,0,0,0,0) lwkt_exit()No, that backtrace was not from a panic, that was when I press ctrl+alt+esc after seeing "start_init: trying /sbin/init" message and it stuck (ctrl+T didn't print anything). And `call dumpsys' in DDB didn't start the dump, so I think /sbin/init wasn't even read from the disk. Then I tried setting `set hw.ata.ata_dma=0' in the boot driver, and this time it made it to the login prompt (updated: http://les.ath.cx/DragonFly/asrock-dmesg.boot) But sometimes random commands(ls, sysctl, ...) dump core and fail. Or ld command reports corruption of libraries when I try to build a new kernel. On SMP kernel it happens more frequently. On UP kernel, if I switch to a UDMAxx mode using natacontrol command, core dumping occurs more frequently.
I just experienced something odd, perhaps similar to your experience earlier. I (probably) experience a null deref when trying to open acd0c, as you can see on http://deviate.fi/~tgen/mountroot_1.png . It appears si_drv1 is NULL on line 218 in sys/dev/disk/nata/atapi-cd.c. Which is strange, because in acd_attach() I really do set si_drv1 on acd0. And, on the SCSI test system, I can open, read, write, etc /dev/acd0c without problems. And the code was able to find acd_open(), so the dev_ops have been registered, so it's not like it's passing the wrong device. Therefore I suspect something somewhere is scribbling over si_drv1, but I don't know where.I haven't seen the panic in acd code, after your commit.
Cheers, -- Thomas E. Spanjaard email@example.com
Description: OpenPGP digital signature