cms on 20 Aug 2004 21:06:02 -0000


[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]

Re: [PLUG] Booting/disk "problem"?


On Wednesday 18 August 2004 16:43, eric@lucii.org wrote:
> I've been called in to do a sort of forensic analysis on a Linux server
> that won't boot (oh, they want me to fix it and make it work again too
> <grin>.)

What type of forensic analysis? Are you expected to repair the server? Or
recover data from the raid array? Or simply replace the disk(s) and get the
machine back in service? Or all of the above?

> The machine is a Compaq server with a RAID array running Red Hat 8.0.  It
> refuses to boot citing: "Kernel panic: no init found".  I also see this
> error: pivotroot: pivot_root(/sysroot,/sysroot/initrd) failed: 2

What type of raid array? Hardware array? Software array? One disk? Multiple
disks? Backplane? What type of disk controller? SCSI or IDE? SCSI Raid
controller? My guess looking at your output is that your machine is using some
type of controller card specific to that machine; or specific to Compaq (and
HP as they are usually the same). If that is the case, you're going to need a
driver for the controller that may not be available on many bootable Linux
CDs, such as Knoppix, Knoppix-STD, Toms' RootBoot, Damn Small Linux, etc.
It depends.

> I booted it with the CD ROM and it STILL won't run on the existing
> partitions.  I got into a shell and went mucking about.  Here's what
> I found with various tools like fdisk, e2label, and fsck:

What CD-ROM? Something provided by Compaq? Who installed RH 8.0? What do you
mean "it STILL won't run on the existing partitions."? Do you mean you still
are unable to access any of the partitions on the disk(s) even AFTER booting
the server with a recovery disk (CD-R) of some type?

>   device              label    note
>   /dev/cciss/c0d0p1   /boot    appears fine
>   /dev/cciss/c0d0p2   /usr     appears fine
>   /dev/cciss/c0d0p5   /home    appears fine
>   /dev/cciss/c0d0p7   /var     appears fine
>   /dev/cciss/c0d0p6   /        Problem: -->
>
>      On the p6 partition, the is only: /bin, /boot, /home, /proc,
>      /usr and /var Since boot, home, usr, and var are mount points, they
>      are empty.  There are a number of files in the bin directory
>      including one called "all.tar" which is 122 MB and is truncated.
>      The tar file was created about the last time that the machine was
>      known to be working.
>
>     df -h shows:
>
>       Size     Used    Available  Use%
>       505.9M   505.9M     0       100%
>
> Also, the UPS went down at some point and may have just taken the server
> down.
>
> Given this limited set of evidence, can anybody come up with a plausible
> explanation for what happened?
>
> I theorize that the partition was too full for the user to build their
> all.tar file so they tried to perform a /bin/rm command but executed it
> in the wrong directory.  They were logged in as root :-(   By the time
> they realized it, it was too late.

What type of filesystem? Ext2, Ext3, ReiserFS, XFS, NFS? I realize I'm asking
a lot of questions and not really answering any of your questions? With more
info there are a lot of people on this list who can be invaluable. Also,
reconstructing what occurred to cause the server to shut down and doing a
"forensic analysis" on the server CAN be two completely different things.

Let me know if I can be of any help. I can do data recovery.

Chris/CMS
___________________________________________________________________________
Philadelphia Linux Users Group         --        http://www.phillylinux.org
Announcements - http://lists.phillylinux.org/mailman/listinfo/plug-announce
General Discussion  --   http://lists.phillylinux.org/mailman/listinfo/plug