[HOWTO] trouble shooting boot problems

Post tutorials, HOWTO's and other useful resources here.
User avatar
fanthom
Moderator Team
Moderator Team
Posts: 5666
Joined: 28 Dec 2010, 02:42
Distribution: Porteus Kiosk
Location: Poland
Contact:

[HOWTO] trouble shooting boot problems

Post#1 by fanthom » 10 Jan 2011, 06:26

This article has been moved to the main site, here.

Feel free to continue discussing it or asking questions about it in this thread.
Please add [Solved] to your thread title if the solution was found.

User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: [HOWTO] trouble shooting boot problems

Post#2 by Ahau » 06 Jun 2011, 15:26

I'm bumping this article because I've merged it with one of brokenman's articles on the main site, and moved it there (see link above). Please review and let me know if you have any questions or corrections.

Thanks!
Please take a look at our online documentation, here. Suggestions are welcome!

User avatar
fanthom
Moderator Team
Moderator Team
Posts: 5666
Joined: 28 Dec 2010, 02:42
Distribution: Porteus Kiosk
Location: Poland
Contact:

Re: [HOWTO] trouble shooting boot problems

Post#3 by fanthom » 06 Jun 2011, 18:55

@Ahau
one thing:
/var/log/livedbg file is created at the end of linuxrc script when all informations are gathered. it can be used to check if changes/external modules/rootcopy were found, what was the booting partition, which partitions were recognized and which modules were loaded.

it can not be used for debugging "sgn not found" failures. if sgn was not found then booting can't be continued and /var/log/livedbg is never created (no point for this), instead linuxrc fails with 'fatal' message where users are instructed what to do next.
please boot porteus and do 'cat /mnt/live/fatal' to check out these instructions.

these three could be added to the log part:
- fstab: cp /etc/fstab /tmp/disk
- blkid output: blkid > /tmp/disk/blkid
- cheatcodes: cat /proc/cmdline > /tmp/disk/cheatcodes

Cheers
Please add [Solved] to your thread title if the solution was found.

User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: [HOWTO] trouble shooting boot problems

Post#4 by Ahau » 07 Jun 2011, 15:03

OK, fanthom -- so I know you made a tiny correction, but as I thought about the best way to implement it in the document, I kind of spiraled out of control and wound up rewriting most of it, and adding a section for errors that happen before LLS starts. Here it is in rough draft---I'm almost positive there is some faulty information here (especially when describing the boot process, lol), so again your help in fixing (yet another draft) is much appreciated. Once I get this cleared up, I'll replace the doc on the main site with this:

**********************************************


If Porteus just doesn't seem to want to boot for you, whether it is stalling at a particular location during boot, the screen goes black or it just spits out an horrific error message, there are always things you can do to isolate when and where the problem is occurring.

The information that is the most critical for solving your problem are your error messages and log files. If you're not an advanced user, you may not be able to figure out the cause of the problem on your own, but this document will help you gather the data you will need in order to seek support from the Porteus Community. If you are an advanced user, then this document will tell you how and where to find the various log files within Porteus.

------------------------

Depending on where in the boot process your system fails, you may have different tools available, and more log files are generated the further along in the boot process you get. It would serve us well at this point to have a general understanding of the Porteus Boot Process (for more detailed information, go here ((LINK TO COME)). Roughly speaking, after your computer turns on and completes it's POST (in Porteus, syslinux, extlinux and lilo are included in the default ISO). The bootloader starts up the linux kernel, which initializes some basic hardware and unpacks the initrd (initial ramdisk). At this point, the system starts the linux-live script (linuxrc), which creates the live filesystem and mounts all of your modules into it. After the linux-live stage, the rest of your system hardware gets initialized, several cache files are generated, and the system starts the xorg-server, bringing up the graphical KDE or LXDE interface.

I'll refer to the first portion of this process (BIOS->Bootloader->Kernel/initrd) as Phase I. I'll call the second portion of this proces (the linux-live stage) Phase II, and the remaining sections (hardware initialization and starting the GUI) as Phase III. Note that these names are arbitrary, I'm creating them solely for the purposes of this document.

Phase I problems:

If you never reach a boot menu asking you which Porteus mode to start, then something is wrong with either your hardware or your bootloader installation. For example, your BIOS may not be set to boot from the device to which you have installed Porteus, your BIOS may not support booting from that kind of device at all (some motherboards won't boot from USB devices, for example), your device may be corrupt, improperly formatted or otherwise not bootable, or your bootloader was not properly installed or configured. For more information on how to resolve these issues, please read our Official Porteus Installation Guide ((LINK)), try reformatting your device and reinstalling, or ask a question in the "General Chat" or "Newbie Questions" section of our forum.

If you get to a bootloader menu and select a mode for Porteus (e.g. "Graphics Mode (KDE)" or "Always Fresh", etc.), but the system fails immediately after you make a selection, without displaying the text "Starting optimized linuxrc (inspired by http://www.linux-live.org)", then you likely have an incompatibility with the default porteus kernel, or your bootloader is not correctly pointing to your kernel (this is only likely with custom installations). Please write down any error messages you receive, and post them in the relevant section of our forum to ask for more help.

Phase II problems:

If your system fails after displaying the "starting optimized linuxrc..." message, but before displaying "Live system is ready now - starting Porteus", then your system is failing during Phase II, the 'linux-live stage'.

If Porteus fails during Phase II, you can use the "debug" cheatcode ((LINK TO CHEATCODES)) to check for errors. Press the TAB button when your menu comes up (what button to press for LILO?), and enter 'debug' (without the quotes) on the line of text that shows up at the bottom of the screen. This cheatcode causes linuxrc to stop every few functions throughout the script. A command prompt will be displayed, and you can press Ctrl + D to progress forward to the next step of the boot process. You can find the place right before the the error occurs by counting how many Ctl-D's you need to press before you get the error. Then you can reboot and go through the process again, this time pressing Ctl-D one fewer times than it took you to reach the error on your previous boot attempt.

Here are a couple examples of specific troubles that may come up early in the boot process, during the 'live-linux stage':

Porteus can't find the *.sgn file
Insufficient space inside the initrd to copy /sbin/init to it

Debugging of these and other early boot errors can really be a pain because userland tools (i.e., all Porteus applications) are not available at this early stage and users are limited to the very few utilities that are provided within initrd (initial ramdisk). However, you still have the ability to gather some information to pass along to the community forum for some help. To do so, please follow these steps:

Try to store log files on writable media, such as a hard drive or usb drive. If you use the 'debug' cheatcode as described above, you will get a command prompt after every few steps in the boot process. From this comand prompt you can mount your drive the same as you would in a normal system and copy log files to it, using these actions:

mkdir /tmp/disk
mount /dev/sdxY /tmp/disk #where sdxY is the drive letter and partition number of your drive, e.g. sda1
cp /etc/fstab /tmp/disk #copies your filesystem table (fstab)
blkid > /tmp/disk/blkid #copies your blkid output
cat /proc/cmdline > /tmp/disk/cheatcodes #provides a list of all cheatcodes used
dmesg > /tmp/disk/dmesg #this is a list of messages from the linux kernel
cat /var/log/livedbg > /tmp/disk/livedbg #this file may not be available early on in this phase, and gets renamed at the end of Phase II --see Phase III below.

You can then open these files on a working operating system and upload the content to an external site such as pastebin.com. Then, write a post in the relevant section of our forum describing your problems and any error messages received, with a link to your logs on the external site.

If it's not possible to store these logs somewhere on a writeable media(for example, if you're booting from read-only media and you don't have any writeable devices around), please make a photo of the error or just write the output of the commands listed above on a piece of paper. To display this information on your screen instead of copying them to a text file, you can execute them at the command prompt like this:

cat /etc/fstab
blkid
cat /proc/cmdline
dmesg | less (this command will produce too much information to write by hand, but by piping the ouput to 'less', you can scroll through the log with your arrow keys to look for problems, and write down anything that looks suspect
cat /var/log/livedbg (again, this will not be available if your system is failing very early in the boot process)

and post it in a thread on the relevant section of our forum.

NOTE: If you are using the 'changes=' cheatcode, then switch to 'always fresh' mode. This can help if your changes have been corrupted or are otherwise causing trouble.

NOTE: If you have modified your initrd, make sure that you have 600KB of free space in it. /sbin/init from 001-core must be copied to the ramdisk in order to allow the proper shutdown procedure, and 600KB is the minimum required amount of free space.


Phase III

On some systems, Porteus may get through the live-linux stage, (you know Phase II is complete when the message "Live system is ready now - starting Porteus" is displayed) but fails to open up a display for KDE or LXDE (you might get a blank black screen or a garbled image, etc). One of the most common reasons why this occurs is that Porteus could not find the correct video driver for the hardware on which it is running. If this is the case, you can try two things:

1) Try booting into "Graphics VESA mode". This will skip the xorg autoconfiguration (xconf) and force the use of the standard VESA driver, at 1024x768 resolution. If your system boots into KDE or LXDE, you can open a console and check your log files as described in Phase II above.

2) If the "Graphics VESA mode" fails, try booting into "Text mode". You will be prompted for your username and password (root/toor by default) and then logged in to the Command Line Interface. You can then look at your log files as described in Phase II above.

NOTE: If your system gets through the linux-live stage, you will have some additional log files that may help. Please also use these commands:

cat /var/log/messages > /tmp/disk/messages
cp /var/log/porteus-livedbg /tmp/disk #note - this log file is created during Phase II as /var/log/livedbg, but is changed to /var/log/porteus-livedbg at the end of the Phase II.
(To copy these logs to text files on a writeable media)
or:

cat /var/log/porteus-livedbg
cat /var/log/messages | less

to display them on your screen.

If you can't figure out what the problem is by looking at these files yourself, please follow the instructions above to copy the logs, upload them, and then post a link with a description of your issues on our forum.
Please take a look at our online documentation, here. Suggestions are welcome!

User avatar
fanthom
Moderator Team
Moderator Team
Posts: 5666
Joined: 28 Dec 2010, 02:42
Distribution: Porteus Kiosk
Location: Poland
Contact:

Re: [HOWTO] trouble shooting boot problems

Post#5 by fanthom » 07 Jun 2011, 17:07

very well written Ahau!
just 2 things to be corrected:

1) "After the linux-live stage, your system hardware gets initialized, several cache files are generated, and the system starts the xorg-server, bringing up the graphical KDE or LXDE interface."
Hardware is initialized in 2 phases:
a) very early during kernel run (only drivers compiled directly into kernel) even before unpacking initrd and lls-stage. If hw wouldn't be initialized here then USB/HD/CDROM wouldn't be discovered and mounted at all :wink:
b) in the middle of boot process when rc.udev is called by rc.S (external drivers compiled as (M) which are present in 000-kernel.xzm)
when you go through /var/log/dmesg then you will see something like:
"RAMDISK: xz image found at block 0"
and
"Freeing unused kernel memory: 516k freed"
everything initialized above this point is done by the kernel with drivers built in, the rest below it is done by udev with drivers from 000-kernel.xzm

BTW: this process explains what must be compiled directly into kernel (usb drivers, hd controllers, filesystem support, etc.. - everything what is needed to mount booting media) and what can be left outside of it (VGA drivers, sound, less important network protocols, etc...)

2) "cat /var/log/livedbg"
is valid only for initrd (lls) stage. when booted to porteus it's saved as "/var/log/porteus-livedbg" so pls modify 'cat' command for Phase III:
"cat /mnt/live/var/log/livedbg"
or
"cat /var/log/porteus-livedbg"

Cheers
Please add [Solved] to your thread title if the solution was found.

User avatar
Ahau
King of Docs
King of Docs
Posts: 1331
Joined: 28 Dec 2010, 15:18
Distribution: LXDE & Xfce 32/64-bit
Location: USA

Re: [HOWTO] trouble shooting boot problems

Post#6 by Ahau » 07 Jun 2011, 21:17

Thanks, fanthom! I've made edits to clarify, I made them bold and blue so they would be easy to pick out where I made changes.

I'll give this another day or so for folks to read over, and then replace the doc on the main site with it.
Please take a look at our online documentation, here. Suggestions are welcome!

Post Reply