You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1047 lines
42 KiB
1047 lines
42 KiB
================= |
|
Kexec/Kdump HOWTO |
|
================= |
|
|
|
|
|
Introduction |
|
============ |
|
|
|
Kexec and kdump are new features in the 2.6 mainstream kernel. These features |
|
are included in Red Hat Enterprise Linux 5. The purpose of these features |
|
is to ensure faster boot up and creation of reliable kernel vmcores for |
|
diagnostic purposes. |
|
|
|
|
|
Overview |
|
======== |
|
|
|
Kexec |
|
----- |
|
|
|
Kexec is a fastboot mechanism which allows booting a Linux kernel from the |
|
context of already running kernel without going through BIOS. BIOS can be very |
|
time consuming especially on the big servers with lots of peripherals. This can |
|
save a lot of time for developers who end up booting a machine numerous times. |
|
|
|
Kdump |
|
----- |
|
|
|
Kdump is a new kernel crash dumping mechanism and is very reliable because |
|
the crash dump is captured from the context of a freshly booted kernel and |
|
not from the context of the crashed kernel. Kdump uses kexec to boot into |
|
a second kernel whenever system crashes. This second kernel, often called |
|
a capture kernel, boots with very little memory and captures the dump image. |
|
|
|
The first kernel reserves a section of memory that the second kernel uses |
|
to boot. Kexec enables booting the capture kernel without going through BIOS |
|
hence contents of first kernel's memory are preserved, which is essentially |
|
the kernel crash dump. |
|
|
|
Kdump is supported on the i686, x86_64, ia64 and ppc64 platforms. The |
|
standard kernel and capture kernel are one in the same on i686, x86_64, |
|
ia64 and ppc64. |
|
|
|
If you're reading this document, you should already have kexec-tools |
|
installed. If not, you install it via the following command: |
|
|
|
# yum install kexec-tools |
|
|
|
Now load a kernel with kexec: |
|
|
|
# kver=`uname -r` # kexec -l /boot/vmlinuz-$kver |
|
--initrd=/boot/initrd-$kver.img \ |
|
--command-line="`cat /proc/cmdline`" |
|
|
|
NOTE: The above will boot you back into the kernel you're currently running, |
|
if you want to load a different kernel, substitute it in place of `uname -r`. |
|
|
|
Now reboot your system, taking note that it should bypass the BIOS: |
|
|
|
# reboot |
|
|
|
|
|
How to configure kdump |
|
====================== |
|
|
|
Again, we assume if you're reading this document, you should already have |
|
kexec-tools installed. If not, you install it via the following command: |
|
|
|
# yum install kexec-tools |
|
|
|
To be able to do much of anything interesting in the way of debug analysis, |
|
you'll also need to install the kernel-debuginfo package, of the same arch |
|
as your running kernel, and the crash utility: |
|
|
|
# yum --enablerepo=\*debuginfo install kernel-debuginfo.$(uname -m) crash |
|
|
|
Next up, we need to modify some boot parameters to reserve a chunk of memory for |
|
the capture kernel. With the help of grubby, it's very easy to append |
|
"crashkernel=128M" to the end of your kernel boot parameters. Note that the X |
|
values are such that X = the amount of memory to reserve for the capture kernel. |
|
And based on arch and system configuration, one might require more than 128M to |
|
be reserved for kdump. One need to experiment and test kdump, if 128M is not |
|
sufficient, try reserving more memory. |
|
|
|
# grubby --args="crashkernel=128M" --update-kernel=/boot/vmlinuz-`uname -r` |
|
|
|
Note that there is an alternative form in which to specify a crashkernel |
|
memory reservation, in the event that more control is needed over the size and |
|
placement of the reserved memory. The format is: |
|
|
|
crashkernel=range1:size1[,range2:size2,...][@offset] |
|
|
|
Where range<n> specifies a range of values that are matched against the amount |
|
of physical RAM present in the system, and the corresponding size<n> value |
|
specifies the amount of kexec memory to reserve. For example: |
|
|
|
crashkernel=512M-2G:64M,2G-:128M |
|
|
|
This line tells kexec to reserve 64M of ram if the system contains between |
|
512M and 2G of physical memory. If the system contains 2G or more of physical |
|
memory, 128M should be reserved. |
|
|
|
Besides, since kdump needs to access /proc/kallsyms during a kernel |
|
loading if KASLR is enabled, check /proc/sys/kernel/kptr_restrict to |
|
make sure that the content of /proc/kallsyms is exposed correctly. |
|
We recommend to set the value of kptr_restrict to '1'. Otherwise |
|
capture kernel loading could fail. |
|
|
|
After making said changes, reboot your system, so that the X MB of memory is |
|
left untouched by the normal system, reserved for the capture kernel. Take note |
|
that the output of 'free -m' will show X MB less memory than without this |
|
parameter, which is expected. You may be able to get by with less than 128M, but |
|
testing with only 64M has proven unreliable of late. On ia64, as much as 512M |
|
may be required. |
|
|
|
Now that you've got that reserved memory region set up, you want to turn on |
|
the kdump init script: |
|
|
|
# chkconfig kdump on |
|
|
|
Then, start up kdump as well: |
|
|
|
# systemctl start kdump.service |
|
|
|
This should load your kernel-kdump image via kexec, leaving the system ready |
|
to capture a vmcore upon crashing. To test this out, you can force-crash |
|
your system by echo'ing a c into /proc/sysrq-trigger: |
|
|
|
# echo c > /proc/sysrq-trigger |
|
|
|
You should see some panic output, followed by the system restarting into |
|
the kdump kernel. When the boot process gets to the point where it starts |
|
the kdump service, your vmcore should be copied out to disk (by default, |
|
in /var/crash/<YYYY-MM-DD-HH:MM>/vmcore), then the system rebooted back into |
|
your normal kernel. |
|
|
|
Once back to your normal kernel, you can use the previously installed crash |
|
kernel in conjunction with the previously installed kernel-debuginfo to |
|
perform postmortem analysis: |
|
|
|
# crash /usr/lib/debug/lib/modules/2.6.17-1.2621.el5/vmlinux |
|
/var/crash/2006-08-23-15:34/vmcore |
|
|
|
crash> bt |
|
|
|
and so on... |
|
|
|
|
|
Notes on kdump |
|
============== |
|
|
|
When kdump starts, the kdump kernel is loaded together with the kdump |
|
initramfs. To save memory usage and disk space, the kdump initramfs is |
|
generated strictly against the system it will run on, and contains the |
|
minimum set of kernel modules and utilities to boot the machine to a stage |
|
where the dump target could be mounted. |
|
|
|
With kdump service enabled, kdumpctl will try to detect possible system |
|
change and rebuild the kdump initramfs if needed. But it can not guarantee |
|
to cover every possible case. So after a hardware change, disk migration, |
|
storage setup update or any similar system level changes, it's highly |
|
recommended to rebuild the initramfs manually with following command: |
|
|
|
# kdumpctl rebuild |
|
|
|
|
|
Saving vmcore-dmesg.txt |
|
======================= |
|
|
|
Kernel log bufferes are one of the most important information available |
|
in vmcore. Now before saving vmcore, kernel log bufferes are extracted |
|
from /proc/vmcore and saved into a file vmcore-dmesg.txt. After |
|
vmcore-dmesg.txt, vmcore is saved. Destination disk and directory for |
|
vmcore-dmesg.txt is same as vmcore. Note that kernel log buffers will |
|
not be available if dump target is raw device. |
|
|
|
|
|
Dump Triggering methods |
|
======================= |
|
|
|
This section talks about the various ways, other than a Kernel Panic, in which |
|
Kdump can be triggered. The following methods assume that Kdump is configured |
|
on your system, with the scripts enabled as described in the section above. |
|
|
|
1) AltSysRq C |
|
|
|
Kdump can be triggered with the combination of the 'Alt','SysRq' and 'C' |
|
keyboard keys. Please refer to the following link for more details: |
|
|
|
http://kbase.redhat.com/faq/FAQ_43_5559.shtm |
|
|
|
In addition, on PowerPC boxes, Kdump can also be triggered via Hardware |
|
Management Console(HMC) using 'Ctrl', 'O' and 'C' keyboard keys. |
|
|
|
2) NMI_WATCHDOG |
|
|
|
In case a machine has a hard hang, it is quite possible that it does not |
|
respond to keyboard interrupts. As a result 'Alt-SysRq' keys will not help |
|
trigger a dump. In such scenarios Nmi Watchdog feature can prove to be useful. |
|
The following link has more details on configuring Nmi watchdog option. |
|
|
|
http://kbase.redhat.com/faq/FAQ_85_9129.shtm |
|
|
|
Once this feature has been enabled in the kernel, any lockups will result in an |
|
OOPs message to be generated, followed by Kdump being triggered. |
|
|
|
3) Kernel OOPs |
|
|
|
If we want to generate a dump everytime the Kernel OOPses, we can achieve this |
|
by setting the 'Panic On OOPs' option as follows: |
|
|
|
# echo 1 > /proc/sys/kernel/panic_on_oops |
|
|
|
This is enabled by default on RHEL5. |
|
|
|
4) NMI(Non maskable interrupt) button |
|
|
|
In cases where the system is in a hung state, and is not accepting keyboard |
|
interrupts, using NMI button for triggering Kdump can be very useful. NMI |
|
button is present on most of the newer x86 and x86_64 machines. Please refer |
|
to the User guides/manuals to locate the button, though in most occasions it |
|
is not very well documented. In most cases it is hidden behind a small hole |
|
on the front or back panel of the machine. You could use a toothpick or some |
|
other non-conducting probe to press the button. |
|
|
|
For example, on the IBM X series 366 machine, the NMI button is located behind |
|
a small hole on the bottom center of the rear panel. |
|
|
|
To enable this method of dump triggering using NMI button, you will need to set |
|
the 'unknown_nmi_panic' option as follows: |
|
|
|
# echo 1 > /proc/sys/kernel/unknown_nmi_panic |
|
|
|
5) PowerPC specific methods: |
|
|
|
On IBM PowerPC machines, issuing a soft reset invokes the XMON debugger(if |
|
XMON is configured). To configure XMON one needs to compile the kernel with |
|
the CONFIG_XMON and CONFIG_XMON_DEFAULT options, or by compiling with |
|
CONFIG_XMON and booting the kernel with xmon=on option. |
|
|
|
Following are the ways to remotely issue a soft reset on PowerPC boxes, which |
|
would drop you to XMON. Pressing a 'X' (capital alphabet X) followed by an |
|
'Enter' here will trigger the dump. |
|
|
|
5.1) HMC |
|
|
|
Hardware Management Console(HMC) available on Power4 and Power5 machines allow |
|
partitions to be reset remotely. This is specially useful in hang situations |
|
where the system is not accepting any keyboard inputs. |
|
|
|
Once you have HMC configured, the following steps will enable you to trigger |
|
Kdump via a soft reset: |
|
|
|
On Power4 |
|
Using GUI |
|
|
|
* In the right pane, right click on the partition you wish to dump. |
|
* Select "Operating System->Reset". |
|
* Select "Soft Reset". |
|
* Select "Yes". |
|
|
|
Using HMC Commandline |
|
|
|
# reset_partition -m <machine> -p <partition> -t soft |
|
|
|
On Power5 |
|
Using GUI |
|
|
|
* In the right pane, right click on the partition you wish to dump. |
|
* Select "Restart Partition". |
|
* Select "Dump". |
|
* Select "OK". |
|
|
|
Using HMC Commandline |
|
|
|
# chsysstate -m <managed system name> -n <lpar name> -o dumprestart -r lpar |
|
|
|
5.2) Blade Management Console for Blade Center |
|
|
|
To initiate a dump operation, go to Power/Restart option under "Blade Tasks" in |
|
the Blade Management Console. Select the corresponding blade for which you want |
|
to initate the dump and then click "Restart blade with NMI". This issues a |
|
system reset and invokes xmon debugger. |
|
|
|
|
|
Dump targets |
|
============ |
|
|
|
In addition to being able to capture a vmcore to your system's local file |
|
system, kdump can be configured to capture a vmcore to a number of other |
|
locations, including a raw disk partition, a dedicated file system, an NFS |
|
mounted file system, or a remote system via ssh/scp. Additional options |
|
exist for specifying the relative path under which the dump is captured, |
|
what to do if the capture fails, and for compressing and filtering the dump |
|
(so as to produce smaller, more manageable, vmcore files, see "Advanced Setups" |
|
for more detail on these options). |
|
|
|
In theory, dumping to a location other than the local file system should be |
|
safer than kdump's default setup, as its possible the default setup will try |
|
dumping to a file system that has become corrupted. The raw disk partition and |
|
dedicated file system options allow you to still dump to the local system, |
|
but without having to remount your possibly corrupted file system(s), |
|
thereby decreasing the chance a vmcore won't be captured. Dumping to an |
|
NFS server or remote system via ssh/scp also has this advantage, as well |
|
as allowing for the centralization of vmcore files, should you have several |
|
systems from which you'd like to obtain vmcore files. Of course, note that |
|
these configurations could present problems if your network is unreliable. |
|
|
|
Kdump target and advanced setups are configured via modifications to |
|
/etc/kdump.conf, which out of the box, is fairly well documented itself. |
|
Any alterations to /etc/kdump.conf should be followed by a restart of the |
|
kdump service, so the changes can be incorporated in the kdump initrd. |
|
Restarting the kdump service is as simple as '/sbin/systemctl restart kdump.service'. |
|
|
|
There are two ways to config the dump target, config dump target only |
|
using "path", and config dump target explicitly. Interpretation of "path" |
|
also differs in two config styles. |
|
|
|
Config dump target only using "path" |
|
------------------------------------ |
|
|
|
You can change the dump target by setting "path" to a mount point where |
|
dump target is mounted. When there is no explicitly configured dump target, |
|
"path" in kdump.conf represents the current file system path in which vmcore |
|
will be saved. Kdump will automatically detect the underlying device of |
|
"path" and use that as the dump target. |
|
|
|
In fact, upon dump, kdump creates a directory $hostip-$date with-in "path" |
|
and saves vmcore there. So practically dump is saved in $path/$hostip-$date/. |
|
|
|
Kdump will only check current mount status for mount entry corresponding to |
|
"path". So please ensure the dump target is mounted on "path" before kdump |
|
service starts. |
|
|
|
NOTES: |
|
|
|
- It's strongly recommanded to put an mount entry for "path" in /etc/fstab |
|
and have it auto mounted on boot. This make sure the dump target is |
|
reachable from the machine and kdump's configuration is stable. |
|
|
|
EXAMPLES: |
|
|
|
- path /var/crash/ |
|
|
|
This is the default configuration. Assuming there is no disk mounted |
|
on /var/ or on /var/crash, dump will be saved on disk backing rootfs |
|
in directory /var/crash. |
|
|
|
- path /var/crash/ (A separate disk mounted on /var/crash) |
|
|
|
Say a disk /dev/sdb is mounted on /var. In this case dump target will |
|
become /dev/sdb and path will become "/" and dump will be saved |
|
on "sdb:/var/crash/" directory. |
|
|
|
- path /var/crash/ (NFS mounted on /var) |
|
|
|
Say foo.com:/export/tmp is mounted on /var. In this case dump target is |
|
nfs server and path will be adjusted to "/crash" and dump will be saved to |
|
foo.com:/export/tmp/crash/ directory. |
|
|
|
Config dump target explicitely |
|
------------------------------ |
|
|
|
You can set the dump target explicitly in kdump.conf, and "path" will be |
|
the relative path in the specified dump target. For example, if dump |
|
target is "ext4 /dev/sda", then dump will be saved in "path" directory |
|
on /dev/sda. |
|
|
|
Same is the case for nfs dump. If user specified "nfs foo.com:/export/tmp/" |
|
as dump target, then dump will effectively be saved in |
|
"foo.com:/export/tmp/var/crash/" directory. |
|
|
|
If the dump target is "raw", then "path" is ignored. |
|
|
|
If it's a filesystem target, kdump will need to know the right mount option. |
|
Kdump will check current mount status, and then /etc/fstab for mount options |
|
corresponding to the specified dump target and use it. If there are |
|
special mount option required for the dump target, it could be set by put |
|
an entry in fstab. |
|
|
|
If there are no related mount entry, mount option is set to "defaults". |
|
|
|
NOTES: |
|
|
|
- It's recommended to put an entry for the dump target in /etc/fstab |
|
and have it auto mounted on boot. This make sure the dump target is |
|
reachable from the machine and kdump won't fail. |
|
|
|
- Kdump ignores some mount options, including "noauto", "ro". This |
|
make it possible to keep the dump target unmounted or read-only |
|
when not used. |
|
|
|
EXAMPLES: |
|
|
|
- ext4 /dev/sda (mounted) |
|
path /var/crash/ |
|
|
|
In this case dump target is set to /dev/sdb, path is the absolute path |
|
"/var/crash" in /dev/sda, vmcore path will saved on |
|
"sda:/var/crash" directory. |
|
|
|
- nfs foo.com:/export/tmp (mounted) |
|
path /var/crash/ |
|
|
|
In this case dump target is nfs server, path is the absolute path |
|
"/var/crash", vmcore path will saved on "foo.com:/export/tmp/crash/" directory. |
|
|
|
- nfs foo.com:/export/tmp (not mounted) |
|
path /var/crash/ |
|
|
|
Same with above case, kdump will use "defaults" as the mount option |
|
for the dump target. |
|
|
|
- nfs foo.com:/export/tmp (not mounted, entry with option "noauto,nolock" exists in /etc/fstab) |
|
path /var/crash/ |
|
|
|
In this case dump target is nfs server, vmcore path will saved on |
|
"foo.com:/export/tmp/crash/" directory, and kdump will inherit "nolock" option. |
|
|
|
Dump target and mkdumprd |
|
------------------------ |
|
|
|
MKdumprd is the tool used to create kdump initramfs, and it may change |
|
the mount status of the dump target in some condition. |
|
|
|
Usually the dump target should be used only for kdump. If you worry about |
|
someone uses the filesystem for something else other than dumping vmcore |
|
you can mount it as read-only or make it a noauto mount. Mkdumprd will |
|
mount/remount it as read-write for creating dump directory and will |
|
move it back to it's original state afterwards. |
|
|
|
Supported dump target types and requirements |
|
-------------------------------------------- |
|
|
|
1) Raw partition |
|
|
|
Raw partition dumping requires that a disk partition in the system, at least |
|
as large as the amount of memory in the system, be left unformatted. Assuming |
|
/dev/vg/lv_kdump is left unformatted, kdump.conf can be configured with |
|
'raw /dev/vg/lv_kdump', and the vmcore file will be copied via dd directly |
|
onto partition /dev/vg/lv_kdump. Restart the kdump service via |
|
'/sbin/systemctl restart kdump.service' to commit this change to your kdump |
|
initrd. Dump target should be persistent device name, such as lvm or device |
|
mapper canonical name. |
|
|
|
2) Dedicated file system |
|
|
|
Similar to raw partition dumping, you can format a partition with the file |
|
system of your choice, Again, it should be at least as large as the amount |
|
of memory in the system. Assuming it should be at least as large as the |
|
amount of memory in the system. Assuming /dev/vg/lv_kdump has been |
|
formatted ext4, specify 'ext4 /dev/vg/lv_kdump' in kdump.conf, and a |
|
vmcore file will be copied onto the file system after it has been mounted. |
|
Dumping to a dedicated partition has the advantage that you can dump multiple |
|
vmcores to the file system, space permitting, without overwriting previous ones, |
|
as would be the case in a raw partition setup. Restart the kdump service via |
|
'/sbin/systemctl restart kdump.service' to commit this change to |
|
your kdump initrd. Note that for local file systems ext4 and ext2 are |
|
supported as dumpable targets. Kdump will not prevent you from specifying |
|
other filesystems, and they will most likely work, but their operation |
|
cannot be guaranteed. for instance specifying a vfat filesystem or msdos |
|
filesystem will result in a successful load of the kdump service, but during |
|
crash recovery, the dump will fail if the system has more than 2GB of memory |
|
(since vfat and msdos filesystems do not support more than 2GB files). |
|
Be careful of your filesystem selection when using this target. |
|
|
|
It is recommended to use persistent device names or UUID/LABEL for file system |
|
dumps. One example of persistent device is /dev/vg/<devname>. |
|
|
|
3) NFS mount |
|
|
|
Dumping over NFS requires an NFS server configured to export a file system |
|
with full read/write access for the root user. All operations done within |
|
the kdump initial ramdisk are done as root, and to write out a vmcore file, |
|
we obviously must be able to write to the NFS mount. Configuring an NFS |
|
server is outside the scope of this document, but either the no_root_squash |
|
or anonuid options on the NFS server side are likely of interest to permit |
|
the kdump initrd operations write to the NFS mount as root. |
|
|
|
Assuming your're exporting /dump on the machine nfs-server.example.com, |
|
once the mount is properly configured, specify it in kdump.conf, via |
|
'nfs nfs-server.example.com:/dump'. The server portion can be specified either |
|
by host name or IP address. Following a system crash, the kdump initrd will |
|
mount the NFS mount and copy out the vmcore to your NFS server. Restart the |
|
kdump service via '/sbin/systemctl restart kdump.service' to commit this change |
|
to your kdump initrd. |
|
|
|
4) Special mount via "dracut_args" |
|
|
|
You can utilize "dracut_args" to pass "--mount" to kdump, see dracut manpage |
|
about the format of "--mount" for details. If there is any "--mount" specified |
|
via "dracut_args", kdump will build it as the mount target without doing any |
|
validation (mounting or checking like mount options, fs size, save path, etc), |
|
so you must test it to ensure all the correctness. You cannot use other targets |
|
in /etc/kdump.conf if you use "--mount" in "dracut_args". You also cannot specify |
|
mutliple "--mount" targets via "dracut_args". |
|
|
|
One use case of "--mount" in "dracut_args" is you do not want to mount dump target |
|
before kdump service startup, for example, to reduce the burden of the shared nfs |
|
server. Such as the example below: |
|
dracut_args --mount "192.168.1.1:/share /mnt/test nfs4 defaults" |
|
|
|
NOTE: |
|
- <mountpoint> must be specified as an absolute path. |
|
|
|
5) Remote system via ssh/scp |
|
|
|
Dumping over ssh/scp requires setting up passwordless ssh keys for every |
|
machine you wish to have dump via this method. First up, configure kdump.conf |
|
for ssh/scp dumping, adding a config line of 'ssh user@server', where 'user' |
|
can be any user on the target system you choose, and 'server' is the host |
|
name or IP address of the target system. Using a dedicated, restricted user |
|
account on the target system is recommended, as there will be keyless ssh |
|
access to this account. |
|
|
|
Once kdump.conf is appropriately configured, issue the command |
|
'kdumpctl propagate' to automatically set up the ssh host keys and transmit |
|
the necessary bits to the target server. You'll have to type in 'yes' |
|
to accept the host key for your targer server if this is the first time |
|
you've connected to it, and then input the target system user's password |
|
to send over the necessary ssh key file. Restart the kdump service via |
|
'/sbin/systemctl restart kdump.service' to commit this change to your kdump initrd. |
|
|
|
Advanced Setups |
|
=============== |
|
|
|
About /etc/sysconfig/kdump |
|
------------------------------ |
|
|
|
Currently, there are a few options in /etc/sysconfig/kdump, which are |
|
usually used to control the behavior of kdump kernel. Basically, all of |
|
these options have default values, usually we do not need to change them, |
|
but sometimes, we may modify them in order to better control the behavior |
|
of kdump kernel such as debug, etc. |
|
|
|
-KDUMP_BOOTDIR |
|
|
|
Usually kdump kernel is the same as 1st kernel. So kdump will try to find |
|
kdump kernel under /boot according to /proc/cmdline. E.g we execute below |
|
command and get an output: |
|
cat /proc/cmdline |
|
BOOT_IMAGE=/xxx/vmlinuz-3.yyy.zzz root=xxxx ..... |
|
|
|
Then kdump kernel will be /boot/xxx/vmlinuz-3.yyy.zzz. However, this option |
|
is provided to user if kdump kernel is put in a different directory. |
|
|
|
-KDUMP_IMG |
|
|
|
This represents the image type used for kdump. The default value is "vmlinuz". |
|
|
|
-KDUMP_IMG_EXT |
|
|
|
This represents the images extension. Relocatable kernels don't have one. |
|
Currently, it is a null string by default. |
|
|
|
-KEXEC_ARGS |
|
|
|
Any additional kexec arguments required. For example: |
|
KEXEC_ARGS="--elf32-core-headers". |
|
|
|
In most situations, this should be left empty. But, sometimes we hope to get |
|
additional kexec loading debugging information, we can add the '-d' option |
|
for the debugging. |
|
|
|
-KDUMP_KERNELVER |
|
|
|
This is a kernel version string for the kdump kernel. If the version is not |
|
specified, the init script will try to find a kdump kernel with the same |
|
version number as the running kernel. |
|
|
|
-KDUMP_COMMANDLINE |
|
|
|
The value of 'KDUMP_COMMANDLINE' will be passed to kdump kernel as command |
|
line parameters, this will likely match the contents of the grub kernel line. |
|
|
|
In general, if a command line is not specified, which means that it is a null |
|
string such as KDUMP_COMMANDLINE="", the default will be taken automatically |
|
from the '/proc/cmdline'. |
|
|
|
-KDUMP_COMMANDLINE_REMOVE |
|
|
|
This option allows us to remove arguments from the current kdump command line. |
|
If we don't specify any parameters for the KDUMP_COMMANDLINE, it will inherit |
|
all values from the '/proc/cmdline', which is not expected. As you know, some |
|
default kernel parameters could affect kdump, furthermore, that could cause |
|
the failure of kdump kernel boot. |
|
|
|
In addition, the option is also helpful to debug the kdump kernel, we can use |
|
this option to change kdump kernel command line. |
|
|
|
For more kernel parameters, please refer to kernel document. |
|
|
|
-KDUMP_COMMANDLINE_APPEND |
|
|
|
This option allows us to append arguments to the current kdump command line |
|
after processed by the KDUMP_COMMANDLINE_REMOVE. For kdump kernel, some |
|
specific modules require to be disabled like the mce, cgroup, numa, hest_disable, |
|
etc. Those modules may waste memory or kdump kernel doesn't need them, |
|
furthermore, there may affect kdump kernel boot. |
|
|
|
Just like above option, it can be used to disable or enable some kernel |
|
modules so that we can exclude any errors for kdump kernel, this is very |
|
meaningful for debugging. |
|
|
|
-KDUMP_STDLOGLVL | KDUMP_SYSLOGLVL | KDUMP_KMSGLOGLVL |
|
|
|
These variables are used to control the kdump log level in the first kernel. |
|
In the second kernel, kdump will use the rd.kdumploglvl option to set the log |
|
level in the above KDUMP_COMMANDLINE_APPEND. |
|
|
|
Logging levels: no logging(0), error(1), warn(2), info(3), debug(4) |
|
|
|
Kdump Post-Capture Executable |
|
----------------------------- |
|
|
|
It is possible to specify a custom script or binary you wish to run following |
|
an attempt to capture a vmcore. The executable is passed an exit code from |
|
the capture process, which can be used to trigger different actions from |
|
within your post-capture executable. |
|
If /etc/kdump/post.d directory exist, All files in the directory are |
|
collectively sorted and executed in lexical order, before binary or script |
|
specified kdump_post parameter is executed. |
|
|
|
In these scripts, the reference to the storage or network device should adhere |
|
to the section 'Supported dump target types and requirements' |
|
|
|
Kdump Pre-Capture Executable |
|
---------------------------- |
|
|
|
It is possible to specify a custom script or binary you wish to run before |
|
capturing a vmcore. Exit status of this binary is interpreted: |
|
0 - continue with dump process as usual |
|
non 0 - run the final action (reboot/poweroff/halt) |
|
If /etc/kdump/pre.d directory exists, all files in the directory are collectively |
|
sorted and executed in lexical order, after binary or script specified |
|
kdump_pre parameter is executed. |
|
Even if the binary or script in /etc/kdump/pre.d directory returns non 0 |
|
exit status, the processing is continued. |
|
|
|
In these scripts, the reference to the storage or network device should adhere |
|
to the section 'Supported dump target types and requirements' |
|
|
|
Extra Binaries |
|
-------------- |
|
|
|
If you have specific binaries or scripts you want to have made available |
|
within your kdump initrd, you can specify them by their full path, and they |
|
will be included in your kdump initrd, along with all dependent libraries. |
|
This may be particularly useful for those running post-capture scripts that |
|
rely on other binaries. |
|
|
|
Extra Modules |
|
------------- |
|
|
|
By default, only the bare minimum of kernel modules will be included in your |
|
kdump initrd. Should you wish to capture your vmcore files to a non-boot-path |
|
storage device, such as an iscsi target disk or clustered file system, you may |
|
need to manually specify additional kernel modules to load into your kdump |
|
initrd. |
|
|
|
Failure action |
|
-------------- |
|
|
|
Failure action specifies what to do when dump to configured dump target |
|
fails. By default, failure action is "reboot" and that is system reboots |
|
if attempt to save dump to dump target fails. |
|
|
|
There are other failure actions available though. |
|
|
|
- dump_to_rootfs |
|
This option tries to mount root and save dump on root filesystem |
|
in a path specified by "path". This option will generally make |
|
sense when dump target is not root filesystem. For example, if |
|
dump is being saved over network using "ssh" then one can specify |
|
failure action to "dump_to_rootfs" to try saving dump to root |
|
filesystem if dump over network fails. |
|
|
|
- shell |
|
Drop into a shell session inside initramfs. |
|
|
|
- halt |
|
Halt system after failure |
|
|
|
- poweroff |
|
Poweroff system after failure. |
|
|
|
Compression and filtering |
|
------------------------- |
|
|
|
The 'core_collector' parameter in kdump.conf allows you to specify a custom |
|
dump capture method. The most common alternate method is makedumpfile, which |
|
is a dump filtering and compression utility provided with kexec-tools. On |
|
some architectures, it can drastically reduce the size of your vmcore files, |
|
which becomes very useful on systems with large amounts of memory. |
|
|
|
A typical setup is 'core_collector makedumpfile -F -l --message-level 7 -d 31', |
|
but check the output of '/sbin/makedumpfile --help' for a list of all available |
|
options (-i and -g don't need to be specified, they're automatically taken care |
|
of). Note that use of makedumpfile requires that the kernel-debuginfo package |
|
corresponding with your running kernel be installed. |
|
|
|
Core collector command format depends on dump target type. Typically for |
|
filesystem (local/remote), core_collector should accept two arguments. |
|
First one is source file and second one is target file. For ex. |
|
|
|
- ex1. |
|
|
|
core_collector "cp --sparse=always" |
|
|
|
Above will effectively be translated to: |
|
|
|
cp --sparse=always /proc/vmcore <dest-path>/vmcore |
|
|
|
- ex2. |
|
|
|
core_collector "makedumpfile -l --message-level 7 -d 31" |
|
|
|
Above will effectively be translated to: |
|
|
|
makedumpfile -l --message-level 7 -d 31 /proc/vmcore <dest-path>/vmcore |
|
|
|
For dump targets like raw and ssh, in general, core collector should expect |
|
one argument (source file) and should output the processed core on standard |
|
output (There is one exception of "scp", discussed later). This standard |
|
output will be saved to destination using appropriate commands. |
|
|
|
raw dumps core_collector examples: |
|
|
|
- ex3. |
|
|
|
core_collector "cat" |
|
|
|
Above will effectively be translated to. |
|
|
|
cat /proc/vmcore | dd of=<target-device> |
|
|
|
- ex4. |
|
|
|
core_collector "makedumpfile -F -l --message-level 7 -d 31" |
|
|
|
Above will effectively be translated to. |
|
|
|
makedumpfile -F -l --message-level 7 -d 31 | dd of=<target-device> |
|
|
|
ssh dumps core_collector examples: |
|
|
|
- ex5. |
|
|
|
core_collector "cat" |
|
|
|
Above will effectively be translated to. |
|
|
|
cat /proc/vmcore | ssh <options> <remote-location> "dd of=path/vmcore" |
|
|
|
- ex6. |
|
|
|
core_collector "makedumpfile -F -l --message-level 7 -d 31" |
|
|
|
Above will effectively be translated to. |
|
|
|
makedumpfile -F -l --message-level 7 -d 31 | ssh <options> <remote-location> "dd of=path/vmcore" |
|
|
|
There is one exception to standard output rule for ssh dumps. And that is |
|
scp. As scp can handle ssh destinations for file transfers, one can |
|
specify "scp" as core collector for ssh targets (no output on stdout). |
|
|
|
- ex7. |
|
|
|
core_collector "scp" |
|
|
|
Above will effectively be translated to. |
|
|
|
scp /proc/vmcore <user@host>:path/vmcore |
|
|
|
About default core collector |
|
---------------------------- |
|
|
|
Default core_collector for ssh/raw dump is: |
|
"makedumpfile -F -l --message-level 7 -d 31". |
|
Default core_collector for other targets is: |
|
"makedumpfile -l --message-level 7 -d 31". |
|
|
|
Even if core_collector option is commented out in kdump.conf, makedumpfile |
|
is default core collector and kdump uses it internally. |
|
|
|
If one does not want makedumpfile as default core_collector, then they |
|
need to specify one using core_collector option to change the behavior. |
|
|
|
Note: If "makedumpfile -F" is used then you will get a flattened format |
|
vmcore.flat, you will need to use "makedumpfile -R" to rearrange the |
|
dump data from stdard input to a normal dumpfile (readable with analysis |
|
tools). |
|
For example: "makedumpfile -R vmcore < vmcore.flat" |
|
|
|
|
|
Caveats |
|
======= |
|
|
|
Console frame-buffers and X are not properly supported. If you typically run |
|
with something along the lines of "vga=791" in your kernel config line or |
|
have X running, console video will be garbled when a kernel is booted via |
|
kexec. Note that the kdump kernel should still be able to create a dump, |
|
and when the system reboots, video should be restored to normal. |
|
|
|
|
|
Notes |
|
===== |
|
|
|
Notes on resetting video: |
|
------------------------- |
|
|
|
Video is a notoriously difficult issue with kexec. Video cards contain ROM code |
|
that controls their initial configuration and setup. This code is nominally |
|
accessed and executed from the Bios, and otherwise not safely executable. Since |
|
the purpose of kexec is to reboot the system without re-executing the Bios, it |
|
is rather difficult if not impossible to reset video cards with kexec. The |
|
result is, that if a system crashes while running in a graphical mode (i.e. |
|
running X), the screen may appear to become 'frozen' while the dump capture is |
|
taking place. A serial console will of course reveal that the system is |
|
operating and capturing a vmcore image, but a casual observer will see the |
|
system as hung until the dump completes and a true reboot is executed. |
|
|
|
There are two possiblilties to work around this issue. One is by adding |
|
--reset-vga to the kexec command line options in /etc/sysconfig/kdump. This |
|
tells kdump to write some reasonable default values to the video card register |
|
file, in the hopes of returning it to a text mode such that boot messages are |
|
visible on the screen. It does not work with all video cards however. |
|
Secondly, it may be worth trying to add vga15fb.ko to the extra_modules list in |
|
/etc/kdump.conf. This will attempt to use the video card in framebuffer mode, |
|
which can blank the screen prior to the start of a dump capture. |
|
|
|
Notes on rootfs mount |
|
--------------------- |
|
|
|
Dracut is designed to mount rootfs by default. If rootfs mounting fails it |
|
will refuse to go on. So kdump leaves rootfs mounting to dracut currently. |
|
We make the assumtion that proper root= cmdline is being passed to dracut |
|
initramfs for the time being. If you need modify "KDUMP_COMMANDLINE=" in |
|
/etc/sysconfig/kdump, you will need to make sure that appropriate root= |
|
options are copied from /proc/cmdline. In general it is best to append |
|
command line options using "KDUMP_COMMANDLINE_APPEND=" instead of replacing |
|
the original command line completely. |
|
|
|
Notes on watchdog module handling |
|
--------------------------------- |
|
|
|
If a watchdog is active in first kernel then, we must have it's module |
|
loaded in crash kernel, so that either watchdog is deactivated or started |
|
being kicked in second kernel. Otherwise, we might face watchdog reboot |
|
when vmcore is being saved. When dracut watchdog module is enabled, it |
|
installs kernel watchdog module of active watchdog device in initrd. |
|
kexec-tools always add "-a watchdog" to the dracut_args if there exists at |
|
least one active watchdog and user has not added specifically "-o watchdog" |
|
in dracut_args of kdump.conf. If a watchdog module (such as hp_wdt) has |
|
not been written in watchdog-core framework then this option will not have |
|
any effect and module will not be added. Please note that only systemd |
|
watchdog daemon is supported as watchdog kick application. |
|
|
|
Notes for disk images |
|
--------------------- |
|
|
|
Kdump initramfs is a critical component for capturing the crash dump. |
|
But it's strictly generated for the machine it will run on, and have |
|
no generality. If you install a new machine with a previous disk image |
|
(eg. VMs created with disk image or snapshot), kdump could be broken |
|
easily due to hardware changes or disk ID changes. So it's strongly |
|
recommended to not include the kdump initramfs in the disk image in the |
|
first place, this helps to save space, and kdumpctl will build the |
|
initramfs automatically if it's missing. If you have already installed |
|
a machine with a disk image which have kdump initramfs embedded, you |
|
should rebuild the initramfs using "kdumpctl rebuild" command manually, |
|
or else kdump may not work as expeceted. |
|
|
|
Notes on encrypted dump target |
|
------------------------------ |
|
|
|
Currently, kdump is not working well with encrypted dump target. |
|
First, user have to give the password manually in capture kernel, |
|
so a working interactive terminal is required in the capture kernel. |
|
And another major issue is that an OOM problem will occur with certain |
|
encryption setup. For example, the default setup for LUKS2 will use a |
|
memory hard key derivation function to mitigate brute force attach, |
|
it's impossible to reduce the memory usage for mounting the encrypted |
|
target. In such case, you have to either reserved enough memory for |
|
crash kernel according, or update your encryption setup. |
|
It's recommanded to use a non-encrypted target (eg. remote target) |
|
instead. |
|
|
|
Notes on device dump |
|
-------------------- |
|
|
|
Device dump allows drivers to append dump data to vmcore, so you can |
|
collect driver specified debug info. The drivers could append the |
|
data without any limit, and the data is stored in memory, this may |
|
bring a significant memory stress. So device dump is disabled by default |
|
by passing "novmcoredd" command line option to the kdump capture kernel. |
|
If you want to collect debug data with device dump, you need to modify |
|
"KDUMP_COMMANDLINE_APPEND=" value in /etc/sysconfig/kdump and remove the |
|
"novmcoredd" option. You also need to increase the "crashkernel=" value |
|
accordingly in case of OOM issue. |
|
Besides, kdump initramfs won't automatically include the device drivers |
|
which support device dump, only device drivers that are required for |
|
the dump target setup will be included. To ensure the device dump data |
|
will be included in the vmcore, you need to force include related |
|
device drivers by using "extra_modules" option in /etc/kdump.conf |
|
|
|
|
|
Parallel Dumping Operation |
|
========================== |
|
|
|
Kexec allows kdump using multiple cpus. So parallel feature can accelerate |
|
dumping substantially, especially in executing compression and filter. |
|
For example: |
|
|
|
1."makedumpfile -c --num-threads [THREAD_NUM] /proc/vmcore dumpfile" |
|
2."makedumpfile -c /proc/vmcore dumpfile", |
|
|
|
1 has better performance than 2, if THREAD_NUM is larger than two |
|
and the usable cpus number is larger than THREAD_NUM. |
|
|
|
Notes on how to use multiple cpus on a capture kernel on x86 system: |
|
|
|
Make sure that you are using a kernel that supports disable_cpu_apicid |
|
kernel option as a capture kernel, which is needed to avoid x86 specific |
|
hardware issue (*). The disable_cpu_apicid kernel option is automatically |
|
appended by kdumpctl script and is ignored if the kernel doesn't support it. |
|
|
|
You need to specify how many cpus to be used in a capture kernel by specifying |
|
the number of cpus in nr_cpus kernel option in /etc/sysconfig/kdump. nr_cpus |
|
is 1 at default. |
|
|
|
You should use necessary and sufficient number of cpus on a capture kernel. |
|
Warning: Don't use too many cpus on a capture kernel, or the capture kernel |
|
may lead to panic due to Out Of Memory. |
|
|
|
(*) Without disable_cpu_apicid kernel option, capture kernel may lead to |
|
hang, system reset or power-off at boot, depending on your system and runtime |
|
situation at the time of crash. |
|
|
|
|
|
Debugging Tips |
|
============== |
|
|
|
- One can drop into a shell before/after saving vmcore with the help of |
|
using kdump_pre/kdump_post hooks. Use following in one of the pre/post |
|
scripts to drop into a shell. |
|
|
|
#!/bin/bash |
|
_ctty=/dev/ttyS0 |
|
setsid /bin/sh -i -l 0<>$_ctty 1<>$_ctty 2<>$_ctty |
|
|
|
One might have to change the terminal depending on what they are using. |
|
|
|
- Serial console logging for virtual machines |
|
|
|
I generally use "virsh console <domain-name>" to get to serial console. |
|
I noticed after dump saving system reboots and when grub menu shows up |
|
some of the previously logged messages are no more there. That means |
|
any important debugging info at the end will be lost. |
|
|
|
One can log serial console as follows to make sure messages are not lost. |
|
|
|
virsh ttyconsole <domain-name> |
|
ln -s <name-of-tty> /dev/modem |
|
minicom -C /tmp/console-logs |
|
|
|
Now minicom should be logging serial console in file console-logs. |
|
|
|
- Using the logger to output kdump log messages |
|
|
|
You can configure the kdump log level for the first kernel in the |
|
/etc/sysconfig/kdump. For example: |
|
|
|
KDUMP_STDLOGLVL=3 |
|
KDUMP_SYSLOGLVL=0 |
|
KDUMP_KMSGLOGLVL=0 |
|
|
|
The above configurations indicate that kdump messages will be printed |
|
to the console, and the KDUMP_STDLOGLVL is set to 3(info), but the |
|
KDUMP_SYSLOGLVL and KDUMP_KMSGLOGLVL are set to 0(no logging). This |
|
is also the current default log levels in the first kernel. |
|
|
|
In the second kernel, you can add the 'rd.kdumploglvl=X' option to the |
|
KDUMP_COMMANDLINE_APPEND in the /etc/sysconfig/kdump so that you can also |
|
set the log levels for the second kernel. The 'X' represents the logging |
|
levels, the default log level is 3(info) in the second kernel, for example: |
|
|
|
# cat /etc/sysconfig/kdump |grep rd.kdumploglvl |
|
KDUMP_COMMANDLINE_APPEND="irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off udev.children-max=2 panic=10 acpi_no_memhotplug transparent_hugepage=never nokaslr hest_disable novmcoredd rd.kdumploglvl=3" |
|
|
|
Logging levels: no logging(0), error(1),warn(2),info(3),debug(4) |
|
|
|
The ERROR level designates error events that might still allow the application |
|
to continue running. |
|
|
|
The WARN level designates potentially harmful situations. |
|
|
|
The INFO level designates informational messages that highlight the progress |
|
of the application at coarse-grained level. |
|
|
|
The DEBUG level designates fine-grained informational events that are most |
|
useful to debug an application. |
|
|
|
Note: if you set the log level to 0, that will disable the logs at the |
|
corresponding log level, which indicates that it has no log output. |
|
|
|
At present, the logger works in both the first kernel(kdump service debugging) |
|
and the second kernel. |
|
|
|
In the first kernel, you can find the historical logs with the journalctl |
|
command and check kdump service debugging information. In addition, the |
|
'kexec -d' debugging messages are also saved to /var/log/kdump.log in the |
|
first kernel. For example: |
|
|
|
[root@ibm-z-109 ~]# ls -al /var/log/kdump.log |
|
-rw-r--r--. 1 root root 63238 Oct 28 06:40 /var/log/kdump.log |
|
|
|
If you want to get the debugging information of building kdump initramfs, you |
|
can enable the '--debug' option for the dracut_args in the /etc/kdump.conf, and |
|
then rebuild the kdump initramfs as below: |
|
|
|
# systemctl restart kdump.service |
|
|
|
That will rebuild the kdump initramfs and gerenate some logs to journald, you |
|
can find the dracut logs with the journalctl command. |
|
|
|
In the second kernel, kdump will automatically put the kexec-dmesg.log to a same |
|
directory with the vmcore, the log file includes the debugging messages like dmesg |
|
and journald logs. For example: |
|
|
|
[root@ibm-z-109 ~]# ls -al /var/crash/127.0.0.1-2020-10-28-02\:01\:23/ |
|
drwxr-xr-x. 2 root root 67 Oct 28 02:02 . |
|
drwxr-xr-x. 6 root root 154 Oct 28 02:01 .. |
|
-rw-r--r--. 1 root root 21164 Oct 28 02:01 kexec-dmesg.log |
|
-rw-------. 1 root root 74238698 Oct 28 02:01 vmcore |
|
-rw-r--r--. 1 root root 17532 Oct 28 02:01 vmcore-dmesg.txt |
|
|
|
If you want to get more debugging information in the second kernel, you can add |
|
the 'rd.debug' option to the KDUMP_COMMANDLINE_APPEND in the /etc/sysconfig/kdump, |
|
and then reload them in order to make the changes take effect. |
|
|
|
In addition, you can also add the 'rd.memdebug=X' option to the KDUMP_COMMANDLINE_APPEND |
|
in the /etc/sysconfig/kdump in order to output the additional information about |
|
kernel module memory consumption during loading. |
|
|
|
For more details, please refer to the /etc/sysconfig/kdump, or the man page of |
|
dracut.cmdline and kdump.conf.
|
|
|