On my system the following initrd-release is generated:
...
VERSION="4 dracut-048 dracut-048"
...
VERSION is not defined in /etc/os-release, so the variable is
concatenated with its previous value:
* "4" comes from the kernel build system since dracut is called from the
kernel install hook ("4" is a major kernel version);
* first "dracut-048" comes from the "systemd-initrd" module;
* second "dracut-048" comes from the "base" module.
When extra devices are added, initqueue should be enabled to make sure
those devices are present, so following services and routines could
use those devices.
See PR #442 for more detail.
If a process (maybe plymouth) was still pinning /oldroot, then shutdown
would
- kill -9 $pid
- umount_a
- umount_a
in a very short timeframe. A small sleep hopefully lets the scheduler free
up /oldroot in the mean time.
JobRunningTimeoutSec now affects how long can start jobs for device
units stay in the "running" state. Disabling default job timeout via
JobTimeoutSec=0 doesn't disable running state timeout. We need to set
running state timeout as well.
Note that doing this the other way around has effect on generic timeout,
i.e. disabling running state timeout disables generic timeout. But doing
it this way we would create implicit dependency on fairly new
systemd-234. However, by setting both options we don't create dependency
on specific systemd version.
Extend "rd.memdebug" to "4", and "make_trace_mem" to "4+:komem".
Add new "cleanup_trace_mem" to cleanup the trace if active.
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
The current method for memory debug is to use "rd.memdebug=[0-3]",
it is not enough for debugging kernel modules. For example, when we
want to find out which kernel module consumes a large amount of memory,
"rd.memdebug=[0-3]" won't help too much.
A better way is needed to achieve this requirement, this is useful for
kdump OOM debugging.
The principle of this patch is to use kernel trace to track slab and
buddy allocation calls during kernel module loading(module_init), thus
we can analyze all the trace data and get the total memory consumption.
As for large slab allocation, it will probably fall into buddy allocation,
thus tracing "mm_page_alloc" alone should be enough for the purpose(this
saves quite some trace buffer memory, also large free is quite unlikey
during module loading, we neglect those memory free events).
The trace events include memory calls under "tracing/events/":
kmem/mm_page_alloc
We also inpect the following events to detect the module loading:
module/module_load
module/module_put
Since we use filters to trace events, the final trace data size won't
be too big. Users can adjust the trace buffer size via "trace_buf_size"
kernel boot command line as needed.
We can get the module name and task pid from "module_load" event which
also mark the beginning of the loading, and module_put called by the
same task pid implies the end of the loading. So the memory events
recorded in between by the same task pid are consumed by this module
during loading(i.e. modprobe or module_init()).
With these information, we can record the rough total memory(the larger,
the more precise the result will be) consumption involved by each kernel
module loading.
Thus we introduce this shell script to find out which kernel module
consumes a large amount of memory during loading. Use "rd.memdebug=4"
as the tigger.
After applying this patch and specifying "rd.memdebug=4", during booting
it will print out something extra like below:
0 pages consumed by "pata_acpi"
0 pages consumed by "ata_generic"
1 pages consumed by "drm"
0 pages consumed by "ttm"
0 pages consumed by "drm_kms_helper"
835 pages consumed by "qxl"
0 pages consumed by "mii"
6 pages consumed by "8139cp"
0 pages consumed by "virtio"
0 pages consumed by "virtio_ring"
9 pages consumed by "virtio_pci"
1 pages consumed by "8139too"
0 pages consumed by "serio_raw"
0 pages consumed by "crc32c_intel"
199 pages consumed by "virtio_console"
0 pages consumed by "libcrc32c"
9 pages consumed by "xfs"
From the print, we see clearly that "qxl" consumed the most memory.
This file will be installed as a separate executable named "tracekomem"
in the following patch.
Signed-off-by: Xunlei Pang <xlpang@redhat.com>
If the initramfs was built with prefix=/run/... /run can't be mounted
with noexec, otherwise no binary can be run.
Guard against it by looking where /bin/sh is really located.
How to reproduce:
host# ./dracut.sh -o 'dracut-systemd systemd systemd-initrd' --local -f ./initramfs.img
host# qemu-system-x86_64 -initrd ./initramfs.img \
-append 'root=/dev/sda1 rd.cmdline=ask rd.hostonly=0' \
...
Enter additional kernel command line parameter (end with ctrl-d or .)
> rd.break
> .
...
There is no "Break before switch_root"
...
Signed-off-by: Evgeny Vereshchagin <evvers@ya.ru>
crypt/parse-crypt.sh generate initqueue job which always call
dev_unit_name() with an argument beginning with "-". This results
in the following error:
dracut-initqueue[307]: + systemd-escape -p -cfb4aa43-2f02-4c6b-a313-60ea99288087
dracut-initqueue[307]: systemd-escape: invalid option -- 'c'
When systemd's crypttab generator parsed crypttab, it tells
systemd about several devices which may not appear until later
in the boot sequence, and which are not needed while dract is running.
This can particularly happen when an md array is encrypted,
and the array is newly degraded so that it doesn't appear until
dracut runs mdraid_start.sh.
This can result in systemd printing warning messages which are
inappropriate.
So tell systemd that the timeout for each of these is zero.
This is involves splitting some functionality out of wait_for_dev()
That function does two things:
- creates 'finished' hooks so that dracut will wait for the device,
and
- sets the systemd timeout for the device to zero, so systemd doesn't
wait.
We only want the second of these for most encrypted devices.
So split that out into a new function set_systemd_timeout_for_dev(),
and call it from parse-crypt.sh
Signed-off-by: NeilBrown <neilb@suse.de>
--
This version fixes the missing redirect from /etc/crypttab
NeilBrown
The only reason we add swap devices to host-only mode (added in
dd5875499e) is to allow us to process
resume= arguments passed on the kernel command line when the swap
partition lives on something slightly more complex than a normal
partion (e.g. in an LVM or RAID setup).
By adding the device to host_devs, the necessary LVM and RAID hooks
are added and thus the underlying storage will be initialised OK, and
the 95resume module handles the waiting for the device (via udev rules
creating the /dev/resume symlink).
So ultimately, we do not need to hard-code the waiting for the swap
devices into the initramfs at build time as the waiting part can be
dynamic.
This makes things more resiliant to swap partitions disappearing and
being reformatted etc.
Inspired by a patch by Martin Whitaker on Mageia bug:
https://bugs.mageia.org/show_bug.cgi?id=12305
Dracut will generate systemd units for additional devices that should be
brought up during boot, e.g. swap devices. These unit files are broken
symlinks with \ in the filename, e.g.
/etc/systemd/system/initrd.target.wants/dev-disk-by\x2duuid-e6a54f99\x2da4fd\x2d4931\x2da956\x2d1c642bcfee5e.device.
Both the backslash and the broken symlink causes problems for shell
scripts, [ -e "$file" ] isn't enough and read requires the additional -r
argument to not react on the \.
nvidia driver needs this via modprobe script.
Needs to do change the group after a device node got created.
Add chown instead of chgrp which can also change the owner of a file.
Ask Stefand Dirsch <sndirsch@suse.de> for details.
Signed-off-by: Thomas Renninger <trenn@suse.de>
In case of systemd is used the timeout already is set to 180s, compare
with file: modules.d/98systemd/dracut-initqueue.sh
Do the same if systemd is not used, e.g. in kdump case.
Signed-off-by: Thomas Renninger <trenn@suse.de>
When 'initqueue' is called with an invalid command it'll generate
invalid job scripts. This will lead to confusing error messages
later on.
So abort in these cases and print out a warning.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Add more corner cases from systemd's
unit_name_from_path_instance() C function.
Signed-off-by: Thorsten Behrens <tbehrens@suse.com>
Signed-off-by: Thomas Renninger <trenn@suse.de>
When generating units for devices the administrator might
want to use a different timeout than the default.
So implement a new parameter 'rd.timeout' for this.
References: bnc#878770
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Thomas Renninger <trenn@suse.de>
This mimicks the similar move of os-release which was done in systemd. These
files are not configuration, but part of the OS.
Still symlinks are in place for compatibility, but those should probably be
dropped eventually.
some programs e.g. systemd-journald expect a directory in /var/log as
the marker to do some actions. Here journald tries to flush
/run/log/journal to /var/log/journal, if the directory is seen.
/var/log is now a symlink to /run/initramfs/log.