For mission-critical systems no matter what task you need to perform (migration, troubleshooting, configuration or just a simple reboot), especially for the systems you are not dealing deeply on a day-to-day basis, it’s always good to capture current run-time parameters - this step alone can save you hours of troubleshooting if things go south.
Check-list
- Current kernel version
- Current runlevel
- Process list and resource utilization
- Environment variables
- Loaded modules
- Mounts
- Network configuration
- Firewall rules
- Systcl values
- Logged in users
- Current date/time
- Open network connections
- Service run-time variables
- Attached hardware
Current kernel version
Sometimes systems might have more than one option available and non-default option might be the one currently supported by your application. Check current boot parameters using
cat /proc/cmdline
command.
Loaded modules
Some system components could depend on kernel modules loaded manually after the boot. If unnoticed, you might end up with a missing interface or anotother components failed after reboot. Check loaded modules with lsmod
command.
Current runlevel
Some services might start at specific runlevel only. Check current runlevel using either
runlevel
or
who -r
command.
Process list and resource utilization
Some services keeping cogs running might have been started manually and does not have appropriate unit/init.d definition. Check current process list (with all arguments) using one of the following commands:
ps -efhax
or
pstree
.
Pro-tip:
You may want to utilize native tools to capture VMs/containers if you prefer to list them in more human-readable form.
Understanding resource profile for some processes might also come in handy - use top -b -n 1
command to get a snapshot, to collect statistic over some period of time you can use top -b -d S -n N
- replace S
with number of seconds to define interval and N
with number of intervals to capture (alternatively you can utilize atop
tool to achieve similar results ).
Environment variables
While reading environment variables of the current shell sessions is as easy as running set
command, reviewing environment variables for running Linux processes requires a bit of knowledge of the /proc
file-system. To check environment variables for a running process, you first need to know its PID (use pgrep
or pidof
commands with the name of a process, or just do ps -auxw
and grep process name from the output). Knowing the PID, now you can read environment variables of that process from /proc/<PID>/environ
file.
Mounts (and swap)
Apart from safeguarding yourself before rebooting the system, this one especially useful during migrating application to a different host, as target host might be lacking network access to your NAS. Check the currently mounted partitions (including networked ones) along with mount options using mount
command.
Note:
Pay attention to encrypted file-systems.
Network configuration
A lot of networking can be configured at the run-time and lost after reboot or migration. Use following commands to capture:
- Current interfaces with IP addresses:
ip address list
- Current routing configuration:
ip route list
- Current routing rules:
ip rule list
Pro-tip:
You may need to do this for all namespaces. Get list of active namespaces by using ip netns
command.
Firewall rules
The easiest way to capture those is to issue iptables-save
command and redirect output to file. This will work for both Ubuntu and RedHat/CentOS (including firewalld) as well as other Netfilter-based distributions.
Systcl values
To get a snapshot of current sysctl settings, use sysctl -a
command.
Logged in users
To list currently logged-in users use either w
or who
command.
Current date/time
This one might look like unimportant, at first glance, but can save you bunch of time when trying to understand why you observe log entries from the future (or similar anomalies). Just run date
command and keep an output for the sake of your own sanity.
Open network connections
This might be huge for heavy-loaded system, but can help you understand what’s missing if things will go south. To capture use one of the following:
netstat -anop
lsof -i4 -n -P
Service run-time variables
Some services can be configured at run-time without persisting those configuration changes. Examples of such services can be MySQL, HAProxy etc. Make sure you capture those settings as well (refer to specific product documentation for more information on listing running configuration).
Attached hardware
Before migrating applications requiring some specific hardware components to run (like GSM modems) you may need to check what is currently available using lspci
and lsusb
commands or just lshw
.