For mission-critical systems no matter what task you need to perform (migration, troubleshooting, configuration or just a simple reboot), especially for the systems you are not dealing deeply on a day-to-day basis, it’s always good to capture current run-time parameters - this step alone can save you hours of troubleshooting if things go south.

Check-list

  • Current kernel version
  • Current runlevel
  • Process list and resource utilization
  • Environment variables
  • Loaded modules
  • Mounts
  • Network configuration
  • Firewall rules
  • Systcl values
  • Logged in users
  • Current date/time
  • Open network connections
  • Service run-time variables
  • Attached hardware

Current kernel version

Sometimes systems might have more than one option available and non-default option might be the one currently supported by your application. Check current boot parameters using cat /proc/cmdline command.

Loaded modules

Some system components could depend on kernel modules loaded manually after the boot. If unnoticed, you might end up with a missing interface or anotother components failed after reboot. Check loaded modules with lsmod command.

Current runlevel

Some services might start at specific runlevel only. Check current runlevel using either runlevel or who -r command.

Process list and resource utilization

Some services keeping cogs running might have been started manually and does not have appropriate unit/init.d definition. Check current process list (with all arguments) using one of the following commands: ps -efhax or pstree.

Pro-tip:

You may want to utilize native tools to capture VMs/containers if you prefer to list them in more human-readable form.

Understanding resource profile for some processes might also come in handy - use top -b -n 1 command to get a snapshot, to collect statistic over some period of time you can use top -b -d S -n N - replace S with number of seconds to define interval and N with number of intervals to capture (alternatively you can utilize atop tool to achieve similar results ).

Environment variables

While reading environment variables of the current shell sessions is as easy as running set command, reviewing environment variables for running Linux processes requires a bit of knowledge of the /proc file-system. To check environment variables for a running process, you first need to know its PID (use pgrep or pidof commands with the name of a process, or just do ps -auxw and grep process name from the output). Knowing the PID, now you can read environment variables of that process from /proc/<PID>/environ file.

Mounts (and swap)

Apart from safeguarding yourself before rebooting the system, this one especially useful during migrating application to a different host, as target host might be lacking network access to your NAS. Check the currently mounted partitions (including networked ones) along with mount options using mount command.

Note:

Pay attention to encrypted file-systems.

Network configuration

A lot of networking can be configured at the run-time and lost after reboot or migration. Use following commands to capture:

  • Current interfaces with IP addresses: ip address list
  • Current routing configuration: ip route list
  • Current routing rules: ip rule list

Pro-tip:

You may need to do this for all namespaces. Get list of active namespaces by using ip netns command.

Firewall rules

The easiest way to capture those is to issue iptables-save command and redirect output to file. This will work for both Ubuntu and RedHat/CentOS (including firewalld) as well as other Netfilter-based distributions.

Systcl values

To get a snapshot of current sysctl settings, use sysctl -a command.

Logged in users

To list currently logged-in users use either w or who command.

Current date/time

This one might look like unimportant, at first glance, but can save you bunch of time when trying to understand why you observe log entries from the future (or similar anomalies). Just run date command and keep an output for the sake of your own sanity.

Open network connections

This might be huge for heavy-loaded system, but can help you understand what’s missing if things will go south. To capture use one of the following:

  • netstat -anop
  • lsof -i4 -n -P

Service run-time variables

Some services can be configured at run-time without persisting those configuration changes. Examples of such services can be MySQL, HAProxy etc. Make sure you capture those settings as well (refer to specific product documentation for more information on listing running configuration).

Attached hardware

Before migrating applications requiring some specific hardware components to run (like GSM modems) you may need to check what is currently available using lspci and lsusb commands or just lshw.