Beruflich Dokumente
Kultur Dokumente
Introduction: ................................................................................................................................................. 2
Access/User Management ............................................................................................................................ 3
Controlling Processes .................................................................................................................................... 4
Monitor Process ........................................................................................................................................ 5
File System .................................................................................................................................................... 5
Disk Management ......................................................................................................................................... 6
Scripting and Shell......................................................................................................................................... 7
Bash Scripting............................................................................................................................................ 7
Python Scripting ............................................................................................................................................ 7
Tools - Cron ................................................................................................................................................... 7
Troubleshooting ............................................................................................................................................ 8
Account provisioning
Performing backups
Troubleshooting
Access Management
User Management
Controlling Processes
File System
Disk Management
Password Management
Log Management
Job Scheduling
Performance Issues
Certificate Management
Package Management
Introduction:
KeyNotes:
Keywords:
Identity Management
Best Practices:
su doesn’t record the commands executed as root, but it does create a log entry that states who
became root and when
/etc/passwd - user identification numbers (UIDs for short) are mapped to usernames
/etc/security
Process ownership
SUDO
RBAC Model –
Password encryption
Password Vaults
SU (Examples)
Commands:
Passwd
/etc/shadow
/etc/group
Controlling Processes
Keywords:
Process ID
Life cycle of process – Runnable, Sleeping (IO or CPU Cycles), Zombie, Stopped
Multithreading
Monitor Process
ps aux
ps –eaf
top
strace
Kill -9 pid
File System
Keywords:
Commands:
touch
Mount , unmount
Chmod - change permissions
Disk Management
Partitioning
LVM (PV, VG, LV) – Logical volume management
fdisk –l
Environmental variables
Bash Scripting
#!/bin/bash
echo "Hello, world!"
$ chmod +x helloworld
$ ./helloworld
Python Scripting
Tools - Cron
standard tool for scheduling tasks
crontab –l
crontab –e
Common usecases:
Simple reminders
Backup
Troubleshooting
Level of troubleshooting
Collaborate – Conference calls , Email, Direct conversation, Chat rooms (Jabber, Spark)
Best Practices
Know What Changed - One of the largest sources of problems in a system is change. When everything
has been running smoothly for a long time and then a problem appears, one of the first things you
should ask is “What changed?”
System load average is probably the fundamental metric you start from when troubleshooting a sluggish
system
The three numbers after load average—2.03, 20.17, and 15.09—represent the 1-, 5-, and 15-minute
load averages on the machine, respectively
$ uptime
13:35:03 up 103 days, 8 min, 5 users, load average: 2.03, 20.17, 15.09
Explanation:
A single-CPU system with a load average of 1 means the single CPU is under constant load. If that single-
CPU system has a load average of 4,
there is four times the load on the system than it can handle, so three out of four processes are waiting
for resources. The load average reported on
a system is not tweaked based on the number of CPUs you have, so if you have a two-CPU system with a
load average of 1, one of your two CPUs is loaded at all times—that is, you are 50% loaded. So a load of
1 on a singleCPU system is the same as a load of 4 on a four-CPU system in terms of the amount of
available resources used
A system that runs out of RAM resources often appears to have I/O-bound load, since once the system
starts using swap storage on the disk, it can consume disk resources and cause a downward spiral as
processes slow to a halt
Scenarios
- Kill - So what if you do notice a process consuming all of your CPU and you want to kill it?
Before diagnosing specific system problems, it’s important to be able to rule out memory issues.
Mem: 1024176k total, 997408k used, 26768k free, 85520k buffers
Swap: 1004052k total, 4360k used, 999692k free, 286040k cached
Note: The Linux kernel also has an out-of-memory (OOM) killer that can kick in if the system runs
dangerously low on RAM. When a system is almost out of RAM, the OOM killer will start killing
processes
When you see high I/O wait, one of the first things you should check is whether the machine is using a
lot of swap.
$ sudo iostat
$ sudo iotop
$ sar
Booting Issues
BIOS
GRUB
Ping
Is Interface Up ?
Network routes
Firewall rules - /sbin/iptables
DNS issues - /etc/resolve.conf
Packet capture tools – tcpdump, wireshark
Bootstrapping
Init process
single-user mode does not allow network operation; you need physical access to the system console to
use it.
You can type <Control-D> instead of a password to bypass single-user mode and continue with a normal
boot.
The fsck command is run during a normal boot to check and repair filesystems