20 Linux + Networking Troubleshooting Labs (with commands + comments)

 

🚀 20 Linux + Networking Troubleshooting Labs (with commands + comments)


🐧 Linux Labs


✅ Lab 1 — Disk Full Issue

Problem

Server says: No space left on device

df -h # check disk usage du -sh /* 2>/dev/null # find which folder is large cd /var/log # logs usually grow big du -sh * # find biggest log rm -rf old.log # delete old logs

✅ Lab 2 — High CPU Usage

Problem

Server slow

top # find high CPU process ps aux --sort=-%cpu | head # sort by CPU usage kill PID # stop process kill -9 PID # force kill if needed

✅ Lab 3 — High Memory Usage

free -m # memory summary top # check memory hog process ps aux --sort=-%mem | head # sort by memory kill PID # stop process

✅ Lab 4 — Service Not Starting

systemctl status nginx # check failure reason journalctl -xe # detailed logs journalctl -u nginx # service logs only systemctl restart nginx # restart after fix

✅ Lab 5 — Port Already in Use

Error: Address already in use

ss -lntp # show listening ports lsof -i :80 # find process using port 80 kill PID # stop it

✅ Lab 6 — Permission Denied Error

ls -l file # check permissions chmod 755 file # add execute chown user:user file # change owner

✅ Lab 7 — File Deleted by Mistake

history # check what you deleted ls -a # check hidden files # restore from backup if exists

👉 Lesson: always backup


✅ Lab 8 — Log Monitoring Live

tail -f /var/log/syslog # watch logs live grep ERROR file.log # filter errors

✅ Lab 9 — Find Large Files

du -ah / | sort -rh | head # biggest files first

✅ Lab 10 — Process Running in Background

jobs # list jobs fg %1 # bring to foreground bg %1 # send back


🌐 Networking Labs


✅ Lab 11 — Server Not Reachable

ping 8.8.8.8 # check internet connectivity ip a # verify IP assigned ip route # check gateway

✅ Lab 12 — DNS Not Resolving

nslookup google.com # test DNS dig google.com # detailed DNS info cat /etc/resolv.conf # check DNS server

✅ Lab 13 — Website Not Opening

curl -I http://localhost # check response headers systemctl status nginx # verify service running ss -lntp # check port listening

✅ Lab 14 — Port Blocked

telnet ip 80 # test port nc -zv ip 80 # check if open

✅ Lab 15 — Trace Network Path

traceroute google.com # find where packets drop

✅ Lab 16 — Test API

curl -v https://api.site.com # verbose HTTP debug

✅ Lab 17 — SSH Login Failed

ssh user@ip # attempt login chmod 400 key.pem # fix key permissions cat /etc/ssh/sshd_config # check ssh config systemctl restart sshd # restart ssh

✅ Lab 18 — Find Open Ports

netstat -tulnp # all listening ports ss -lntp # modern command

✅ Lab 19 — Check Packet Traffic

tcpdump -i eth0 # capture packets live

👉 Helps debug API or network issue


✅ Lab 20 — Auto Restart Service Script

#!/bin/bash systemctl is-active nginx # check if running if [ $? -ne 0 ]; then # if not running systemctl start nginx # start service fi
chmod +x monitor.sh # make executable ./monitor.sh # run script

🎯 FINAL CHALLENGE (DO ALL WITHOUT HELP)

👉 Install nginx
👉 Host webpage
👉 Break it
👉 Fix it
👉 Fill disk
👉 Clean it
👉 Stop DNS
👉 Fix it
👉 Kill service
👉 Auto-restart

If you can do this calmly →
🔥 You’re already ahead of 80% DevOps engineers


If you want, next I can give:
✅ 10 mini projects
✅ or Daily timetable
✅ or Week 2 (Git + Docker labs)

Say which one 😄

Additional commands to this, scenario based

Niceeee — now you’re thinking like an SRE, not just “learning commands” 😄🔥

You already have:
✅ basic commands
✅ 20 labs

Now let’s level up → REAL-WORLD SCENARIO-BASED troubleshooting
(what actually happens at 2AM on production 😅)

I’ll give you:

👉 Scenario
👉 Symptoms
👉 Commands (with comments)
👉 What you’re thinking mentally


🚀 Advanced Scenario-Based Linux + Networking Labs


🐧 System / Linux Scenarios


✅ Scenario 21 — Server Very Slow (Load issue)

Symptoms

  • SSH slow

  • Commands lagging

uptime # check load average top # live CPU/memory view htop # easier visualization ps aux --sort=-%cpu | head # find CPU hog ps aux --sort=-%mem | head # find memory hog kill PID # stop bad process

🧠 Think: CPU? Memory? Too many processes?


✅ Scenario 22 — Too Many Processes (fork bomb or leak)

ps -e | wc -l # count total processes top # check runaway processes ulimit -u # max user processes limit pkill process_name # kill all by name

✅ Scenario 23 — Disk Inodes Full (hidden problem)

Symptoms

Disk shows free but cannot create files

df -i # check inode usage find / -xdev -type f | wc -l # count files

👉 Too many small files → clean /tmp or logs


✅ Scenario 24 — App Keeps Crashing After Restart

systemctl status app # crash reason journalctl -u app -f # live logs dmesg | tail # kernel errors (OOM etc)

🧠 Look for: OOMKilled / Segfault


✅ Scenario 25 — Server Rebooted Automatically

last reboot # reboot history uptime # since when up journalctl -b -1 # previous boot logs

✅ Scenario 26 — File Changed Recently (who changed?)

stat file # file timestamps ls -lt # recently modified files history # user commands

✅ Scenario 27 — CPU Spikes Every Minute

crontab -l # user cron jobs ls /etc/cron* # system cron jobs top # catch process

👉 Cron scripts often cause spikes


✅ Scenario 28 — User Cannot Login

id user # check user exists passwd -S user # account status tail -f /var/log/auth.log # login logs chmod 700 /home/user # fix home permission

✅ Scenario 29 — App Cannot Write to Folder

ls -ld folder # check permission whoami # current user chown appuser folder # fix ownership chmod 755 folder # allow access

✅ Scenario 30 — Server Time Wrong (critical for cloud)

date # current time timedatectl # time settings timedatectl set-ntp true # enable NTP sync


🌐 Networking / Cloud Scenarios


✅ Scenario 31 — Cannot Reach Internet

ping 8.8.8.8 # test raw connectivity ip route # check gateway route -n # routing table

👉 If ping works → DNS issue


✅ Scenario 32 — DNS Slow

time nslookup google.com # measure DNS time cat /etc/resolv.conf # DNS server config

✅ Scenario 33 — Website Returns 500 Error

curl -I localhost # check HTTP code tail -f /var/log/nginx/error.log # backend errors systemctl status nginx

✅ Scenario 34 — Website Returns 502 Bad Gateway

systemctl status backend # backend service running? ss -lntp # backend port open? curl localhost:3000 # test backend directly

✅ Scenario 35 — HTTPS Not Working

curl -v https://site.com # TLS debug openssl s_client -connect site.com:443 # certificate details

✅ Scenario 36 — Network Packet Loss

ping -c 50 google.com # check packet loss % mtr google.com # hop-by-hop loss

✅ Scenario 37 — Firewall Blocking Port

ufw status # firewall rules iptables -L # iptables rules ufw allow 8080 # allow port

✅ Scenario 38 — Service Listening Only on Localhost

App works locally but not externally

ss -lntp # check listening IP

If:

127.0.0.1:8080

👉 change config to:

0.0.0.0

✅ Scenario 39 — Large File Transfer Slow

iftop # bandwidth usage nload # live network traffic scp -C file user@ip:/tmp # compress transfer

✅ Scenario 40 — Continuous Monitoring Script

watch -n 1 df -h # auto refresh disk watch -n 1 free -m # auto refresh memory watch -n 1 "ss -s" # sockets summary

🎯 Pro Engineer Tip

When debugging:

Always think in order:

1. Is service running? 2. Is port open? 3. Is network reachable? 4. Is DNS working? 5. Are logs clean?

👉 90% issues solved with just this flow.

Comments

Popular posts from this blog

Managing Amazon EBS Volumes and Snapshots Across Regions

Git for Beginners: Complete Guide from Installation to First Push on GitHub

AWS - Amazon Web Services