http://honglus.blogspot.com/search/label/Troubleshooting
Collectl,collect for Linux, is a single tool which integrates functions of various tools:sar,iostat,mpstat,top,slaptop,netstat,nfstat,ps .. .
– Supported: Linux
– Requirement: Perl
Collectl features:
– run in command line or run as daemon
– Various output formats: raw,gunplot,gexprt(ganglia),sexpr,lexpr,csv(–sep ,)
– Send data to other programs (ganglia) remotely via socket instead of writing to a file
– IPMI monitoring for fans and temperature sensors
– Support module (Perl scripts) for customized checks
– Monitor process’s disk read/write, find the top processes keeping disk busy
The last one is the most impressive feature, I haven’t found other Linux tools can do it. (DTtrace can in Solaris)
collectl examples
#help, all options
$collect –x
#-s?, what to monitor:c – cpu d – disk “collectl --showsubsys”
#-c 5 : collect 5 samples and exit
#-oT: T - preface output with time only ; “collectl --showoptions”
$collectl -sc -c5 -i2 --verbose -oT
waiting for 2 second sample...
# CPU SUMMARY (INTR, CTXSW & PROC /sec)
#Time User Nice Sys Wait IRQ Soft Steal Idle CPUs Intr Ctxsw Proc RunQ Run Avg1 Avg5 Avg15
12:39:34 0 0 0 0 0 1 0 97 1 1082 23 0 76 1 0.42 0.42 0.44
12:39:36 0 0 0 0 0 1 0 97 1 1088 24 0 76 1 0.42 0.42 0.44
The following demonstrates how collectl identify the process reading/writing most data to disk
#Hammer disk by writing 50mb data with dd
$dd if=/dev/urandom of=test bs=1k count=50000
#collectl identifies the “dd” process
#in top mode, sort by “iokb total I/O KB” ; “collectl –showtopopts”
$collectl -i2 --top iokb
TOP PROCESSES sorted by iokb (counters are /sec) 12:50:31
# PID User PR PPID THRD S VSZ RSS CP SysT UsrT Pct AccuTime RKB WKB MajF MinF Command
6861 root 18 6784 0 R 3M 572K 0 0.91 0.00 45 0:00.91 0 3680 0 97 dd
1 root 15 0 0 S 2M 632K 0 0.00 0.00 0 0:28.21 0 0 0 0 init
2 root RT 1 0 S 0 0 0 0.00 0.00 0 0:00.00 0 0 0 0 migration/0