Collectl, an all-in-one tool for collecting Linux statistical data

http://honglus.blogspot.com/search/label/Troubleshooting

Collectl,collect for Linux, is a single tool which integrates functions of various tools:sar,iostat,mpstat,top,slaptop,netstat,nfstat,ps .. . 
– Supported: Linux 
– Requirement: Perl 
Collectl features: 
– run in command line or run as daemon 
– Various output formats: raw,gunplot,gexprt(ganglia),sexpr,lexpr,csv(–sep ,) 
– Send data to other programs (ganglia) remotely via socket instead of writing to a file 
– IPMI monitoring for fans and temperature sensors 
– Support module (Perl scripts)  for customized checks 
– Monitor process’s disk read/write, find the top processes keeping disk busy 
The last one is the most impressive feature, I haven’t found other Linux tools can do it. (DTtrace can in Solaris)
collectl  examples

#help, all options 
$collect –x
#-s?, what to monitor:c – cpu  d – disk “collectl   --showsubsys”
#-c 5 : collect 5 samples and exit
#-oT:  T - preface output with time only ; “collectl   --showoptions”
$collectl   -sc -c5 -i2 --verbose -oT
waiting for 2 second sample...
# CPU SUMMARY (INTR, CTXSW & PROC /sec)
#Time      User  Nice   Sys  Wait   IRQ  Soft Steal  Idle  CPUs  Intr  Ctxsw  Proc  RunQ   Run   Avg1  Avg5 Avg15
12:39:34      0     0     0     0     0     1     0    97     1  1082     23     0    76     1   0.42  0.42  0.44
12:39:36      0     0     0     0     0     1     0    97     1  1088     24     0    76     1   0.42  0.42  0.44

The following demonstrates how collectl identify the process reading/writing most data to disk
#Hammer disk by writing 50mb data with dd

$dd if=/dev/urandom of=test bs=1k count=50000
#collectl identifies the “dd” process
#in top mode, sort by  “iokb   total I/O KB” ; “collectl –showtopopts”
$collectl -i2  --top iokb
TOP PROCESSES sorted by iokb (counters are /sec) 12:50:31
# PID  User     PR  PPID THRD S   VSZ   RSS CP  SysT  UsrT Pct  AccuTime  RKB  WKB MajF MinF Command
6861  root     18  6784    0 R    3M  572K  0  0.91  0.00  45   0:00.91    0 3680    0   97 dd
1  root     15     0    0 S    2M  632K  0  0.00  0.00   0   0:28.21    0    0    0    0 init
2  root     RT     1    0 S     0     0  0  0.00  0.00   0   0:00.00    0    0    0    0 migration/0