IO Monitoring

IO Monitoring

I/O Monitoring and Disk Bottlenecks

  • Disk Performance Problems:
    • Coupled with other factors like insufficient memory or inadequate network hardware/tuning.
    • System considered I/O-bound when CPU idles waiting for I/O or network buffers to clear.
    • Misleading situations: Slow I/O can appear as insufficient memory; memory buffers filling up or emptying too slowly.
    • Network transfers may wait for I/O, affecting network throughput.
    • Real-time monitoring and tracing essential for locating and mitigating disk bottlenecks.
    • Rare or non-repeating problems can complicate the troubleshooting process.

iostat

iostat is the basic workhorse utility for monitoring I/O device activity on the system.

It can generate reports with a lot of information, with the precise content controlled by options. The general form of the command is:

iostat [OPTIONS] [devices] [interval] [count]
  • -k option, which shows results in KB instead of blocks.

  • -m to get results in MB.

  • -x detailed report (-x for extended)

    iostat
    Linux 6.4.4-200.fc38.x86_64 (fedora) 	27/07/23 	_x86_64_	(8 CPU)
    
    avg-cpu:  %user   %nice %system %iowait  %steal   %idle
            4.44    0.00    2.10    0.06    0.00   93.40
    
    Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
    dm-0             29.41       469.53       331.00       148.89   30633024   21594968    9713760
    nvme0n1           0.05         0.55         1.01         0.00      35624      65600          0
    sda              23.09       469.75       331.00       148.89   30647163   21595269    9713760
    zram0            61.34        82.42       162.96         0.00    5377404   10631948          0

After a brief summary of CPU utilization, I/O statistics are given:

  • tps (I/O transactions per second; logical requests can be merged into one actual request),
  • blocks read and written per unit time, where the blocks are generally sectors of 512 bytes; and the total blocks read and written.
  • Information is broken out by disk partition (and if LVM is being used also by dm, or device mapper, logical partitions).

iotop

  • Another very useful utility is iotop, which must be run as root. It displays a table of current I/O usage and updates periodically, like top.

  • In the PRIO column, be stands for best effort and rt stands for real time.

    Total DISK READ:         0.00 B/s | Total DISK WRITE:         0.00 B/s
    Current DISK READ:	     0.00 B/s | Current DISK WRITE:       0.00 B/s
        TID  PRIO  USER        DISK READ   DISK WRITE>    COMMAND                                                                                                                                                             
        1 be/4   root        0.00 B/s    0.00 B/s       systemd --switched-root --system --deserialize=35 rhgb
        2 be/4   root        0.00 B/s    0.00 B/s       [kthreadd]
        3 be/0   root        0.00 B/s    0.00 B/s       [rcu_gp]
        4 be/0   root        0.00 B/s    0.00 B/s       [rcu_par_gp]
        5 be/0   root        0.00 B/s    0.00 B/s       [slub_flushwq]
        6 be/0   root        0.00 B/s    0.00 B/s       [netns]
        8 be/0   root        0.00 B/s    0.00 B/s       [kworker/0:0H-events_highpri]
        11 be/0   root        0.00 B/s    0.00 B/s       [mm_percpu_wq]
        13 be/4   root        0.00 B/s    0.00 B/s       [rcu_tasks_kthread]
        14 be/4   root        0.00 B/s    0.00 B/s       [rcu_task]