• About WordPress
    • WordPress.org
    • Documentation
    • Support
    • Feedback
  • Log In
  • Register
  • Home
  • About Us
  • Blog
  • Courses
  • Contact Us

Have any question?

101daysofdevops@gmail.com
RegisterLogin
101DaysofDevops
  • Home
  • About Us
  • Blog
  • Courses
  • Contact Us

Blog

  • Home
  • Blog
  • Blog
  • Debugging Performance Issue using SAR

Debugging Performance Issue using SAR

  • Posted by lakhera2020
  • Date April 21, 2022
  • Comments 2 comments

What is SAR?

SAR is a utility used to collect and report system activity. It collects data relating to most core system functions and writes those metrics to binary data files.

Installing SAR

# yum -y install sysstat
  • To enable SAR on boot
# systemctl enable sysstat

How SAR works

  • When we install a sysstat package it places a file in /etc/cron.d/sysstat
# cat /etc/cron.d/sysstat# Run system activity accounting tool every 10 minutes*/10 * * * * root /usr/lib64/sa/sa1 1 1# 0 * * * * root /usr/lib64/sa/sa1 600 6 &# Generate a daily summary of process accounting at 23:5353 23 * * * root /usr/lib64/sa/sa2 -A

This file setup two cron jobs

  • 1st job to record statistics every 10 minutes.
  • 2nd job to write the binary sa\#\# file to a text sar\#\# file once a day (typically right before midnight)

Additional config can be placed in a configuration file(/etc/sysconfig/sysstat)

# cat /etc/sysconfig/sysstat
# sysstat-10.1.5 configuration file.




# How long to keep log files (in days).
# If value is greater than 28, then log files are kept in
# multiple directories, one for each month.
HISTORY=28




# Compress (using gzip or bzip2) sa and sar files older than (in days):
COMPRESSAFTER=31




# Parameters for the system activity data collector (see sadc manual page)
# which are used for the generation of log files.
SADC_OPTIONS="-S DISK"




# Compression program to use.
ZIP="bzip2"

How SAR is useful

  • Sar data is useful in pinpointing the system resource (networking, memory, IO, CPU, etc.) that is causing a performance issue.
  • Sar data contains several useful sections for determining how the system was performing at a given time. By default, this is configured to run at ten minute intervals. Information on any of the descriptors used below (such as runq-sz, kbmemused, etc.) can be obtained by searching for these names after executing man sar.

Debugging High Load Average

  • Load — information regarding how many tasks are currently on the system.
# sar -q 1 4
00:00:01 runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15
[...]
04:20:01 1 2751 0.08 0.34 0.42
04:30:01 0 955 2.18 1.37 1.01
04:40:03 10 1645 8.18 6.58 3.83
04:50:14 25 1704 170.37 113.04 55.24
Average: 2 2605 6.67 4.64 2.51
  • From the above information, we can see that the load had a massive spike around 4:50. Typically a system’s load should remain at 70% of the number of cores or lower. If the system’s load is consistently above this amount there may be performance degradation, and if the load ever rises above the number of cores there will be a significant slowdown.

Debugging High Memory Utilization

  • Memory — information regarding memory and swap usage.
# sar -r 1 3
00:00:01 kbmemfree kbmemused %memused kbbuffers kbcached kbswpfree kbswpused %swpused kbswpcad
[...]
04:30:01 62716272 201531148 76.27 15952 1556572 2048236 12 0.00 4
04:40:03 191904 264055516 99.93 2692 28908 0 2048248 100.00 8496
04:50:14 184100 264063320 99.93 1388 10600 0 2048248 100.00 0
Average: 4415719 259831701 98.33 1357749 20307185 1906978 141270 6.90 297
  • Linux likes to use memory at around 99%; however, if the system is actively swapping then the system is most likely experiencing memory pressure. Considering that we see all of the swap used above within 10 minutes the system containing this data was experiencing memory issues during this time.

Debugging High I/O Utilization

  • IO — information pertaining to the number of disk accesses.
# sar -b 1 3
00:00:01 tps rtps wtps bread/s bwrtn/s
[...]
04:30:01 67.95 51.84 16.12 4303.82 4664.14
04:40:03 564.60 227.07 337.52 34338.84 87719.02
04:50:14 51.05 40.25 10.80 1326.32 245.06
Average: 31.12 11.00 20.12 1383.64 3346.65
  • The number of disk reads and writes will vary based on the underlying hardware; however, we can take a look at what is considered ‘normal’ for this system by examining the data over a period of time, and then look for spikes. We can see a large spike at 4:40 where the number of reads and writes increases dramatically. Note that shortly after these go back down, indicating that this massive burst was resolved.

Debugging High CPU Utilization

  • CPU — information regarding where each of the system’s cycles are spent.
# sar -u 1 10
00:00:01 CPU %user %nice %system %iowait %steal %idle
[...]
04:40:03 all 10.45 0.00 1.67 0.89 0.00 86.99
04:50:14 all 0.19 0.00 62.06 1.98 0.00 35.78
  • Spending time in %user is expected behavior, as this is where all non-system tasks are accounted for. If cycles are actively being spent in %system then much of the execution time is being spent in lower-level code. If %iowait is high then it indicates processes are actively waiting due to disk accesses being a bottleneck on the system.
  • In addition, sar data may be viewed graphically by downloading and using the ‘kSar’ tool. This tool is not provided from RedHat; however, it may be useful in pinpointing problematic times on the system. Documentation on this tool is at the following link https://sourceforge.net/projects/ksar/

Tag:devops

  • Share:
author avatar
lakhera2020

Previous post

4 common Kubernetes Pods Error and Debugging
April 21, 2022

Next post

Am I reading the iostat command output correctly?
April 25, 2022

You may also like

Am I reading the iostat command output correctly?
25 April, 2022

Iostat command came from the same sysstat family package # rpm -qf `which iostat` sysstat-11.7.3-6.el8.x86_64 It mainly read data from /proc/diskstats # cat /proc/diskstats 259 0 nvme1n1 147 0 6536 …

4 common Kubernetes Pods Error and Debugging
20 April, 2022

Why do Kubernetes Pods fail? The two most common reasons for Kubernetes pod failure is The container inside the pod doesn’t start, which we also call a startup failure. The …

My road to Gremlin Chaos Engineering Practitioner Certificate
16 October, 2021

Chaos Engineering is one field that always draws my attention. I came to know about it after I heard about the Netflix Simian Army toolkit https://github.com/Netflix/SimianArmy . At an initial glance, it’s …

    2 Comments

  1. Anonymous
    April 29, 2022
    Reply

    what does 1 and 4 mean in sar -q 1 4 command ?

    • lakhera2020
      May 1, 2022
      Reply

      1 is Interval(amount of time between each report)and 4 Count(number of report generated)

Leave a Reply to lakhera2020 Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Am I reading the iostat command output correctly?
  • Debugging Performance Issue using SAR
  • 4 common Kubernetes Pods Error and Debugging
  • My road to Gremlin Chaos Engineering Practitioner Certificate
  • My road to Certified Kubernetes Security Specialist (CKS)

Recent Comments

  • lakhera2020 on Debugging Performance Issue using SAR
  • Anonymous on Debugging Performance Issue using SAR
  • Pety on Day 2 – MetalLB Load Balancer for Bare Metal Kubernetes
  • akashambasta on Day 1 – AWS IAM User
  • rd on 100 Days of AWS

 

101daysofdevops@gmail.com

  • Home
  • About Us
  • Courses
  • Blog

© 101daysofdevops. All rights reserved.

Login with your site account

Lost your password?

Not a member yet? Register now

Register a new account

Are you a member? Login now