Skip to content

Latest commit

 

History

History
128 lines (98 loc) · 6.14 KB

Monitoring_Tools_on_Graviton.md

File metadata and controls

128 lines (98 loc) · 6.14 KB

Monitoring tools for AWS Graviton

Listed below are some monitoring and profiling tools supported on AWS Gravtion. Also listed are some differences when compared to the tools available on x86 processor architectures.

Some of the most commonly used tools such as top, htop, iostat, lstopo, hwloc, dmidecode, lmbench, Linux perf are supported on AWS Graviton processors. There are some tools such as Intel MLC, Intel VTune Profiler, PCM that are supported only on Intel processors and some tools such as turbostat supported on x86 processors.

Details

Info utilities

lscpu utility

The lscpu utility shows details of the processor features such as architecture, number of cores, active CPUs, caches, NUMA, instruction support and optionally CPU frequency.

$ lscpu

Attribute Value
Architecture aarch64
CPU(s) 64
On-line CPU(s) list 0-63
L1d 4 MiB (64 instances)
... ...
NUMA node(s) 1
... ...

dmidecode to get CPU frequency info

The dmidecode utility is a tool for listing details of the system's hardware components.

$ sudo dmidecode | less

Attribute Value
Socket Designation CPU00
Type Central Processor
Family ARMv8
Manufacturer AWS
... ...
Max Speed 2600 MHz
Current Speed 2600 MHz
... ...

hwloc and lstopo utilities

Shown below is sample output for these utilities on a c6g.2xlarge instance.

$ hwloc-info

depth 0:           1 Machine (type #0)
    depth 1:          1 Package (type #1)
        depth 2:         1 L3Cache (type #6)
            depth 3:        8 L2Cache (type #5)
                depth 4:       8 L1dCache (type #4)
                    depth 5:      8 L1iCache (type #9)
        ...

$ lstopo

Machine (15GB total)
    Package L#0
        NUMANode L#0 (P#0 15GB)
            L3 L#0 (32MB)
            L2 L#0 (1024KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0 + PU L#0 (P#0)
            L2 L#1 (1024KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1 + PU L#1 (P#1)
            L2 L#2 (1024KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2 + PU L#2 (P#2)
    ...

Perf monitoring utilities

On AWS Graviton processors, the Linux perf tool comes handy to collect an application execution profile, hardware perf counters. Much of the functionality provided by tools such as Intel VTune Profiler and PCM are supported in Linux perf. Below are some details of its usage/ syntax.

Collect basic CPU statistics for the specified command or system wide

$ perf stat command

Shown below are Linux perf stats collected at system wide on a c6g.2xlarge instance.

$ perf stat

Performance counter stats for 'system wide':

      87692.26 msec cpu-clock                 #    8.000 CPUs utilized
           441      context-switches          #    5.029 /sec
            13      cpu-migrations            #    0.148 /sec
             2      page-faults               #    0.023 /sec
      25115021      cycles                    #    0.000 GHz
      28853592      instructions              #    1.15  insn per cycle
         68126      branch-misses

  10.961122204 seconds time elapsed

Collect basic/ specific CPU hardware counters, for a specific command or system wide, for 10 seconds

One can collect hardware events/ counters for an application, on a specific CPU, for a PID or system wide as follows:

$ perf stat -e cycles,instructions,cache-references,cache-misses,bus-cycles -a sleep 10

Performance counter stats for 'system wide':

     161469308      cycles                                                        (80.01%)
     120685678      instructions              #    0.75  insn per cycle           (79.97%)
      42132948      cache-references                                              (80.01%)
       2001520      cache-misses              #    4.750 % of all cache refs      (80.02%)
     160016796      bus-cycles                                                    (60.00%)

  10.002896494 seconds time elapsed

View the profile using perf report command

$ perf report

Overhead Command Shared Object Symbol
72.44% functionA functionA classA::functionA
7.66% functionB libB.so classB::functionB
...
0.81% functioA libc-2.31.so memcmp

More details on how to use Linux perf utility on AWS Graviton processors is available here.

Summary: Utilities on AWS Graviton vs. Intel x86 architectures

Processor x86 Graviton2,3
CPU frequency listing lscpu, /proc/cpuinfo, dmidecode dmidecode
turbostat support Yes No
hwloc support Yes Yes
lstopo support Yes Yes
i7z Works Yes No
lmbench Yes Yes
Intel MLC Yes No
Performance monitoring tools VTune Profiler and PCM, Linux perf Linux perf, Arm Forge

Utilities such as lmbench are available here and can be built for AWS Graviton processors to obtain latency and bandwidth stats.

Notes:

1. The ARM Linux kernel community has decided not to put CPU frequency in /proc/cpuinfo which can be read by tools such as lscpu or directly.

2. On AWS Graviton 2/3 processors, Turbo isn’t supported. So, utilities such as ‘turbostat’ aren’t supported/ relevant for Arm architecture (and not on AWS Graviton processor either). Also, tools such as i7z for discovering CPU frequency, turbo, sockets and other information are only supported on Intel architecture/ processors. Intel MLC is a memory latency checker utility that is only supported on Intel processors.