Record event counts per thread

perf record can be used to record event counts for every thread when using -s option. Check following multi-thread program:

  1. $ cat sum.c
  2. #include <omp.h>
  3. #define N 100000000
  4. #define THRAED_NUM 8
  5. int values[N];
  6. int main(void)
  7. {
  8. int sum[THRAED_NUM];
  9. #pragma omp parallel for
  10. for (int i = 0; i < THRAED_NUM; i++)
  11. {
  12. int local_sum;
  13. for (int j = 0; j < N; j++)
  14. {
  15. local_sum += values[j] >> i;
  16. }
  17. sum[i] = local_sum;
  18. }
  19. return 0;
  20. }

Build and use perf record to note down event counts for every thread:

  1. # gcc -fopenmp -g sum.c -o sum
  2. # perf record -s ./sum
  3. [ perf record: Woken up 1 times to write data ]
  4. [ perf record: Captured and wrote 0.404 MB perf.data (8757 samples) ]

Use perf report to analyze it (-T option means displaying per-thread event counts):

  1. # perf report -T
  2. ......
  3. # PID TID cycles:uppp
  4. 9960 9963 751824252
  5. 9960 9961 750625307
  6. 9960 9965 749742594
  7. 9960 9967 749228142
  8. 9960 9968 0
  9. 9960 9970 9857914
  10. 9960 9974 9150516
  11. 9960 9977 8928628
  12. ......