Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xeptore/issue7 #8

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from
102 changes: 102 additions & 0 deletions Benchmarks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Benchmark Results

## System Specification

### Hardware

```txt
Memory: RAM: total: 30.62 GiB
CPU: Topology: 16-Core (4-Die) model: AMD EPYC (with IBPB) bits: 64 type: MCP MCM arch: Zen rev: 2
L2 cache: 8192 KiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 bogomips: 79849
Speed: 2495 MHz min/max: N/A Core speeds (MHz): 1: 2495 2: 2495 3: 2495 4: 2495 5: 2495 6: 2495 7: 2495
8: 2495 9: 2495 10: 2495 11: 2495 12: 2495 13: 2495 14: 2495 15: 2495 16: 2495
```

### Software

```txt
System: Host: ubuntu-32gb-nbg1-1 Kernel: 5.4.0-72-generic x86_64 bits: 64 compiler: gcc v: 9.3.0 Console: N/A
dm: N/A Distro: Ubuntu 20.04.2 LTS (Focal Fossa)
Compiler: Ubuntu clang version 11.0.0-2~ubuntu20.04.1
Target: x86_64-pc-linux-gnu
Thread model: posix
```

## Results

Execution times are the average of processing an image with dimensions **43680px x 4160px** with **100 times** iterations.

### pthread

![pthread benchmark diagram](/benchmark_diagrams/pthread.png)

### OpenMP

![OpenMP benchmark diagram](/benchmark_diagrams/openmp.png)

### pthread vs. OpenMP

![pthread vs. OpenMP 1 workers benchmark diagram](/benchmark_diagrams/01w.png)

![pthread vs. OpenMP 2 workers benchmark diagram](/benchmark_diagrams/02w.png)

![pthread vs. OpenMP 3 workers benchmark diagram](/benchmark_diagrams/03w.png)

![pthread vs. OpenMP 4 workers benchmark diagram](/benchmark_diagrams/04w.png)

![pthread vs. OpenMP 5 workers benchmark diagram](/benchmark_diagrams/05w.png)

![pthread vs. OpenMP 6 workers benchmark diagram](/benchmark_diagrams/06w.png)

![pthread vs. OpenMP 7 workers benchmark diagram](/benchmark_diagrams/07w.png)

![pthread vs. OpenMP 8 workers benchmark diagram](/benchmark_diagrams/08w.png)

![pthread vs. OpenMP 9 workers benchmark diagram](/benchmark_diagrams/09w.png)

![pthread vs. OpenMP 10 workers benchmark diagram](/benchmark_diagrams/10w.png)

![pthread vs. OpenMP 11 workers benchmark diagram](/benchmark_diagrams/11w.png)

![pthread vs. OpenMP 12 workers benchmark diagram](/benchmark_diagrams/12w.png)

![pthread vs. OpenMP 13 workers benchmark diagram](/benchmark_diagrams/13w.png)

![pthread vs. OpenMP 14 workers benchmark diagram](/benchmark_diagrams/14w.png)

![pthread vs. OpenMP 15 workers benchmark diagram](/benchmark_diagrams/15w.png)

![pthread vs. OpenMP 16 workers benchmark diagram](/benchmark_diagrams/16w.png)

![pthread vs. OpenMP 17 workers benchmark diagram](/benchmark_diagrams/17w.png)

![pthread vs. OpenMP 18 workers benchmark diagram](/benchmark_diagrams/18w.png)

![pthread vs. OpenMP 19 workers benchmark diagram](/benchmark_diagrams/19w.png)

![pthread vs. OpenMP 20 workers benchmark diagram](/benchmark_diagrams/20w.png)

![pthread vs. OpenMP 21 workers benchmark diagram](/benchmark_diagrams/21w.png)

![pthread vs. OpenMP 22 workers benchmark diagram](/benchmark_diagrams/22w.png)

![pthread vs. OpenMP 23 workers benchmark diagram](/benchmark_diagrams/23w.png)

![pthread vs. OpenMP 24 workers benchmark diagram](/benchmark_diagrams/24w.png)

![pthread vs. OpenMP 25 workers benchmark diagram](/benchmark_diagrams/25w.png)

![pthread vs. OpenMP 26 workers benchmark diagram](/benchmark_diagrams/26w.png)

![pthread vs. OpenMP 27 workers benchmark diagram](/benchmark_diagrams/27w.png)

![pthread vs. OpenMP 28 workers benchmark diagram](/benchmark_diagrams/28w.png)

![pthread vs. OpenMP 29 workers benchmark diagram](/benchmark_diagrams/29w.png)

![pthread vs. OpenMP 30 workers benchmark diagram](/benchmark_diagrams/30w.png)

![pthread vs. OpenMP 31 workers benchmark diagram](/benchmark_diagrams/31w.png)

![pthread vs. OpenMP 32 workers benchmark diagram](/benchmark_diagrams/32w.png)
4 changes: 4 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ set(CMAKE_C_COMPILER /usr/bin/clang)
set(CMAKE_CXX_COMPILER /usr/bin/clang++)
set(CMAKE_EXPORT_COMPILE_COMMANDS true)

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fopenmp")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -fopenmp")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -fopenmp")

project(blurrifier VERSION 0.1.0)

add_executable(blurrifier main.c)
Expand Down
Binary file added benchmark_diagrams/01w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/02w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/03w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/04w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/05w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/06w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/07w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/08w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/09w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/10w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/11w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/12w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/13w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/14w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/15w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/16w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/17w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/18w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/19w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/20w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/21w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/22w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/23w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/24w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/25w.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added benchmark_diagrams/26w.png
Binary file added benchmark_diagrams/27w.png
Binary file added benchmark_diagrams/28w.png
Binary file added benchmark_diagrams/29w.png
Binary file added benchmark_diagrams/30w.png
Binary file added benchmark_diagrams/31w.png
Binary file added benchmark_diagrams/32w.png
Binary file added benchmark_diagrams/openmp.png
Binary file added benchmark_diagrams/pthread.png
58 changes: 50 additions & 8 deletions diagram.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,58 @@
import plotly.express as px
import plotly.io as pio
import pandas as pd

df = pd.read_csv("./results.csv")
pio.kaleido.scope.default_width = 700 * 1
pio.kaleido.scope.default_height = 500 * 1
pio.kaleido.scope.default_width = 900 * 1
pio.kaleido.scope.default_height = 1200 * 1
pio.kaleido.scope.default_scale = 5

df['time'] = df['time'] * 1e-9
pthread_dataframe = pd.read_csv("./results.csv")

fig = px.line(
df,
title="Kernel Radius Per Number of Workers Processing Times",
pthread_dataframe['time'] = pthread_dataframe['time'] * 1e-9

pthread_fig = px.line(
pthread_dataframe,
title="Processing Time For Different Kernel Radii (pthread)",
x="radius",
y="time",
color="workers",
labels={ "time": "Duration (seconds)", "radius": "Kernel Radius" }
color="Workers",
labels={"time": "Duration (seconds)", "radius": "Kernel Radius"},
range_x=[2, 20],
range_y=[0, 700]
)
fig.show()
pthread_fig.write_image(f"./benchmark_diagrams/pthread.png")

pthread_dataframe["Threading"] = "pthread"

openmp_dataframe = pd.read_csv("./openmp.csv")
openmp_dataframe['time'] = openmp_dataframe['time'] * 1e-9

openmp_fig = px.line(
openmp_dataframe,
title=f"Processing Time For Different Kernel Radii (OpenMP)",
x="radius",
y="time",
color="Workers",
labels={"time": "Duration (seconds)", "radius": "Kernel Radius"},
range_x=[2, 20],
range_y=[0, 700]
)
openmp_fig.write_image(f"./benchmark_diagrams/openmp.png")

openmp_dataframe["Threading"] = "OpenMP"

for i in range(32):
fig = px.line(
pd.concat([pthread_dataframe.where(pthread_dataframe["Workers"] == i + 1).loc[i*10:(i+1) * 10 - 1],
openmp_dataframe.where(openmp_dataframe["Workers"] == i + 1).loc[i*10:(i+1) * 10 - 1]]),
title=f"Processing Time For Different Kernel Radii Using {i + 1} Worker{'s' if i > 0 else ''}",
x="radius",
y="time",
color="Threading",
labels={"time": "Duration (seconds)", "radius": "Kernel Radius"},
range_x=[2, 20],
range_y=[0, 700]
)
fig.write_image(f"./benchmark_diagrams/{i + 1:02}w.png")
31 changes: 12 additions & 19 deletions main.c
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
#include <memory.h>
#include <jpeglib.h>
#include <math.h>
#include <omp.h>
#include <pthread.h>
#include <sys/time.h>
#include "config.h"
Expand Down Expand Up @@ -114,9 +115,7 @@ void copy_kernel(double destination[KERNEL_HEIGHT][KERNEL_WIDTH], const double s
};
}

void *transform_rows(void *serialized_params) {
struct transform_row_params *params = (struct transform_row_params *)serialized_params;

void transform_rows(struct transform_row_params *params) {
for (size_t i = params->start_row; i < params->start_row + params->num_rows; i++) {
for (size_t j = 0; j < params->IMAGE_WIDTH; j++) {
struct pixel_components components_multiplication_sum = {
Expand Down Expand Up @@ -145,8 +144,6 @@ void *transform_rows(void *serialized_params) {
params->output_image[i][INPUT_IMAGE_COMPONENTS_NUMBER * j + 2] = round(components_multiplication_sum.blue / kernel_cells_sum);
}
}

return NULL;
}

int transform(
Expand Down Expand Up @@ -215,6 +212,12 @@ int transform(
const unsigned int remainder = decompressor.image_height % NUM_THREADS;

unsigned long total_assigned_rows = 0U;

struct timespec start_time, end;

timespec_get(&start_time, TIME_UTC);

#pragma omp parallel for
for (size_t i = 0; i < NUM_THREADS; i++) {
const unsigned long int worker_quotient = (i < remainder) ? (quotient + 1) : (quotient);
struct transform_row_params *params = (struct transform_row_params *)&buffer[2 * IMAGE_SIZE_IN_BYTES + i * sizeof(struct transform_row_params)];
Expand All @@ -225,25 +228,13 @@ int transform(
params->IMAGE_WIDTH = IMAGE_WIDTH;
params->num_rows = worker_quotient;
params->start_row = total_assigned_rows;
thread_params_refs[i] = params;
total_assigned_rows += worker_quotient;
}

struct timespec start, end;

timespec_get(&start, TIME_UTC);

for (size_t i = 0; i < NUM_THREADS; i++) {
(void)pthread_create(&thread_ids[i], NULL, transform_rows, thread_params_refs[i]);
}

for (size_t i = 0; i < NUM_THREADS; i++) {
(void)pthread_join(thread_ids[i], NULL);
transform_rows(params);
}

timespec_get(&end, TIME_UTC);

unsigned long int time_in_nano_seconds = (end.tv_sec - start.tv_sec) * 1e9 + (end.tv_nsec - start.tv_nsec);
unsigned long int time_in_nano_seconds = (end.tv_sec - start_time.tv_sec) * 1e9 + (end.tv_nsec - start_time.tv_nsec);
printf("total:%lu", time_in_nano_seconds);

while (compressor.next_scanline < compressor.image_height) {
Expand All @@ -266,6 +257,8 @@ int transform(
}

int main() {
omp_set_num_threads(NUM_THREADS);

return transform(
INPUT_IMAGE_FILENAME,
OUTPUT_IMAGE_FILENAME,
Expand Down
Loading