Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use direct inc/dec for ++/-- op #3179

Closed
wants to merge 2 commits into from

Conversation

jordalgo
Copy link
Contributor

@jordalgo jordalgo commented May 17, 2024

This will save a call to bpf_map_update_elem

Issue: #3175

Also make the add operation in CreateMapElemAdd atomic.

Checklist
  • Language changes are updated in man/adoc/bpftrace.adoc
  • User-visible and non-trivial changes updated in CHANGELOG.md
  • The new behaviour is covered by tests

This will save a call to bpf_map_update_elem

Issue: bpftrace#3175
@jordalgo jordalgo marked this pull request as ready for review May 17, 2024 03:37
@netoptimizer
Copy link

Compiled bpftrace from this branch.

Running this bpftrace oneliner:
sudo src/bpftrace -e 'kprobe:cgroup_rstat_flush_locked {@flush_calls++}'

The resulting byte-code now looks like this:

int64 kprobe_cgroup_rstat_flush_locked_1(int8 * ctx):
   0: (b7) r1 = 0
   1: (7b) *(u64 *)(r10 -16) = r1
   2: (bf) r2 = r10
   3: (07) r2 += -16
   4: (18) r1 = map[id:1391]
   6: (85) call __htab_map_lookup_elem#267824
   7: (15) if r0 == 0x0 goto pc+1
   8: (07) r0 += 56
   9: (15) if r0 == 0x0 goto pc+4
  10: (79) r1 = *(u64 *)(r0 +0)
  11: (07) r1 += 1
  12: (7b) *(u64 *)(r0 +0) = r1
  13: (05) goto pc+10
  14: (b7) r1 = 1
  15: (7b) *(u64 *)(r10 -8) = r1
  16: (bf) r2 = r10
  17: (07) r2 += -16
  18: (bf) r3 = r10
  19: (07) r3 += -8
  20: (18) r1 = map[id:1391]
  22: (b7) r4 = 1
  23: (85) call htab_map_update_elem#281184
  24: (b7) r0 = 0
  25: (95) exit

This looks like it fixes the bug in #3175.
As line 13 jumps over the htab_map_update_elem call in line 23.

@netoptimizer
Copy link

(Maybe for another PR)

Looking at above BPF byte-code.

  • Why use a hash-map for storing this value?

This could easily be a simple array-map.
The hash-map comes with unnecessary overhead.

netoptimizer added a commit to xdp-project/xdp-project that referenced this pull request May 17, 2024
…omic

Fix comment about increment operator being atomic.

It is both slow and contains data races as described here:
 - bpftrace/bpftrace#3175

Maybe this will soon get fixed via:
 - bpftrace/bpftrace#3179
 - bpftrace#3179

Signed-off-by: Jesper Dangaard Brouer <[email protected]>
@jordalgo
Copy link
Contributor Author

This could easily be a simple array-map.
The hash-map comes with unnecessary overhead.

Yeah true.

Additionally, I think we might be causing data races in this implementation by updating the pointer directly instead of utilizing __sync_fetch_and_add for these operations to keep them atomic.

Missed this in the initial implementation but this add
instruction needs to be atomic because the map value itself
is not protected by a spin lock and therefore requires
synchronization.
expr_ = b_.CreateLoad(b_.GetType(map.type), newval);
b_.CreateLifetimeEnd(newval);
AllocaInst *pre_post_val = b_.CreateAllocaBPF(map.type,
map.ident + "pre_post_val");

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable scoped_key_deleter is not used.
@netoptimizer
Copy link

Additionally, I think we might be causing data races in this implementation by updating the pointer directly instead of utilizing __sync_fetch_and_add for these operations to keep them atomic.

True, I think it would be preferred that we use atomic inc/dec operations, as it is correct that data races still exists without them (still after removing the unnecessary map_update_elem).

As an end-user of bpftrace, I (falsely) assumed/expected these were atomic inc/dec operators (++/--).

@danobi
Copy link
Member

danobi commented May 18, 2024

Hashmap is indeed suboptimal. We can definitely do some additional specialization here:

map_type = libbpf::BPF_MAP_TYPE_HASH;

As for ++ vs count(), we have in the docs some details about that -- we can make the difference more clear though. https://github.com/bpftrace/bpftrace/blob/master/man/adoc/bpftrace.adoc#count

I'm not sure using atomic ops for ++/-- is best, though. I think you can still lose concurrent updates, right? And using atomic op in non-contended case is additional overhead I believe (HW locks cache line). To play devil's advocate, in C you wouldn't expect ++/-- to do anything special, right?

In your comment in xdp-project/xdp-project@010d6b1, I see your motivation is to be able to printf() a value, right? As a workaround, you could consider doing:

 $ sudo ./build/src/bpftrace -e 'BEGIN { @ = count(); print(("val", @)) }'
Attaching 1 probe...
(val, 1)
^C

@: 1

The reason you can't print a map with printf() yet is b/c printf() is synchronous whereas for print() the values we pull out are asynchronous. I believe it should be possible to synchronously print out count() cuz the kernel supports loops over maps elements now. We should support this.

@danobi
Copy link
Member

danobi commented May 18, 2024

I've filed #3187 and #3186

@jordalgo
Copy link
Contributor Author

@danobi We already have this issue filed for our per-cpu functions: #3126

I'm not sure using atomic ops for ++/-- is best, though. And using atomic op in non-contended case is additional overhead I believe (HW locks cache line).

Yes, you're right. I didn't realize all the places we were calling CreateMapElemAdd were using per-cpu maps, which make atomic operations unnecessary. I'll probably close this PR for now and fix up inc/dec after working on #3126

@jordalgo jordalgo closed this May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants