Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seg fault using latest bladerf #24

Open
alphafox02 opened this issue Jun 24, 2023 · 14 comments
Open

Seg fault using latest bladerf #24

alphafox02 opened this issue Jun 24, 2023 · 14 comments
Assignees
Labels
bug Something isn't working

Comments

@alphafox02
Copy link

I did a quick test between commit 41ef634 and the latest c123d83 commit from bladerf github's page. I install it using the /usr prefix along with lib/x86_64-linux-gnu cmake options.

With commit c123d83 and the all channels option on 22.04 Ubuntu with the bladeRFxA9 there is an immediate segmentation fault after the program loads and attempts to use the bladerf. With the previous commit 41ef634 it runs fine. This is from gdb with the newer libbladerf in place. Thought maybe it would help, dropping back to the older libbladerf for now.

tarting program: /usr/src/ice9-bluetooth-sniffer/build/ice9-bluetooth -l -i bladerf -a -w /home/dragon/all_channels.pcap
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff55ff640 (LWP 17143)]
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1185] bandwidth assignements with oversample feature enabled yields unkown results
[New Thread 0x7ffff3bf9640 (LWP 17144)]
[New Thread 0x7ffff33f8640 (LWP 17145)]
[New Thread 0x7ffff2bf7640 (LWP 17146)]
[New Thread 0x7ffff23f6640 (LWP 17147)]
[New Thread 0x7ffff1bf5640 (LWP 17148)]
[New Thread 0x7ffff13f4640 (LWP 17149)]
[New Thread 0x7ffff0bf3640 (LWP 17150)]
[New Thread 0x7fffdbfff640 (LWP 17151)]
[New Thread 0x7fffdb7fe640 (LWP 17152)]
[New Thread 0x7fffdaffd640 (LWP 17153)]
[New Thread 0x7fffda7fc640 (LWP 17154)]
[New Thread 0x7fffd9ffb640 (LWP 17155)]
[New Thread 0x7fffd97fa640 (LWP 17156)]
[New Thread 0x7fffd8ff9640 (LWP 17157)]
[New Thread 0x7fffbbfff640 (LWP 17158)]
[New Thread 0x7fffbb7fe640 (LWP 17159)]
[New Thread 0x7fffbaffd640 (LWP 17160)]
[New Thread 0x7fffba7fc640 (LWP 17161)]
[New Thread 0x7fffb9ffb640 (LWP 17162)]
[New Thread 0x7fffb97fa640 (LWP 17163)]
[New Thread 0x7fffb8ff9640 (LWP 17164)]
[New Thread 0x7fff9bfff640 (LWP 17165)]
[New Thread 0x7fff9b7fe640 (LWP 17166)]
[New Thread 0x7fff9affd640 (LWP 17167)]
[New Thread 0x7fff9a7fc640 (LWP 17168)]
[New Thread 0x7fff99ffb640 (LWP 17169)]
[New Thread 0x7fff997fa640 (LWP 17170)]
[New Thread 0x7fff98ff9640 (LWP 17171)]
[New Thread 0x7fff7bfff640 (LWP 17172)]
[New Thread 0x7fff7b7fe640 (LWP 17173)]
[New Thread 0x7fff7affd640 (LWP 17174)]
[New Thread 0x7fff7a7fc640 (LWP 17175)]
[New Thread 0x7fff79ffb640 (LWP 17176)]
[New Thread 0x7fff797fa640 (LWP 17177)]
[New Thread 0x7fff78ff9640 (LWP 17178)]
[New Thread 0x7fff5bfff640 (LWP 17179)]
[New Thread 0x7fff5b7fe640 (LWP 17180)]
[New Thread 0x7fff5affd640 (LWP 17181)]
[New Thread 0x7fff5a7fc640 (LWP 17182)]
[New Thread 0x7fff59ffb640 (LWP 17183)]
[New Thread 0x7fff597fa640 (LWP 17184)]
[New Thread 0x7fff58ff9640 (LWP 17185)]
[New Thread 0x7fff3bfff640 (LWP 17186)]
[New Thread 0x7fff3b7fe640 (LWP 17187)]
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/common.c:356] The total sample throughput for the 1 active channel, 96 Msps, is greater than the recommended maximum sample throughput, 80 Msps. You may experience dropped samples unless the sample rate is reduced, or some channels are deactivated.

Thread 46 "ice9-bluetooth" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff3b7fe640 (LWP 17187)]
bladerf_rx_cb (bladerf=<optimized out>, stream=<optimized out>, meta=<optimized out>, samples=0x7ffff6944010, num_samples=393216, user_data=<optimized out>) at /usr/src/ice9-bluetooth-sniffer/bladerf.c:105                                                                                      
105             s->samples[i] = d[i];
(gdb) trace 
Tracepoint 1 at 0x55555555aa60: file /usr/src/ice9-bluetooth-sniffer/bladerf.c, line 105.                                
(gdb) backtrace                                                                                                          
#0  bladerf_rx_cb (bladerf=<optimized out>, stream=<optimized out>, meta=<optimized out>, samples=0x7ffff6944010,        
    num_samples=393216, user_data=<optimized out>) at /usr/src/ice9-bluetooth-sniffer/bladerf.c:105                      
#1  0x00007ffff7e4c239 in lusb_stream_cb () from /lib/x86_64-linux-gnu/libbladeRF.so.2                                   
#2  0x00007ffff6d0c5f5 in ?? () from /lib/x86_64-linux-gnu/libusb-1.0.so.0                                               
#3  0x00007ffff6d0d104 in ?? () from /lib/x86_64-linux-gnu/libusb-1.0.so.0                                               
#4  0x00007ffff6d0d661 in ?? () from /lib/x86_64-linux-gnu/libusb-1.0.so.0                                               
#5  0x00007ffff6d0e4ec in ?? () from /lib/x86_64-linux-gnu/libusb-1.0.so.0                                               
#6  0x00007ffff6d0fcd8 in libusb_handle_events_timeout_completed () from /lib/x86_64-linux-gnu/libusb-1.0.so.0           
#7  0x00007ffff7e4c8db in lusb_stream () from /lib/x86_64-linux-gnu/libbladeRF.so.2                                      
#8  0x00007ffff7e21242 in async_run_stream () from /lib/x86_64-linux-gnu/libbladeRF.so.2                                 
#9  0x00007ffff7e36a73 in bladerf2_stream () from /lib/x86_64-linux-gnu/libbladeRF.so.2                                  
#10 0x0000555555559b14 in bladerf_stream_thread (arg=0x55555560b260) at /usr/src/ice9-bluetooth-sniffer/bladerf.c:147    
#11 0x00007ffff6694b43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442                              
#12 0x00007ffff6726a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81                                     
(gdb) 
@AdamLaurie
Copy link

I'm seeing a similar problem. Setup:

  Board:                    Nuand bladeRF 2.0 (bladerf2)
  Serial #:                 [redacted]
  VCTCXO DAC calibration:   0x201b
  FPGA size:                49 KLE
  FPGA loaded:              yes
  Flash size:               32 Mbit
  USB bus:                  4
  USB address:              7
  USB speed:                SuperSpeed
  Backend:                  libusb
  Instance:                 0

bladeRF> version

  bladeRF-cli version:        1.9.0-git-86d68eed-dirty
  libbladeRF version:         2.5.0-git-86d68eed-dirty

  Firmware version:           2.4.0-git-a3d5c55f
  FPGA version:               0.15.0 (configured by USB host)

anything over 28 channels gives:

  [WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1185] bandwidth assignements with oversample feature enabled yields unkown results
Segmentation fault (core dumped)

@mikeryan
Copy link
Owner

mikeryan commented Oct 4, 2023

This is caused by a workaround in my code to fix an issue with the 2.5.0 release of libbladeRF. I should have bumped the patch version of that library so that the code in bladerf.c only implements the workaround on that version. I'm working with the bladeRF team to resolve this at the root.

In the meantime, if you're experiencing this issue you can disable the workaround by setting num_samples_workaround = 0; in the line from bladerf.c below:

num_samples_workaround = 1;

@mikeryan mikeryan self-assigned this Oct 4, 2023
@mikeryan mikeryan added the bug Something isn't working label Oct 4, 2023
@XenoKovah
Copy link

I may be having the same problem. On Ubuntu 22.04.02 I kept getting the error:

test@host02:~/ice9-bluetooth-sniffer/build$ sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2402-2432.pcap -s -c 2416 -C 32
[INFO @ host/libraries/libbladeRF/src/helpers/version.c:103] FPGA version (v0.15.3) is newer than entries in libbladeRF's compatibility table. Please update libbladeRF if problems arise.
Segmentation fault

I also tried reverting to the FPGA version v0.15.0 that I had on a machine from testing in the past where I know it worked. But that didn't help.

So I did a "sudo apt-get upgrade -y". Then it worked for exactly one run...

test@host02:~/ice9-bluetooth-sniffer/build$ cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
<snip>
test@host02:~/ice9-bluetooth-sniffer/build$ sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2402-2432.pcap -s -c 2416 -C 32
[INFO @ host/libraries/libbladeRF/src/helpers/version.c:103] FPGA version (v0.15.0) is newer than entries in libbladeRF's compatibility table. Please update libbladeRF if problems arise.
ch   0.4  samp/sec (  0% realtime); agc 114.7 Msamp/sec (191% realtime)
Channelizer too slow, use fewer channels
ch  32.0 Msamp/sec (100% realtime); agc 113.6 Msamp/sec (189% realtime)
ch  32.9 Msamp/sec (103% realtime); agc 114.3 Msamp/sec (191% realtime)
ch  36.0 Msamp/sec (112% realtime); agc 113.3 Msamp/sec (189% realtime)
<ctrl-c>
...

After that I tried increasing the channels which gave an error, and when I went back down in terms of number of channels it continued to error out:

test@host02:~/ice9-bluetooth-sniffer/build$ sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2402-2432.pcap -s -c 2416 -C 80
[INFO @ host/libraries/libbladeRF/src/helpers/version.c:103] FPGA version (v0.15.0) is newer than entries in libbladeRF's compatibility table. Please update libbladeRF if problems arise.
[ERROR @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:996] bladerf2_set_rational_sample_rate: dev->board->set_sample_rate(dev, ch, integer_rate, &actual_integer_rate) failed: Provided parameter was out of the allowable range
ice9-bluetooth: Unable to set bladeRF sample rate: Provided parameter was out of the allowable range
test@host02:~/ice9-bluetooth-sniffer/build$ sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2402-2432.pcap -s -c 2416 -C 48
[INFO @ host/libraries/libbladeRF/src/helpers/version.c:103] FPGA version (v0.15.0) is newer than entries in libbladeRF's compatibility table. Please update libbladeRF if problems arise.
Segmentation fault
test@host02:~/ice9-bluetooth-sniffer/build$ sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2402-2432.pcap -s -c 2416 -C 32
[INFO @ host/libraries/libbladeRF/src/helpers/version.c:103] FPGA version (v0.15.0) is newer than entries in libbladeRF's compatibility table. Please update libbladeRF if problems arise.
Segmentation fault

So I manually installed the 2023.2 release of libbladeRF from source. And while that got rid of the warning, I was still getting the segfaults.

test@host02:~/ice9-bluetooth-sniffer/build$ sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2402-2432.pcap -s -c 2416 -C 32
Segmentation fault
test@host02:~/ice9-bluetooth-sniffer/build$ sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2402-2432.pcap -s -c 2416 -C 32
Segmentation fault

This ticket says to set num_samples_workaround = 0; as a workaround, and I did that, and now I get no segfault, but instead get capped at 50% realtime for the channelizer for some reason, and I believe this is an error since I'm testing this on a fairly beefy desktop with 24 physical / 48 hyperthreaded "CPUs" (even if I then drop down to below 28 channels.)

test@host02:~/ice9-bluetooth-sniffer/build$ sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2402-2432.pcap -s -c 2416 -C 32
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:916] Oversample feature gain limit reached. RF Gain clamped to 11.
ch   0.4  samp/sec (  0% realtime); agc 124.2 Msamp/sec (207% realtime)
Channelizer too slow, use fewer channels
ch  16.0 Msamp/sec ( 50% realtime); agc 121.6 Msamp/sec (203% realtime)
Channelizer too slow, use fewer channels
ch  16.0 Msamp/sec ( 50% realtime); agc 122.3 Msamp/sec (204% realtime)

So if I can help debug this new 50% channelizer but related to the segfault bug, LMK.


(And as an aside, and reminder for future-me, when I went to rebuild with the bladerf.c change, after installing newer libbladeRF from source, I got an error of

[  5%] Linking C executable ice9-bluetooth
/usr/bin/ld: /tmp/cc7OnnSy.ltrans0.ltrans.o: in function `main':
/home/test/ice9-bluetooth-sniffer/bladerf.c:67: undefined reference to `bladerf_enable_feature'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/ice9-bluetooth.dir/build.make:330: ice9-bluetooth] Error 1
make[1]: *** [CMakeFiles/Makefile2:84: CMakeFiles/ice9-bluetooth.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

I thrashed around a bit but I believe the success path was first removing the apt-installed copy with sudo apt-get remove libbladerf-dev, and then I got an error of

[ 94%] Building C object CMakeFiles/ice9-bluetooth.dir/fftw/fft.c.o
make[2]: *** No rule to make target '/usr/lib/x86_64-linux-gnu/libbladeRF.so', needed by 'ice9-bluetooth'.  Stop.
make[1]: *** [CMakeFiles/Makefile2:84: CMakeFiles/ice9-bluetooth.dir/all] Error 2
make: *** [Makefile:136: all] Error 2

which I resolved with sudo ln -s /usr/local/lib/libbladeRF.so /usr/lib/x86_64-linux-gnu/libbladeRF.so)

@mikeryan
Copy link
Owner

mikeryan commented May 1, 2024

@XenoKovah here's what I did to get everything working on a clean install of Ubuntu 22.04. First make sure to remove any bladeRF packages you installed using apt or manually (check for /usr/local/lib/libbladeRF.so*). Then:

  • download and load (or flash) the latest FPGA image
  • install bladeRF host from the master branch
  • run sudo ldconfig
  • set num_samples_workaround to 0 here
  • compile and build ice9-bluetooth

If the bladeRF starts giving you lip again, I always suggest replugging (and reloading FPGA) to reset state which should clear up the type of errors you were seeing.

@XenoKovah
Copy link

XenoKovah commented May 1, 2024

The first thing I did was unplug, replug, and re-run sudo bladeRF-cli -l ~/hostedxA9-latest.rbf. However I still saw the 50% cap after that.

The next thing I did was remove the 2023.02 tag branch's code with sudo make uninstall from the bladeRF/host/build directory, which I confirmed removes /usr/local/lib/libbladeRF.so*. I then checked out the master branch and pulled to confirm I was up to date. After make clean and sudo make install and sudo ldconfig I then found that sudo bladeRF-cli -l ~/hostedxA9-latest.rbf results in a segfault.

sudo bladeRF-cli -l ~/hostedxA9-latest.rbf
Loading FPGA...
Segmentation fault

Checking with GDB says it's crashing at some function pointer usage in bladerf2_load_fpga() as shown below.

   0x00007ffff7f4abc4 <+148>:	js     0x7ffff7f4ac60 <bladerf2_load_fpga+304>
   0x00007ffff7f4abca <+154>:	xor    %r12d,%r12d
   0x00007ffff7f4abcd <+157>:	pop    %rbx
   0x00007ffff7f4abce <+158>:	mov    %r12d,%eax
   0x00007ffff7f4abd1 <+161>:	pop    %rbp
   0x00007ffff7f4abd2 <+162>:	pop    %r12
   0x00007ffff7f4abd4 <+164>:	pop    %r13
   0x00007ffff7f4abd6 <+166>:	pop    %r14
   0x00007ffff7f4abd8 <+168>:	ret    
   0x00007ffff7f4abd9 <+169>:	nopl   0x0(%rax)
   0x00007ffff7f4abe0 <+176>:	mov    0x310(%rbx),%rax
   0x00007ffff7f4abe7 <+183>:	mov    %rbp,%rdi
=> 0x00007ffff7f4abea <+186>:	call   *0x28(%rax)
   0x00007ffff7f4abed <+189>:	mov    %eax,%r12d
   0x00007ffff7f4abf0 <+192>:	test   %eax,%eax
   0x00007ffff7f4abf2 <+194>:	jns    0x7ffff7f4ab97 <bladerf2_load_fpga+103>

To confirm this was specific to the master branch, I then re-removed, re-checked out tag 2023.02 and confirmed again that the sudo bladeRF-cli -l hostedxA9-latest.rbf works from that branch.

sudo bladeRF-cli -l ~/hostedxA9-latest.rbf
Loading FPGA...
Successfully loaded FPGA bitstream!

However, it's reproducible that if I go back to the 2023.02 branch and load the FPGA that way, ice9 is capped at 50%.

sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2400-2432.pcap -s -c 2416 -C 32
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:916] Oversample feature gain limit reached. RF Gain clamped to 11.
ch   0.4  samp/sec (  0% realtime); agc 118.5 Msamp/sec (197% realtime)
Channelizer too slow, use fewer channels
ch  16.0 Msamp/sec ( 50% realtime); agc 114.6 Msamp/sec (191% realtime)
Channelizer too slow, use fewer channels
ch  16.0 Msamp/sec ( 50% realtime); agc 115.6 Msamp/sec (193% realtime)
Channelizer too slow, use fewer channels
ch  16.0 Msamp/sec ( 50% realtime); agc 114.0 Msamp/sec (190% realtime)
Channelizer too slow, use fewer channels
ch  16.0 Msamp/sec ( 50% realtime); agc 114.8 Msamp/sec (191% realtime)
Channelizer too slow, use fewer channels

(Note because it will be relevant to a future question: I can confirm sniff_2400-2432.pcap has a size proportionate to runtime:

ls -la sniff_2400-2432.pcap 
-rw-r--r-- 1 root root 2243 May  1 08:39 sniff_2400-2432.pcap

)

Q1: So you didn't see a segfault with sudo bladeRF-cli -l from the latest (2fbae2c38b377cfbee98c281789cd43d1f1b55e4) master branch code when you run it?

(I went ahead and filed this ticket just in case it's a real bug.)


As an alternative approach, when I was in the master branch and getting the segfault what I can do instead of the -l is -L.

sudo bladeRF-cli -L ~/hostedxA9-latest.rbf
Writing FPGA to flash for autoloading...
[INFO @ host/libraries/libbladeRF/src/backend/usb/usb.c:504] Erasing 197 blocks starting at block 4
<snip>
Successfully wrote FPGA bitstream to flash!

At this point, it acts like it's working...

sudo ./ice9-bluetooth -l -i bladerf0 -w sniff_2400-2432.pcap -s -c 2416 -C 32
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1194] bandwidth assignements with oversample feature enabled yields unkown results
ch   0.4  samp/sec (  0% realtime); agc 131.6 Msamp/sec (219% realtime)
Channelizer too slow, use fewer channels
ch  32.1 Msamp/sec (100% realtime); agc 131.6 Msamp/sec (219% realtime)
ch  32.0 Msamp/sec (100% realtime); agc 131.6 Msamp/sec (219% realtime)
ch  32.0 Msamp/sec (100% realtime); agc 131.5 Msamp/sec (219% realtime)
ch  32.1 Msamp/sec (100% realtime); agc 131.4 Msamp/sec (219% realtime)
ch  32.0 Msamp/sec (100% realtime); agc 131.2 Msamp/sec (219% realtime)

But in reality the sniff_2400-2432.pcap is not getting updated and is always 24 bytes large regardless of runtime.

ls -lah sniff_2400-2432.pcap
-rw-r--r-- 1 root root 24 May  1 08:47 sniff_2400-2432.pcap

Q2: When it seemed to be working, did you check that the pcap was getting filled in with valid data?

@XenoKovah
Copy link

Update: Tested with the latest master branch after the ticket was fixed, but the behavior is still the same: with the latest, the pcap doesn't get written. Possibly due to that [WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1194] bandwidth assignements with oversample feature enabled yields unkown results?

@mikeryan
Copy link
Owner

mikeryan commented May 2, 2024

It's possible (likely) that you aren't seeing any data with your current flags. You can keep using those flags but also add -v to show if any individual packets are getting detected (and subsequently rejected).

The most basic test you can run is 4 channels around an advertising channel. Try ice9-bluetooth -l -i bladerf0 -C 4 -c 2428 -s -v -w 2426.pcap. If you see output between the channelizer stats, we're headed in the right direction.

You can also try to run it with the default flags (all 80 channels @ 96 MHz), because that's my usual testing setup: ice9-bluetooth -l -i bladerf0 -s -v -w danger_zone.pcap

Interestingly I was able to repro the segfault in bladerf-cli master: if the FPGA is already configured (via flash or USB) then it reliably segfaults. You must have burned your flash at some point before you reflashed it.

@XenoKovah
Copy link

XenoKovah commented May 2, 2024

I don't think lack of packets is the problem. But just to confirm that, I went and stuck a known advertisement Zephyr-based peripheral on the same box and confirmed it was being seen by bluetoothctl scan on (BDADDR 11:22:33:44:55:66 below, but you can see there are other devices seen as well.)

$ bluetoothctl scan on
Discovery started
[CHG] Controller 00:1A:55:44:22:11 Discovering: yes
[NEW] Device 11:22:33:44:55:66 ZOTS
[CHG] Device 11:22:33:44:55:66 RSSI: -17
[NEW] Device 6D:4A:13:2C:C3:A9 6D-4A-13-2C-C3-A9
[NEW] Device 50:C9:C9:F4:C9:C6 50-C9-C9-F4-C9-C6
[CHG] Device 11:22:33:44:55:66 RSSI: -26
[CHG] Device 11:22:33:44:55:66 RSSI: -18
[NEW] Device C9:B8:72:15:86:B8 C9-B8-72-15-86-B8
[CHG] Device 11:22:33:44:55:66 RSSI: -26
[CHG] Device 11:22:33:44:55:66 RSSI: -18
[CHG] Device 11:22:33:44:55:66 RSSI: -26
[CHG] Device 11:22:33:44:55:66 RSSI: -18
[CHG] Device 11:22:33:44:55:66 RSSI: -26
[NEW] Device 14:14:53:7B:17:14 14-14-53-7B-17-14
[CHG] Device 11:22:33:44:55:66 RSSI: -18
[CHG] Device 14:14:53:7B:17:14 RSSI: -54
[CHG] Device 14:14:53:7B:17:14 TxPower: 12
[CHG] Device 14:14:53:7B:17:14 Name: host
...

I have confirmed in the past with Sniffle that 11:22:33:44:55:66 is advertising on all 3 channels.

Most of the time when I run sudo ./ice9-bluetooth -l -i bladerf0 -C 4 -c 2428 -s -v -w 2426.pcap I see the following:

 sudo ./ice9-bluetooth -l -i bladerf0 -C 4 -c 2428 -s -v -w 2426.pcap
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1194] bandwidth assignements with oversample feature enabled yields unkown results
[ERROR @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1000] bladerf2_set_rational_sample_rate: dev->board->set_sample_rate(dev, ch, integer_rate, &actual_integer_rate) failed: Provided parameter was out of the allowable range
burst 2426-0000 cfo -0.002599 deviation 0.053800 
ch   0.0  samp/sec (  0% realtime); agc  39.8 Msamp/sec (995% realtime)
Channelizer too slow, use fewer channels
ch  21.5 Msamp/sec (537% realtime); agc  44.5 Msamp/sec (1113% realtime)
ch  21.5 Msamp/sec (538% realtime); agc  44.5 Msamp/sec (1113% realtime)
ch  21.8 Msamp/sec (544% realtime); agc  45.0 Msamp/sec (1126% realtime)
ch  21.8 Msamp/sec (545% realtime); agc  45.1 Msamp/sec (1127% realtime)
ch  21.8 Msamp/sec (545% realtime); agc  45.1 Msamp/sec (1127% realtime)
ch  21.8 Msamp/sec (546% realtime); agc  45.2 Msamp/sec (1129% realtime)
ch  21.8 Msamp/sec (545% realtime); agc  45.1 Msamp/sec (1127% realtime)
...
ch  23.9 Msamp/sec (598% realtime); agc  50.1 Msamp/sec (1252% realtime)
WARNING: dropped samples on the floor. try fewer channels or a bigger buffer.
WARNING: dropped samples on the floor. try fewer channels or a bigger buffer.
ch  23.8 Msamp/sec (595% realtime); agc  50.1 Msamp/sec (1252% realtime)
WARNING: dropped samples on the floor. try fewer channels or a bigger buffer.
WARNING: dropped samples on the floor. try fewer channels or a bigger buffer.
WARNING: dropped samples on the floor. try fewer channels or a bigger buffer.
...

Once it starts hitting those WARNINGs it spits them out very fast and I just ctrl-c it.

However, exactly 1 time in perhaps 10 tests I saw the following before it hit the WARNINGs:

sudo ./ice9-bluetooth -l -i bladerf0 -C 4 -c 2428 -s -v -w 2428.pcap
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1194] bandwidth assignements with oversample feature enabled yields unkown results
[ERROR @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1000] bladerf2_set_rational_sample_rate: dev->board->set_sample_rate(dev, ch, integer_rate, &actual_integer_rate) failed: Provided parameter was out of the allowable range
burst 2426-0000 cfo 0.015532 deviation 0.103555 
ch   0.0  samp/sec (  0% realtime); agc  39.9 Msamp/sec (997% realtime)
Channelizer too slow, use fewer channels
ch  21.7 Msamp/sec (541% realtime); agc  45.0 Msamp/sec (1124% realtime)
ch  21.8 Msamp/sec (546% realtime); agc  45.2 Msamp/sec (1130% realtime)
ch  21.9 Msamp/sec (547% realtime); agc  45.3 Msamp/sec (1133% realtime)
ch  21.8 Msamp/sec (545% realtime); agc  45.1 Msamp/sec (1129% realtime)
ch  21.9 Msamp/sec (548% realtime); agc  45.4 Msamp/sec (1135% realtime)
ch  21.2 Msamp/sec (531% realtime); agc  45.3 Msamp/sec (1134% realtime)
ch  21.9 Msamp/sec (547% realtime); agc  45.3 Msamp/sec (1132% realtime)
ch  21.8 Msamp/sec (545% realtime); agc  45.3 Msamp/sec (1131% realtime)
ch  22.0 Msamp/sec (551% realtime); agc  45.9 Msamp/sec (1148% realtime)
ch  24.3 Msamp/sec (608% realtime); agc  51.0 Msamp/sec (1276% realtime)
ch  24.2 Msamp/sec (605% realtime); agc  51.2 Msamp/sec (1280% realtime)
ch  24.5 Msamp/sec (612% realtime); agc  51.4 Msamp/sec (1284% realtime)
ch  24.2 Msamp/sec (604% realtime); agc  51.4 Msamp/sec (1285% realtime)
ch  24.2 Msamp/sec (606% realtime); agc  51.3 Msamp/sec (1283% realtime)
ch  24.2 Msamp/sec (606% realtime); agc  51.2 Msamp/sec (1281% realtime)
ch  24.1 Msamp/sec (603% realtime); agc  51.4 Msamp/sec (1284% realtime)
ch  24.1 Msamp/sec (602% realtime); agc  51.3 Msamp/sec (1284% realtime)
ch  24.1 Msamp/sec (603% realtime); agc  51.4 Msamp/sec (1285% realtime)
ch  24.2 Msamp/sec (605% realtime); agc  51.3 Msamp/sec (1282% realtime)
burst 2426-0335 cfo 0.038842 deviation 0.189306 
burst 2426-0336 cfo 0.053311 deviation 0.205164 
ch  24.3 Msamp/sec (607% realtime); agc  51.5 Msamp/sec (1286% realtime)
ch  24.1 Msamp/sec (602% realtime); agc  51.5 Msamp/sec (1288% realtime)
ch  24.2 Msamp/sec (606% realtime); agc  51.0 Msamp/sec (1274% realtime)
ch  24.4 Msamp/sec (610% realtime); agc  51.5 Msamp/sec (1286% realtime)
ch  24.5 Msamp/sec (612% realtime); agc  51.5 Msamp/sec (1287% realtime)
burst 2426-0617 cfo 0.053109 deviation 0.203644 aa 12010009
ch  24.1 Msamp/sec (602% realtime); agc  51.6 Msamp/sec (1289% realtime)
burst 2426-0624 cfo 0.057693 deviation 0.211489 
burst 2426-0629 cfo 0.034952 deviation 0.195347 
burst 2426-0634 cfo 0.060942 deviation 0.212326 
burst 2428-16615 cfo 0.041602 deviation 0.191559 
burst 2426-0641 cfo 0.024661 deviation 0.197648 
burst 2428-16855 cfo 0.026818 deviation 0.129185 
burst 2426-0644 cfo 0.052059 deviation 0.207090 
ch  24.0 Msamp/sec (599% realtime); agc  51.5 Msamp/sec (1289% realtime)
burst 2426-0663 cfo 0.043445 deviation 0.216559 
ch  24.1 Msamp/sec (603% realtime); agc  51.2 Msamp/sec (1279% realtime)

And I confirmed that it did lead to a non-24-byte output pcap.

Q1: Is that [ERROR @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1000] bladerf2_set_rational_sample_rate: dev->board->set_sample_rate(dev, ch, integer_rate, &actual_integer_rate) failed: Provided parameter was out of the allowable range an actual error that's causing problems?

Q2: How can I increase the buffer size that's being complained about in the warning? (Because it's clearly not an option to decrease the channels down from the 4 that's used in that command.)

@XenoKovah
Copy link

Misc other thing I noticed. If I change from 4 channels to 8 channels, the channelization stat goes down drastically. I.e. above you can see w/ 4 channels it's ~21-24 Msamp/sec ~600% realtime. If I do 8 the Msamp/sec is capped at 8.0 for some reason... Seems like another separate possible bug?

sudo ./ice9-bluetooth -l -i bladerf0 -C 8 -c 2426 -s -v -w 2426.pcap
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1194] bandwidth assignements with oversample feature enabled yields unkown results
burst 2426-0000 cfo -0.004094 deviation 0.062801 
ch   0.1  samp/sec (  0% realtime); agc  44.5 Msamp/sec (371% realtime)
Channelizer too slow, use fewer channels
ch   8.0 Msamp/sec (100% realtime); agc  44.0 Msamp/sec (367% realtime)
ch   8.0 Msamp/sec (100% realtime); agc  44.1 Msamp/sec (367% realtime)
ch   8.0 Msamp/sec (100% realtime); agc  44.1 Msamp/sec (367% realtime)
ch   8.0 Msamp/sec (100% realtime); agc  44.1 Msamp/sec (367% realtime)
ch   8.0 Msamp/sec (100% realtime); agc  43.9 Msamp/sec (366% realtime)
...

p.s. I think adv channel 38 is 2426MHz so I corrected from the above which I just copied and pasted from the example which was set to 2428 (unless there was some reason to test specifically with 2428?)

@XenoKovah
Copy link

XenoKovah commented May 2, 2024

Misc other datapoint. The "burst" outputs are saying it saw a packet right?

If so, when I do the same above command but up the channels to 80, I see a decent number of bursts, despite not being able to keep up (which I'm still skeptical about on this ubuntu server w/ lots of cores, so I filed #32 to discuss separately), but the pcap size doesn't increase.

sudo ./ice9-bluetooth -l -i bladerf0 -C 80 -c 2426 -s -v -w 2426.pcap
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1194] bandwidth assignements with oversample feature enabled yields unkown results
burst 2426-0000 cfo -0.000071 deviation 0.009830 
burst 2402-0002 cfo -0.005716 deviation 0.159508 
ch   0.9  samp/sec (  0% realtime); agc 238.2 Msamp/sec (153% realtime)
Channelizer too slow, use fewer channels
burst 2402-0003 cfo -0.011729 deviation 0.156868 
ch  38.6 Msamp/sec ( 48% realtime); agc 240.7 Msamp/sec (154% realtime)
Channelizer too slow, use fewer channels
burst 2402-0005 cfo -0.009979 deviation 0.159363 
ch  37.3 Msamp/sec ( 47% realtime); agc 242.6 Msamp/sec (156% realtime)
Channelizer too slow, use fewer channels
burst 2402-0008 cfo -0.007519 deviation 0.160051 
burst 2402-0009 cfo -0.009345 deviation 0.162484 
ch  37.0 Msamp/sec ( 46% realtime); agc 241.2 Msamp/sec (155% realtime)
Channelizer too slow, use fewer channels
burst 2402-0011 cfo -0.011570 deviation 0.168530 
burst 2402-0012 cfo -0.001944 deviation 0.157497 
ch  37.1 Msamp/sec ( 46% realtime); agc 240.2 Msamp/sec (154% realtime)
Channelizer too slow, use fewer channels
burst 2402-0014 cfo -0.002771 deviation 0.156555 
ch  38.3 Msamp/sec ( 48% realtime); agc 242.3 Msamp/sec (155% realtime)
Channelizer too slow, use fewer channels
ch  38.4 Msamp/sec ( 48% realtime); agc 242.4 Msamp/sec (155% realtime)
Channelizer too slow, use fewer channels
ch  38.2 Msamp/sec ( 48% realtime); agc 243.3 Msamp/sec (156% realtime)
Channelizer too slow, use fewer channels
ch  38.7 Msamp/sec ( 48% realtime); agc 237.7 Msamp/sec (152% realtime)
Channelizer too slow, use fewer channels
burst 2402-0021 cfo 0.001669 deviation 0.159307 
burst 2402-0022 cfo -0.009556 deviation 0.164196 
...

The pcap is still only 24 bytes after that.

And the same basic behavior is seen with the -a flag

sudo ./ice9-bluetooth -l -i bladerf0 -s -v -w danger_zone.pcap
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1194] bandwidth assignements with oversample feature enabled yields unkown results
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/common.c:356] The total sample throughput for the 1 active channel, 96 Msps, is greater than the recommended maximum sample throughput, 80 Msps. You may experience dropped samples unless the sample rate is reduced, or some channels are deactivated.
ch   1.1  samp/sec (  0% realtime); agc 279.6 Msamp/sec (149% realtime)
Channelizer too slow, use fewer channels
ch  39.1 Msamp/sec ( 41% realtime); agc 303.4 Msamp/sec (161% realtime)
...
Channelizer too slow, use fewer channels
burst 2454-0215 cfo 0.104460 deviation 0.222255 
burst 2454-0225 cfo 0.105447 deviation 0.225993 
burst 2468-0408 cfo 0.069146 deviation 0.255209 
burst 2468-0410 cfo 0.105634 deviation 0.209870 
burst 2454-0235 cfo 0.106266 deviation 0.222205 
ch  37.1 Msamp/sec ( 39% realtime); agc 277.4 Msamp/sec (148% realtime)
Channelizer too slow, use fewer channels
burst 2458-0242 cfo 0.126710 deviation 0.201212 
burst 2468-0433 cfo 0.118328 deviation 0.210188 
burst 2460-0250 cfo 0.137735 deviation 0.195445 
ch  37.0 Msamp/sec ( 38% realtime); agc 250.9 Msamp/sec (133% realtime)
Channelizer too slow, use fewer channels
ch  37.1 Msamp/sec ( 39% realtime); agc 288.6 Msamp/sec (154% realtime)
Channelizer too slow, use fewer channels
ch  37.1 Msamp/sec ( 39% realtime); agc 288.8 Msamp/sec (154% realtime)
Channelizer too slow, use fewer channels
^C[ERROR @ host/libraries/libbladeRF/src/backend/usb/libusb.c:1089] Transfer timed out for buffer 0x7f416189a010
test@host02:~/ice9-bluetooth-sniffer/build$ ls -lah danger_zone.pcap 
-rw-r--r-- 1 root root 24 May  2 08:15 danger_zone.pcap

@mikeryan
Copy link
Owner

mikeryan commented May 2, 2024

OK various aspects of this project are swapping back in. Seeing channelizer hover around 100% is the normal, expected behavior when your CPU is able to keep up with the samples as they're being served from the bladeRF. You can't exceed 100% in this scenario because you'd be receiving samples from the future. You should only ever see it exceed 100% when reading from a file.

Which raises the question, why are we seeing that above 200% with -C 4? I think you're right, that error message must be indicative of a failure and my code goes haywire when failing to deal with it. So for now, yes stick to -C 8 or higher.

Bursts just refer to RF energy observed on a channel, which may be due to BLE, classic BT, Wi-Fi, other narrowband transceivers (nRF24 series like Logitech receivers). Seeing periodic bursts is an indicator that the data is flowing all the way through the signal processing chain correctly. There's still likely some problem with the data as it comes through.

If a burst is correctly decoded as a BLE packet, we invoke libbtbb to dump its contents. If you've ever used ubertooth, the output will look similar to that with the full packet contents displayed in the terminal.

If you want to follow along at home as I debug this, set base_name to a string like "test". This will output the individual bursts from each channel into files that can be plotted with inspectrum.

For center channel, we actually want to use one of the odd numbered channels so that the DC offset from the SDR lies between two BLE channels. I typically use 2427. With all that said, here's the command I recommend for smoke testing:

ice9-bluetooth -l -i bladerf0 -C 8 -c 2427 -s -v -w test.pcap

@XenoKovah
Copy link

Changed base_name to "test" and recompiled and re-ran.

Currently with an advertising peripheral plugged in next to it again, I can issue sudo ./ice9-bluetooth -l -i bladerf0 -C 8 -c 2427 -s -v -w test.pcap and it will run for a long time (e.g. 2+ minutes) and not see any bursts (but also not see the WARNING: dropped samples on the floor. try fewer channels or a bigger buffer. errors either, so that is probably specific to the -C 4 setting.) If I instead use the -a I do see some bursts as before, and the consequent output files.

@mikeryan
Copy link
Owner

OK just sanity checking here. Here's my current setup:

  • Ubuntu 22.04 native (no VM)
  • bladeRF xA4 running hostedxA4-latest.rbf (downloaded today, substitute appropriately for your model)
  • bladeRF host tools master
  • ice9-bluetooth-sniffer master

In the sniffer I set num_samples_workaround = 0 at line 60 in bladerf.c, If I run the following command, I see output similar to the following:

$ ./ice9-bluetooth -l -c 2427 -C 8 -sv -i bladerf0
burst 2426-0035 cfo 0.000177 deviation 0.154761 aa 8e89bed6
burst 2424-0607 cfo 0.013212 deviation 0.190828
burst 2428-0046 cfo 0.010062 deviation 0.167405
burst 2426-0036 cfo 0.010988 deviation 0.154553 aa 8e89bed6
burst 2424-0609 cfo 0.013281 deviation 0.184390
burst 2426-0040 cfo 0.040153 deviation 0.147100
burst 2424-1002 cfo 0.056386 deviation 0.257270
ch   8.0 Msamp/sec (100% realtime); agc  61.3 Msamp/sec (511% realtime)

The lines with aa 8e89bed6 are successfully decoded BLE advertising packets.

Can you run bladeRF-cli --flash-fpga X (not a typo, literally capital X) to disable FPGA auto-loading and load the FPGA manually? That's the only remaining thing I can think of that's different between our two setups.

@XenoKovah
Copy link

Ubuntu 22.04 native (no VM)
bladeRF xA4 running hostedxA4-latest.rbf (downloaded today, substitute appropriately for your model)
bladeRF host tools master
ice9-bluetooth-sniffer master

Yes to all of those, with A9 instead of A4.

I think the bladeRF-cli --flash-fpga X made the difference (in conjunction with reloading with sudo bladeRF-cli -l ~/hostedxA9-latest.rbf of course.) Because yes, now I am also seeing far more successful captures of bursts when I've got the advertiser right next to it. 👍

./ice9-bluetooth -l -c 2427 -C 8 -sv -i bladerf0 -w ice9CH38.pcap
[WARNING @ host/libraries/libbladeRF/src/board/bladerf2/bladerf2.c:1194] bandwidth assignements with oversample feature enabled yields unkown results
burst 2426-0009 cfo 0.024305 deviation 0.150941 
burst 2426-0014 cfo 0.016020 deviation 0.170329 
ch   0.1  samp/sec (  0% realtime); agc  43.7 Msamp/sec (364% realtime)
Channelizer too slow, use fewer channels
burst 2426-0017 cfo 0.024765 deviation 0.164000 
burst 2426-0018 cfo 0.019900 deviation 0.152570 
ch   8.0 Msamp/sec (100% realtime); agc  43.4 Msamp/sec (362% realtime)
burst 2426-0021 cfo 0.013077 deviation 0.166726 aa 8e89bed6
burst 2426-0022 cfo 0.020734 deviation 0.149532 
ch   8.0 Msamp/sec (100% realtime); agc  43.5 Msamp/sec (362% realtime)
burst 2426-0025 cfo 0.010012 deviation 0.151067 
burst 2426-0026 cfo 0.004412 deviation 0.157726 
ch   8.0 Msamp/sec (100% realtime); agc  43.5 Msamp/sec (363% realtime)
burst 2426-0030 cfo 0.016502 deviation 0.162652 
burst 2426-0037 cfo 0.009266 deviation 0.155733 
ch   8.0 Msamp/sec (100% realtime); agc  43.3 Msamp/sec (361% realtime)
burst 2426-0044 cfo 0.017054 deviation 0.156401 
burst 2426-0051 cfo 0.026337 deviation 0.148755 aa 89fed6a0

I think we can probably close this ticket (since it diverged a bit from the original segfault discussion) and I will open a new one to discuss the fact that it seems to be missing lots of packets though compared to Sniffle's pcap over the same time period.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants