Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Audio recording and playback #163

Draft
wants to merge 37 commits into
base: master
Choose a base branch
from
Draft

WIP: Audio recording and playback #163

wants to merge 37 commits into from

Conversation

dpgeorge
Copy link
Collaborator

This PR adds audio/microphone recording and playback capabilities, as per the docs proposal: bbcmicrobit/micropython#791

It currently supports:

  • audio.AudioFrame(size) to create large audio frames
  • audio.play(audio_frame) can now take a single AudioFrame to play
  • audio.set_rate(rate) can set the playback rate on the fly
  • audio.sound_level() returns the current sound level being played
  • microphone.record_into(buffer, rate, wait) to record audio
  • microphone.is_recording() to check if recording is ongoing
  • microphone.stop_recording() to stop recording

There is a test program called src/test_record.py which uses all the above to show how it works.

@dpgeorge dpgeorge marked this pull request as draft November 14, 2023 01:46
@dpgeorge
Copy link
Collaborator Author

@microbit-carlos Please try out the test program and see how it goes.

@dpgeorge
Copy link
Collaborator Author

I have updated this PR with the following additions/changes:

  • added microphone.record(duration, rate=7812)
  • added a "used length" member to AudioFrame, so that it remembers how much valid recording there is; this value is set when recording, and when playing back only that much data is played
  • added a "rate" member to AudioFrame, and set_rate() / get_rate() members
  • calling microphone.record_into() will either use the rate of the given AudioFrame, or set the rate on that AudioFrame, if the rate is passed into this record method
  • playing an AudioFrame will play at its rate, and changing the rate (via audio_frame.set_rate()) while it's playing will dynamically adjust the playback rate
  • the AudioFrame constructor is changed to AudioFrame(duration=-1, rate=7812), so you can just pass in a duration in the simple case eg AudioFrame(2000)

@dpgeorge
Copy link
Collaborator Author

Some tweaks have been made to how the "used_size" entry in AudioFrame is used/set, the default microphone gain set to 0.2, and this PR has been rebased on the latest master which includes an update to the latest MicroPython version.

@microbit-carlos this is now ready for wider testing.

@microbit-carlos
Copy link
Contributor

microbit-carlos commented Feb 22, 2024

Thanks Damien!
Btw, the next CODAL tag has been released yesterday, which should resolve some of the issues with changing the sampling rate to non-proportional values: https://github.com/lancaster-university/codal-microbit-v2/releases/tag/v0.2.65

Diff:

@dpgeorge
Copy link
Collaborator Author

@microbit-carlos I have now updated this PR to use CODAL v0.2.66. It's working well.

Comment on lines 204 to 237
microbit_hal_microphone_start_recording(audio_frame->data, audio_frame->alloc_size, &audio_frame->used_size, audio_frame->rate);

return mp_const_none;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is missing the wait flag implementation.

For example, with this bit of code, the display shows "R" inmediately:

my_recording = audio.AudioFrame(duration=3000)
microphone.record_into(my_recording, wait=True)
display.show("R")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait is now implemented.

@microbit-carlos
Copy link
Contributor

microbit-carlos commented Mar 1, 2024

With this example the playback rate doesn't seem to change for playback.

from microbit import *

RECORDING_SAMPLING_RATE = 7812

while True:
    if pin_logo.is_touched():
        # Record and play back at the same rate
        my_recording = microphone.record(duration=3000, rate=RECORDING_SAMPLING_RATE)
        audio.play(my_recording)
        del my_recording

    if button_a.is_pressed():
        # Play back at half the sampling rate
        my_recording = microphone.record(duration=3000, rate=RECORDING_SAMPLING_RATE)
        audio.set_rate(RECORDING_SAMPLING_RATE // 2)
        audio.play(my_recording)
        del my_recording

    if button_b.is_pressed():
        # Play back at twice the sampling rate
        my_recording = microphone.record(duration=3000, rate=RECORDING_SAMPLING_RATE)
        audio.set_rate(RECORDING_SAMPLING_RATE * 2)
        audio.play(my_recording)
        del my_recording

    sleep(200)

Edit: Ah! the type was that this example was using audio.set_rate(), which I guess we need to remove from this branch?

@microbit-carlos
Copy link
Contributor

Also, I think the microphone.set_sensitivity(gain) still needs to be implemented.

@microbit-carlos
Copy link
Contributor

microbit-carlos commented Mar 4, 2024

Another small one, if the AudioFrame rate is set to a negative number it overflows the calculation (well, maybe it's converting it to an unsigned int at some point):

>>> AudioFrame(1000, -1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
MemoryError: memory allocation failed, allocating 4294990 bytes

It could throw a ValueError instead, like setting it to zero:

>>> AudioFrame(1000, 0)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: rate out of bounds

@microbit-carlos
Copy link
Contributor

microbit-carlos commented Mar 4, 2024

Looks like set_rate() overflows/converts as well:

>>> af = AudioFrame()
>>> af.set_rate(2**16 - 1)
>>> af.get_rate()
65535
>>> af.set_rate(2**16)
>>> af.get_rate()
0
>>> af.set_rate(2**16 + 1)
>>> af.get_rate()
1

And on that note, should the rate be limited to 16 bits? Should it be increased to 32?

@microbit-carlos
Copy link
Contributor

Should copyfrom() also duplicate the length from the source?
Before these changes AudioFrame was always 32 bytes, but now we can have different size AudioFrames.

I was also wondering if we should also make it a static function to be able to do new = AudioFrame.copyfrom(old).
Otherwise we now we have to figure out a way to match the length of the old instance somehow when creating the new, as we cannot even do something like new = AudioFrame(size=len(old)); new.copyfrom(old).

@microbit-carlos
Copy link
Contributor

Does the AudioFrame returned by microphone.record() purposely does not have a size that is a multiple of 32 bytes, or is len(AudioFrame()) returning the "used size" instead of the "total size"?

>>> len(microphone.record(duration=1000))
7812
>>> 7812 / 32
244.125

@microbit-carlos
Copy link
Contributor

microbit-carlos commented Mar 6, 2024

Passing a negative value to the microphone.record() duration or rate parameters seems to lock up and never return.
So microphone.record(-1) or microphone.record(1000, -1)

And setting the rate to zero returns a zero-length AudioFrame:

>>> len(microphone.record(1000, 0))
0

@dpgeorge
Copy link
Collaborator Author

Ah! the type was that this example was using audio.set_rate(), which I guess we need to remove from this branch?

I have now removed audio.set_rate().

@dpgeorge
Copy link
Collaborator Author

I think the microphone.set_sensitivity(gain) still needs to be implemented.

Yes, correct, it still needs to be implemented.

Would this just call CODAL's uBit.audio.processor->setGain()? If so, what is the argument to set_sensitivity(), is it a float or integer, and is the value inverted before passing through to setGain()? And what values should we use for the SENSITIVITY_xxx constants?

@dpgeorge
Copy link
Collaborator Author

Another small one, if the AudioFrame rate is set to a negative number it overflows the calculation

This was a mistake, it was supposed to be checking for negative rates. Now fixed.

Looks like set_rate() overflows/converts as well:

I also fixed this, and increased the internal rate variable to 32-bits (I thought 16-bit would be enough for the rate, but it doesn't cost anything to make it 32-bits).

@microbit-carlos
Copy link
Contributor

Would this just call CODAL's uBit.audio.processor->setGain()? If so, what is the argument to set_sensitivity(), is it a float or integer, and is the value inverted before passing through to setGain()? And what values should we use for the SENSITIVITY_xxx constants?

Yes, only need to call the uBit.audio.processor->setGain() and the values are floats and we can use the same three levels as MakeCode:

  • SENSITIVITY_LOW = 0.079
  • SENSITIVITY_MEDIUM = 0.2 (the default)
  • SENSITIVITY_HIGH = 1.0

@microbit-matt-hillsdon
Copy link
Contributor

The existing sim implementation of playing AudioBuffers has always been problematic because of the 4ms chunk size. For audio frames of longer duration I had more hope, but at the moment the chunk size used is the same even though we likely have a much larger buffer in the frame itself.

Our work-in-progress record/playback sim implementation works OK if you increase LOG_AUDIO_CHUNK_SIZE from 5 to 6. (it's possible that on slower computers a bigger buffer still might help, I've not had a chance to test yet). We can't just do that as I think it means regular audio frames are pitch/rate shifted but it might show a way forward for the sim.

The sim changes without the LOG_AUDIO_CHUNK_SIZE change can be seen here:
https://review-python-simulator.usermbit.org/beta-updates/demo.html (microbit-foundation/micropython-microbit-v2-simulator#113) - see sample "Record". For now you'll have to rebuild locally to see the benefit of the buffer size change. We've not yet looked at edge cases etc. but I think this problem will remain.

@microbit-carlos
Copy link
Contributor

microbit-carlos commented Apr 12, 2024

@dpgeorge We've been testing the latest version from this PR and came up with a few discussion topics and bug reports. As there is a few of them, I've created individual GH issues to be able to track each of them individually, and all of them are in the v2.2.0-beta.1 milestone.

Some of the issues are bug reports or fairly straightforward questions, but the following issues require further discussion:

@jaustin @microbit-matt-hillsdon your input on these would also be very welcomed.

@dpgeorge
Copy link
Collaborator Author

The following changes have been made to this branch/PR:

  • Added microphone.set_sensitivity(gain) along with the constants microphone.SENSITIVITY_LOW, SENSITIVITY_MEDIUM, SENSITIVITY_HIGH.
  • Round up AudioFrame size to 32 bytes when created by microphone.record().
  • Raise exceptions if duration or rate arguments to microphone.record() are not positive.

@dpgeorge
Copy link
Collaborator Author

Should copyfrom() also duplicate the length from the source?

This needs discussion. copyfrom() actually allows copying from any object with the buffer protocol (eg bytes, memoryview) and they won't necessarily have the used_size attribute.

This method should probably just set used_size to the total bytes copied fro the buffer-like object. The only issue there is copying from another AudioFrame it will copy the total allocated size, not the used_size, of the other AudioFrame.

I was also wondering if we should also make it a static function to be able to do new = AudioFrame.copyfrom(old). Otherwise we now we have to figure out a way to match the length of the old instance somehow when creating the new, as we cannot even do something like new = AudioFrame(size=len(old)); new.copyfrom(old).

How about allowing passing an AudioFrame to the AudioFrame constructor? This is how list, dict etc work in Python. (Did we already discuss point?)

Make the constructor take duration and rate arguments, and add get_rate() /
set_rate() methods.

Signed-off-by: Damien George <[email protected]>
Signed-off-by: Damien George <[email protected]>
Signed-off-by: Damien George <[email protected]>
No longer needs to be a multiple of 32.

Signed-off-by: Damien George <[email protected]>
And allow duration to be a float, for more accurate sizing.

Signed-off-by: Damien George <[email protected]>
This is more efficient than mp_sched_schedule(), and can never fail.

Signed-off-by: Damien George <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants