Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tie microphone button to a specific stream & only open stream when activated #1445

Open
esand opened this issue Apr 16, 2024 · 5 comments
Open

Comments

@esand
Copy link

esand commented Apr 16, 2024

This is a feature request to enhance how 2-way audio and the microphone would work using the frigate-card.

What I am suggesting would be that if a user enables the microphone button in the menu, that it also asks for the sub-stream specific to using the 2-way audio. It could be the same stream as the camera view, or it could be a separate stream (allow for whatever configuration options are necessary such as go2rtc stream names, etc..).

Having the microphone button tied to a specific stream would allow that stream to be audio-only and could be opened/closed only when the microphone button is activated. Hopefully it's possible such that the card requests microphone access from the browser/client but doesn't have to attach the 2-way audio stream at that time. When the mic button is activated though, it would dynamically bind the 2-way audio stream, and unbind it once the mic button is deactivated. This would allow you to request microphone access as soon as the view is opened, but only open any backchannel audio streams when the mic feature is needed.

This feature is something I mentioned in #1235 and I believe may be a viable solution to a number of little issues people are having with setting up 2-way audio in an easy to manage/use manner in Home Assistant.

Criticism welcome if there's reasons this wouldn't work in all secnarios.

@dermotduffy
Copy link
Owner

What I am suggesting would be that if a user enables the microphone button in the menu, that it also asks for the sub-stream specific to using the 2-way audio.

Just to confirm, you are suggesting a configurable microphone stream, that is only used when the microphone connection is activated?

Having the microphone button tied to a specific stream would allow that stream to be audio-only and could be opened/closed only when the microphone button is activated. Hopefully it's possible such that the card requests microphone access from the browser/client but doesn't have to attach the 2-way audio stream at that time. When the mic button is activated though, it would dynamically bind the 2-way audio stream, and unbind it once the mic button is deactivated. This would allow you to request microphone access as soon as the view is opened, but only open any backchannel audio streams when the mic feature is needed.

Interesting idea! I do plan to revisit 2-way audio in general, since some people are struggling with it -- but didn't quite plan what to do yet. What you're describing sounds similar to always_connected as it exists today, in that the microphone stream is just always connected to the microphone. When the button is pressed, it's unmuted, when it's let go it's re-muted. How is what is described better that what already exists?

@esand
Copy link
Author

esand commented Apr 17, 2024

Just to confirm, you are suggesting a configurable microphone stream, that is only used when the microphone connection is activated?

Yes. Since a number of people have issues where opening the backchannel somehow blocks features on the device. In the case of doorbells which seem to be the most common, some models block the button press from working while backchannel is active/connected.

Interesting idea! I do plan to revisit 2-way audio in general, since some people are struggling with it -- but didn't quite plan what to do yet. What you're describing sounds similar to always_connected as it exists today, in that the microphone stream is just always connected to the microphone. When the button is pressed, it's unmuted, when it's let go it's re-muted. How is what is described better that what already exists?

Ok, hopefully I can explain my thought process a bit better...

With a dedicated microphone stream configured, you know what stream to use (open/connect to) when 2-way audio is required. For cases where the 2-way audio needs to be disconnected, it's easy to figure out that you simply close that stream, but not the main camera stream used for viewing (and possibly regular play-back audio).

Currently, you can only configure a single stream for a camera, which in the case of requiring 2-way audio means that the 2-way audio stream is being consumed at all times when that camera is in view - there's no way to drop the 2-way audio but keep the video feed. This is the first part of the issue that my proposal is hoping to improve.

The second half is how and when 2-way audio is used, and that touches on the always_connected feature. Presently, in order for 2-way audio to work, you need two things - the microphone permission in the client browser/app, and the stream to connect to. Having always_connected: true is definitively easier to use since the mic button doesn't require extra presses and works as you'd expect it to work - but it currently means the 2-way steam is connected right away, even if you don't plan to speak; as soon as the camera video feed is opened, so is the 2-way audio. This is where the issue comes in with common doorbells; as soon as there's a consumer for the backchannel, some features stop working on the device.

My proposal hinges on the possibility of establishing microphone access in the client browser/app without needing to open the 2-way stream. If this is possible, you could effectively hard code always_connected: true since if you've enabled the mic in the menu (with a stream as per part one of my idea), obviously they want the ability to speak - but you just don't know if/when yet.

Once the mic button is activated, you would establish the connection between the microphone and the 2-way stream. Since the person has activated the mic, they want to speak - so let them. As soon as they are done, they deactivate the mic button and you close the stream connection. This releases the consumer and brings the device back to normal operation.

Since the 2-way stream would only be consumed (not just muted!) ad-hoc when the mic is going to be used, it would limit functionality issues on these devices to only when the person is trying to communicate through 2-way audio. Also, since you'd know what specific 2-way stream to use independently of the video, you could disconnect the 2-way audio feature (for timeouts or whatever purpose) without any impact to the video stream for the device which could remain active.

I think this would allow people to set up "doorbell" views in Home Assistant where they can have the view active, watching the video, and then only when necessary, activate 2-way audio to speak to people. As soon as they're done talking, doorbell presses would work again and everything would be fine; it all hinges on releasing the backchannel consumer - nothing to do with microphone access and muting.

Does that explain it better?

@esand
Copy link
Author

esand commented Apr 17, 2024

A final bit of extra info...

Presently, to set up a doorbell view in HA would require 2 views. One that uses video only for its stream, no microphone access and is used just to watch the video feed in case someone is approaching.

You would then need a secondary view that enables microphone and binds to the 2-way audio backchannel so that you can use the microphone to communicate with someone - but you cannot activate this view if the approaching person has not yet pressed the doorbell and is about to.

If the approaching person needs to ring the doorbell - there cannot be any backchannel consumers. As soon as there's something consuming that 2-way audio stream, doorbell presses do not work and the doorbell goes in to a communications mode that limits functionality.

You have to wait for the person to ring the doorbell, and only then can you go to the secondary 2-way audio view, which lets you use the microphone to communicate with them. When you are done, you need to close that view (and/or whatever else is necessary) to disconnect the backchannel consumer so that the doorbell reverts back to normal operation, allowing other doorbell presses. Failure to do so keeps the backchannel open, blocking all other doorbell presses from even registering. So you can see the issue here... if nothing times out that backchannel connection, you could inadvertently keep it open for days and people trying to press the doorbell to alert you of their presence would have no effect at all (the doorbell doesn't ring).

Does this sound like a horrible design flaw for the doorbell? Absolutely. Sadly though, they seem to be fairly common, and at least for those of us with them, I'm hoping that some changes can make our lives a bit easier. This should have no negative impact to anyone with a properly functioning device, so it should be a win-win.

@felipecrs
Copy link
Contributor

I am just reading this issue's title: this is kinda doable already. You can use automations to automatically switch to a different camera when microphone is connected and otherwise.

Also you can use overrides to tweak the UI like disabling the mic button when some given camera is visible or whatever.

But I encountered some problems doing so:

One is #1458

Another was trying to use substreams (hide: true), which I found many other issues. If I remember well, the conditions doesn't work, it always thinks it's streaming the main camera not the substream (maybe an extra condition like substream: true would be nice), and also other issues I don't remember now.

@esand
Copy link
Author

esand commented May 27, 2024

I am hoping for something more easily used and configured than something "kinda doable". It's the minor caveats (the "kinda" piece) that can trip people up or prevent them from creating a setup that's just what they're hoping for and works all the time properly.

As mentioned, the issue is with binding to the substream on certain cameras; it can't assume that it's ok to consume the substream all the time and having full control over when and how the card consumes any substreams would be a big step towards easier 2-way audio setups.

End result is hopefully far fewer support requests asking for help and people being able to ditch cloud-only apps just to use their cameras.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants