.Net: Add AssemblyAI connector #5392

Swimburger · 2024-03-08T21:36:17Z

Motivation and Context

AssemblyAI is a speech AI company offering AI models through APIs.
Adding a connector will help users integrate AssemblyAI easily with Semantic Kernel.

Description

Progress of implementation of AssemblyAI connector.
Current implementation ASSEMBLYAI BRANCH

TODO

AudioToTextService

GetTextContentsAsync using AudioContent (.Net: Add AssemblyAI connector for Audio-to-text #5094)
GetTextContentsAsync using AudioStreamContent (.Net: Add AssemblyAI connector for Audio-to-text #5094) (deprecated in favor of file service)
Add DI extensions (.Net: Add AssemblyAI connector for Audio-to-text #5094)
Add AssemblyAI file service to upload files (.Net: Add AssemblyAI file service #5964)
Return typed class in TextContent.InnerContent
Add all transcript parameters to AssemblyAIAudioToTextExecutionSettings

Potential additions

Add real-time speech-to-text

The text was updated successfully, but these errors were encountered:

Swimburger · 2024-03-08T21:41:39Z

I noticed that the IAudioToTextService.GetTextContentsAsync method returns multiple TextContent's.
We have APIs to return the transcript as sentences and another as paragraphs.
Would it make sense to add options to AssemblyAIAudioToTextExecutionSettings, which would control whether the transcript is returned as a single TextContent, or a TextContent for each sentence, or a TextContent for each paragraph?

Krzysztof318 · 2024-03-08T22:04:57Z

I would add to todo also full realtime transcribing, so you send AudioContent or AudioStreamContent and you get IAsyncEnumerable<StreamingTextContent>

Swimburger · 2024-03-08T22:29:46Z

I would add to todo also full realtime transcribing, so you send AudioContent or AudioStreamContent and you get IAsyncEnumerable<StreamingTextContent>

I want to add realtime, but I want to finalize and release non-realtime transcription first.

Our realtime solution uses a WebSocket connection, expects raw audio bytes to be sent continuously, and responds with partial and final transcript objects. This is mostly consistent with other realtime transcription services.
I'd be happy to work with y'all in figuring out how to create a good abstraction that'll work for us and other realtime services.

Swimburger · 2024-05-01T20:29:37Z

Instead of using the AudioStreamContent, I'm introducing an AssemblyAI file service for users to upload their files to AssemblyAI. #5964

In the future, we can use a streaming audio content class for Streaming STT.

Swimburger · 2024-06-10T13:51:47Z

Now that we have the AssemblyAIAudioToTextService and AssemblyAIFileService in, I think we can release the initial version of this connector. What would the next steps be?

markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Mar 8, 2024

github-actions bot changed the title ~~.NET: Add AssemblyAI connector~~ .Net: Add AssemblyAI connector Mar 8, 2024

Swimburger mentioned this issue Mar 8, 2024

.Net: Add AssemblyAI connector for Audio-to-text #5094

Merged

4 tasks

markwallace-microsoft removed the triage label Mar 12, 2024

markwallace-microsoft assigned RogerBarreto Mar 12, 2024

Swimburger mentioned this issue May 1, 2024

Add AssemblyAIAudioToTextService.cs AssemblyAI/assemblyai-semantic-kernel#8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Net: Add AssemblyAI connector #5392

.Net: Add AssemblyAI connector #5392

Swimburger commented Mar 8, 2024 •

edited

Swimburger commented Mar 8, 2024

Krzysztof318 commented Mar 8, 2024

Swimburger commented Mar 8, 2024

Swimburger commented May 1, 2024

Swimburger commented Jun 10, 2024

.Net: Add AssemblyAI connector #5392

.Net: Add AssemblyAI connector #5392

Comments

Swimburger commented Mar 8, 2024 • edited

Motivation and Context

Description

TODO

Potential additions

Swimburger commented Mar 8, 2024

Krzysztof318 commented Mar 8, 2024

Swimburger commented Mar 8, 2024

Swimburger commented May 1, 2024

Swimburger commented Jun 10, 2024

Swimburger commented Mar 8, 2024 •

edited