Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

me.colab_run() does not handle microphone permissions for a web component #1152

Open
williamito opened this issue Dec 18, 2024 · 2 comments
Open
Labels
bug Something isn't working

Comments

@williamito
Copy link
Contributor

williamito commented Dec 18, 2024

Describe the bug
I am trying to create a web component to handle microphone input. I have code that works when mesop is run from a command line but fails when run in colab.

To Reproduce
This code correctly requests microphone permission when run directly. In Colab I get a silent error, seen in the screenshot below. The only addition in colab is adding !pip install mesop as the first cell.

Screenshots
Image

Workaround

Colab seems to be doing something fancy to identify when to ask permissions. If I run the following code in a different cell first I can trigger the microphone request flow, and then I think things work?

%%javascript
navigator.mediaDevices.getUserMedia({audio: true});
@williamito williamito added the bug Something isn't working label Dec 18, 2024
@wwwillchen
Copy link
Collaborator

Thanks for filing the issue. I actually implemented the feature in Colab for the microphone permission request flow 😅. As you've discovered, Colab's permission check is a somewhat rough heuristic and because Mesop apps run inside an iframe inside the Colab cell output (which itself is an iframe), Colab is not able to properly do a permission check.

I think the workaround that you've described is reasonable and should work (once the permission has been granted, it's given for all the cells in a notebook).

There isn't a straightforward way to fix this (it'd likely take work in both Colab and Mesop)

@williamito
Copy link
Contributor Author

Thanks for confirming!

I've updated the workaround in my initial report to use %%javascript cell magic to make things cleaner.

This was motivated to leverage Mesop while experimenting with Voice to Voice agents and the Gemini Multimodal Live API, so other users might find the workaround helpful. I just made #1158 to add it to the docs until there's a proper fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants