Skip to content

Conversation

tracelarue
Copy link

Gemini Live for low-latency bidirectional voice interactions with ros-mcp-server.

  • Added gemini_live to examples/2_gemini.
  • Enables audio input from the user and audio output from Gemini.
  • Enables Gemini Live to use ros-mcp-server.
  • Tested in ubuntu 22.04, python 3.10, ros2 humble.

@tracelarue tracelarue changed the title Voice interactions with Gemini Live and ros-mcp-server added to Gemini example. Add voice interactions with Gemini Live and ros-mcp-server to Gemini example. Sep 24, 2025
@stex2005 stex2005 requested review from stex2005, lpigeon and rjohn-v and removed request for lpigeon September 24, 2025 16:05
@stex2005
Copy link
Contributor

Thank you for your contribution, @tracelarue. I will give it a try soon.

@rjohn-v — I’d suggest adding a client/ folder in the repository to store installation packages and runnable clients. This would help keep different client implementations (e.g., Gemini API client) and their installation steps organized in one place. I’m not sure I’d keep this under examples/.

@stex2005 stex2005 linked an issue Sep 24, 2025 that may be closed by this pull request
@stex2005 stex2005 linked an issue Sep 24, 2025 that may be closed by this pull request
@stex2005
Copy link
Contributor

I connected issue #62 to this PR.
After, we can close it and reopen for other APIs.

@stex2005
Copy link
Contributor

stex2005 commented Sep 26, 2025

image

Raises this error during installation, seems that I need a system-package: portaudio.

I woudl recommend trying to include this into dependencies in the README.md + comamnd to install:

sudo apt install portaudio19-dev

@stex2005
Copy link
Contributor

image

Couldn't run uv run on my WSL Ubuntu. Please specify that this works only on Ubuntu, will try on Ubuntu soon.

@stex2005
Copy link
Contributor

@tracelarue @rjohn-v Another good next step would be to provide a dockerized version of the Gemini client, so it can be run more easily in different environments. Since the client is only a tool within this project, I don’t think we should invest too much effort in tightly integrating it into the repo. A simplified version of client_gemini (without audio support) would already be a good, lightweight solution and would find a good place in clients folder.

@stex2005
Copy link
Contributor

Couldn't run uv run on my WSL Ubuntu. Please specify that this works only on Ubuntu, will try on Ubuntu soon.

image

This is the same error when I try to run mcp_handler.py

@tracelarue
Copy link
Author

@stex2005 Thank you for the review and feedback. I'll work on getting these changes and fixes implemented.

@mokcontoro
Copy link
Contributor

@tracelarue wow, voice command sounds super cool. thanks for your contributions. I cannot wait for trying this feature soon!

@tracelarue tracelarue requested a review from stex2005 October 4, 2025 19:14
@tracelarue
Copy link
Author

@stex2005 Ready for review

Changes made:

  • removed mcp_config.json and instruct user to create their own
  • removed pyproject.toml and uv.lock. Now uses uv pip install to install dependencies in the existing ros-mcp .venv
  • updated README.md to specify .env location
  • renamed gemini_live.py to gemini_client.py
  • specified it only works on ubuntu in README.md (try WSL again after these changes)
  • removed mcp_handler.py in favor of Google's latest method to connect to mcp servers
  • Other gemini_client.py improvements and README.md updates

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add example for running a local/on-prem LLM with MCP
3 participants