This guide demonstrates how to use the Ubiq-Genie framework to create an application that generates textures based on voice commands. Users also have the option to specify target objects using ray-based selection.
To run this sample, you'll need a computer capable of supporting Stable Diffusion 2.0. This typically requires a GPU with at least 4GB of memory and 16GB of RAM. The sample has been tested on Windows 11 with an NVIDIA GeForce RTX 3080 Ti GPU and 64GB of RAM. macOS is also supported through PyTorch's MPS support (requires Apple Silicon).
Additionally, an Azure Speech Services subscription is required. You can create a free subscription here.
Important
Before proceeding, ensure the Ubiq-Genie framework and the necessary dependencies for this sample are correctly installed. For further details, please see the root README file.
We recommend using VS Code to run and modify the server application, with the Node
folder as the workspace root. To run the server application:
-
Open a terminal and navigate to the
Node/apps/texture_generation
directory. Ensure that your conda or venv environment is activated. -
Execute the command below, which will guide you through the configuration process, including setting up the server information and the required environment variables. Configuration will only run the first time you start the application. Ensure that you apply the same server configuration to the Unity client (in
Unity/Assets/ServerConfig.asset
).npm start texture_generation
If you need to reconfigure the application, you can run npm start texture_generation configure
. You may also manually configure the application by changing the config.json
and .env
files. The config.json
file contains the server configuration, while the .env
file contains the environment variables for the Azure Speech Services subscription key and region (SPEECH_KEY
and SPEECH_REGION
).
- Launch Unity and navigate to the
Unity/Assets/Apps/TextureGeneration
directory. Open theTextureGeneration.unity
scene. - Ensure the
Room Client
under theNetwork Scene
object has the correct IP address and port for the server. If the server is running on the same machine as the Unity Editor, the IP address should belocalhost
. This should correspond to the configuration on the server side. - In the Unity Editor, press the
Play
button to launch the application or deploy it to a VR headset. - Press and hold the trigger button to activate the voice command system.
- Say a command like "make this look like lava" while pointing at your target object in the scene.
For a demonstration video of this sample, please refer to the Ubiq-Genie demo video.
For any questions or issues, please use the Discussions tab on GitHub or send a message in the ubiq-genie channel in the Ubiq Discord server.