IVA-CUI

This repository contains the Python and Unity code for a paper titled "Mitigating Response Delays in Free-Form Conversations with LLM-powered Intelligent Virtual Agents" to appear in the Proceedings of the 7th ACM Conference on Conversational User Interfaces (CUI '25). If you use this code or Unity environments in your research, please cite our paper (see Citation section below).

Unity Setup

User study scenes

All scenes are located in iva-cui-unity/Assets/Scenes/. List of licenses for third-party code and assets used in this project can be found in the ASSET_LICENSES.md file.

City_Scene.unity -> Scenario 1
Hotel_Scene.unity -> Scenario 2
Museum_Scene.unity -> Scenario 3

How to run the scenes

Unity version: 2022.3.21
Run Python backend before running the Unity scenes.
VR and Desktop (non-VR) modes are supported. Follow instructions in Desktop Mode and VR Mode.
To speak with agents, toggle mic on before and toggle mic off after you speak (see Controls). Adjust microphone on the SceneControls gameobject in scene hierarchy (see screenshot below, Desktop Mode and VR Mode).
Agents will respond after a short delay. If no agent can hear you or an agent is currently thinking or speaking, you will hear a broken mic sound.

Desktop mode

Enable WASD Player gameobject in hierarchy
Disable XR Interaction Setup gameobject in hierarchy
On the SceneControls gameobject, set a working microphone

VR mode

Enable XR Interaction Setup gameobject in hierarchy
Disable WASD Player gameobject in hierarchy
On the SceneControls gameobject, set microphone to Oculus Virtual Audio Device (or other device equivalent)

Controls

Action	VR Mode	Desktop Mode
Toggle microphone	A	M
Move	Left Stick	WASD
Look around	Right Stick	Mouse
Sprint	–	Left Shift
Interact with objects	Side Trigger (Grab)	–

Python Setup

Backend (we also call it 'middleware') is responsible for handling requests from Unity, processing audio files, and interacting with the LLM server. It is located in the iva-cui-backend directory.

Setup Steps

The outcome from following these instructions should be:

A local LLM server running on port 8082 (or 11434 for Ollama)
A local ASR server running on port 8083
A local Python middleware server running on port 8000

Running LLM locally on Windows

By default, backend runs using Ollama. We recommend using it, however, OpenAI API-style LLM server endpoints and locally-deployed options (llamafile and LMStudio) are also supported. LLM API endpoints are specified in iva-cui-backend/python_middleware/llm_backends.py. If you want to switch to OpenAI-style endpoints, you can do so by changing the LLM_BACKEND variable in iva-cui-backend/python_middleware/app.py.

Ollama (local, recommended)

Download and install Ollama.
Run ollama run llama3.1:8b-instruct-q5_K_M.
Set the LLM_BACKEND variable in iva-cui-backend/python_middleware/app.py to ollama.

LMStudio (local, OpenAI-style endpoints)

Download, install and run LMStudio.
Download this model lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf.
Set the UI mode to "Developer" or "Power User" (bottom left corner).
Go to "Developer" tab -> Settings -> Server Port and set it to 8082.
Start the server by toggling the switch in the top left corner.
Set the LLM_BACKEND variable in iva-cui-backend/python_middleware/app.py to llamafile_llama3.

llamafile (local, OpenAI-style endpoints)

Download llamafile-0.9.0
Rename llamafile-0.9.0 to llamafile-0.9.0.exe
Download Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf from huggingface
Run llamafile-0.9.0.exe --server -ngl 9999 -m Meta-Llama-3.1-8B-Instruct-Q5_K_M.gguf --host 0.0.0.0 --port 8082
Set the LLM_BACKEND variable in iva-cui-backend/python_middleware/app.py to llamafile_llama3

Official OpenAI API (remote)

Create a file mentioned in the load_openai_key() function in iva-cui-backend/python_middleware/llm_backends.py and put your OpenAI API key there. The file should contain only the key, no other text. Alternatively, modify that function to load the key from an environment variable. You can also make the function directly return the key in the code (not recommended).
Set the LLM_BACKEND variable in iva-cui-backend/python_middleware/app.py to openai_4 or openai_4mini. You can also use other models by directly setting the model="gpt-4o" in an appropriate class in the iva-cui-backend/python_middleware/llm_backends.py file.

Running Python backend (middleware) on Windows

# create and activate virtual environment
python -m venv venv
venv\Scripts\activate

# install the required packages
pip install openai ollama edge-tts FastAPI[all]

# navigate to the directory and run the server
cd iva-cui-backend\python_middleware
uvicorn app:app --reload

Running the ASR model on WSL

# create a virtual environment
sudo apt update
sudo apt install python3-venv
python3 -m venv venv

# activate the virtual environment
source venv/bin/activate

# install the required packages
pip install nvidia-cublas-cu12 nvidia-cudnn-cu12==9.*

export LD_LIBRARY_PATH=`python -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))'`

pip install faster_whisper FastAPI[all]

# navigate to directory and run the ASR server
cd iva-cui-backend\transcription_server
python whisper_server.py

Test LLM Locally

cd iva-cui-backend\python_middleware
python test_conv.py

Authors

Mykola Maslych, Mohammadreza Katebi, Christopher Lee, Yahya Hmaiti, Amirpouya Ghasemaghaei, Christian Pumarada, Janneese Palmer, Esteban Segarra Martinez, Marco Emporio, Warren Snipes, Ryan P. McMahan, Joseph J. LaViola Jr.

Citation

If you use this code in your research, please cite our paper:

@inproceedings{Maslych2025Mitigating,
    author    = {Maslych, Mykola and Katebi, Mohammadreza and Lee, Christopher and Hmaiti, Yahya and Ghasemaghaei, Amirpouya and Pumarada, Christian and Palmer, Janneese and Segarra Martinez, Esteban and Emporio, Marco and Snipes, Warren and McMahan, Ryan P. and LaViola Jr., Joseph J.},
    title     = {Mitigating Response Delays in Free-Form Conversations with LLM-powered Intelligent Virtual Agents},
    year      = {2025},
    isbn      = {9798400715273},
    publisher = {Association for Computing Machinery},
    address   = {New York, NY, USA},
    url       = {https://doi.org/10.1145/3719160.3736636},
    doi       = {10.1145/3719160.3736636},
    booktitle = {Proceedings of the 7th ACM Conference on Conversational User Interfaces},
    articleno = {49},
    numpages  = {15},
    month     = {jul},
    series    = {CUI '25},
    location  = {Waterloo, ON, Canada},
}

Name		Name	Last commit message	Last commit date
Latest commit History 220 Commits
iva-cui-backend		iva-cui-backend
iva-cui-unity		iva-cui-unity
.gitignore		.gitignore
README.md		README.md
setup.png		setup.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

IVA-CUI

Table of Contents

Unity Setup

User study scenes

How to run the scenes

Desktop mode

VR mode

Controls

Python Setup

Setup Steps

Running LLM locally on Windows

Ollama (local, recommended)

LMStudio (local, OpenAI-style endpoints)

llamafile (local, OpenAI-style endpoints)

Official OpenAI API (remote)

Running Python backend (middleware) on Windows

Running the ASR model on WSL

Test LLM Locally

Authors

Citation

About

Uh oh!

Contributors 5

Uh oh!

Languages

ISUE/iva-cui

Folders and files

Latest commit

History

Repository files navigation

IVA-CUI

Table of Contents

Unity Setup

User study scenes

How to run the scenes

Desktop mode

VR mode

Controls

Python Setup

Setup Steps

Running LLM locally on Windows

Ollama (local, recommended)

LMStudio (local, OpenAI-style endpoints)

llamafile (local, OpenAI-style endpoints)

Official OpenAI API (remote)

Running Python backend (middleware) on Windows

Running the ASR model on WSL

Test LLM Locally

Authors

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors 5

Uh oh!

Languages