-
Notifications
You must be signed in to change notification settings - Fork 231
Update voice assistant LP with multimodal functionality #2368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Update voice assistant LP with multimodal functionality #2368
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for upgrading the Learning Path, Nina, looks overall great!
KleidiAI simplifies development by abstracting away low-level optimization: developers can write high-level code while the KleidiAI library selects the most efficient implementation at runtime based on the target hardware. This is possible thanks to its deeply optimized micro-kernels tailored for Arm architectures. | ||
|
||
As newer versions of the architecture become available, KleidiAI becomes even more powerful: simply updating the library allows applications like the Voice Assistant to take advantage of the latest architectural improvements - such as SME2 — without requiring any code changes. This means better performance on newer devices with no additional effort from developers. | ||
As newer versions of the architecture become available, KleidiAI becomes even more powerful: simply updating the library allows applications like the Voice Assistant to take advantage of the latest architectural improvements - such as SME2 — without requiring any code changes. This means better performance on newer devices with no additional effort from developers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like the "multi-modal Voice Assistant" - just rephrasing things to add the latest enablement to multi-modal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, done
- Kotlin | ||
- C++ | ||
operatingsystems: | ||
- Android |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we say that the multi-modal voice assist example is based on Android, but for benchmarking purposes of the different individual building blocks, we support multi-platform operating systems (and we list them)
https://gitlab.arm.com/kleidi/kleidi-examples/large-language-models | ||
``` | ||
|
||
and build for various platforms: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And build for various platforms to test the individual building blocks. For the full pipeline, the example runs on Android OS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks Gemma, I've updated this in 2 places for STT and LLM module and updated to say can build for various platforms to benchmark the functionality independently, also updated the last line in this page to mention that the example is based on android, hopefully this clarifies things more but happy to add more changes if unclear.
* added updated to multi-modal use * update to learning objectives and overview pages to specify platforms supported
Before submitting a pull request for a new Learning Path, please review Create a Learning Path
Please do not include any confidential information in your contribution. This includes confidential microarchitecture details and unannounced product information.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the Creative Commons Attribution 4.0 International License.