-
-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a intelligent smart home chat bot to the UI #2995
Comments
Originally posted by @GiviMAD in #2275 (comment) The enhanced ChatGPT binding PR is linked above.
Originally posted by @GiviMAD in #2275 (comment) Fully agreed, though I haven’t had a look at the code for now. |
In the other thread I brought up HABot because it seems to already be rendering widgets using MainUI F7 widgets. If I ask to get a single Item, I get that Item's default stand-alone widget. However, if I ask for more than one Item I don't think I get my default list item widgets. But there has to be something there already. From a usability perspective, if you do not intend to use the semantic model in any way, how does the model know where devices are located? Will there be yet another set of metadata to add to the Items to encode this information? |
Very excited about the all the work @GiviMAD has done on voice (really!) , and thanks @florian-h05 for now also diving into this. If i was not already working on the Matter binding and iOS client (plus some myopenhab stuff) this would be my next top priority. About a year ago (err, maybe much longer) i spent several weeks prototyping an integration using a LLM to do basic control of openHAB. I ended up getting distracted and having to move on to other things, but i think there is a TON of opportunity there. I know @rkoshak mentioned this in another thread, but one thing that quickly become apparent is the importance of the semantic model. It not actually something i use on my home system, so i did not start with it. But..... it became very, very necessary when trying to describe my openHAB to a LLM. You quickly end up needing a concept of rooms and equipment, including the room where the user is currently at (if they are speaking in the living room, then when they say "lights on" you want only those lights on) . This also helps discard all the non essential items that are usually not included in the model (item for rules, sensors, etc...). And i think keeping an eye on the emerging open source voice hardware is also super important. I would have no problem dropping my 10+ Alexa's if there was a viable on prem solution through openHAB, even if that means spending a lot more on hardware (including some GPUs). Very excited ! |
That's kind of interesting - I know HABot is built with Quasar (which uses Vue as well), but I would expect it to be independent from Main UI, as it has been around earlier IIRC.
I just checked the ChatGPT binding code and the location is injected together with Item name, type, label & state into the request prompt.
Same for me - I currently have one HomePod Mini that very often does not understand me, this is the only smart speaker as I have privacy concerns with using Alexa or Google Assistant (mainly speech data being transmitted as well as controlling my devices over their cloud), so having TTS and STT locally and only the interpretation of the text in the cloud would be very nice (I am not planning to buy GPUs ;-)). From my experience, the smaller 7B models are pretty usable for everyday stuff. (Chatting around with them locally on my laptop from time to time.) Unfortunately do the small Llama models in Ollama always or never call tools, and the suggested system prompt didn't fix that for me, so I gave up for the time being using a small Llama (it wouldn't run on "embedded" hardware like Pi or NAS though, so no chance for me to run in 24/7). |
Yeah, was also was switching between local and openAI. I ended up on a strategy to continue with the bigger cloud models for development, making sure i was not using anything in the API that would not be portable to other models. I figured by the time the functionality was complete, there would be better, smaller models available and they would only get better over time and we would be able to offer them as a choice at some point. Have not played yet with llama 3.2 models for anything serious, but the 3b model seemed promising for such as use case. |
Not to keep promoting the Matter binding, but i am now using this for Alexa/Siri integration for everything which means all control is local, its just the speech data as you mention thats being sent back and forth in the cloud (control is noticeably snappier now). In any case, either my system is getting more complex and harder to process, or their systems are getting less accurate, but i feel like voice accuracy has been declining as of late. |
Hello, As an aside, for those looking for a device to power AI locally, I bought an Nvidia Orin Nano some months ago which can be found right now for over 300 euros. It only has 8gb of ram but with the whisper large turbo model and with llama 3.2 3b though ollama works quite good. Just in case it matches what you are looking for. I was looking at the ChatGPT binding PR and I think we could start by adding some classes from there to core (something similar to the ChatFunction and ChatMessage classes) and then also add a new interface than implement from HumanLanguageInterpreter (maybe LLMInterpreter) with a interpret method that allows to pass the chat messages and the chat tools on each execution. After that I don't know what is the better way to proceed. The idea that I like most is if we allow to have like different agents (we can see an agent as the conjunction of system prompt + available tools + conversation expiration time + LLMInterpreter) . I'll try to explain it with an example: You open the chat ui component and it will let you choose if you'll talk with the built-in "openhab agent" or with another agent that you have defined your self for example with your workout routine in the system prompt and a bigger expiration time so it covers the time you need for your training. And I think a future update could be choosing the agent automatically. Do you think something like that makes sense? |
Just chiming in to give some thoughts and context (could be more relevant to #2275 but this one is newer) about my journey and what I've tried to achieve and build over the years. For the record I don't use voice control at all these days, I might have been traumatized :) It started when I figured I could try to use Snips (since acquired by Sonos), as they seemed to make really great strides in local STT with cheap SOCs (I didn't want to use more than that for home automation); I added a MQTT broker, and a "flows" UI I had built, and the goal was to bring voice control to my living room with an omni mic and a speaker for feedback. It didn't work nearly well enough though, mostly due to the limited local STT and keyword spotting (and perhaps the mic quality and ambient noise etc.). Amazon Alexa devices were just way ahead on all these but I insisted on something local and so cloud-based was a no-no. So then I made the HABot for the use case of just mostly getting a piece of UI relevant to your situation and current need (like, I could pull out my phone from my pocket and ask "show me the lights in the living room" to get a bunch of switches and sliders and adjust with these, instead of directing by voice which would fail half of the time, and also instead of navigating through pages of pre-made sitemaps which may not even be what I need). Of course OpenNLP allowed HABot to have more skills than getting those pieces of UI: it could do stuff like sending various commands to items, but then you hit the frustrating and sometimes embarrassing outcomes again - like when you ask "turn off the lights in X" and it didn't understand "X" so the intended What I want to ultimately say is that I've been burned multiple times by expecting too much from a local NLP, but I really think ultimately I want it to happen still, and LLMs open a new frontier, so I will keep apprised and possibly involved.
All of this is super interesting (I'll admit I'm behind the curve). |
I am thinking of having a smart home chatbot for openHAB 5, a bit like HABot but more intelligent, integrated into Main UI and not only limited to smart home related stuff.
This would require the following bits:
Unfortunately anything LLM-based is unlikely to work locally on embedded hardware like Pis or NAS, however is OpenAI's GPT 4o Mini relatively cheap and I don't have large privacy concerns as it is only used to interpret a text prompt - it does not have direct access to openHAB.
The text was updated successfully, but these errors were encountered: