Poor VAD end with background noise #113

dslugPX · 2023-05-26T12:46:49Z

As mentioned in issue 112 we have a relatively noisy environment as we have music running 24/7.

We have noticed that sometimes willow may seem to be listening to the music in addition to our voices.
I believe it may be causing some of the issues mentioned in 112, in particular scenario 3.

I have also seen (only once) it pick up what I think was a drum beat as the command "No no no"

Happy to help by providing whatever I can for you.

Also - should mention I have two ESP32s in flight now, and one more in a box still so I can certainly try some different settings and the like as well.

Cheers!

kristiankielhofner · 2023-05-26T13:55:44Z

Thank you for filing separate issues, we'll be addressing them in commits for you to test with later today.

As I've noted previously, of all of the reports we are getting you seem to be having the most usability issues. It's reassuring to us that even with these initial and very early problems your experience is still positive enough to order more devices!

dslugPX · 2023-05-26T14:09:42Z

Oh yeah, this is great. I'll buy one more once stock is high again for these too. Gonna put one outside too! Be REALLY nice to control stuff with voice while in the pool (we live in AZ so summer time is spent in water if we are in the yard, though that will be an interesting thing to see how they do in the heat here, eeek) I'm presuming I'm seeing more issues for two reasons: 1. My network is a complete mess. It's cobbled together with a half dozen unmanaged switches in addition to a few different mesh network endpoints and some vlan chaos that's probably not helping, at all. 2. I'm probably the worst (or best, dunno) kind of tester for you right now. Someone willing to kind of muddle through, but not 100% certain what they are doing. I mean, I'm well versed in parts of this, but others I just followed a shit ton of tutorials to get something running and didn't even really try to retain much. I think, so far, I've done things the way you would expect though. 3. I have tinnitus, and bad. As such and as you know by now we keep music on constantly because it helps drown out some of the constant ringing. I have to presume this is a factor, but you seem confident it can be dialed in, and I have no reason whatsoever to doubt you. We're happy to be a help in knocking out these things as once you get the easy install script stuff done, you are going to have a LOT more users like me around. And my wife is simply over the moon with how easy it is for me to make her little things she can do now. I mean, over the moon.

…

On Fri, May 26, 2023 at 6:55 AM Kristian Kielhofner < ***@***.***> wrote: Thank you for filing separate issues, we'll be addressing them in commits for you to test with later today. As I've noted previously, of all of the reports we are getting you seem to be having the most usability issues. It's reassuring to us that even with these initial and very early problems your experience is still positive enough to order more devices! — Reply to this email directly, view it on GitHub <#113 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A3AMQB6GITL6FLTV5R6E4T3XICY6VANCNFSM6AAAAAAYQF6YFI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

kristiankielhofner · 2023-05-26T14:26:09Z

Wow, yeah... Now that I'm hearing about the network situation I suppose I'm even happier Willow works as well as it does, especially for hardware that is 2.4 GHz only. Do you have any plans to address some of that? I wouldn't ask you to do it for Willow - after all, we aim to be the best speech solution in the world and it's good to know it's being used in environments that are... Let's just go with "suboptimal" for a wireless network connected speech recognition device. Don't take this the wrong way but from the sounds of it someone couldn't purposely design more of an environmental nightmare for a solution like Willow ;). I'm almost surprised it works at all.

I'm very sorry to hear you have severe tinnitus. I don't have it myself but from what I understand it's dramatically life-impacting.

Yes, background noise is always a challenge. The ESP BOX and the various libraries do (IMO) a very good job with it but at the end of the day you can start to run out of magic. That said we have plenty of knobs to tweak and we'll get the full set to you later today.

dslugPX · 2023-05-26T14:48:31Z

Well I went with a little hyperbole there. I have wired backhaul to each access point and the 2.4ghz network has its own VLAN. But it's definitely in the ballpark of "messy but good enough most of the time" :) I always have plans to make everything better. But who knows when that will happen on the network side. It was a nearly 6 month journey to get whole home audio and video working perfectly. Audio was sort of easy, but once you add in keeping things in sync with video too, it gets dicey fast and it took a while, That's when a lot of I'll just run ethernet through this closet and add a switch kind of crap came into play. Sonos would have done the trick, but then I'd have sonos quality sound (sub par) with easy controls. Now I have good sound and easy controls. In fact the very first commands willow ever used were: Switch to Music and Switch to TV. Anyway, if i put a server in to run WIS, I'll move everything else over to it and that will take a ton of the weird network routes on the network out of the mix instead of plex here, HA there, and so forth and so on. (PopOS looks perfect by the way). Anyway - our house is for sure a good example of the kind of "real world" folks you're ultimately targeting. Perhaps an extreme one even :) Just hope it's not too early for you to begin dealing with this stuff. And if it is, *please* don't put a ton of effort into helping me specifically, you have much more important stuff to work on, but I sure don't mind giving you feedback so you have it. OK... Day job time!

…

On Fri, May 26, 2023 at 7:26 AM Kristian Kielhofner < ***@***.***> wrote: Wow, yeah... Now that I'm hearing about the network situation I suppose I'm even happier Willow works as well as it does, especially for hardware that is 2.4 GHz only. Do you have any plans to address some of that? I wouldn't ask you to do it for Willow - after all, we aim to be the best speech solution in the world and it's good to know it's being used in environments that are... Let's just go with "suboptimal" for a wireless network connected speech recognition device. Don't take this the wrong way but from the sounds of it someone couldn't purposely design more of an environmental nightmare for a solution like Willow ;). I'm almost surprised it works at all. I'm very sorry to hear you have severe tinnitus. I don't have it myself but from what I understand it's dramatically life-impacting. Yes, background noise is always a challenge. The ESP BOX and the various libraries do (IMO) a very good job with it but at the end of the day you can start to run out of magic. That said we have plenty of knobs to tweak and we'll get the full set to you later today. — Reply to this email directly, view it on GitHub <#113 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/A3AMQB727H4DASSIVSTDWG3XIC4QZANCNFSM6AAAAAAYQF6YFI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

stintel · 2023-05-30T14:30:00Z

My apartment is also relatively noisy and I too run into AUDIO_REC_VAD_END not triggering. With 7da6d73 we will force the stream to end after CONFIG_WILLOW_STREAM_TIMEOUT seconds, which avoids endless stream and setting it to 5 works around that problem somewhat.

Last night I wondered if reducing the mic gain would help in noisy environments, so I added a Kconfig option to set mic gain in 00f0d1b. Could you please test if reducing the mic gain helps in noisy environments? I'm currently travelling so can't test myself.

kristiankielhofner · 2023-06-05T11:47:08Z

@dslugPX As shown in the commit reference I also just added a parameter exposed under "Advanced Configuration" to configure the "aggressiveness" of VAD - higher values mean it will be more selective in considering what constitutes speech. In my initial testing VAD_MODE_4 (most aggressive) helps with this issue, but you may want to play with the various levels in your environment.

dslugPX · 2023-06-07T21:40:47Z

@dslugPX As shown in the commit reference I also just added a parameter exposed under "Advanced Configuration" to configure the "aggressiveness" of VAD - higher values mean it will be more selective in considering what constitutes speech. In my initial testing VAD_MODE_4 (most aggressive) helps with this issue, but you may want to play with the various levels in your environment.

Nice. I'll have a little time coming up in the next few days to do some updates and try a few more things.
We are still using this daily and trying to note things we are finding. Drums are definitely a source of trouble, but I only did the one update since we put them online so most of your more interesting changes aren't in use yet.
Will follow up again soon!

btw - bunch of esp32 boxes hit ADAfruit this afternoon, so I'm guessing you will have a new run of users coming at you soon!

kristiankielhofner · 2023-06-08T15:30:02Z

Thanks, appreciate it!

Yep, we saw a bunch come into Mouser too!

kristiankielhofner added a commit that referenced this issue Jun 5, 2023

Expose VAD aggressiveness parameters in Advanced Configuration #113

8dc4167

kristiankielhofner added a commit that referenced this issue Jun 15, 2023

Expose VAD aggressiveness parameters in Advanced Configuration #113

ce7a086

kristiankielhofner added audio Issues related to audio wake, speech recognition, audio quality, etc dynamic configuration Config and behavior changes for post-dynamic configuration support 1.0 Issues to address for 1.0 release labels Jun 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor VAD end with background noise #113

Poor VAD end with background noise #113

dslugPX commented May 26, 2023

kristiankielhofner commented May 26, 2023

dslugPX commented May 26, 2023 via email

kristiankielhofner commented May 26, 2023

dslugPX commented May 26, 2023 via email

stintel commented May 30, 2023 •

edited

Loading

kristiankielhofner commented Jun 5, 2023

dslugPX commented Jun 7, 2023

kristiankielhofner commented Jun 8, 2023

Poor VAD end with background noise #113

Poor VAD end with background noise #113

Comments

dslugPX commented May 26, 2023

kristiankielhofner commented May 26, 2023

dslugPX commented May 26, 2023 via email

kristiankielhofner commented May 26, 2023

dslugPX commented May 26, 2023 via email

stintel commented May 30, 2023 • edited Loading

kristiankielhofner commented Jun 5, 2023

dslugPX commented Jun 7, 2023

kristiankielhofner commented Jun 8, 2023

stintel commented May 30, 2023 •

edited

Loading