Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent sound control(userspace) throughout Mycroft software stack. #53

Open
j1nx opened this issue Nov 13, 2019 · 8 comments
Open

Comments

@j1nx
Copy link

j1nx commented Nov 13, 2019

Who is doing what? Using which userspace sound system?

Linux and sound is and has always been a pain in the @$$ if you ask me personally. Sound control within the Mycroft software stack is no exception, although A LOT has been improved lately. However at the moment there are some inconsistencies in who is doing what and by using which userland system.

I would like to open a dialog about this, your ideas and figure out the best way forward.

Information

At this moment we can force mycroft-core to use PulseAudio by changing;

"play_wav_cmdline": "aplay -Dhw:0,0 %1"
"play_mp3_cmdline": "mpg123 -a hw:0,0 %1"

to;

"play_wav_cmdline": "aplay %1"
"play_mp3_cmdline": "mpg123 %1"

This all works fine, and WAV and MP3 are played using the default sink of PulseAudio. However the volume control of mycroft itself is handled by this skill (hence the reason I open the dialog here), BUT is forced to use ALSA and its mixers or nothing at all.

Looking at the code of the skill, there are a couple things that jump out;
https://github.com/MycroftAI/skill-volume/blob/19.08/__init__.py#L26

It it using the enclosure - platform config parameter to push it towards using ALSA or not. However if it not pushed into the ALSA_PLATFORM array, there is no fallback other than telling the mycroft.messagebus about it. This should then be picked up by something else. I believe the enclosure code should listen to the messagebus and do the volume stuff as stated here;
https://github.com/MycroftAI/skill-volume/blob/19.08/__init__.py#L132

The strange thing is, there aren't many enclosures at the moment. You have the mark-1 enclosure, but that one is forced to use ALSA here in the skill. There was no picroft enclosure, although a dialog has been started about it and a initial enclosure has been created with two simple controls for shut down and reboot;
https://github.com/MycroftAI/enclosure-picroft/blob/09488ffa30872b032a4e29cb5ef5355ef1d46400/home/pi/enclosure/PicroftEnclosure.py

But again nothing about the volume and again, it is being pushed into using ALSA within the skill.

The mark-2 has its own enclosure code I believe, however volume control of it is being handled by a skill instead of the enclosure code;
https://github.com/MycroftAI/skill-mark-2-pi/blob/master/__init__.py#L185

I don't know why volume for the mark-2 is handled by directly communication with the I2S bus instead of using ALSA or PulseAudio. Probably because volume control via PulseAudio is not supported.

My idea / opinion

I strongly believe this volume skill should support both ALSA and PulseAudio.
How this could be done in a sane matter is a bit unclear though;

  • Create a PULSE_PLATFORM setting and extend the IF, THEN, ELSE structure with pulsectl functions?
  • Look at the "play_wav_cmdline" and "play_mp3_cmdline" parameters and figure it out from there itself?
  • ...??

Background

Perhaps a bit of backgrounud information about how I come to this topic.
For MycroftOS I am setting up the system to ONLY use PulseAudio. This to have better control about the sinks and sources using the UDEV modules and let PulseAudio kook up straight into the kernel without the need of the ALSA userland system.
This is preperation for other plugins such as resampling and/or AEC Webrtc of PulseAudio.
For MycroftOS I have set the enclosure config parameter to "MycroftOS" and with that parameter set, this volume skill was not doing anything anymore.
Patched this skill to include my enclosure tag into the ALSA_PLATFORM string

For now I am forcing ALL ALSA requests to pulseaudio using the alsa-plugins, which works just fine.
https://github.com/j1nx/MycroftOS/blob/develop/buildroot-external/rootfs-overlay/etc/asound.conf

@j1nx
Copy link
Author

j1nx commented Nov 14, 2019

Thinking about it a bit more, I now believe; "moving the actual hardware handling towards a / the enclosure code" would be the best way forward in my humble opinion.

I believe that was already the way forward, looking at the commented code.

But with that statement, the Mark-2 is the inconsistent one with the volume control via the mark-2-skill.

You could wondering if the whole actual system volume handling, shouldn't be moved to the enclosure code completely and let the skill-volume just handle the requests and feedback, where as the enclosure code is doing the actual changes. A bit similar as the Common Play stuff.

What are your ideas? Original thoughts?

@forslund
Copy link
Collaborator

The idea started with the Mark-2 was to start pushing the volume to the messagebus allowing an enclosure to handle the volume. During the experimentation stages we put the volume updating into the Mark-2 skill so it could be quickly iterated upon, but in the long term it was intended to be moved into the enclosure. It hasn't gotten there due to issues getting a production hardware solution so we could finalize what's needed.

The plan is to remove the direct dependency on alsa from the skill. It's quite clunky to work with but it does work on most Linux systems. (I didn't think you could have pulse without an alsa backend).

We also wish to break out the enclosure code from mycroft-core and making it easier to create standalone enclosures. (Also in the idea phase, but we're starting to create dependent packages such as a mycroft bus client

@penrods
Copy link
Collaborator

penrods commented Nov 14, 2019

One other point of clarification: the Seeed ReSpeaker Mic v2 (like what is being used in the Mark 2 prototype) has no volume control at the system level. The algorithms built in to the mic firmware wants full-volume audio, expecting volume to be controlled after the audio goes out from the mic board's output. This, alsamixer is useless.

The prototype controls the final volume level via an I2C control protocol.

@j1nx
Copy link
Author

j1nx commented Nov 14, 2019

Right, than indeed my believe is on the right path and inline with what you guys wanted; Volume control should be handled by the Enclosure code. I will follow that path then...

@forslund
Like I said, sound and linux has always been a thingy IMHO, but lately I started to hate ALSA userland more and more. It is just very hard to support multiple sound hardware in a sane matter with ALSA userland. It requires a lot of scripting, while PulseAudio works very nicely with UDEV now a days (both detecting the ALSA kernel part and configuring PulseAudio)

To comment on your statement; "I didn't think you could have pulse without an alsa backend". I had the same believe. That confusion comes from that ALSA itself consist of two parts. The kernel part and the userland part. From the pulseAudio documentation;

"PulseAudio is a general purpose sound server intended to run as a middleware between your applications and your hardware devices, either using ALSA or OSS. It also offers easy network streaming across local devices using Avahi if enabled. While its main purpose is to ease audio configuration, its modular design allows more advanced users to configure the daemon precisely to best suit their needs.

Note: Some confusion may occur between ALSA and PulseAudio. ALSA includes a Linux kernel component with sound card drivers, as well as a userspace component, libalsa. PulseAudio builds only on the kernel component, but offers compatibility with libalsa through pulseaudio-alsa."

Anyhow...

I already saw the quick picroft enclosure start;
https://github.com/MycroftAI/enclosure-picroft/tree/feature/picroft-enclosure
I will hook up on that system, creating a mycrofos enclosure and then look at the messagebus, volume control functions of the skill-mark2-pi code.

@penrods
Can't you use PulseAudio to do a similar trick? The PCM output of the RPi is also very "dumb" where you need (DMIX or SoftVol) mixers to control the volume. Perhaps I need to buy one, to add support for it in my own work.

@forslund
Copy link
Collaborator

I agree pulseaudio works very well and can be configured without losing ones mind :)

@j1nx
Copy link
Author

j1nx commented Jun 7, 2020

Would like to bump this issue / discussion.

The whole Mycroft A.I. Software stack is now using PulseAudio as default, however the volume skill still uses Alsa?!?

Shouldn't it be converted to also use PulseAudio? Makes using Mycroft on desktops also less painfull as desktops also use PulseAudio. Then a default Mycroft desktop install is able to control the volume out of the box.

The alsa_platform logic can then be reverted to add device to the "do nothing" logic so enclosure codes can do it's thing (mark2)

@forslund
Copy link
Collaborator

forslund commented Jun 7, 2020

We could definitely test it out at least. It would (hopefully) remove a bunch of the clutter that has been added to handle alsa properly on all different platforms.

If the Mark-1 has issues we can with not too many lines of code move the alsa specific things to the Mark-1 platform. If you don't have something set up I can try to post a wip branch this evening / tomorrow.

@j1nx
Copy link
Author

j1nx commented Jun 7, 2020

I now just forces all alsa to pulse by using alsa-utils pulse addon.

I was looking at using jarbas-utils it's pulseaudio control to be integrated within the enclosure skill/code, but then scratched my head thinking, why...

If this skill uses pulseaudio as the rest of mycroft now, everyone benefits. Mostly all those desktop installs. Almost 100% sure that plasma also uses Pulseaudio, then there is just one volume skill that just handles all of them.

Not really sure how the Mark-1 fits into this though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants