Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Follow idea of immutable /usr vs. mutable overrides in /etc #5

Open
jnpkrn opened this issue Nov 21, 2017 · 18 comments
Open

[RFC] Follow idea of immutable /usr vs. mutable overrides in /etc #5

jnpkrn opened this issue Nov 21, 2017 · 18 comments

Comments

@jnpkrn
Copy link
Contributor

jnpkrn commented Nov 21, 2017

There are many practical reasons why we want to copy this growingly
popular scheme, while enabling users to modify the agents per their
needs, for instance:

  • having solely static data in /usr allows one to share that as
    read-only (or sparsely utilized copy-on-write) mount point with
    their VMs and containers so as to save space

  • no conflict-on-update issue

Hence my expectation is that OCF standard will address this,
presumably in resource-agent-api.md by replacing

The Resource Agents are located in subdirectories under
/usr/ocf/resource.d.

with something like

The Resource Agents are located in subdirectories under
/usr/ocf/resource.d. OCF X.Y compliant RM shall first consult
/etc/ocf/resource.d path for existence of the requested agent,
which, when present, takes a precedence in the agent lookup.
This makes for convenient customization of existing agents without
altering them at the stated standard location, and in turn,
simplifying a revert to stock configuration, coexistence with
package updates, and possibly locked-down use of /usr mount
point. The agent lookup based on the file presence is definite,
any further issue, like file not being executable, notwithstanding.

@jnpkrn jnpkrn changed the title [RFC] Follow idea of immutable /usr vs mutable overrides in /etc [RFC] Follow idea of immutable /usr vs. mutable overrides in /etc Nov 21, 2017
@krig
Copy link
Contributor

krig commented Nov 21, 2017

Sounds good to me. 👍

@kgaillot
Copy link
Contributor

I'm uncomfortable with putting executables in /etc, and I strongly think users shouldn't reuse the same provider+agent name when modifying an agent, as it greatly complicates troubleshooting.

The currently recommended approach for modifying resource agents is to create a new, custom provider under /usr/lib/ocf/resource.d. I could see extending the standard to allow providers in an alternate location, such as /usr/local, /opt, or /srv (followed by ocf/resource.d), or even allowing an OCF_RA_PATH environment variable. I'm not convinced it's a good idea though, as custom OCF scripts are not any more mutable than the commonly distributed ones. In production, few users are going to modify custom scripts directly; they are going to have a development environment, and then push changes to all production nodes (comparable to updating the resource-agents package).

@jnpkrn
Copy link
Contributor Author

jnpkrn commented Nov 21, 2017 via email

@dmuhamedagic
Copy link

dmuhamedagic commented Nov 22, 2017 via email

@oalbrigt
Copy link

And you can already make custom or similarly named directories in /usr/lib/ocf/resource.d/heartbeat to avoid clashing with the agents provided by the distro.

@jnpkrn
Copy link
Contributor Author

jnpkrn commented Nov 22, 2017 via email

@jnpkrn
Copy link
Contributor Author

jnpkrn commented Nov 22, 2017 via email

@krig
Copy link
Contributor

krig commented Nov 23, 2017

The main benefit as I see it would be enabling the sysadmin to add their own agents on top of a read-only /usr file system delivered by a transactional update mechanism.

@jnpkrn
Copy link
Contributor Author

jnpkrn commented Nov 27, 2017 via email

@jnpkrn
Copy link
Contributor Author

jnpkrn commented Nov 27, 2017

Thinking about that, /etc/ocf should indeed be subdir-namespaced per
resource-manager, possibly reserving a chosen name (ANY?) to apply for all.

@krig
Copy link
Contributor

krig commented Nov 27, 2017

The other practical value is that administrator would (one wants to
say, finally) gain power to defuse OCF-based resources

I'm sorry, but I don't understand this argument at all. Why is the administrator trying to prevent the administrator from configuring resources?

I also don't recall any actual argument for why the anything agent is problematic...

@jnpkrn
Copy link
Contributor Author

jnpkrn commented Nov 27, 2017 via email

@krig
Copy link
Contributor

krig commented Nov 27, 2017

Yeah, I think I follow what you're saying. Of course the apache agent might not be the best example to allow when trying to avoid privilege escalation, since it can be trivially configured to execute arbitrary executables. Though that might be an argument for fixing apache. ;)

@kgaillot
Copy link
Contributor

There's a bunch of executable glue scripts already (including /etc/rc.d/init.d ones for non-systemd systems), which is exactly what resource agents are meant to be. I see no conflict here.

While there are common existing cases of executables under /etc, they are exceptions, not the rule. System administrators expect /etc to contain configuration, and executables to be located elsewhere, except in unusual cases.

I believe this is recommended in the LSB, with good reason. An example is that resource agents do not necessarily need to be scripts, they can be compiled, but /etc is architecture-independent.

On the other hand, let's not fall into the fallacy that current situation is a breeze in the "which agent variant was run, exactly" matter, at least with pacemaker in particular:

The main goal is whether enterprise support personnel can reasonably determine whether a particular agent is supported, not the exact agent code used. If the user can override an OS-provided agent, extra steps must be taken with every support case to check whether that has happened. The current recommendation of using a different provider name makes it immediately clear.

Also, the provider name is intended to indicate who provided the agent. If a custom script reuses a provider name, it obscures that indicator. The current recommendation of using a different provider name when modifying a script makes it clear where the agent came from.

The main benefit as I see it would be enabling the sysadmin to add their own agents on top of a read-only /usr file system delivered by a transactional update mechanism.

I don't believe this accomplishes that. When users modify or create resource agents, they typically get them working, then rarely or never touch them again. They tend to change less frequently than OS-supplied resource agents. Custom agents don't prevent /usr from being read-only any more than OS-supplied ones do. In either case, there has to be a mechanism to temporarily make /usr writeable during updates.

Even if a non-/usr location is perceived to be desirable, I would argue for using a custom provider name, and have the non-/usr location be where to look for additional providers.

I'm sorry, but I don't understand this argument at all. Why is the administrator trying to prevent the administrator from configuring resources?

I also don't recall any actual argument for why the anything agent is problematic...

I agree. Disabling particular resource agents is no different than disabling particular binaries provided with any other package. If someone wishes to disable unused resource agents, likely they want to disable unused binaries from other packages as well, and already have a generic mechanism for doing so.

Also, this is a security risk, not a mitigation. Being able to write a script into /etc that is automatically run as root without having to touch the pacemaker configuration destroys any security gained by mounting /usr read-only. And I can't imagine any scenario where a security compromise that allows an unused OCF agent as a vector doesn't have an easier vector elsewhere. Pacemaker runs as root and can run arbitrary executables. They don't have to be in the OCF agent directory.

Regarding non-production agents such as Dummy, anything, etc., it is up to each distribution to decide which agents are installed by which packages. For example, RHEL already removes some agents distributed upstream. Any distribution could move such agents to a resource-agents-testing package, for example, or create a separate package for each resource agent, allowing users to install only the ones they need. Similarly users who compile their own can build packages as they like.

Bottom line, I could see some value in having alternate locations for providers, but I think users should be shepherded into using a unique provider name if they modify or create an agent.

@jnpkrn
Copy link
Contributor Author

jnpkrn commented Nov 27, 2017 via email

@jnpkrn
Copy link
Contributor Author

jnpkrn commented Nov 27, 2017 via email

@kgaillot
Copy link
Contributor

While there are common existing cases of executables under /etc,
they are exceptions, not the rule.
This is then a subjectively inferred rule, not a given fact.
And I am not cheered when that's used as a base to naysay what
I believe is a good, versatile mechanism.

From http://refspecs.linuxbase.org/FHS_2.3/fhs-2.3.html#PURPOSE6 :

"The /etc hierarchy contains configuration files. A 'configuration file' is a local file used to control the operation of a program; it must be static and cannot be an executable binary."

The existence of exceptions to this is simply a result of decades of organic growth, before any standards existed (even POSIX). The subjectiveness of system administrators' expectation that /etc does not normally contain executables does not reduce the legitimacy of the expectation. Following common expectations, even loosely subjective ones, helps system administrators do their jobs.

You are talking about the happy cases (users caring about what was written on the topic, etc.) while I am about the pessimistic scenarios. And for these, there's next to no difference, except that with proper tooling support, it may be immediately clear that /etc override is what's in use (cf. hidden in-situ changes in the agents).

If a user directly modifies a script deployed by an OS package, the next OS update of that package will overwrite it. That's an effective enforcement mechanism that quickly educates anyone who didn't pay attention to the documentation.

If an administrator or someone else troubleshooting a cluster problem wants to look at the resource agent code, they're going to go to the standard location first. If the behavior doesn't fit the code they see, they'll just get confused. There won't be any obvious indication that there's an override.

What I had in mind, say, a bunch of VMs could share the same /usr so as to save space.

That's feasible regardless of where custom agents are, and regardless of whether users can override an existing provider or require a unique provider name.

Also, this is a security risk, not a mitigation. Being able to write
a script into /etc that is automatically run as root without having
to touch the pacemaker configuration destroys any security gained by
mounting /usr read-only.
I don't follow. How is a normal user privileged to write to /etc?

The point of mounting /usr read-only is to disallow root from writing to it. The vulnerability is to exploits that allow only writing files as root, as opposed to full shell access. If the attacker can replace a common command with a trojan, it will end up being executed. Allowing that same attacker to write an OCF override to /etc, and having pacemaker automatically run it without any configuration change required, provides a way around a read-only /usr.

Existing scripts under /etc could be attacked in the same way, which is a good reason why they shouldn't be there, and are there only for historical reasons. From a security standpoint, mounting /usr read-only is stronger when paired with all other filesystems being mounted ro and/or noexec. As an example, Gentoo recommends mounting /etc read-only as well, with symlinks for files that need to be updated:

https://wiki.gentoo.org/wiki/Filesystem/Security#Mount_options

The bottom line from a security standpoint is that all executables should be on read-only partitions, otherwise the protection is only partial. (This is one reason this is not a common setup.)

@jnpkrn
Copy link
Contributor Author

jnpkrn commented Nov 28, 2017 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants