Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rtld: Always check LD_CHERI/LD_64 evironment variables #874

Draft
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

arichardson
Copy link
Member

With this change a pure-capability RTLD will check LD_CHERI_* and if those
are not set fall back to checking LD_*. This makes it possible to have
scripts that set variables such as LD_64_LIBRARY_PATH and work on both
a non-CHERI 64-bit system and a pure-capability world with a compat64
rtld. Without this change we would have to check what the native ABI is
first and then either set LD_64_LIBRARY_PATH (purecap) or LD_LIBRARY_PATH
(hybrid/non-CHERI).

If you agree this makes sense, I'll try to upstream this so that we can
also use LD_64_LIBRARY_PATH for upstream FreeBSD.

With this change a pure-capability RTLD will check LD_CHERI_* and if those
are not set fall back to checking LD_*. This makes it possible to have
scripts that set variables such as LD_64_LIBRARY_PATH and work on both
a non-CHERI 64-bit system and a pure-capability world with a compat64
rtld. Without this change we would have to check what the native ABI is
first and then either set LD_64_LIBRARY_PATH (purecap) or LD_LIBRARY_PATH
(hybrid/non-CHERI).

If you agree this makes sense, I'll try to upstream this so that we can
also use LD_64_LIBRARY_PATH for upstream FreeBSD.
This removes all remaining warnings when building RTLD.
Copy link
Collaborator

@bsdjhb bsdjhb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had some crazy hacks at Y! though in a bit of a different direction (the 32-bit libc would rewrite attempts to setenv() LD_FOO to LD_32_FOO so that applications that set LD_FOO variables still worked when run under compat32).

I think that you want to only check the fallback for the "native" ABI. That is, if any COMPAT_FOO is set you want to sabotage LD_FALLBACK (so COMPAT_CHERI for a ld-elf-cheri on a hybrid world, or COMPAT_64BIT for ld-elf64 on a purecap world). Only ld-elf.so.1 should check LD_*.

I would also be tempted to make the LD_ the "fallback" that is checked second in the native linker with LD_* having precedence, but that's probably a question for the upstream review. I can see arguments both ways.

@arichardson
Copy link
Member Author

We had some crazy hacks at Y! though in a bit of a different direction (the 32-bit libc would rewrite attempts to setenv() LD_FOO to LD_32_FOO so that applications that set LD_FOO variables still worked when run under compat32).

I think that you want to only check the fallback for the "native" ABI. That is, if any COMPAT_FOO is set you want to sabotage LD_FALLBACK (so COMPAT_CHERI for a ld-elf-cheri on a hybrid world, or COMPAT_64BIT for ld-elf64 on a purecap world). Only ld-elf.so.1 should check LD_*.

I would also be tempted to make the LD_ the "fallback" that is checked second in the native linker with LD_* having precedence, but that's probably a question for the upstream review. I can see arguments both ways.

I originally wanted to use that approach. However, the problem I see with not doing it for COMPAT_* is that it can break 3rd-party non-CHERI software on a purecap CHERI system:
I've seen a few Linux programs that bundle specific versions of libraries and are launched using a wrapper scripts that add the lib/ directory with LD_LIBRARY_PATH. This would no longer work if your world is purecap then since it will only consider LD_64_* and no longer load the libraries. My guess would be that this also exists on FreeBSD.

@jrtc27
Copy link
Member

jrtc27 commented Dec 30, 2020

Yeah in the GNU world there is no LD_32 etc it's just LD. The current FreeBSD approach is more in keeping with how it does ABI in the kernel, but does require scripts know what the native ABI is in order to correctly set environment variables. I don't think you want ld-elf32 and ld-elf64 to differ in semantics though, regardless of what approach is deemed best.

@arichardson
Copy link
Member Author

I don't think you want ld-elf32 and ld-elf64 to differ in semantics though, regardless of what approach is deemed best.

I'd be happy to also read LD_* variables for ld-elf32, I was just concerned that upstream wouldn't want a change in behaviour.

@bsdjhb
Copy link
Collaborator

bsdjhb commented Jan 4, 2021

I think LD_PRELOAD in particular is a bit dangerous if all the linkers honor it. If i do `env LD_PRELOAD=foo.so ' where said program executes a purecap program (or vice versa), it's a bit of a mess, hence why upstream is explicit and does not honor "plain" LD_* for compat ABIs. Yes, it means that scripts might have to be patched, and/or the compat libc for native ABIs might need to adopt the Y! approach of rewriting the names of environment variables transparently, but it avoids crossing the streams in that way. LD_LIBRARY_PATH is probably kind of harmless since we would fail the dlopen() and try the next library search path, but LD_PRELOAD is different.

@jrtc27
Copy link
Member

jrtc27 commented Jan 4, 2021

I think LD_PRELOAD in particular is a bit dangerous if all the linkers honor it. If i do `env LD_PRELOAD=foo.so ' where said program executes a purecap program (or vice versa), it's a bit of a mess, hence why upstream is explicit and does not honor "plain" LD_* for compat ABIs. Yes, it means that scripts might have to be patched, and/or the compat libc for native ABIs might need to adopt the Y! approach of rewriting the names of environment variables transparently, but it avoids crossing the streams in that way. LD_LIBRARY_PATH is probably kind of harmless since we would fail the dlopen() and try the next library search path, but LD_PRELOAD is different.

Well there are two cases:

  1. You give a file name, in which case it'll be as if the binary had an extra DT_NEEDED and everything just works.
  2. You give a path, in which case if the ABIs aren't compatible you'll just get an error.

In both cases you get what you ask for. With glibc LD_PRELOAD is always honoured and it works just fine.

@arichardson
Copy link
Member Author

arichardson commented Jan 5, 2021

I think LD_PRELOAD in particular is a bit dangerous if all the linkers honor it. If i do `env LD_PRELOAD=foo.so ' where said program executes a purecap program (or vice versa), it's a bit of a mess, hence why upstream is explicit and does not honor "plain" LD_* for compat ABIs. Yes, it means that scripts might have to be patched, and/or the compat libc for native ABIs might need to adopt the Y! approach of rewriting the names of environment variables transparently, but it avoids crossing the streams in that way. LD_LIBRARY_PATH is probably kind of harmless since we would fail the dlopen() and try the next library search path, but LD_PRELOAD is different.

Well there are two cases:

  1. You give a file name, in which case it'll be as if the binary had an extra DT_NEEDED and everything just works.
  2. You give a path, in which case if the ABIs aren't compatible you'll just get an error.

In both cases you get what you ask for. With glibc LD_PRELOAD is always honoured and it works just fine.

And if you have a mixed environment you can always set the LD_64_PRELOAD/LD_32_PRELOAD/LD_CHERI_PRELOAD so that binaries with a different ABI ignore it.
I personally much prefer allowing LD_* for all ABIs over libc getenv/putenv hacks.

arichardson added a commit to arichardson/freebsd that referenced this pull request Jan 19, 2021
Summary:
With this change a 64-bit RTLD will check LD_64_* first and if those are
not set it will fall back to checking LD_*. Similarly, the 32-bit RTLDs
check LD_32_*, and then fall back to LD_*.

The main motivation here is CheriBSD, where we have an additional rtld for
CHERI binaries that uses LD_CHERI_*. Therefore, if you have existing 64-bit
software package that ships with a script that sets LD_LIBRARY_PATH to find
the local libraries, this would not work since currently LD_* is only used
by the native RTLD (i.e. CHERI pure-capability for us). To handle this case
this change makes RTLD check the ABI-suffixed variables first and if not
set falls back to LD_*. This still allows for different paths for different
ABI, but handles the program with bundled libraries + launcher script case
better.

Obtained from:	CheriBSD (CTSRD-CHERI/cheribsd#874)

Reviewers: kib, jhb, jrtc27, brooks, emaste

Subscribers: bdragon

Differential Revision: https://reviews.freebsd.org/D28220
arichardson added a commit to arichardson/freebsd that referenced this pull request Jan 19, 2021
Summary:
With this change a 64-bit RTLD will check LD_64_* first and if those are
not set it will fall back to checking LD_*. Similarly, the 32-bit RTLDs
check LD_32_*, and then fall back to LD_*.

The main motivation here is CheriBSD, where we have an additional rtld for
CHERI binaries that uses LD_CHERI_*. Therefore, if you have existing 64-bit
software package that ships with a script that sets LD_LIBRARY_PATH to find
the local libraries, this would not work since currently LD_* is only used
by the native RTLD (i.e. CHERI pure-capability for us). To handle this case
this change makes RTLD check the ABI-suffixed variables first and if not
set falls back to LD_*. This still allows for different paths for different
ABI, but handles the program with bundled libraries + launcher script case
better.

Obtained from:	CheriBSD (CTSRD-CHERI/cheribsd#874)

Reviewers: kib, jhb, jrtc27, brooks, emaste

Subscribers: bdragon

Differential Revision: https://reviews.freebsd.org/D28220
arichardson added a commit to arichardson/freebsd that referenced this pull request Jan 25, 2021
Summary:
With this change a 64-bit RTLD will check LD_64_* first and if those are
not set it will fall back to checking LD_*. Similarly, the 32-bit RTLDs
check LD_32_*, and then fall back to LD_*.

The main motivation here is CheriBSD, where we have an additional rtld for
CHERI binaries that uses LD_CHERI_*. Therefore, if you have existing 64-bit
software package that ships with a script that sets LD_LIBRARY_PATH to find
the local libraries, this would not work since currently LD_* is only used
by the native RTLD (i.e. CHERI pure-capability for us). To handle this case
this change makes RTLD check the ABI-suffixed variables first and if not
set falls back to LD_*. This still allows for different paths for different
ABI, but handles the program with bundled libraries + launcher script case
better.

Obtained from:	CheriBSD (CTSRD-CHERI/cheribsd#874)

Reviewers: kib, jhb, jrtc27, brooks, emaste

Subscribers: bdragon

Differential Revision: https://reviews.freebsd.org/D28220
arichardson added a commit to arichardson/freebsd that referenced this pull request Jan 25, 2021
Summary:
With this change a 64-bit RTLD will check LD_64_* first and if those are
not set it will fall back to checking LD_*. Similarly, the 32-bit RTLDs
check LD_32_*, and then fall back to LD_*.

The main motivation here is CheriBSD, where we have an additional rtld for
CHERI binaries that uses LD_CHERI_*. Therefore, if you have existing 64-bit
software package that ships with a script that sets LD_LIBRARY_PATH to find
the local libraries, this would not work since currently LD_* is only used
by the native RTLD (i.e. CHERI pure-capability for us). To handle this case
this change makes RTLD check the ABI-suffixed variables first and if not
set falls back to LD_*. This still allows for different paths for different
ABI, but handles the program with bundled libraries + launcher script case
better.

Obtained from:	CheriBSD (CTSRD-CHERI/cheribsd#874)

Reviewers: kib, jhb, jrtc27, brooks, emaste

Subscribers: bdragon

Differential Revision: https://reviews.freebsd.org/D28220
@arichardson arichardson marked this pull request as draft January 10, 2023 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants