Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] Toolchain as a package that injects compiler args #17825

Open
1 task done
rohel01 opened this issue Feb 21, 2025 · 10 comments
Open
1 task done

[question] Toolchain as a package that injects compiler args #17825

rohel01 opened this issue Feb 21, 2025 · 10 comments
Assignees

Comments

@rohel01
Copy link

rohel01 commented Feb 21, 2025

What is your question?

Hello,

I am using Conan 2.12.2.

I have a similar use case than the one explained in the tutorial "Creating a Conan package for a toolchain". In my case though, the toolchain must also inject specific argument on the compiler command lines: specific includes, link libraries and scripts for example.

So, I modified the package_info method from the example as follows:

def package_info(self):
        toolchain = self.conan_data["toolchain"]
        cpu = self.conan_data["cpu"]

        # Add the package bin dir to the build path
        self.cpp_info.bindirs.append(os.path.join(self.package_folder, toolchain, "bin"))

        # Instruct C and C++ compiler to build for RTEMS, providing path to RTEMS tools
        self.cpp_info.cflags.extend([f"-B{os.path.join(self.package_folder, toolchain, cpu, "lib")}", "-qrtems"])
        self.cpp_info.cxxflags.extend([f"-B{os.path.join(self.package_folder, toolchain, cpu, "lib")}", "-qrtems"])
        # Add include directory to RTEMS headers
        self.cpp_info.includedirs.append(os.path.join(self.package_folder, toolchain, cpu, "lib", "include"))
        self.cpp_info.defines.append("_RTEMS_5_")

        # We link for RTEMS, using provided paths and an explicit list of system libs
        self.cpp_info.exelinkflags = [f"-B{os.path.join(self.package_folder, toolchain, cpu, "lib")}", "-qrtems", "-nodefaultlibs"]
        self.cpp_info.system_libs = ["rtempscpu", "rtemsbsp", "gcc", "c"]

        # Configure compiler binaries for the build system
        self.conf_info.define("tools.build:compiler_executables", {
            "c":   f"{toolchain}-gcc",
            "cpp": f"{toolchain}-g++",
            "ar": f"{toolchain}-ar",
            "strip": f"{toolchain}-strip"
        })

However, these compiler flags do not end up being used in consuming packages. Furthermore, I cannot just put them in a profile, since they use the path of the toolchain package.

Any ideas?

Have you read the CONTRIBUTING guide?

  • I've read the CONTRIBUTING guide
@memsharded memsharded self-assigned this Feb 21, 2025
@memsharded
Copy link
Member

Hi @rohel01

Thanks for your question.

The self.cpp_info information is intended for regular libraries, that are discoverable via find_package() or similar build system approaches.

When talking about toolchains and compilers, this is rarely the case, and there is no such a thing as find_package(compiler), because teh compiler is even detected earlier, in the project() call.

So the idea is that toolchain tool-requires configuration flags should also go via conf_info, so something like:

def package_info(self):
     self.conf_info.append("tools.build:cxxflags", ["-I my/include/path", ...]

Please let me know if this helps.

@rohel01
Copy link
Author

rohel01 commented Feb 21, 2025

Thanks for your quick answer @memsharded !

Yes, this helps. I think what tripped me is that binary dirs are configured via cpp_info

        self.cpp_info.bindirs.append(os.path.join(self.package_folder, toolchain, "bin"))

I guess it is because binaries are discoverable by the build system as you mentionned, but it would be more consistent from a UX point of view to configure them via conf_info.

One more question, if you do not mind.

When using the meson toolchain, I found some compiler executables are not picked up.

In my recipe:

    def package_info(self):
        toolchain = self.conan_data["toolchain"]
       # Skipped code

        # Configure compiler binaries for the build system
        self.conf_info.define("tools.build:compiler_executables", {
            "c":   f"{toolchain}-gcc",
            "cpp": f"{toolchain}-g++",
            "ar": f"{toolchain}-ar",
            "strip": f"{toolchain}-strip"
        })

In the generated conan_meson_cross.ini

[binaries]
c = 'riscv-rtems6-gcc'
cpp = 'riscv-rtems6-g++'

[built-in options]
# ...

I found out about it because of the following meson warning during install

User defined options
Cross files: /home/michael/proj/pfgen/endurance-flight-software/build/conan_meson_cross.ini
prefix : /

Found ninja-1.12.1 at /home/michael/.pixi/envs/meson/bin/ninja
WARNING: Cross file does not specify strip binary, result will not be stripped.
WARNING: Cross file does not specify strip binary, result will not be stripped.

It should work, right?

@rohel01
Copy link
Author

rohel01 commented Feb 21, 2025

Looking at the meson toolchain code, this behavior seems intentional. In tools/meson/toolchain.py, only the c and cpp binaries are computed from "tools.build:compiler_executables"

@memsharded
Copy link
Member

self.cpp_info.bindirs.append(os.path.join(self.package_folder, toolchain, "bin"))

Yes, the thing is that bindirs can contain executables and shared libraries. The shared libraries might be needed by regular requires consumers to be able to run correctly, so it plays a role in find_package(). Also for executables, there are packages that contains tool that do actually create CMake imported executable targets like protobuf::protoc, and are disovered via find_package() too. That bindir is also leveraged for the VirtualBuildEnv and VirtualRunEnv to inject those directories in the PATH env-vars for things to be found

But toolchains are a bit different, as they are not a regular dependency.

I guess it is because binaries are discoverable by the build system as you mentionned, but it would be more consistent from a UX point of view to configure them via conf_info.

conf_info is only injected from direct tool_requires, but not from regular requires, while the cpp_info is propagated for direct and transitive regular requires.

In other words, if the toolchain was not inside a Conan package, it would still be possible to define the tools.build:cxxflags directly in the profile [conf], but this is not the case for regular packages cpp_info information.

The relevant code for the compilers in Meson is:

 compilers_by_conf = self._conanfile_conf.get("tools.build:compiler_executables", default={},
                                                     check_type=dict)
        # Read the VirtualBuildEnv to update the variables
        build_env = self._conanfile.buildenv_build.vars(self._conanfile) if native else (
            VirtualBuildEnv(self._conanfile, auto_generate=True).vars())
        #: Sets the Meson ``c`` variable, defaulting to the ``CC`` build environment value.
        #: If provided as a blank-separated string, it will be transformed into a list.
        #: Otherwise, it remains a single string.
        self.c = compilers_by_conf.get("c") or self._sanitize_env_format(build_env.get("CC")) or default_comp

So this should work, unless you defined that conf also in your profile, where does the riscv-rtems6-gcc come from? Where is it defined? Conan is not defining this automatically, is it coming from the environment?

@rohel01
Copy link
Author

rohel01 commented Feb 21, 2025

There are two separate albeit related issues.

I opened this ticket because I first failed to forward compilation and link flags from a toolchain package to consuming package via [tool_requires]. Your proposal to use conf_info instead of cpp_info in the package_info method works. The MesonToolchain now correctly injects the toolchain flags in the generated conan_meson_cross.ini

Now that this is solved, I noticed an other issue. At the end of the package_info method of the toolchain package, I override some compilation executables, as demonstrated by the ARM toolchain tutorial.

self.conf_info.define("tools.build:compiler_executables", {
            "c":   f"{toolchain}-gcc",
            "cpp": f"{toolchain}-g++",
            "ar": f"{toolchain}-ar",
            "strip": f"{toolchain}-strip"
        })

(Note: toolchain is read from the conandata.yml file in the toolchain recipe)

After conan install, I notice only the c and cpp entries are picked up by the Meson toolchain generators. My others entries (ar and strip) are ignored . Looking at the MesonToolchain code on Github, only the c and cpp compiler keys are indeed read from the configuration, as you showed. For example, here is how the strip variable is computed

        #: Defines the Meson ``strip`` variable. Defaulted to ``STRIP`` build environment value
        self.strip = build_env.get("STRIP")

Is it intentional? Usually, toolchains need to override a lot more than the c and cpp compilers.

@memsharded
Copy link
Member

That is correct, the MesonToolchain does:

        #: Defines the Meson ``ar`` variable. Defaulted to ``AR`` build environment value
        self.ar = build_env.get("AR")
        #: Defines the Meson ``strip`` variable. Defaulted to ``STRIP`` build environment value
        self.strip = build_env.get("STRIP")
        #: Defines the Meson ``as`` variable. Defaulted to ``AS`` build environment value

So it is not picking it up from compiler_executables. I think there was already some discussion about it, like in general build systems will automatically pick those from the compiler definitions, but I don't recall where and can't find it. Adding @franramirez688 and @jcar87 to have a look-into

@jcar87
Copy link
Contributor

jcar87 commented Feb 24, 2025

We have been reviewing this.

Usually, toolchains need to override a lot more than the c and cpp compilers.

Actually not quite! What we have found in "most cases" is the opposite - that most build systems are able to derive the name of the tools directly from the compiler and/or the target platform. For instance, for a gcc compiler toolchain, if I set the compiler to aarch64-linux-gnu-g++-10 - both autotools and cmake will be able to pick up the correct ar, nm, strip, etc tools without having to manually specify them - this is preferable to doing it manually.

Some other tools like assembler and linker tend to be invoked via the compiler - so defining them doesn't make sense for most users and is more likely to cause issues e.g.:

  • we've seen cases where as or ld are defined but they don't have the same CLI as when invoked from the compiler, so some checks could fail or the build could just be wrong
  • LD tends to not make sense to define, because the compiler is invoked so it could be completely ignored, causing confusion to users (-fuse-ld needs to be passed as a linker flag instead)

It may be that:

  • some toolchains may have different named prefix conventions (e.g. llvm) - and may not be handled by all build systems. I think some of the llvm- prefixed tooling could fall in this category, but we would like to see specific examples.
  • some build systems may not derive the tools automatically, for example this is the case with meson and strip, where CMake and Autotools are actually able to guess it correctly in most cases (see here)

For your case, where I can see

        # Configure compiler binaries for the build system
        self.conf_info.define("tools.build:compiler_executables", {
            "c":   f"{toolchain}-gcc",
            "cpp": f"{toolchain}-g++",
            "ar": f"{toolchain}-ar",
            "strip": f"{toolchain}-strip"
        })

I suspect that defining only gcc and g++ would work for the vast majority of "compliant" build systems and scripts (CMake, autootols), with the build systems correctly deriving the tooling without having to specify them.

I have a feeling that the only reason ar and strip are needed, are for meson - which seem to follow different rules to guess these files (looking at the meson codebase and also the discussion in (mesonbuild/meson#14172)

When using the meson toolchain, I found some compiler executables are not picked up.

Looking at the meson toolchain code, this behavior seems intentional. In tools/meson/toolchain.py, only the c and cpp binaries are computed from "tools.build:compiler_executables"

So it is not picking it up from compiler_executables. I think there was already some discussion about it, like in general build systems will automatically pick those from the compiler definitions, but I don't recall where and can't find it. Adding @franramirez688 and @jcar87 to have a look-into

Possibly the implementation in MesonToolchain came before the tools.build:compiler_executables

As it stands right now, we don't have this for ar - we may have to document this better, but I would advise simply defining it as an env var for the time being, eg.

        self.conf_info.define("tools.build:compiler_executables", {
            "c":   f"{toolchain}-gcc",
            "cpp": f"{toolchain}-g++",
        })

       self.buildenv_info.define("AR", f"{toolchain}-ar")
       self.buildenv_info.define("STRIP", f"{toolchain}-strip")

@rohel01
Copy link
Author

rohel01 commented Feb 24, 2025

We have been reviewing this.

Usually, toolchains need to override a lot more than the c and cpp compilers.

Actually not quite! What we have found in "most cases" is the opposite - that most build systems are able to derive the name of the tools directly from the compiler and/or the target platform. For instance, for a gcc compiler toolchain, if I set the compiler to aarch64-linux-gnu-g++-10 - both autotools and cmake will be able to pick up the correct ar, nm, strip, etc tools without having to manually specify them - this is preferable to doing it manually.

The current toolchains we are using for a RISCV SoC overrides gcc, g++, addr2line, ar, gcov, gprof, ranlib, objcopy, size, strings and strip. I mean override in the sense the toolchain provides specific (prefixed) versions of these executables. I agree users should provide minimal information so to prevent inconsistencies between tools involved in the same workflow.

Some other tools like assembler and linker tend to be invoked via the compiler - so defining them doesn't make sense for most users and is more likely to cause issues e.g.:

* we've seen cases where `as` or `ld` are defined but they don't have the same CLI as when invoked from the compiler, so some checks could fail or the build could just be wrong

* LD tends to not make sense to define, because the compiler is invoked so it could be completely ignored, causing confusion to users (`-fuse-ld` needs to be passed as a linker flag instead)

It may be that:

* some toolchains may have different named prefix conventions (e.g. llvm) - and may not be handled by all build systems. I think some of the `llvm-` prefixed tooling could fall in this category, but we would like to see specific examples.

* some build systems may not derive the tools automatically, for example this is the case with meson and strip, where CMake and Autotools are actually able to guess it correctly in most cases (see [here](https://github.com/mesonbuild/meson/issues/14172#issuecomment-2654554761))

ar is overriden in the Conan ARM toolchain tutorial :) I like that, by default, we can rely on some compiler driver heuristic to select consistent tools to be involved in the build. My surprise here is that there is a documented way to be explicit, but then my request is ignored.

The ability to selectively override tools is useful there because

  1. I still want to use Conan to formalize my build configuration (reproducible builds)
  2. Heuristics can change or be affected by external variables (PATH,PKG_CONFIG_SYSROOT_DIR...)
  3. Tools can be bugged... (example below)

If I cannot override and the tool is guessing wrong, then I start fighting the very tools I wanted to rely on. There are concrete examples where overriding is or would have been useful

  • Meson tends to inject -D_FILE_OFFSET_BITS agressively in the build, which just does not work on some 32bits targets.
  • pkg-config does not properly handle multiple sysroots.

For your case, where I can see

        # Configure compiler binaries for the build system
        self.conf_info.define("tools.build:compiler_executables", {
            "c":   f"{toolchain}-gcc",
            "cpp": f"{toolchain}-g++",
            "ar": f"{toolchain}-ar",
            "strip": f"{toolchain}-strip"
        })

I suspect that defining only gcc and g++ would work for the vast majority of "compliant" build systems and scripts (CMake, autootols), with the build systems correctly deriving the tooling without having to specify them.

I have a feeling that the only reason ar and strip are needed, are for meson - which seem to follow different rules to guess these files (looking at the meson codebase and also the discussion in (mesonbuild/meson#14172)

What is a "compliant" build system? One of the key reasons I want to deploy Conan in my company is I can use profiles to enforce consistent build constraints (flags, toolchain depencenies) in a build system independent way: I do not need to duplicate profile specifications across each (internal or external) team build system and yet, each of those team can pick the build system of their choice (as long as it is supported by Conan) and still use the company's profiles.

Conan acting as a consistency layer between non cooperating stakeholders and technologies is a huge benefit for me. Here, we have a concrete case were the behavior depends on the actual build backend. I do not think this is something we want. Maybe MesonToolchain should take care of the differences?

When using the meson toolchain, I found some compiler executables are not picked up.

Looking at the meson toolchain code, this behavior seems intentional. In tools/meson/toolchain.py, only the c and cpp binaries are computed from "tools.build:compiler_executables"

So it is not picking it up from compiler_executables. I think there was already some discussion about it, like in general build systems will automatically pick those from the compiler definitions, but I don't recall where and can't find it. Adding @franramirez688 and @jcar87 to have a look-into

Possibly the implementation in MesonToolchain came before the tools.build:compiler_executables

As it stands right now, we don't have this for ar - we may have to document this better, but I would advise simply defining it as an env var for the time being, eg.

        self.conf_info.define("tools.build:compiler_executables", {
            "c":   f"{toolchain}-gcc",
            "cpp": f"{toolchain}-g++",
        })

       self.buildenv_info.define("AR", f"{toolchain}-ar")
       self.buildenv_info.define("STRIP", f"{toolchain}-strip")

Yes, I will test that tomorrow. I expect this workaround will work. Thanks you for suggesting it.

@jcar87
Copy link
Contributor

jcar87 commented Feb 25, 2025

The current toolchains we are using for a RISCV SoC overrides gcc, g++, addr2line, ar, gcov, gprof, ranlib, objcopy, size, strings and strip. I mean override in the sense the toolchain provides specific (prefixed) versions of these executables. I agree users should provide minimal information so to prevent inconsistencies between tools involved in the same workflow.

In our experience, most build systems know the gcc prefixing conventions and are able to derive all those tools with little information (either the compiler alone, the --host value when in autotools, etc) - provided those tools are available in PATH.

The problem that we are seeing, is that when users are "too" specific - they tend to pick the wrong executables, or not in the right way, which causes issues. So on the one hand, yes - being explicit could be beneficial, but on the other hand, it's causing users to shoot themselves in the foot more often than not.

ar is overriden in the Conan ARM toolchain tutorial :)
The assembler is, not the archiver - although from what I can see, it should be removed as most scripts would use the compiler as the assembler cli anyway

My surprise here is that there is a documented way to be explicit, but then my request is ignored.

I think there's some confusion here. ar and strip(as per your example) are not documented keys in compiler_executables - so perhaps we can argue waht Conan should do in those instances - but I would not expect ar or strip to be passed to any underlying build system when the documentation doesn't mention it.

"tools.build:compiler_executables": "Defines a Python dict-like with the compilers path to be used. Allowed keys {'c', 'cpp', 'cuda', 'objc', 'objcxx', 'rc', 'fortran', 'asm', 'hip', 'ispc'}",

pkg-config pkgconf/pkgconf#213 properly handle multiple sysroots.

I think more spefically, it doesn't handle finding .pc files in both a sysroot and outside of the sysroot - and this applies for both pkg-config and pkgconf.

However, this can be solved by following a variation of pkg-config own recommendations when crossbuilding - see an example here:
https://github.com/jcar87/conan-crossbuild-with-sysroot/blob/main/support_files/pkg-config -

What seems to be missing for most users is that pkg-config is pre-baked with paths/subpaths (sometimes it is compiled into the executable itself, sometimes in a separate configuration file), and also which paths are "pruned" from the output (-I and -L should never be emitted for system paths). What seems to be missing is that most users... make assumptions about pkgconfig/pkgconf, end up using the wrong thing, and don't read the documentation of those tools. Some toolchains that provide a sysroot will provide a properly (or better) configured pkg-config.

Conan acting as a consistency layer between non cooperating stakeholders and technologies is a huge benefit for me.

Happy to hear this!

Here, we have a concrete case were the behavior depends on the actual build backend. I do not think this is something we want. Maybe MesonToolchain should take care of the differences?

We have a behaviour that is just not implemented

I think more specifically, as a summary:

  • Injecting compiler args in a toolchain recipe (this very issue), is already possible as of [question] Toolchain as a package that injects compiler args #17825 (comment)
  • The compiler_executables conf does not support ar or strip
  • Meson (and only meson, as far as we can see), does not autoderive the location of ar and strip:
    • it may locate the build machine's upstream, which is likely to work, but not guaranteed
    • when cross building, it won't locate the prefixed "strip" - and will it not strip binaries. Unwanted, but the binaries will still build and will still work.
  • There is a way of defining AR and STRIP for meson, which is via buildenv vars.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants