Replies: 1 comment 2 replies
-
A section in our manual page may answer to your question. Quote from https://github.com/rui314/mold/blob/main/docs/mold.md#archive-symbol-resolution Archive symbol resolutionTraditionally, Unix linkers are sensitive to the order in which input files appear on the command line. They process input files from the first (leftmost) file to the last (rightmost) file one-by-one. While reading input files, they maintain sets of defined and undefined symbols. When visiting an archive file (.a files), they pull out object files to resolve as many undefined symbols as possible and move on to the next input file. Object files that weren't pulled out will never have a chance for a second look. Due to this behavior, you usually have to add archive files at the end of a command line, so that when a linker reaches archive files, it knows what symbols remain as undefined. If you put archive files at the beginning of a command line, a linker doesn't have any undefined symbols, and thus no object files will be pulled out from archives. You can change the processing order by using the --start-group and --end-group options, though they make a linker slower. mold, as well as the LLVM lld(1) linker, takes a different approach. They remember which symbols can be resolved from archive files instead of forgetting them after processing each archive. Therefore, mold and lld(1) can "go back" in a command line to pull out object files from archives if they are needed to resolve remaining undefined symbols. They are not sensitive to the input file order. --start-group and --end-group are still accepted by mold and lld(1) for compatibility with traditional linkers, but they are silently ignored. |
Beta Was this translation helpful? Give feedback.
-
I have a C library and a libpthread in it, and they will export partially the same symbols. In particular, there is an internal initialization function for FILE* structs for fopen. The pthread variant will also initialize a mutex.
Building a test program that calls fopen without pthread will pull fdglue2.o from libc.a, which exports the function.
Building the same test program with -pthread tells the linker to link in libpthread.a first and then libc.a.
With GNU ld, the internal function is pulled from libc.a even though libpthread.a exports it and is named first on the command line.
I developed this code many years ago when there was no mold yet. I'm pretty sure I tested this and it worked with GNU ld, which was and still is my default system linker. Using -fuse-ld=lld or -fuse-ld=mold will pull the symbol from libpthread.a as expected (by me at least).
Now, is this a bug in GNU ld? Or have I been relying on undefined behavior for over a decade and now it bites me in the ass?
The .o file that pulls in the function is in libc.a. Maybe GNU ld tries to satisfy the reference from the same library first? That sounds vaguely plausible but breaks my assumptions. I opened a ticket with GNU binutils but I wonder whether there is a specification that actually describes how a linker is supposed to resolve symbols if there is ambiguity.
Beta Was this translation helpful? Give feedback.
All reactions