Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trick compile error with gcc 11.1 #1553

Open
keithvetter opened this issue Aug 24, 2023 · 21 comments
Open

Trick compile error with gcc 11.1 #1553

keithvetter opened this issue Aug 24, 2023 · 21 comments

Comments

@keithvetter
Copy link
Contributor

Running with gcc 11.1, the Trick compile errors out while trying to ICG. Here is an edited output of the error:

/bin/trick-ICG -sim_services -m /include/trick/files_to_ICG.hh

In file included from include/trick/GetTimeOfDayClock.hh:19:
In file included from include/trick/Clock.hh:11:
In file included from gcc/11.1.0/include/c++/11.1.0/string:40:
In file included from gcc/11.1.0/include/c++/11.1.0/bits/char_traits.h:39:
In file included from gcc/11.1.0/include/c++/11.1.0/bits/stl_algobase.h:64:
In file included from gcc/11.1.0/include/c++/11.1.0/bits/stl_pair.h:59:
In file included from gcc/11.1.0/include/c++/11.1.0/bits/move.h:57:
gcc/11.1.0/include/c++/11.1.0/type_traits:906:49: error: '_Tp' does not refer to a value : public __bool_constant<__is_constructible(_Tp, _Args...)>

@keithvetter keithvetter changed the title Trick compile error on gcc 11.1 Trick compile error with gcc 11.1 Aug 24, 2023
@sharmeye
Copy link
Contributor

Let us look into whether this is an ICG problem, a Keith problem, or a third as-yet-unkown problem. What is your distro you are running, and did you manually install gcc 11 yourself on said distro?

@keithvetter
Copy link
Contributor Author

keithvetter commented Aug 24, 2023

Two distros exhibit the same issue when using gcc-11.1-4, here are the culprits:

  1. Ubuntu 22.04.2 LTS, gcc 11.4.0, llvm 14, clang 14
  2. CentOS 7.9.209, gcc 11.1.0, llvm ?, clang 3.4.2

CentOS 7.9 is what the FSL uses. In the FSL, one can choose gcc11.1 via a module command. This one is arguably an oddity found by using a new compiler on an older distro.

The Ubuntu problem is straight out of the box.

@sharmeye
Copy link
Contributor

Our Ubuntu CI isn't failing so we'll try it manually with an Ubuntu container. It might be difficult to test the CentOS 7.9 one though, so we'll go after Ubuntu first

@jdeans289
Copy link
Contributor

Hi, I couldn't replicate this in a fresh Ubuntu 22.04 container with gcc 11.4.0 and clang 14.

What version of Trick are you using?

Is it possible that there are some other environment settings that could be causing this?

@keithvetter
Copy link
Contributor Author

keithvetter commented Aug 24, 2023

The Ubuntu version actually exhibits another additional error - almost the same. This error isn't fatal, so able to run. Here is that error:

/bin/trick-ICG -sim_services -m ... -/include/trick/files_to_ICG.hh
In file included from /include/trick/files_to_ICG.hh:5:
In file included from include/trick/GetTimeOfDayClock.hh:19:
In file included from include/trick/Clock.hh:11:
In file included from /usr/include/c++/11/string:40:
In file included from /usr/include/c++/11/bits/char_traits.h:40:
In file included from /usr/include/c++/11/bits/postypes.h:40:
In file included from /usr/include/c++/11/cwchar:44:
/usr/include/wchar.h:155:24: error: 'malloc' attribute takes no arguments
attribute_malloc __attr_dealloc_free;

@jdeans289
Copy link
Contributor

I do see the malloc error, but the compilation completes successfully.

@keithvetter
Copy link
Contributor Author

We ran with the latest Trick, just cloned it fresh yesterday.

The malloc "error" is not fatal, but that "error" is in the model builds and makes an otherwise pristine compile of ES's sims look like there are errors. It's not just when building Trick, it's when you build sims.

I think the fatal error is only on that CentOS7.9 gcc11.1, but maybe in this case we can just not use gcc11.1 since it is special.

@jdeans289
Copy link
Contributor

I think it might be some issue of trying to compile gcc system libraries with Clang (which is what ICG does). That would also explain why the Centos7 version with the older clang version can't handle it.

I'm not sure what the solution would be here, I'll look into it.

@alexlin0
Copy link
Contributor

The compilation may complete successfully with the ICG errors, but there could be missing io code.

@jdeans289
Copy link
Contributor

I think there might be a fundamental incompatibility with clang 3.4.2 and gcc 8+ that explains the Centos7 issue.

@jdeans289
Copy link
Contributor

The issue is between the glibc that comes with gcc11 and Clang. The only solution I can think of is to downgrade the gcc standard library.

@keithvetter
Copy link
Contributor Author

keithvetter commented Aug 25, 2023

Thanks for looking into this and trying out a docker container. I need to learn how to set up a Ubuntu container on my Redhat8 box.

Would there be a way to tell ICG to not process the #include <string> header? I tried setting a ICG_EXCLUDE_DIR env var to the gcc11 area, but to no avail.

Another avenue might be to set some define so that ICG doesn't process the block of code in the header that it's having issues with.

I think it's time for me to get a Ubuntu machine or spin up a container.

@ddj116
Copy link
Contributor

ddj116 commented Aug 28, 2023

The error signature appears a bit different but I ran into some issues with ICG and gcc 8.3.0 in FSL on CentOS 7.9 that turned out to be gcc not being configured with the correct ABI. See this issue.

We are currently using gcc 8.3.0 and clang 3.4.2 with no issues in FSL. We have not tried gcc 11 yet.

@keithvetter
Copy link
Contributor Author

keithvetter commented Aug 28, 2023

Thanks, ddj116. I don't believe we have any issues in the FSL unless we choose gcc11 via the module command. The work around you found for that one issue sounds hopeful.

The non-fatal error that we'd really like to solve is on Ubuntu 22.04.2 LTS, gcc 11.4.0, llvm 14, clang 14. Since the error is non-fatal, it's more of an annoyance. Every time a sim is built, the errors catch the eye and one has to scroll up to make sure it's just that ICG malloc "error". ES worked hard at eliminating all warnings, so this is the only thing in an otherwise perfect compile.

@keithvetter
Copy link
Contributor Author

keithvetter commented Aug 29, 2023

For the record, I said in an earlier post, " I don't believe we have any issues in the FSL unless we choose gcc-11.1 via the module command." Well, I said that, but didn't realize that some of what I thought was in the FSL was actually on Briscoe's Ubuntu machine. The following shows FSL errors on gcc > 8.3.0:

FSL, Trick-19.5.1 with gcc-9.2:
object_Linux_9.2_x86_64/ClassVisitor.o: In function clang::NamedDecl::getNameAsString[abi:cxx11]() const': /usr/include/clang/AST/Decl.h:144: undefined reference to clang::DeclarationName::getAsStringabi:cxx11 const'
etc.

FSL, Trick-19.5.1with gcc 11.1
io_src/io_IntegLoopManager.cpp:52:46: error: ‘class Trick::IntegrationManager::JobClassJobs’ is private within this context
52 | return sizeof(Trick::IntegrationManager::JobClassJobs) ;

In the end, I think there are errors in the FSL if you use gcc > 8.3.0. In order to the use the latest Trick (as of Aug 29, 2023), you need swig >= 3.0. The FSL has swig-2.0.10.

All of that said, I still want to see if I can work around that one Ubuntu 22.04.2, gcc-11 "error". I don't have Ubuntu so will need to use docker or find a machine to work it.

@ddj116
Copy link
Contributor

ddj116 commented Aug 29, 2023

That first gcc 9.2 error is almost certainly the way gcc was configured in FSL - the abi:cxx11 as part of the message is the red flag. That could potentially be resolved by FSL admins if you need it to be. They fixed up gcc 8.3.0 for me because I had the same abi error.

The FSL has swig-2.0.10.

FSL has newer swig versions installed if you know where to look, I'll send those paths to you directly.

@keithvetter
Copy link
Contributor Author

Thanks, well, I'm hopping out of the FSL since I think they can just run with gcc-8.3.0 and be fine. The Ubuntu 22.04 malloc error is what I looked at this morning. I was able to use podman on my Redhat 8 box to create a Ubuntu 22.04 image and recreate the malloc error. I didn't fix it, but at least can recreate it. I was hoping that I could dig in and give some #define or something to ICG to stop the error, but it's not that simple. I don't really know what ICG is doing, but guessing it's something to do with clang and gcc not being compatible, like jdeans289 pointed out.

@dandexter
Copy link

I also see this for gcc 11.4 on Ubuntu, but the error is for malloc. Trick does compile and link so I don't think I can trust this for any production work.

  • Ubuntu 22.04.3 LTS
  • gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

/home/ddexter/projects/trick/bin/trick-ICG -sim_services -m -I/home/ddexter/projects/trick/trick_source -I/home/ddexter/projects/trick/include -I/home/ddexter/projects/trick/include/trick/compat -DTRICK_VER=19 -DTRICK_MINOR=7 -fpic -DUSE_ER7_UTILS_INTEGRATORS /home/ddexter/projects/trick/include/trick/files_to_ICG.hh
In file included from /home/ddexter/projects/trick/include/trick/files_to_ICG.hh:5:
In file included from include/trick/GetTimeOfDayClock.hh:19:
In file included from include/trick/Clock.hh:11:
In file included from /usr/include/c++/11/string:40:
In file included from /usr/include/c++/11/bits/char_traits.h:40:
In file included from /usr/include/c++/11/bits/postypes.h:40:
In file included from /usr/include/c++/11/cwchar:44:
/usr/include/wchar.h:155:24: error: 'malloc' attribute takes no arguments
attribute_malloc __attr_dealloc_free;
^
/usr/include/x86_64-linux-gnu/sys/cdefs.h:691:30: note: expanded from macro '__attr_dealloc_free'

define __attr_dealloc_free __attr_dealloc (__builtin_free, 1)

                         ^

/usr/include/x86_64-linux-gnu/sys/cdefs.h:690:21: note: expanded from macro '__attr_dealloc'
attribute ((malloc (dealloc, argno)))
^
In file included from /home/ddexter/projects/trick/include/trick/files_to_ICG.hh:5:
In file included from include/trick/GetTimeOfDayClock.hh:19:
In file included from include/trick/Clock.hh:11:
In file included from /usr/include/c++/11/string:55:
In file included from /usr/include/c++/11/bits/basic_string.h:6608:
In file included from /usr/include/c++/11/ext/string_conversions.h:41:
In file included from /usr/include/c++/11/cstdlib:75:
/usr/include/stdlib.h:566:5: error: 'malloc' attribute takes no arguments
__attr_dealloc_free;
^
/usr/include/x86_64-linux-gnu/sys/cdefs.h:691:30: note: expanded from macro '__attr_dealloc_free'

(and there are many more)

@keithvetter
Copy link
Contributor Author

keithvetter commented Aug 29, 2023

Hi dandexter, the malloc error that you are seeing is what I'm seeing as well, and the main thing I want to fix. That said, the error is non-fatal, and the sims run.

I found that I could get rid of the error by adding this to the top of .../trick/include/trick/files_to_ICG.hh:

#define __attribute__(xyz)

That's a hack, but it worked for me. I did that after seeing this in /usr/include/x86_64-linux-gnu/sys/cdefs.h:

/* GCC and clang have various useful declarations
   that can  be made with the '__attribute__' syntax.  
   All of the ways we use this do fine if they are 
   omitted for compilers that don't understand it.  */
#if !(defined __GNUC__ || defined __clang__)
# define __attribute__(xyz)     /* Ignore */
#endif

@keithvetter
Copy link
Contributor Author

keithvetter commented Sep 1, 2023

Okay, after much reading (and talking with ChatGPT!) I think the fix for the malloc error is in LLVM/clang's court. I think we'll just have to wait for a fix and live with the non-fatal error until LLVM/clang accomodate gcc 11's malloc attribute with arguments.

I see no workaround. The last comment I gave didn't work for S_source.hh.

Here is a post on the gcc11 change:
https://developers.redhat.com/blog/2021/04/30/detecting-memory-management-bugs-with-gcc-11-part-1-understanding-dynamic-allocation

Here are two open LLVM issues to fix it:
llvm/llvm-project#51607
llvm/llvm-project#53152

@sharmeye
Copy link
Contributor

Not closing this issue yet, waiting on status change for the referenced llvm issues that are mentioned in the previous post

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants