Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libtcc1.a: abort, bad checksum #205

Open
nickalcock opened this issue Sep 21, 2022 · 10 comments
Open

libtcc1.a: abort, bad checksum #205

nickalcock opened this issue Sep 21, 2022 · 10 comments

Comments

@nickalcock
Copy link

Seen with trunk in my first attempt to do a live-bootstrap with this package. 64-bit x86-64 box, building with bwrap via:

PATH=/usr/src/live-bootstrap/bwrap:$PATH ./rootfs.py --bwrap

(I have to point PATH through a directory that contains a non-setuid bwrap because the setuid one refuses to allow CAP_SETPCAP wrapping.)

Here's the end of the bootstrap process, including at least one thing that had a correctly-validated checksum:

 +> cp boot3-tcc /usr/bin 
 +> chmod 755 /usr/bin/boot3-tcc 
 +> cd ../mes-aa5f1533e1736a89e60d2c34c2a0ab3b01f8d037 
 +> boot3-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o /usr/lib/mes/crt1.o lib/linux/x86-mes-gcc/crt1.c 
 +> boot3-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o /usr/lib/mes/crtn.o lib/linux/x86-mes-gcc/crtn.c 
 +> boot3-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o /usr/lib/mes/crti.o lib/linux/x86-mes-gcc/crti.c 
 +> boot3-tcc -c -D HAVE_CONFIG_H=1 -D HAVE_FLOAT=1 -D HAVE_LONG_LONG=1 -I include -I include/linux/x86 lib/libtcc1.c 
 +> boot3-tcc -c -D TCC_TARGET_I386=1 ../tcc-0.9.26-1136-g5bba73cc/lib/libtcc1.c 
 +> boot3-tcc -ar cr /usr/lib/mes/tcc/libtcc1.a libtcc1.o 
 +> boot3-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o unified-libc.o unified-libc.c 
unified-libc.c:2000: warning: assignment makes integer from pointer without a cast
unified-libc.c:2000: warning: assignment makes pointer from integer without a cast
unified-libc.c:2284: warning: SYS_exit redefined
unified-libc.c:2502: warning: SYS_write redefined
unified-libc.c:5877: warning: assignment from incompatible pointer type
unified-libc.c:6767: warning: assignment from incompatible pointer type
 +> boot3-tcc -ar cr /usr/lib/mes/libc.a unified-libc.o 
 +> cd ../tcc-0.9.26-1136-g5bba73cc 
 +> boot3-tcc -version 
tcc version 0.9.26 (i386 Linux)
 +> boot3-tcc -g -v -static -o boot4-tcc -D BOOTSTRAP=1 -D HAVE_BITFIELD=1 -D HAVE_FLOAT=1 -D HAVE_LONG_LONG=1 -D HAVE_SETJMP=1 -I . -I /usr/include -D TCC_TARGET_I386=1 -D CONFIG_TCCDIR="/usr/lib/mes/tcc" -D CONFIG_TCC_CRTPREFIX="/usr/lib/mes" -D CONFIG_TCC_ELFINTERP="/mes/loader" -D CONFIG_TCC_LIBPATHS="/usr/lib/mes:/usr/lib/mes/tcc" -D CONFIG_TCC_SYSINCLUDEPATHS="/usr/include" -D TCC_LIBGCC="/usr/lib/mes/libc.a" -D TCC_LIBTCC1="libtcc1.a" -D CONFIG_TCCBOOT=1 -D CONFIG_TCC_STATIC=1 -D CONFIG_USE_LIBGCC=1 -D TCC_MES_LIBC=1 -D TCC_VERSION="0.9.26" -D ONE_SOURCE=1 -L . tcc.c 
tcc version 0.9.26 (i386 Linux)
-> tcc.c
<- boot4-tcc
 +> cp boot4-tcc /usr/bin 
 +> chmod 755 /usr/bin/boot4-tcc 
 +> cd ../mes-aa5f1533e1736a89e60d2c34c2a0ab3b01f8d037 
 +> boot4-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o /usr/lib/mes/crt1.o lib/linux/x86-mes-gcc/crt1.c 
 +> boot4-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o /usr/lib/mes/crtn.o lib/linux/x86-mes-gcc/crtn.c 
 +> boot4-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o /usr/lib/mes/crti.o lib/linux/x86-mes-gcc/crti.c 
 +> boot4-tcc -c -D HAVE_CONFIG_H=1 -D HAVE_FLOAT=1 -D HAVE_LONG_LONG=1 -I include -I include/linux/x86 lib/libtcc1.c 
 +> boot4-tcc -c -D TCC_TARGET_I386=1 ../tcc-0.9.26-1136-g5bba73cc/lib/libtcc1.c 
 +> boot4-tcc -ar cr /usr/lib/mes/tcc/libtcc1.a libtcc1.o 
 +> boot4-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o unified-libc.o unified-libc.c 
unified-libc.c:2000: warning: assignment makes integer from pointer without a cast
unified-libc.c:2000: warning: assignment makes pointer from integer without a cast
unified-libc.c:2284: warning: SYS_exit redefined
unified-libc.c:2502: warning: SYS_write redefined
unified-libc.c:5877: warning: assignment from incompatible pointer type
unified-libc.c:6767: warning: assignment from incompatible pointer type
 +> boot4-tcc -ar cr /usr/lib/mes/libc.a unified-libc.o 
 +> cd ../tcc-0.9.26-1136-g5bba73cc 
 +> boot4-tcc -version 
tcc version 0.9.26 (i386 Linux)
 +> boot4-tcc -g -v -static -o boot5-tcc -D BOOTSTRAP=1 -D HAVE_BITFIELD=1 -D HAVE_FLOAT=1 -D HAVE_LONG_LONG=1 -D HAVE_SETJMP=1 -I . -I /usr/include -D TCC_TARGET_I386=1 -D CONFIG_TCCDIR="/usr/lib/mes/tcc" -D CONFIG_TCC_CRTPREFIX="/usr/lib/mes" -D CONFIG_TCC_ELFINTERP="/mes/loader" -D CONFIG_TCC_LIBPATHS="/usr/lib/mes:/usr/lib/mes/tcc" -D CONFIG_TCC_SYSINCLUDEPATHS="/usr/include" -D TCC_LIBGCC="/usr/lib/mes/libc.a" -D TCC_LIBTCC1="libtcc1.a" -D CONFIG_TCCBOOT=1 -D CONFIG_TCC_STATIC=1 -D CONFIG_USE_LIBGCC=1 -D TCC_MES_LIBC=1 -D TCC_VERSION="0.9.26" -D ONE_SOURCE=1 -L . tcc.c 
tcc version 0.9.26 (i386 Linux)
-> tcc.c
<- boot5-tcc
 +> cp boot5-tcc /usr/bin 
 +> chmod 755 /usr/bin/boot5-tcc 
 +> cd ../mes-aa5f1533e1736a89e60d2c34c2a0ab3b01f8d037 
 +> boot5-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o /usr/lib/mes/crt1.o lib/linux/x86-mes-gcc/crt1.c 
 +> boot5-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o /usr/lib/mes/crtn.o lib/linux/x86-mes-gcc/crtn.c 
 +> boot5-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o /usr/lib/mes/crti.o lib/linux/x86-mes-gcc/crti.c 
 +> boot5-tcc -c -D HAVE_CONFIG_H=1 -D HAVE_FLOAT=1 -D HAVE_LONG_LONG=1 -I include -I include/linux/x86 lib/libtcc1.c 
 +> boot5-tcc -c -D TCC_TARGET_I386=1 ../tcc-0.9.26-1136-g5bba73cc/lib/libtcc1.c 
 +> boot5-tcc -ar cr /usr/lib/mes/tcc/libtcc1.a libtcc1.o 
 +> boot5-tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 -o unified-libc.o unified-libc.c 
unified-libc.c:2000: warning: assignment makes integer from pointer without a cast
unified-libc.c:2000: warning: assignment makes pointer from integer without a cast
unified-libc.c:2284: warning: SYS_exit redefined
unified-libc.c:2502: warning: SYS_write redefined
unified-libc.c:5877: warning: assignment from incompatible pointer type
unified-libc.c:6767: warning: assignment from incompatible pointer type
 +> boot5-tcc -ar cr /usr/lib/mes/libc.a unified-libc.o 
 +> boot5-tcc -version 
tcc version 0.9.26 (i386 Linux)
 +> cp /usr/bin/boot5-tcc /usr/bin/tcc 
 +> chmod 755 /usr/bin/tcc 
 +> cp /usr/bin/tcc /usr/bin/tcc-0.9.26 
 +> chmod 755 /usr/bin/tcc-0.9.26 
 +> tcc -c -D HAVE_CONFIG_H=1 -I include -I include/linux/x86 lib/posix/getopt.c 
 +> tcc -ar cr /usr/lib/mes/libgetopt.a getopt.o 
 +> cd ../.. 
 +> if match xFalse xTrue 
/usr/bin/mes-tcc: OK
/usr/bin/boot0-tcc: OK
/usr/bin/boot1-tcc: OK
/usr/bin/boot2-tcc: OK
/usr/bin/boot3-tcc: OK
/usr/bin/boot4-tcc: OK
/usr/bin/tcc: OK
/usr/lib/mes/libc.a: OK
/usr/lib/mes/libgetopt.a: OK
/usr/lib/mes/crt1.o: OK
/usr/lib/mes/crti.o: OK
/usr/lib/mes/crtn.o: OK
/usr/lib/mes/tcc/libtcc1.a: FAILED
Wanted:   ac11f09698f092ed76ae40ebcd56cf3f2b903ea1333ef7537a00673dd6f73da7
Received: 2adce9b440aefc6dd458b046582f3584de224dc2e0cf3cf1c0c17e98182beaa5
Subprocess error 1
ABORTING HARD
Subprocess error 1
ABORTING HARD
Subprocess error 1
ABORTING HARD
Subprocess error
ABORTING HARD
Bootstrapping failed

I don't know where to start debugging this because I don't have an instance that works to work from. Clearly codegen is broken, but where? (I can provide the probably-broken binaries to anyone who wants them.)

@stikonas
Copy link
Collaborator

stikonas commented Sep 21, 2022

Yes, it would be good to compare binaries with good ones. diffoscope can often show what's wrong.

Do you have a root on some box? Then you could try to also see if chroot and qemu modes work.

P.S. also feel free to visit #bootstrappable on libera.chat if you want more interactive help.

@nickalcock
Copy link
Author

Exactly. I'll see if I can generate good binaries via qemu, though I'd be surprised if that would work: I do wonder if the problem is file sort order or something, i.e. down to the underlying filesystem, xfs versus qemu? hmm that's easy to test, will do. (I have root across this local network, so that should be good enough. It looks like qemu mode doesn't do anything dangerous or crazy. Well, more dangerous and crazy than this project as a whole :) )

@nickalcock
Copy link
Author

Confirmed that it only goes wrong under --bwrap. Still trying to figure out where --qemu mode writes to so I can diffoscope the artifacts: it's not writing to tmp or sysc/tmp or even sysc/tmp/disk.img even with the tmpfs mounting forcibly disabled.

@stikonas
Copy link
Collaborator

Confirmed that it only goes wrong under --bwrap. Still trying to figure out where --qemu mode writes to so I can diffoscope the artifacts: it's not writing to tmp or sysc/tmp or even sysc/tmp/disk.img even with the tmpfs mounting forcibly disabled.

qemu mode runs in tmpfs during sysa stage and later on the virtual disk in sysc stage, so getting artifacts out is a bit tricky. You would have to transfer them to sysc first... Perhaps it would be easier if I publish good file somewhere.

What about chroot mode? Does that give your correct checksum?

But it might indeed be related to underlying filesystem...

@stikonas
Copy link
Collaborator

@nickalcock
Copy link
Author

A bit more info: not creating a tmpfs in the bwrap stage makes the error go away! (to be replaced by another error, which isn't too surprising after I did that). So this must be a difference in the behaviour of tmpfs between the sysa qemu image (which for me is based on a 5.10.0 defconfig kernel) and the host kernel (5.16.19 tmpfs, 64-bit). I'll arrange to copy the file in question off the tmpfs before deleting it... let's see.

@nickalcock
Copy link
Author

Failure confirmed intermittent, happening about 50% of the time. The difference is that my faulty copy of libtcc1.a has four more null bytes at the end. This seems to be pure padding: it's not represented in the size of the archive's lone element at all. This almost has to be something up with tcc 0.9.26's tcc_tool_ar, I'd think.

@stikonas
Copy link
Collaborator

Hmm, that is strange. I was expecting something like ordering issue and not padding.

Perhaps another useful data point would be to check if that happens with all libtcc1.a stages. Unfortunately, we only checksum the last one but mescc->tcc-0.9.26 step actually involves 5 rebuilds.

@nickalcock
Copy link
Author

Good idea: I can use the same "stuff a cp into sysc_image/tmp" kludge I used for this to smuggle all five out in both the failing build and the qemu build. I'll look at it once this stupid cold has gone away :(

@nanonyme
Copy link
Contributor

nanonyme commented Dec 1, 2023

@nickalcock does this still reproduce?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants