Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

builtin.inc: fix build with -Woverlength-strings #2979

Closed
wants to merge 4 commits into from

Conversation

emanuele6
Copy link
Member

C99 only allows string literals with length up to 4095 characters.

This patch introduces a gen_jq_builtins generator program that generates the jq_builtins array in builtin.c as an array literal instead of a string literal to make compilation not fail when building with -Woverlength-strings.

Also add -Woverlength-strings to CFLAGS to verify that the fix works.

Fixes #1481

@emanuele6
Copy link
Member Author

I need to add src/gen_jq_builtins.c to dist

@emanuele6 emanuele6 force-pushed the literalarray branch 2 times, most recently from 9eaa7b5 to 75d0c92 Compare December 11, 2023 18:18
@emanuele6
Copy link
Member Author

I think I need to pass more flags to make it build on non-x86 systems.
And I also need to define main() in a special way to make it build on windows.

@emanuele6 emanuele6 marked this pull request as draft December 11, 2023 18:34
@emanuele6 emanuele6 added the lint Code cleanup; style fixes; small refactors; dead code removal; etc. label Dec 11, 2023
@emanuele6
Copy link
Member Author

emanuele6 commented Dec 11, 2023

I fixed the macos (amd64) build =)

@emanuele6 emanuele6 force-pushed the literalarray branch 3 times, most recently from d052f52 to d68438c Compare December 11, 2023 19:15
Makefile.am Outdated Show resolved Hide resolved
@wader
Copy link
Member

wader commented Dec 11, 2023

Huh strange failure cannot execute binary file: Exec format error, can't really see how that could happen... cross compile flags somehow?

@emanuele6
Copy link
Member Author

emanuele6 commented Dec 11, 2023

I fixed the windows builds, now I will try to remove -lshlwapi

@wader
Copy link
Member

wader commented Dec 11, 2023

Remembered now that we only run and tests on amd64 https://github.com/jqlang/jq/blob/master/.github/workflows/ci.yml#L168-L172 but it does not correlate with the failured ci jobs hmm. Thinking if gen_builtins is built as cross target binary somehow?

@emanuele6
Copy link
Member Author

It works without -lshlwapi, we should check if jq actually needs it later.

@emanuele6 emanuele6 force-pushed the literalarray branch 4 times, most recently from 9782323 to 02c8e5b Compare December 11, 2023 20:35
@emanuele6 emanuele6 force-pushed the literalarray branch 4 times, most recently from 49f60fc to 760bfe8 Compare December 11, 2023 22:08
@emanuele6
Copy link
Member Author

CI / linux (mipsr6el):

  CCLD     src/gen_jq_builtins
  GEN      src/version.h
file -- src/gen_jq_builtins
src/gen_jq_builtins: ELF 32-bit LSB executable, MIPS, MIPS32 rel6 version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-mipsn8.so.1, BuildID[sha1]=6ae16e6a26ff4d90fa6e4c3e1b4dcdb949d11a63, for GNU/Linux 4.5.0, not stripped
  GEN      src/builtin.inc
/bin/bash: line 1: src/gen_jq_builtins: cannot execute binary file: Exec format error

🤔

@emanuele6
Copy link
Member Author

I think CIs are all running on amd64, and just cross compiling for a given architecture

@wader
Copy link
Member

wader commented Dec 11, 2023

configure: WARNING: using cross tools not prefixed with host triplet 🤔

@emanuele6
Copy link
Member Author

We should use an host compiler to build gen_jq_builtins somehow

@wader
Copy link
Member

wader commented Dec 11, 2023

We should use an host compiler to build gen_jq_builtins somehow

Hmm define a CC_HOST in the ci env?

@emanuele6 emanuele6 force-pushed the literalarray branch 2 times, most recently from 7480327 to 189586c Compare December 11, 2023 23:02
C99 only allows string literals with length up to 4095 characters.

This patch introduces a gen_jq_builtins generator program that generates
the jq_builtins array in builtin.c as an array literal instead of a
string literal to make compilation not fail when building with
-Woverlength-strings.

Add -Woverlength-strings to CFLAGS to verify that the fix works.

Also introduce HOSTCC autoconf variable to get a C compiler to build
gen_jq_builtin on the host when cross-compiling.

Fixes jqlang#1481
@emanuele6 emanuele6 marked this pull request as ready for review December 11, 2023 23:39
@emanuele6 emanuele6 removed windows lint Code cleanup; style fixes; small refactors; dead code removal; etc. labels Dec 11, 2023
#include <stringapiset.h>
#include <wchar.h>

#define DEFINE_MAIN(ARGC, ARGV) \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't notice this code before. Is this used when building native for windows, not via emulation layers like msys2 or cygwin?

Copy link
Member Author

@emanuele6 emanuele6 Dec 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, when you compile with -municode on windows, wmain() is called instead of main() as entry point with wchar_t* strings instead of char* strings; then that code in wmain() converts the wchar_t* strings to char* strings encoding with utf-8.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it is probably an hack to make the program get the correct utf-8 encoding on unicode characters you pass through cmd.exe because windows encodings are weird.
For example, recently, I added a test to curl that verifies that utf-8 bytes get correctly encoded to JSON; and I had to modify my test to get the utf-8 string from a file instead of from command line arguemnts because the windows CI was not passing the right bytes to curl. :/
curl/curl#12434

@emanuele6
Copy link
Member Author

I didn't anticipate that adding a generator program would be so complex.
I think this PR is too complex for what it is trying to solve, and creates annoyances when cross-compiling.
It is probably cleaner to just rewrite the def math: error; functions in C instead of jq, and just serialise src/builtin.jq to a literal C array with sed as I mentioned in #1481.

@emanuele6 emanuele6 closed this Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

builtin.inc exceeds maximum length for C99
2 participants