-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault with very large extended glob patterns #207
Comments
I don't think I can fix this as I believe it involves rewriting the libast pattern matching engine to avoid using so much recursion. That is simply beyond me. Someone else will need to have a go at that. |
The excessive recursion happens here. (Ignore # 0 as that just happens to be where the stack finally overflowed that time. Recursion backtrace starts from # 1.)
|
See also #144 which was due to the same design flaw. The fix there was to simply refuse to match excessively long command names to a regex. Of course such a workaround can't apply here. |
Since this involves a design flaw in the regex engine I'm not sure if this is of much help, but I'll note that this crash first occurs in 2005-05-22 ksh93q+ (earlier versions don't segfault when running the reproducer). Edit: Using the previous version of expand.c with |
I think there are 2 problems here.
This is fixable by 2 means
Note that ksh handle ERE correctly on long input
Only [[ == ]] is abused. I could proceed with any of those fix if you want to. Cheers. |
Just checked bash, it crash the same way, yet the limit is higher, but still core dumps.
With stack overflow. So may be we should just leave it (close it?) as it is with 'bash bug compatible' status. My prefered fix is 2.2 though, if we want to address this. |
This is not correct. Globs in Bourne and derived shells have never been exclusively for path names, they've always been used to match arbitrary values (e.g., in the
My understanding is that it is basically impossible to check this reliably at runtime in C. Do you have knowledge otherwise? If this can be done well, it would be good to do this in sh_funscope() too. But, AFAIK, no shell does this, which makes me suspect it's not feasible. But, returning to the bug at hand, there should also be a 3.3: somehow find a way to modify the algorithm so that it doesn't use four recursive C function calls for every consecutive |
You are right, and I was wrong, I learned the semantic of globbing the hard way, bare in mind english is not my mother tong, it took me a while to understand that globbing meant yet another RE name, like RE, ERE, PCRE, and globbing, because it is the only one not named RE, I thought it was special, i.e for file path matching :-) Regarding 3.3, not sure it is good enough, even if the 4 function are flattened (not even sure it can) it would just bumb the number of iterations before crash, basically 4x more call possible, but overflowing the stack is still what will happen. Since crashing IMHO is bad IMHO, a developer may like to recover (bookeeping) before exiting. Having a limit on recursion depth is what is done so far for 'shell' function call, recursion break before crashing the whole thing, so that's why I proposed to set an hard limit on RE recursion level, this is simple to do, cost an integer check on each recursion function entry, and it doesn't change the algorithm (beside the depth check). |
There is another excessive recursion problem in When the regression tests are run on a ksh compiled with AddressSanitizer a.k.a. ASan (which increases C heap usage and thus reduces maximum recursion before segfault), this test fails: ksh/src/cmd/ksh93/tests/arrays.sh Lines 609 to 630 in 70a0032
The crash happens on line 628, as the When we remove the 2>/dev/null from line 630 to see what's going on, we see this ASan stack trace:
So the excessive recursion happens in regcomp.c on line 2683: ksh/src/lib/libast/regex/regcomp.c Line 2683 in 70a0032
When the |
To bring this current memory fault back up to light as it was masked over with other material from att#1464 by @jghub . On my macOS box, this still happens:
A little redo of the test to help me find my memory fault spot:
$ x=6801 ksh -c $'[[ $( printf \'a%.0s\' {0..$x} ) == +(a) ]] && print match!' Memory fault
with a similar original test:
$ ksh -c 'v=a; s=; for ((i=0; i < 6801; ++i)); do s+=$v; done; [[ $s == +($v) ]]' Memory fault
Not sure if this still can be fixed or not. I would hope if some sort of recursion limit (as suggested by the original issue) was reached ksh would error out with an message.
The text was updated successfully, but these errors were encountered: