Skip to content

Conversation

@cyangle
Copy link
Contributor

@cyangle cyangle commented Nov 1, 2025

Fixes #16210

@ysbaddaden @straight-shoota So this fix was also generated by google gemini AI in vscode.

I actually tried github copilot first but failed.

@Blacksmoke16 Blacksmoke16 added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:interpreter labels Nov 1, 2025
@ysbaddaden ysbaddaden changed the title Try to fix bug #16210 Fix: interpreter handles self in inlined method with arguments Nov 1, 2025
Copy link
Contributor

@ysbaddaden ysbaddaden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. It makes sense to handle the args first, and then leave self on the stack, instead of handling args last and have them maybe mess with the stack.

@cyangle cyangle marked this pull request as ready for review November 1, 2025 13:28
@straight-shoota straight-shoota added this to the 1.19.0 milestone Nov 3, 2025
@cyangle
Copy link
Contributor Author

cyangle commented Nov 4, 2025

Here's the chat history with chain of thoughts to help understand the fix:

@ysbaddaden @straight-shoota

Chao Yang object.cr This repo contains code for crystal lang's compiler which is also an interpreter. The interpreter should behave the same as executing the compiled binary. Read and modify relevant code files to fix an error that only happens when interpreting a program while working fine with compiled binary. It incorrectly returns the method parameter as self, as comparison it works if not passing param to the not_nil! This suggests that something's wrong with stack operation or method scope.

Gemini
stat_1
Thinking
Considering the Bug's Core
I'm focusing now on the erroneous self return within the local/self_test.cr file. I'm leaning heavily towards a stack operation problem or potentially a scoping issue within the method itself. The user’s comment about src/object.cr seems like a valuable area to investigate, hinting at the core object representation.

Inspecting the Test File
Okay, I've got the local/self_test.cr file open, and I'm carefully reviewing the code to pinpoint exactly where the self return is going wrong. It's a series of simple tests, which is helpful. Now, I'm setting up a local Crystal environment to run these tests. I'll need to compile and interpret to see if the bug manifests consistently. I'm focusing on the different ways self is used within the test methods.

Okay, I will fix the interpreter bug. First, I'll examine local/self_test.cr to understand the issue, then analyze src/object.cr and related files to correct the interpreter's handling of self.

check
ReadFile...cal/self_test.cr
stat_minus_1
Thinking
Okay, I'm on it. The user's provided test case in local/self_test.cr fails in Crystal's interpreter but not when compiled, pointing to a self handling bug. The problem seems to be with the not_nil! method. I'll now find its definition in src/object.cr to debug the issue.

check
SearchTextsrc/object.cr def not_nil!
stat_minus_1
Thinking
Okay, I've confirmed two not_nil! definitions in src/object.cr, both returning self. The test case calls the one with a message, x.not_nil!("test"), where x is "hello". The interpreter incorrectly returns the argument "test" instead of self ("hello"), while the compiled code is correct. This points to a stack or scope issue in the interpreter's method handling. I'll now search the project for "interpreter" or "eval" to locate the relevant interpreter code.

check
SearchTextinterpreter
stat_minus_1
Thinking
"Interpreter" search results point to src/compiler/crystal/interpreter/. interpreter.cr seems central, with instructions.cr and compiler.cr also relevant. The bug—x.not_nil!("test") returning "test" instead of x—suggests the interpreter is incorrectly using the argument as the return value instead of self. I'll now examine interpreter.cr to understand method call handling.

check
ReadFile...r/interpreter.cr
stat_minus_1
Thinking
I've reviewed interpreter.cr. The interpret method's main loop executes opcodes from instructions.cr. The call macro sets up the stack for a method call, and the leave macro, with its size argument, handles the return value. The problem with x.not_nil!("test") returning "test" instead of x suggests the argument is on top of the stack at return, not self. I'll now examine instructions.cr to understand method call instruction generation.

close
ReadFileread_file.../instructions.cr ...ompiler/crystal/
stat_minus_1
Thinking
Wrong path. It's src/compiler/crystal/interpreter/instructions.cr. Reading it now.

check
ReadFile.../instructions.cr
stat_1
Thinking
Interpreting Instruction Details
I've been analyzing the Crystal interpreter's instruction set, specifically in instructions.cr. The call and leave instructions have grabbed my attention. The call instruction, crucial for method invocations, utilizes a compiled_def operand and invokes the call macro. I'm focusing on how compiled_def impacts call execution.

Tracing Return Values
I've pinpointed the issue: the leave instruction is popping an incorrect value. Delving into the Crystal interpreter, particularly instructions.cr, I see the call and leave instructions. I'm now examining how the compiler, in compiler.cr, handles returning self. The not_nil! method's structure helps clarify the problem, illustrating that the compiler must correctly manage self on the stack.

Okay, I've reviewed instructions.cr. The call and leave instructions, with their respective compiled_def and size operands, are key. The issue is the wrong value on the stack at leave: an argument instead of self. Since self is local variable 0, and not_nil!'s body is just self, the compiler should be getting local variable 0, pushing it, then calling leave. I'll now examine compiler.cr to see how self expressions are compiled.

check
ReadFile...eter/compiler.cr
stat_1
Thinking
Considering Self Compilation
I've been poring over the Crystal interpreter's compiler code, zeroing in on the handling of self. I've located relevant code in visit(node : Var), but I need to understand more before I can provide a comprehensive update. The initial assessment suggests I am headed in the right direction.

Investigating Call Compilation
I've examined the Call visitor in the compiler, particularly where self figures in the method dispatch process. The visit(node : Call) method appears crucial. This also touches on Var nodes, where self is explicitly handled. The interplay between these is becoming clearer.

Understanding put_self
I've homed in on the visit(node : Call) method and the crucial section dealing with calls to self, specifically how put_self(node: node) is handled. This is linked to the not_nil! call. I'm now tracing the execution path that leads to this instruction, which should clarify how self is placed on the stack.

Tracing request_value
The visit(node : Call) method, particularly the inlining of self-returning methods, is being re-examined. I've pinpointed that the call request_value(obj) is the crucial step. It seems that request_value simplifies down to accept(obj). I'm now verifying the stack manipulation implications. The path to visit(node: Var) is now much clearer.

Examining the Argument Handling
I've been meticulously examining the behavior of discard_value and its effect on argument evaluation. The visit(node : Call) method, when inlining self, calls discard_value on the arguments. I've confirmed that the arguments are indeed visited with @wants_value = false, preventing their values from being pushed onto the stack. I'm now certain that the issue does not reside there.

Debugging Discard Logic
I'm now tracing through the discard_value process again, and it appears my prior assumptions were flawed. Despite my previous conclusions, I'm now certain that discard_value does visit the node, which means "test" is being pushed to the stack. This contradicts the expected behavior, indicating a bug in how arguments are handled during inline self calls. I am now focusing on stack manipulation surrounding string literals.

Isolating the String Literal
I've realized the problem lies in the argument handling during the self inlining process. Specifically, the discard_value method's behavior with string literals seems to be the culprit. The investigation now centers on why "test" is still being pushed onto the stack, despite the discard_value instruction. This is likely due to an unexpected side effect or a misinterpretation of @wants_value within the context of string literal compilation.

Pinpointing the Culprit
I'm now zeroing in on the interaction between request_value and discard_value in the Call visitor. The core of the issue seems to be how the arguments of the not_nil! method are being handled when self is inlined. My attention has now drifted towards StringLiteral nodes and discard_value.

Rethinking Argument Evaluation
I'm now reconsidering the flow of execution within discard_value and re-evaluating my earlier conclusions. I'm focusing specifically on how the arguments of the not_nil! method are being handled and whether they're actually being discarded, as I initially assumed, or if a different behavior is occurring. This is where the issue seems to reside.

Tracing the Execution Path
I'm now back to the fundamental process of how arguments, particularly string literals, are handled during the self inlining phase of compilation. I am meticulously retracing each step in discard_value, now convinced that the stack is being incorrectly modified. I am determined to isolate the precise instruction causing the misbehavior.

Revising Stack Assumptions
I'm now deeply examining the stack's state and operation during argument evaluation within the context of the inlined self call. My focus is on the impact of discard_value on stack management, particularly in the case of string literals. I suspect an underlying issue with instruction sequencing related to the stack.

Refining the Theory
I'm now revisiting the logic within discard_value and its effect on the evaluation of arguments during the self inlining process. My current theory is that, despite my earlier assessments, the arguments of the not_nil! call are somehow still being evaluated, specifically with the string literal "test". It appears that the @wants_value = false setting might be misinterpreted in certain scenarios.

Isolating the Root Cause
I've isolated the core issue to the interaction between request_value for the receiver (x) and discard_value applied to the arguments of not_nil!. The string literal's behavior is the key. The problem is that the string literal "test" is incorrectly remaining on the stack, which is not expected. This seems to be the source of my challenges.

Reflecting on the Stack
I am now thoroughly immersed in the stack management aspects, particularly focusing on how request_value interacts with discard_value during the inlining of self-returning methods. The analysis is centered around the handling of string literals and how they are pushed to the stack. The investigation is now heavily involved in stack manipulation.

Tracing the Stack Again
I'm now revisiting the stack and the order of operations when inlining self and its method arguments. The focus has sharpened on the stack's state after request_value(obj) and just before the argument processing. The core challenge is the incorrect stack manipulation.

Analyzing Bytecode Generation
I'm now tracing the bytecode that is produced for the failing test case. Specifically, I'm examining the output of the compiler for x.not_nil!("test"). I'm going to look at the differences between the compiled output for both x.not_nil!("test") and x.not_nil!. This is the crucial step now.

Tracing Bytecode Generation
I'm now tracing the bytecode that is produced for the failing test case. I'm focusing on the get_local instruction for self and verifying the argument pushing, starting with x.not_nil!("test"). I am now comparing the results with the behavior of x.not_nil!. The bytecode difference should reveal the stack problem.

Modifying the Call Order
I've altered the argument and self evaluation order in the visit(node : Call) method, to test if that's influencing the stack. Instead of pushing self and then discarding arguments, I'm now discarding arguments first, and then pushing self. I am hoping to resolve the issue.

Testing the Order Change
I've just modified the call argument evaluation order within the visit(node : Call) method, re-ordering things. The change involves processing the arguments before pushing self. The goal is to correct a stack imbalance that appears to be caused by discard_value. I am now running tests to verify if this has resolved the issue.

Testing Order of Operation
I've just modified the call argument evaluation order within the visit(node : Call) method and the tests are running now. The reordering involves processing the arguments before pushing self onto the stack, addressing a potential stack imbalance caused by an error in discard_value. I am awaiting the test results.

Testing Order of Operation
I've just modified the call argument evaluation order within the visit(node : Call) method and the tests are running now. The reordering involves processing the arguments before pushing self onto the stack, addressing a potential stack imbalance caused by an error in discard_value. I am awaiting the test results.

I'm still unsure why discard_value is failing, but I have a fix. The issue is that when inlining self-returning methods, the argument is being left on the stack, causing it to be returned instead of self. My theory is that even though discard_value is used, the argument's value is still being pushed. To fix this, I'll reorder the operations in visit(node : Call). I'll evaluate and discard the arguments before pushing self onto the stack. This ensures self is the last value pushed, making it the return value, which aligns with standard call evaluation order. I'll now implement this change.

check
Editcompiler.cr -3, +3
stat_minus_1
Thinking
Fix applied. Now to verify. I'll run local/self_test.cr with the interpreter. I need to find the correct crystal command flag, so I'll check bin/crystal --help since the Makefile's interpreter_spec isn't what I need.

stat_minus_1
Thinking
I can't run make interpreter_spec. I've implemented the fix but can't verify it due to security constraints.

@cyangle
Copy link
Contributor Author

cyangle commented Nov 4, 2025

Calling discard_value in inlined self is causing a stack imbalance. While we have a workaround, the root cause is still a mystery. The Gemini AI is also refusing to dig deeper into the problem, which feels like 'cheating' to me.

@ysbaddaden
Copy link
Contributor

The full chat log ain't very useful. Only the following bit is valuable as it explains what was the issue and why it implemented the fix. With some edit/markdown style to improve readability it could have become the PR description:

I'm still unsure why discard_value is failing, but I have a fix.

The issue is that when inlining self-returning methods, the argument is being left on the stack, causing it to be returned instead of self. My theory is that even though discard_value is used, the argument's value is still being pushed.

To fix this, I'll reorder the operations in visit(node : Call). I'll evaluate and discard the arguments before pushing self onto the stack. This ensures self is the last value pushed, making it the return value, which aligns with standard call evaluation order.

@ysbaddaden
Copy link
Contributor

ysbaddaden commented Nov 4, 2025

Both patches (here and #16303) are interesting and welcomed. I'm impressed that an AI could pinpoint the actual issues to be honest 👌

Now, when pushing a PR, one should try to understand what the AI did so they can explain it back (that's the ideal situation) and to at least extract the most interesting bit out of the chat log, as I did above for example.

Here, it was sufficient (discard args before pushing self 👍), in #16303 it needed additional knowledge to validate the issue (harder) which @HertzDevil provided.

If unsure, I can only encourage to first comment on the original issue what the AI identified (only the conclusion, not the full chat log, please 🙇) and the potential patch.

@cyangle
Copy link
Contributor Author

cyangle commented Nov 4, 2025

@ysbaddaden It usually doesn't work with a single try with AI, I think it's still valuable to provide the AI prompt(user's initial input) to help others use it more effectively.

Specifically, in this prompt, the key is below:

It incorrectly returns the method parameter as self, as comparison it works if not passing param to the not_nil! This suggests that something's wrong with stack operation or method scope.

I was also shocked by the potential capabilities of AI, but it still needs humans to validate the results.

@straight-shoota straight-shoota merged commit ba2499c into crystal-lang:master Nov 5, 2025
40 checks passed
@cyangle cyangle deleted the fix_self_with_arg branch November 7, 2025 04:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:compiler:interpreter

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Interpreter segfault when resuming fiber from nilable ivar

4 participants