Skip to content

Conversation

Stevengre
Copy link
Contributor

@Stevengre Stevengre commented Sep 24, 2025

  • This PR is to tackle issue: [FEATURE] Hard-wire conversions between [u8;8] and u64 solana-token#6
    Changes includes:
  • kmir/src/kmir/kdist/mir-semantics/lemmas/kmir-lemmas.md: simp rules to make other simp rules' require clause can be satisfied.
  • kmir/src/kmir/kdist/mir-semantics/rt/data.md:
    • castAux function to bring TypeInfo into a function without configuration, allowing simplification rules for #cast.
    • A rewrite rule to transmute from [u8; 8] to u64
  • kmir/src/kmir/kdist/mir-semantics/symbolic/p-token.md:
    • make #asU8s's rule concrete to simplify the result for symbolic value
    • rules for #asU8s(X)

<types> TYPEMAP </types>
requires TY_TARGET in_keys(TYPEMAP)
andBool isTypeInfo(TYPEMAP[TY_TARGET])
[priority(160), preserves-definedness, symbolic]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous one without symbolic leads to integration test timeout.

I don't know if this works, but just give it a try.

If you find something suspicious to slow down the symbolic/concrete execution, please let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be more specific. symbolic requires all variable bindings to be symbolic, but CASTKIND is never symbolic, and neither is TY_TARGET. Please try symbolic(1) here.
Interesting that it starts looping without the attribute...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting that it starts looping without the attribute...

Interesting... But why? Should we have an issue to store this problem and investigate it later?

Copy link
Contributor Author

@Stevengre Stevengre Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please try symbolic(1) here.

I'm always confused with the number here. It starts with 1 rather than 0? But sure, let me try to constraint VAL.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used symbolic(VAL) here: 2aa8c0c

@Stevengre Stevengre marked this pull request as ready for review September 25, 2025 01:44
Comment on lines +130 to +131
rule [int-and-255-geq-0]: 0 <=Int _ &Int 255 => true [simplification, smt-lemma]
rule [int-and-255-lt-256]: _ &Int 255 <Int 256 => true [simplification, smt-lemma]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand, this is stating: the result of any integers bitwise Anded with 255 will be bounded between 0 (inclusive) and 256 (not inclusive)?

I don't really understand how this works under the hood, but straight away I wonder if there should be any restriction on the integers. How does this work with negative numbers? Does this make assumptions about 2's complement?

I might be highlighting some misunderstanding I have by asking these questions, but it feels a bit suspicious to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work with negative numbers? Does this make assumptions about 2's complement?

I think the answer is yes. We've checked and introduced similar simp rules for riscv-semantics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No restriction on the integers. The way &Int works in K is that negative numbers are assumed to have an infinite prefix of ...111111 but that prefix becomes zero when applying &Int N with a positive N. So negative numbers become positive.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two simp rules (smt-lemma) are used to return a true for the following expression.
andBool 0 <=Int B0 andBool B0 <Int 256
If we don't have these two rules, the corresponding rule cannot be applied.

I'm not sure the implementation detail of the backend. I think it might be a rewrite to the substitued terms to get this true. Another thing that I guess the backend can do is to interpret it in SMT, and let SMT solver to return true or false. I think haskell backend should have both way to tackle the require clause, like apply rewrites first, then give it to Z3. Am I right? @jberthold

<types> TYPEMAP </types>
requires TY_TARGET in_keys(TYPEMAP)
andBool isTypeInfo(TYPEMAP[TY_TARGET])
[priority(160), preserves-definedness, symbolic]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be more specific. symbolic requires all variable bindings to be symbolic, but CASTKIND is never symbolic, and neither is TY_TARGET. Please try symbolic(1) here.
Interesting that it starts looping without the attribute...

<types> TYPEMAP </types>
requires TY_SOURCE in_keys(TYPEMAP) andBool isTypeInfo(TYPEMAP[TY_SOURCE])
andBool TY_TARGET in_keys(TYPEMAP) andBool isTypeInfo(TYPEMAP[TY_TARGET])
[priority(160), preserves-definedness, symbolic]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, please try symbolic(1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used symbolic(VAL) here: 2aa8c0c

Comment on lines 1307 to 1309
// Transmute from [u8; 8] to u64
rule <k> #cast(Range(ListItem(Integer(B0:Int, 8, false))
ListItem(Integer(B1:Int, 8, false))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add more rules for the case of casting a byte array to an unsigned integer. And this could be handled by a function inside numbers.md maybe, so we don't keep adding to data.md?
Not going to block merging this single case first if it gets us further, though, but we should try to structure the K code more..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked the semantics in numbers.md and still think this rule should be here for now, unless we move all the #cast rules into a single cell. My reason for this is:

  1. Currently, I don't understand the main point of number.md. It seems to gather things about Integer and Float. But for example, the number cast is still in data.md. And it will become hard to read if we move the number cast from data.md to numbers.md.
  2. All the #cast rules are in data.md. It's easy to find the rule and understand the rules in this location.

I think it should be a refactor to move all the cast things into one file. And it will be easy to refactor if this rule is in this location.

| #asU8List ( List , Int ) [function]
// -------------------------------------
rule #asU8s(X) => #asU8List(.List, X)
rule #asU8s(X) => #asU8List(.List, X) [concrete]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good idea. Requires testing, of course, but we want to avoid building up the list of bit-shifts and enable round-tripping, which this should do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works on test_process_burn.

Comment on lines 1325 to 1334
requires TY_TARGET in_keys(TYPEMAP)
andBool {TYPEMAP[TY_TARGET]}:>TypeInfo ==K typeInfoPrimitiveType(primTypeUint(uintTyU64))
andBool 0 <=Int B0 andBool B0 <Int 256
andBool 0 <=Int B1 andBool B1 <Int 256
andBool 0 <=Int B2 andBool B2 <Int 256
andBool 0 <=Int B3 andBool B3 <Int 256
andBool 0 <=Int B4 andBool B4 <Int 256
andBool 0 <=Int B5 andBool B5 <Int 256
andBool 0 <=Int B6 andBool B6 <Int 256
andBool 0 <=Int B7 andBool B7 <Int 256
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this rule should only fire when the B are concrete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea and added concrete(B0,B1,B2,B3,B4,B5,B6,B7). But after adding this, the following term will be unchanged, is that what we want? Actually, I think it should be simplier if we do this rewrite for the symbolic Bxs.

ListItem (typedValue (
    thunk ( #cast ( Range ( ListItem (Integer ( ARG_UINT214:Int , 8 , false ))
                ListItem (Integer ( ARG_UINT215:Int , 8 , false ))
                ListItem (Integer ( ARG_UINT216:Int , 8 , false ))
                ListItem (Integer ( ARG_UINT217:Int , 8 , false ))
                ListItem (Integer ( ARG_UINT218:Int , 8 , false ))
                ListItem (Integer ( ARG_UINT219:Int , 8 , false ))
                ListItem (Integer ( ARG_UINT220:Int , 8 , false ))
                ListItem (Integer ( ARG_UINT221:Int , 8 , false ))
         ) , castKindTransmute , ty ( 600087 ) , ty ( 600086 ) ) ) ,
    ty ( 600086 ),
    mutabilityNot )
)

After this analysis, I would like to delete this concrete attribute now. What about you?


```k
syntax Evaluation ::= #cast( Evaluation, CastKind, MaybeTy, Ty ) [strict(1)]
syntax Evaluation ::= castAux ( Value, CastKind, MaybeTypeInfo, TypeInfo ) [function]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add no-evaluators to this function because it does not have equations, only simplifications

jberthold
jberthold previously approved these changes Sep 25, 2025
@Stevengre Stevengre self-assigned this Sep 25, 2025
Comment on lines 1070 to 1074
syntax Evaluation ::= castAux ( Value, CastKind, MaybeTypeInfo, TypeInfo ) [function, no-evaluators]
syntax MaybeTypeInfo ::= "Maybe" "(" TypeInfo ")" | "TypeInfoUnknown"
// ----------------------------------------------------------------
rule <k> #cast(VAL, CASTKIND, TyUnknown, TY_TARGET)
=> castAux(VAL, CASTKIND, TypeInfoUnknown, {TYPEMAP[TY_TARGET]}:>TypeInfo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two small suggestions:

  1. syntax MaybeTypeInfo ::= TypeInfo | NoTypeInfo
  2. maybe these rules should move to p-token?

Copy link
Contributor Author

@Stevengre Stevengre Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I applied the first one.

For the second one, I think it will be good to refactor like this way. Without castAux, we cannot introduce any simp rules for #cast. So, I want to leave these rules here.

I'll mark the pr as automerge, but please let me know if you have any other thoughts on it.

@jberthold
Copy link
Member

Something is wrong with the latest changes in the PR, the integration test is running for >2h now. Removing automerge

@jberthold jberthold dismissed their stale review September 25, 2025 10:58

Code caused the integration test proofs to loop, revision in progress

Introduce concrete rule for castAux for thunk
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants