Logbook 2025 H1

September 2025

2025-09-16

SB on H4 in Increment tx

When working on partial commits I noticed something weird - occasionally you would get H4 validator error from evaluating increment tx and what is weird I noticed it has something to do with the amount of generated tokens for the deposit.
If I just seed more ada then the test is always green, but if I use 2 or 3 ada to deposit together with some tokens then I experience H4 or HeadValueIsNotPreserved error.
I understand that if we are trying to transfer large quantities of tokens can have impact on the tx fee but why would we get the validator error beats me!
What I can do is try to debug and figure out what is causing this - I have a pretty good idea where this happens (Wallet.hs) and when we try to estimateScriptsCost for some reason we get ErrScriptExecutionFailed which is then re-mapped to ScriptExecutionFailure in Direct/Handlers.hs.
This means that evalTxExUnits is also running the Head script and checkIncrement validator function.
It is worth mentioning that we either get H4 or :

    1) Test.EndToEnd, End-to-end on Cardano devnet, single party hydra head, can deposit partial UTxO
       uncaught exception: FaucetException
       FaucetFailedToBuildTx {reason = TxBodyErrorMinUTxONotMet (TxOutInAnyEra ConwayEra (TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "d99b3e7c965d67b5e182e93ca8723fc3b61eeb5d9276cd3c2e461a5e"})) StakeRefNull)) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 2000000) (MultiAsset (fromList [(PolicyID {policyID = ScriptHash "dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75"},fromList [("006cc09c1b4e884bd6265c5604c5c8b0a162ee09b0c7cf4bde379f9aebec",1),("35bf4a530304089dbac2fed26fccedf510327acf9ebb25",7889008373497912565),("36",2),("37",8130754424926331177),("38",6083698601569809568),("48d44378c0cff4ba1a13e3f6eaa0d738e335d905550aacca67f70688",2),("4bbdd3d3264e6f85c3a65b819a32bc056942b43d8c0f",1319783183065313421),("71a21566f4f89c51730021c4b21d1ab3c35b169d22ca",8985411846134803463),("92261938b47e873dc018600f5f3dadfb5c20754b679c9810d4be81099ac0",8588183237463905639),("9481e44d64abd5d5471d84b12842",7629825169502544102),("a13aa4ac445c248686e3b3594263f770d45ced6caa826499cb265c917450f933",2),("a3e4e135016a6ddc59d2655001a7c044eca612272aefa5b2c1",404060153465380944),("a6e62d57edb79f60",3271098189746201174),("beba62802da1bd771798cc138337360927608e7901b0",5896190021190470668),("cf3a6e",2),("e4a201af38419bf51bb1c37fb30a3c",2)])])))) TxOutDatumNone ReferenceScriptNone)) (Coin 2689440)}

  To rerun use: --match "/Test.EndToEnd/End-to-end on Cardano devnet/single party hydra head/can deposit partial UTxO/" --seed 364461330

So sometimes we get failures when trying to mint tokens from faucet too.
There is also evalTxExUnitsWithLogs which provides logs useful for debugging!
I sprinkled some logging and used evalTxExUnitsWithLogs (not sure how useful it is to print only when we get Right result) using the seed:

  cabal test hydra-cluster --test-options '--match "can deposit partial UTxO" --seed 185090575'

and what is weird is this seed I got from the first unsuccessful run (H4) but running for the second time I see:

  test/Test/EndToEndSpec.hs:330:7:
  1) Test.EndToEnd, End-to-end on Cardano devnet, single party hydra head, can deposit partial UTxO
       uncaught exception: FaucetException
       FaucetFailedToBuildTx {reason = TxBodyErrorMinUTxONotMet (TxOutInAnyEra ConwayEra (TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "fecd3d8bda4e7f450205cf5c2756d39b79aa81378f4bc9e8171cfcd8"})) StakeRefNull)) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 2000000) (MultiAsset (fromList [(PolicyID {policyID = ScriptHash "dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75"},fromList [("2676d60be23ff11bad4666202ad03aefab9316",1),("2a17115569c616af8053",2),("2cabddce782763098b1b39901e0d39a94f0915e669",1),("314b8161d03d3638",1),("32",6886952548160811876),("33",4498128670021801312),("36",1),("36c74fd7aadc801a595f66286b8c98c7899ec7d06ec269f7",2),("39",13812362742214938515),("44e59ea5093c3403b4373b2e3c52112c6231f6",2),("653c9f0794b0",7611254910991885205),("86f0d82b5d570b8ed0139015be250878b34ca5c9934b08",7709398943983541368),("876ab1d4d53f9bea81a90e1ffcf4a6a56c",273705404968363380),("d2063b5e504dd4ae1c1fc64e89c67220f9cdc94f2297977dd3864fdf6d32",2),("e1252e9d6a1fb84b32e9f5204818d321a44a",487284472556764299)])])))) TxOutDatumNone ReferenceScriptNone)) (Coin 2228270)}

So the second time faucet failed to build minting tx because of TxBodyErrorMinUTxONotMet
When printing increment and deposit txs I noticed that increment is using always the first deposit output while in this test we have two outputs since we are giving back the change containing leftover assets to the user! This should be the cause of all problems in case these outputs in reverse order!
This is the deposit UTxO we tried to consume in the increment:

  DEPOSIT UTxO: fromList
    [
        ( TxIn "39eda4cd0fd667dd311d2266d022ff98770962e264b89c1a76a92f509b829e30"
            ( TxIx 0 )
        , TxOut
            ( AddressInEra ( ShelleyAddressInEra ShelleyBasedEraConway )
                ( ShelleyAddress Testnet
                    ( ScriptHashObj
                        ( ScriptHash "ae01dade3a9c346d5c93ae3ce339412b90a0b8f83f94ec6baa24e30c" )
                    ) StakeRefNull
                )
            )
            ( TxOutValueShelleyBased ShelleyBasedEraConway
                ( MaryValue
                    ( Coin 2935110 )
                    ( MultiAsset
                        ( fromList
                            [
                                ( PolicyID
                                    { policyID = ScriptHash "dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75" }
                                , fromList
                                    [
                                        ( "35"
                                        , 1832708093535842422
                                        )
                                    ,
                                        ( "36"
                                        , 5955814720656890312
                                        )
                                    ,
                                        ( "73dc5dc233d70942c576a06c06c2a727adbbfa5b"
                                        , 6167589719126987963
                                        )
                                    ,
                                        ( "e47cfedc29b00ab35ac2da15be40b2ad184d39e01dd0bee65a2eb5c0"
                                        , 3397913103788772526
                                        )
                                    ,
                                        ( "ee1483eef61368f5a22f341e18c3a12bcdd4f5b61a33ca0c"
                                        , 8511226188589623859
                                        )
                                    ]
                                )
                            ]
                        )
                    )
                )
            )

This is how deposit tx looks like:

"39eda4cd0fd667dd311d2266d022ff98770962e264b89c1a76a92f509b829e30"

== INPUTS (2)
- 04e36b0eeb5d97be19e543583e3b736640b45ceaf5ece1d37f5db9ac88143f20#1
- bdf0688131efb9c476553b5d60d1a52f4814e11cc9359f3f19f0345619683f12#0
      ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "884edab29cc9791346a0d93881c678d136542dade7929132b7b315da"})) StakeRefNull
      10000000 lovelace
      2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.03dd2e426d18a75446
      2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.34
      1832708093535842422 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.35
      5955814720656890312 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.36
      2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.37
      6167589719126987963 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.73dc5dc233d70942c576a06c06c2a727adbbfa5b
      4 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.74ad1fc65024c6
      3 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.823b659d6d43680c260d0109c4712a5f767b82
      1 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.a8165b9ada218c52340576a3caaba0ca70c272ba
      2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.c57dfc865840dccc5feff758db02c09e2d9ed34f7035154ce8
      3397913103788772526 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.e47cfedc29b00ab35ac2da15be40b2ad184d39e01dd0bee65a2eb5c0
      1 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.e54e1616e3abbd03bfb2
      8511226188589623859 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.ee1483eef61368f5a22f341e18c3a12bcdd4f5b61a33ca0c
      4 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.f85c0b51cb3a3f67a227b9b89d6d985dbb5f25787230
      TxOutDatumNone
      ReferenceScriptNone

== COLLATERAL INPUTS (1)
- 04e36b0eeb5d97be19e543583e3b736640b45ceaf5ece1d37f5db9ac88143f20#1

== REFERENCE INPUTS (0)

== OUTPUTS (3)
Total number of assets: 15
- ShelleyAddress Testnet (ScriptHashObj (ScriptHash "ae01dade3a9c346d5c93ae3ce339412b90a0b8f83f94ec6baa24e30c")) StakeRefNull
      2935110 lovelace
      1832708093535842422 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.35
      5955814720656890312 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.36
      6167589719126987963 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.73dc5dc233d70942c576a06c06c2a727adbbfa5b
      3397913103788772526 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.e47cfedc29b00ab35ac2da15be40b2ad184d39e01dd0bee65a2eb5c0
      8511226188589623859 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.ee1483eef61368f5a22f341e18c3a12bcdd4f5b61a33ca0c
      TxOutDatumInline [0,["0xe9e21b6e4f59b3c62721c99109e76abb46e018e4fc184b16e58d650a",1758033905909,[[0,[[0,["0xbdf0688131efb9c476553b5d60d1a52f4814e11cc9359f3f19f0345619683f12",0]],"0xd8799fd8799fd8799f581c884edab29cc9791346a0d93881c678d136542dade7929132b7b315daffd87a80ffa240a1401a0016e360581cdcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75a541351b196f164d0c2ff07641361b52a74deee4cbd9c85473dc5dc233d70942c576a06c06c2a727adbbfa5b1b5597ae2c18ea54bb581ce47cfedc29b00ab35ac2da15be40b2ad184d39e01dd0bee65a2eb5c01b2f27cff708f98cae5818ee1483eef61368f5a22f341e18c3a12bcdd4f5b61a33ca0c1b761df31fc5e2ea33d87980d87a80ff"]]]]]
- ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "884edab29cc9791346a0d93881c678d136542dade7929132b7b315da"})) StakeRefNull
      8500000 lovelace
      2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.03dd2e426d18a75446
      2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.34
      2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.37
      4 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.74ad1fc65024c6
      3 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.823b659d6d43680c260d0109c4712a5f767b82
      1 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.a8165b9ada218c52340576a3caaba0ca70c272ba
      2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.c57dfc865840dccc5feff758db02c09e2d9ed34f7035154ce8
      1 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.e54e1616e3abbd03bfb2
      4 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.f85c0b51cb3a3f67a227b9b89d6d985dbb5f25787230
      TxOutDatumNone
- ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d"})) StakeRefNull
      20141053 lovelace
      TxOutDatumNone

== TOTAL COLLATERAL
TxTotalCollateralNone

== RETURN COLLATERAL
TxReturnCollateralNone

== FEE
TxFeeExplicit ShelleyBasedEraConway (Coin 210953)

== VALIDITY
TxValidityNoLowerBound
TxValidityUpperBound ShelleyBasedEraConway (Just (SlotNo 39))

== MINT/BURN
0 lovelace

== SCRIPTS (0)
Total size (bytes):  0

== DATUMS (0)

== REDEEMERS (0)

== REQUIRED SIGNERS
[]

== METADATA
TxMetadataInEra ShelleyBasedEraConway (TxMetadata {unTxMetadata = fromList [(55555,TxMetaText "HydraV1/DepositTx")]})

Hmm so far deposit UTxO is picked correctly... I am debugging wallet code since there I should find the problem.
Nothing suspicious wherever I look.
What I could do is to increase the seeded ADA which will make the test pass and then look at the same print debugging to try and spot where is the difference.
~~I commented out completely mustIncreaseValue which was a failing check and then the test fails on step where we want to see HeadIsFinalized message.~~ This was just a fluke, probably something weird that happened once.
It is worthwhile mentioning I occasionally also see:

  test/Test/EndToEndSpec.hs:330:7:
 1) Test.EndToEnd, End-to-end on Cardano devnet, single party hydra head, can deposit partial UTxO
      uncaught exception: Error
      BearerClosed "<socket: 17> closed when reading data, waiting on next header True"

Ok so I since bumping ada amount solves the issue and I couldn't find anything else problematic then it has to be the ADA amount that is off in the validator check.
Indeed! Deposit output gets autobalanced and it is 2512730 lovelace instead of 150000 which I deliberately specified. This makes the validator check fail and it seems that all is good.
I am glad I did this although it wasn't necessary since everything works - at least now I know the reason and perhaps this should be mentioned somewhere in the documentation since it can be a foot gun.

2025-09-08

SB on partial tokens output

Still trying to figure out what is happening with deposits. What I see is after posting a deposit tx no activity in hydra-nodes a part from PeerConnected messages. It seems like there is an exception raised somewhere (since I also see userInterrupt in stdout) but for the live of me I can't figure out where it is.
Running other deposit related tests to see if they are broken too now.
Ok, other tests still work..phew
Seems like we see a deposit tx arrive from the chain component but there is not subsequent observation and increment posting. Sprinkling some debugging into observation to see why the deposit was not observed.
Ok, I am on to something. We have a check to make sure all deposit inputs are spent into a deposit tx but now with partial deposits this check is not working anymore since we potentially consume some inputs but return to the user a change.
This check doesn't work no more since I am bumping the TxIx when adding tokens to the outputs. Perhaps I should re-think this and merge UTxO contents instead if possible?
In general - is it safe to split complex UTxO like I am doing now? I think this is bound to have problems when users start committing script UTxO with some tokens that also rely on correct datum/redeemer in the UTxO - it is very easy to omit datum/redeemer which can cause problems down the line.
I see H4 when trying to post Increment tx so there are couple of problems with my implementation so far.
Let's try to focus on first one - to not have to bump TxId indices and introduce a function to merge UTxO contents (mappend (<>) just ignores subsequent inputs in case TxIn is already present)
After writing the mergeUTxO function I see H4 on Increment transaction so let's fix this one next.
In the meantime I had to juggle a bit the utxo to deposit since I needed to alter the seeded value and reduce the ADA amount (since we specified partial commit for both ADA and some assets)
I am currently debugging Increment tx so I left debug prints in code. Everywhere I look the values are correct so I need to dig deeper :TM:
BTW seems like ada value in increment is off by 1 ADA? Let's explore tomorrow.

2025-09-03

SB on partial tokens output

Continuing where I left off - it seems like there are still problems with the deposit construction since I am not seeing any tokens at the deposit output. Debugging.
The deposit output with tokens was squashed into the first UTxO containing only ada because they were using the same TxIn. I made sure to bump the index and now I see correctly tokens in the deposit output.
Another problem is splitting what is left and returning to the user. withoutUTxO is not removing the tokens for the same reason - now the TxIn is unique so withoutUTxO is happy to keep the tokens around.
What I did was to not bump indices until the very end. This allows me to withoutUTxO to remove what is to be committed token wise and then create additional leftover output in case we do not want to commit all tokens in the UTxO.
I also notice that we are keeping the lovelace for UTxO with tokens which then produces the wrong outcome - if user specified any lovelace amount to keep this should be respected.
This whole issue feels like it needed grooming before I started to work on it but the problem, I believe, is that we started with only lovelace - now we want to limit tokens as well and this needed a careful approach.
One period of time to get rid of our technical debt would be in order.
It is evan hard to explain in couple of sentences the problems we may have. I think I should perhaps unify functions that:
split the utxo by the specified amount
- split the utxo by the specified assets
This way we would avoid unnecessary juggling with avaliable UTxO
Feels like in general that it would be useful to write a function squashUTxO that will merge all assets under a single TxIn. After having this then we should write another function to operate on this single entry UTxO and take split it into two - what we want to commit and what should be given back to the user.
In the middle of trying things out I realize there will be use cases where people require datums to be present at exact output and similar so it is not that easy as it sounds.
Been running in circles forever! Seems like deposit tx is finally accepted by the node but when waiting on CommitApproved message I am unable to see it.
Ironically after the test times out I do see CommitApproved message in stdout!
Is this a timing issue? Some quirk where I need to pick correct values for deposit/contestation period?

2025-09-03

SB on partial tokens output

It seem there is issue with partial tokens feature where the specified assets are not to be found in the deposit output.
I'll take a look at the Deposit.hs first to glance over it trying to spot the problem.
Immediately I spot this line:

    let leftoverUTxO = (leftoverUTxO' `withoutUTxO` tokensToDepositUTxO)

Here leftoverUTxO' should contain only non ada assets and then we are trying to remove also any tokens we want to commit?
We need a proper e2e test for this.
In order to do this successfully I need to make sure we can mint some assets when requesting funds from faucet.
At first I used an extra argument for the assets that should be minted in the transaction but realized this can be more elegant if I just search for any non ada values and if any is present I use dummyValidatorScript as the script witness for minting.
For some reason I am getting negative values in the tx outputs even though I made sure to filter out the negatives:

  1) Test.EndToEnd, End-to-end on Cardano devnet, single party hydra head, can deposit partial UTxO
       uncaught exception: FaucetException
       FaucetFailedToBuildTx {reason = TxBodyErrorBalanceNegative (Coin 899849338161) (MultiAsset (fromList [(PolicyID {policyID = ScriptHash "2e12c5e499e0521b13837391beed1248a2e36117370662ee75918b56"},fromList [("2affd3bbefc1d1b8fd5b684c69f5f2a7f3",-1)]),(PolicyID {policyID = ScriptHash "707aab0cefa84f37cb8ac4c6e7f02de5c3fbda4534b59b53490e433b"},fromList [("33",-2)]),(PolicyID {policyID = ScriptHash "8f461954fe2f18fee1dca233f358907e643ff839ed1f995e4bf325e3"},fromList [("32",-6410724724660021339)])]))}

This happens because I was setting minting values correctly but only passed in UTxO that contains only lovelace.
Had to attach a minting script and ~~optimize it using plutus-tx plugin pragma since the tx got too big (MaxTxSizeUTxO error)~~ make sure all minted tokens use the same script hash for PolicyId but still get:

       uncaught exception: FaucetException
     FaucetFailedToBuildTx {reason = TxBodyScriptExecutionError [(ScriptWitnessIndexMint 0,ScriptErrorMissingScript (ScriptWitnessIndexMint 0) (ResolvablePointers ShelleyBasedEraConway (fromList [(ConwayMinting (AsIx {unAsIx = 0}),(ConwayMinting (AsItem {unAsItem = PolicyID {policyID = ScriptHash "54c25c25088dad1c2421be9b91aa48ca787a18db1d29f38319b8e998"}}),Nothing,ScriptHash "54c25c25088dad1c2421be9b91aa48ca787a18db1d29f38319b8e998")),(ConwayMinting (AsIx {unAsIx = 1}),(ConwayMinting (AsItem {unAsItem = PolicyID {policyID = ScriptHash "8894a9e32b4c206bcc49250d2bcc048466b1dd3dde49ca6fcab8566a"}}),Nothing,ScriptHash "8894a9e32b4c206bcc49250d2bcc048466b1dd3dde49ca6fcab8566a"))]))),(ScriptWitnessIndexMint 1,ScriptErrorMissingScript (ScriptWitnessIndexMint 1) (ResolvablePointers ShelleyBasedEraConway (fromList [(ConwayMinting (AsIx {unAsIx = 0}),(ConwayMinting (AsItem {unAsItem = PolicyID {policyID = ScriptHash "54c25c25088dad1c2421be9b91aa48ca787a18db1d29f38319b8e998"}}),Nothing,ScriptHash "54c25c25088dad1c2421be9b91aa48ca787a18db1d29f38319b8e998")),(ConwayMinting (AsIx {unAsIx = 1}),(ConwayMinting (AsItem {unAsItem = PolicyID {policyID = ScriptHash "8894a9e32b4c206bcc49250d2bcc048466b1dd3dde49ca6fcab8566a"}}),Nothing,ScriptHash "8894a9e32b4c206bcc49250d2bcc048466b1dd3dde49ca6fcab8566a"))])))]}

It seems like there are two script hashes in the minting redeemer. I'll debug to figure it out.
Yes, the code to set the specified policy id was wrong.
After seeing retries when submitting a faucet seed transaction I had to debug again to see the actual error. It was missing collateral since we now have scripts in our plain old faucet tx.

After adding a collateral inputs I finally have a regression test that fails with this error for deposit tx:

  test/Test/EndToEndSpec.hs:324:7:
1) Test.EndToEnd, End-to-end on Cardano devnet, single party hydra head, can deposit partial UTxO
     uncaught exception: SubmitTransactionException
     SubmitTxValidationError (TxValidationErrorInCardanoMode (ShelleyTxValidationError ShelleyBasedEraConway (ApplyTxError (ConwayUtxowFailure (UtxoFailure (ValueNotConservedUTxO (Mismatch {mismatchSupplied = MaryValue (Coin 35583235) (MultiAsset (fromList [(PolicyID {policyID = ScriptHash "dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75"},fromList [("32",2),("7fe47286bc0adcd5cbd0a4013bc051d36773c9f0974b809ab4d6",2),("cda7455757cb034b1bd8321b567e2a6b71b13eeb",1)])])), mismatchExpected = MaryValue (Coin 35583235) (MultiAsset (fromList []))}))) :| []))))

Looking at the rendered deposit tx id we can see that indeed there are no tokens in the output:

"7c9d858e80fde98d61968ccabba46491ab965dbe97e4c8fb36e592ce7f15a97b"

== INPUTS (2)
- 73adac7c3908f8d14a872d525db39bf9772f56b64f99b5eb686708b1916bb07e#1
- e953a530ff8d02a9e7efbed8eadd0eccc7bb4e988f80da529e7186ef7a774e20#0
     ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "4ab68d8804c41aafe7c0bba0bc781a1497b53b430f39e552e71316af"})) StakeRefNull
     7512730 lovelace
     2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.32
     2 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.7fe47286bc0adcd5cbd0a4013bc051d36773c9f0974b809ab4d6
     1 dcf5fdd1d01c04b0e6262bba173a89c4b81b6570211f08bc059c8a75.cda7455757cb034b1bd8321b567e2a6b71b13eeb
     TxOutDatumNone
     ReferenceScriptNone

== COLLATERAL INPUTS (1)
- 73adac7c3908f8d14a872d525db39bf9772f56b64f99b5eb686708b1916bb07e#1

== REFERENCE INPUTS (0)

== OUTPUTS (3)
Total number of assets: 1
- ShelleyAddress Testnet (ScriptHashObj (ScriptHash "ae01dade3a9c346d5c93ae3ce339412b90a0b8f83f94ec6baa24e30c")) StakeRefNull
     2000000 lovelace
     TxOutDatumInline [0,["0xe31c2cd8e640e8d8b902dcc3c4625ffea708e4a475189c06557b2225",1756914066710,[[0,[[0,["0xe953a530ff8d02a9e7efbed8eadd0eccc7bb4e988f80da529e7186ef7a774e20",0]],"0xd8799fd8799fd8799f581c4ab68d8804c41aafe7c0bba0bc781a1497b53b430f39e552e71316afffd87a80ffa140a1401a001e8480d87980d87a80ff"]]]]]
- ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "4ab68d8804c41aafe7c0bba0bc781a1497b53b430f39e552e71316af"})) StakeRefNull
     5512730 lovelace
     TxOutDatumNone
- ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "1052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6"})) StakeRefNull
     27881156 lovelace
     TxOutDatumNone

== TOTAL COLLATERAL
TxTotalCollateralNone

== RETURN COLLATERAL
TxReturnCollateralNone

== FEE
TxFeeExplicit ShelleyBasedEraConway (Coin 189349)

== VALIDITY
TxValidityNoLowerBound
TxValidityUpperBound ShelleyBasedEraConway (Just (SlotNo 37))

== MINT/BURN
0 lovelace

== SCRIPTS (0)
Total size (bytes):  0

== DATUMS (0)

== REDEEMERS (0)

== REQUIRED SIGNERS
[]

== METADATA
TxMetadataInEra ShelleyBasedEraConway (TxMetadata {unTxMetadata = fromList [(55555,TxMetaText "HydraV1/DepositTx")]})

Nice!
Now it is time to debug again and figure out where the problem is.
Seems like capUTxO is using only lovelace!
Yup that was the issue, not sure how it could get undetected like this.

August 2025

2025-08-14

SB on debugging H4 user issue on deposit

Finally it is time to put the visualize-logs executable to the test! I had to add apiTransactionTimeout to RunOptions in the logs since this user is using prior hydra-node version but after adding this I can see pretty logs ordered by timestamp.
The problem is the deposit and H4 that occurs on posting an increment tx.
I want to add ToPost lines to the logs so I can see and debug the deposit tx.
This is what was observed as deposited:

OnDepositTx {headId = UnsafeHeadId "\244\219p\156\221\"g\177\255\203\204\223\&6\212f\STXM\198t\228\&1\166\&8^\251:\164\171", depositTxId = "530ab09e96885ec211c3834059bde26ebf0ee3ec14826c6ece3eaf87decb5d91", deposited = fromList [(TxIn "164833d9c6456cc65305c441880ce5a1bfd47145f457839c2159c4ff85e15873" (TxIx 0),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "e06f2ae361f33815f775b224789025dccc4b6413599224e70841eebf"})) (StakeRefBase (KeyHashObj (KeyHash {unKeyHash = "eee790c7bb9c497f716fcec7d92a4d68c27b48c9e301b7c547c653cd"}))))) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 152000000) (MultiAsset (fromList [(PolicyID {policyID = ScriptHash "0836587ed7cee3c0790e24c930c67f31fc2511a3c25aa66ed205e05f"},fromList [("74525053",1000)]),(PolicyID {policyID = ScriptHash "9649407b4f02a38d98b1d9de2457eff522c47c87e22533377dcc70c4"},fromList [("446f6e67",10),("546f6b656e31",20),("55444f",10)])])))) TxOutDatumNone ReferenceScriptNone)], created = 2025-08-14 07:49:27 UTC, deadline = 2025-08-14 07:57:59.999 UTC}
OnDepositTx {headId = UnsafeHeadId "\244\219p\156\221\"g\177\255\203\204\223\&6\212f\STXM\198t\228\&1\166\&8^\251:\164\171", depositTxId = "530ab09e96885ec211c3834059bde26ebf0ee3ec14826c6ece3eaf87decb5d91", deposited = fromList [(TxIn "164833d9c6456cc65305c441880ce5a1bfd47145f457839c2159c4ff85e15873" (TxIx 0),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "e06f2ae361f33815f775b224789025dccc4b6413599224e70841eebf"})) (StakeRefBase (KeyHashObj (KeyHash {unKeyHash = "eee790c7bb9c497f716fcec7d92a4d68c27b48c9e301b7c547c653cd"}))))) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 152000000) (MultiAsset (fromList [(PolicyID {policyID = ScriptHash "0836587ed7cee3c0790e24c930c67f31fc2511a3c25aa66ed205e05f"},fromList [("74525053",1000)]),(PolicyID {policyID = ScriptHash "9649407b4f02a38d98b1d9de2457eff522c47c87e22533377dcc70c4"},fromList [("446f6e67",10),("546f6b656e31",20),("55444f",10)])])))) TxOutDatumNone ReferenceScriptNone)], created = 2025-08-14 07:49:27 UTC, deadline = 2025-08-14 07:57:59.999 UTC}

This is the deposit tx in the explorer https://preprod.cexplorer.io/tx/530ab09e96885ec211c3834059bde26ebf0ee3ec14826c6ece3eaf87decb5d91?tab=content
So what is locked should be 164833d9c6456cc65305c441880ce5a1bfd47145f457839c2159c4ff85e15873#0
Requested snapshot also looks correct:

utxoToCommit = Just (fromList [(TxIn "164833d9c6456cc65305c441880ce5a1bfd47145f457839c2159c4ff85e15873" (TxIx 0),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "e06f2ae361f33815f775b224789025dccc4b6413599224e70841eebf"})) (StakeRefBase (KeyHashObj (KeyHash {unKeyHash = "eee790c7bb9c497f716fcec7d92a4d68c27b48c9e301b7c547c653cd"}))))) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 152000000) (MultiAsset (fromList [(PolicyID {policyID = ScriptHash "0836587ed7cee3c0790e24c930c67f31fc2511a3c25aa66ed205e05f"},fromList [("74525053",1000)]),(PolicyID {policyID = ScriptHash "9649407b4f02a38d98b1d9de2457eff522c47c87e22533377dcc70c4"},fromList [("446f6e67",10),("546f6b656e31",20),("55444f",10)])])))) TxOutDatumNone ReferenceScriptNone)])

This is how increment tx looks like:

{
    "auxiliary scripts": null,
    "certificates": null,
    "collateral inputs": [],
    "currentTreasuryValue": null,
    "datums": [],
    "era": "Conway",
    "fee": "0 Lovelace",
    "governance actions": [],
    "inputs": [
        "02b2a8a41fa13712c675d069ac84cfc194cbbe2c4e4b6331289c9dbc747fbc98#0",
        "530ab09e96885ec211c3834059bde26ebf0ee3ec14826c6ece3eaf87decb5d91#0"
    ],
    "metadata": {
        "55555": "HydraV1/IncrementTx"
    },
    "mint": null,
    "outputs": [
        {
            "address": "addr_test1wzlxa0r5ggyvvc9lplwpe7a4z4680nfsthjmzam72awtknqz5cf7v",
            "address era": "Shelley",
            "amount": {
                "lovelace": 250094340,
                "policy 0836587ed7cee3c0790e24c930c67f31fc2511a3c25aa66ed205e05f": {
                    "asset 74525053 (tRPS)": 2000
                },
                "policy 9649407b4f02a38d98b1d9de2457eff522c47c87e22533377dcc70c4": {
                    "asset 446f6e67 (Dong)": 10,
                    "asset 546f6b656e31 (Token1)": 20,
                    "asset 55444f (UDO)": 10
                },
                "policy f4db709cdd2267b1ffcbccdf36d466024dc674e431a6385efb3aa4ab": {
                    "asset 3e31597e53ccb0a84c071a70f51ce38ec26175cb1d4671d0f6c5425c": 1,
                    "asset 4879647261486561645631 (HydraHeadV1)": 1,
                    "asset 58bc0a0b52009ff85b46233edd5966b2e0ae0524c4dc67f78af92396": 1,
                    "asset 67b499913a7169d504d7887c99b1ff862d980b6f3ad5683c6394cde0": 1
                }
            },
            "datum": {
                "constructor": 1,
                "fields": [
                    {
                        "constructor": 0,
                        "fields": [
                            {
                                "bytes": "f4db709cdd2267b1ffcbccdf36d466024dc674e431a6385efb3aa4ab"
                            },
                            {
                                "list": [
                                    {
                                        "bytes": "957698848d51dc52da009e088c85a6b85dd30230d279b7a468c2ac4f509ba227"
                                    },
                                    {
                                        "bytes": "7701af5180c40cb7cbdfdc2621dd4674166ff958e6d0e4d723003c3f6f0e553f"
                                    },
                                    {
                                        "bytes": "efd9a68d3dfe2981fe4672ecd8bfc764202310abfd3e4523209b162e14577b0e"
                                    }
                                ]
                            },
                            {
                                "constructor": 0,
                                "fields": [
                                    {
                                        "int": 60000
                                    }
                                ]
                            },
                            {
                                "int": 2
                            },
                            {
                                "bytes": "1e53176a1b9ae1620d8b1ddb536836ca3b8297e5b9ced616c408e8893b929b19"
                            }
                        ]
                    }
                ]
            },
            "network": "Testnet",
            "payment credential script hash": "be6ebc744208c660bf0fdc1cfbb5157477cd305de5b1777e575cbb4c",
            "reference script": null,
            "stake reference": null
        }
    ],
    "redeemers": [
        {
            "purpose": {
                "spending script witnessed input": "02b2a8a41fa13712c675d069ac84cfc194cbbe2c4e4b6331289c9dbc747fbc98#0"
            },
            "redeemer": {
                "data": "Constr 1 [Constr 0 [List [B \"\\RS\\ETX\\190e%\\153\\195X\\148\\186\\155\\175\\239\\192\\170z\\EOT\\n\\239\\EOT\\207L\\178-:+#v$*\\DC2\\204\\199%\\DC1J\\166\\224\\176\\213\\EM0\\NAK\\ACK\\191\\149\\206\\168\\148\\ENQ\\209z\\204\\209\\253<\\202\\254^\\157\\140\\204\\&4\\v\",B \"t\\141\\238\\n\\211=yH(\\234\\234F:\\173\\247\\178(\\EOT\\177\\229\\144`\\215\\EOT\\NAK\\US\\184o\\221\\222\\151\\214\\130\\197\\RS\\181\\215F\\185\\SI\\131\\191\\vA-1/\\153\\176\\181\\248\\174\\241\\169s\\RS\\134a\\255\\&2\\173#\\234\\r\",B \"r\\ESC0\\134T\\FS\\DC1y(\\USQ\\134#\\203\\DC3\\218`8`\\213I\\222l]\\132?\\ETB\\RS\\166z\\223\\182\\178A\\129\\EM[\\t\\221\\199f\\142k\\174\\137\\STX\\\\T\\226\\228\\&0\\205\\ESC\\162JR\\213\\231\\186\\188i#\\138\\ETX\"],I 624,Constr 0 [B \"S\\n\\176\\158\\150\\136^\\194\\DC1\\195\\131@Y\\189\\226n\\191\\SO\\227\\236\\DC4\\130ln\\206>\\175\\135\\222\\203]\\145\",I 0]]]",
                "execution units": {
                    "memory": 0,
                    "steps": 0
                }
            }
        },
        {
            "purpose": {
                "spending script witnessed input": "530ab09e96885ec211c3834059bde26ebf0ee3ec14826c6ece3eaf87decb5d91#0"
            },
            "redeemer": {
                "data": "Constr 0 [B \"\\244\\219p\\156\\221\\\"g\\177\\255\\203\\204\\223\\&6\\212f\\STXM\\198t\\228\\&1\\166\\&8^\\251:\\164\\171\"]",
                "execution units": {
                    "memory": 0,
                    "steps": 0
                }
            }
        }
    ],
    "reference inputs": [
        "48bd29e43dd01d12ab464f75fe40eed80e4051c8d3409e1cb20b8c01120b425e#0"
    ],
    "required signers (payment key hashes needed for scripts)": [
        "67b499913a7169d504d7887c99b1ff862d980b6f3ad5683c6394cde0"
    ],
    "return collateral": null,
    "scripts": [
        {
            "script data": {
                "plutus version": "PlutusV3",
                "script": "59044e59044b010100323232323232323232322533300332323232325332330093001300b37540042646644a66666602800c2646464a66601e6006002264a66602800201e264a666666032002020020020020264a66602c603200600a0226eb8004c058004c048dd50048a9998079803800899299980a000807899299999980c800808008008099299980b180c8018028089bad00101030160013012375401201c60206ea802054ccc034c004c03cdd5001099191919191929998099803980a9baa00d153330133300330044c103d87e80003371e6eb8c008c058dd50031bae30193016375401a2660066008980103d87980003322325333016300e30183754002266e24dd6980e180c9baa00100213300630074c103d87a80004a0600860306ea8c00cc060dd50011802980b1baa00e375a6002602c6ea8018528099299980a198021802a6103d87c80003322325333017300f30193754002266e20008dd6980e980d1baa00113300730084c103d87b80004a0600a60326ea8c014c064dd50011803180b9baa00f375a6004602e6ea801c4c8c8c8c8cc020c02530103d87d80003371e6e48ccc00ccc008ccc004004dd61802180d9baa0130052376600291010022337140040026e48ccc00ccc008c8cc004004dd61802980e1baa00c22533301e00114bd700991919800800998020021811801912999810800899811001a5eb804cc894ccc07ccdd79991192999811180d18121baa00113322533302433710004002298103d879800015333024337100020042980103d87b800014c103d87a8000375a6020604a6ea800cdd6980818129baa002100133225333023337200040022980103d8798000153330233371e0040022980103d87a800014c103d87b8000375c602060486ea8008dd7180818121baa001300e3022375400a601c60446ea800930103d8798000133024005003133024002330040040013023001302400130200012375c600e60386ea8005221002233714004002444a66603466e24005200014bd700a99980f0010a5eb804cc07cc080008ccc00c00cc084008cdc0000a400244646600200200644a66603c002297ae013301f37526006604000266004004604200244464666002002008006444a66603e004200226660060066044004660080026eb8c0840088c06cc070c0700045281bad30193016375401a4603260340024603000244a666024002294454cc04c008528119299980898028008a490344303100153330113009001149010344303200153330113370e90020008a490344303300153330113370e90030008a490344303400153330113370e90040008a49034430350014910344303600301237540024602a602c602c602c602c602c602c602c002602660206ea800854cc039241054c35353b350016370e900000580580580598080009808180880098061baa002370e90010b1806980700198060011805801180580098031baa00114984d95854cc0092401054c35313b3500165734ae7155ceaab9e5573eae815d0aba257481",
                "type": "plutus"
            },
            "script hash": "ae01dade3a9c346d5c93ae3ce339412b90a0b8f83f94ec6baa24e30c"
        }
    ],
    "total collateral": null,
    "treasuryDonation": 0,
    "update proposal": null,
    "validity range": {
        "lower bound": null,
        "upper bound": 99474878
    },
    "voters": {},
    "withdrawals": null,
    "witnesses": []
}

I checked inputs/outputs and they look good so nothing really major that makes this increment tx fail. Let's dig deeper to see why would this tx fail.
I thought it might be related to how the datum is constructed but I see that there is an inline datum which is fine.
I can't tell why we get H4 . It is also not clear if the locked deposit hash matches which would reveal if we locked the commit wrong but that check happens after the failing one so we don't know.
I decoded the increment redeemer and found out that TxOutRef inside indeed matches the deposit input we are trying to spend.
User mentioned that if they used --contestation-period 60and --deposit-period 200 and if they use bigger values the problem goes away!
So how can it be that we see H4 and right after DepositExpired? H4 is unexpected here.

July 2025

2025-07-22

SB on fixing the input queue bug

After spending time to observe the test runs, debug and try to reason about the problem at hand I think that as the first step we should make sure that we can load persisted items on start. This can be achieved by just making sure that the queue capacity is large enough to handle all items. This is the most easy thing to do. We could also load incrementally and process items and then start but this requires more changes.
It feels like PersistentQueue is bolted on InputQueue which is not how we use PersistentQueue in Etcd module for example.
Perhaps it is worthwhile rewriting the createPersistInputQueue?
Let's tackle the first problem with making sure we can actually load all items from disk. I know this is not optimal but want to get results quick (tick-tack we don't have much time).
I also noticed we create the queue in race deep in some cps style code - why? Queue should be created only once?
Doing this makes the test red consistently
Next what we seemed to want to do in the test is to spam the node with many NewTx inputs and expect this to work out. This part was added:

            foldM_
              ( \utxo _i -> do
                  let Just (aliceTxIn, aliceTxOut) = UTxO.find (isVkTxOut aliceCardanoVk) utxo
                  let Right selfTx =
                        mkSimpleTx
                          (aliceTxIn, aliceTxOut)
                          (mkVkAddress testNetworkId aliceCardanoVk, txOutValue aliceTxOut)
                          aliceCardanoSk
                  send node $ input "NewTx" ["transaction" .= selfTx]
                  pure utxo
              )
              initialUTxO
              [1 .. capacity * 10]

and then we get:

uncaught exception: IOException of type ResourceVanished          
writev: resource vanished (Connection reset by peer)

So it seems this kills hydra-node

If I just spam the node without exceeding the queue capacity I get:

 test/Test/EndToEndSpec.hs:820:11:
  1) Test.EndToEnd, End-to-end on Cardano devnet, withHydraNode, load persistent queue with capacity exceeded
       hydra-node (nodeId = 1) exited with failure code: 1
       thread blocked indefinitely in an STM transaction
       CallStack (from HasCallStack):
         wrapBlockedIndefinitely, called at src/Control/Monad/Class/MonadSTM/Internal.hs:546:16 in io-classes-1.5.0.0-CQ6SF7CN7ju8okIkiv0F3i:Control.Monad.Class.MonadSTM.Internal
         atomically, called at src/Hydra/PersistentQueue.hs:91:3 in hydra-node-0.22.2-inplace:Hydra.PersistentQueue
         writePersistentQueue, called at src/Hydra/Node/InputQueue.hs:97:19 in hydra-node-0.22.2-inplace:Hydra.Node.InputQueue

We use offline chain config in the test and we are not opening the head but still send a bunch of NewTx inputs. I'll refactor the test so that we have the head opened so to make sure we don't get any exceptions that could kill the queue thread.
This change makes the test green even with 10x capacity exceeded.

2025-07-03

SB on increment bug

Ok so it seems like when two parties commit nothing to a Head and then one tries to increment with 1 ada (contestation-period 300s, deposit-deadline 600s) the increment transaction fails with H4 HeadValueIsNotPreserved.
I'll walk trough the code to try and reason out why this happens.
Let's go backwards from the increment tx construction.
The increment transaction looks like this:

{
    "auxiliary scripts": null,
    "certificates": null,
    "collateral inputs": [],
    "currentTreasuryValue": null,
    "era": "Conway",
    "fee": "0 Lovelace",
    "governance actions": [],
    "inputs": [
        "ac6ab83cf1a766fedc68e26ca33dbd5f457ecbb00997f34bf178f757d7152c81#0",
        "cf95ebccdc7cc42eb8957e3f34c96722f9022ea9e0a2dc64e30e82d342990de4#0"
    ],
    "metadata": {
        "55555": "HydraV1/IncrementTx"
    },
    "mint": null,
    "outputs": [
        {
            "address": "addr_test1wzlxa0r5ggyvvc9lplwpe7a4z4680nfsthjmzam72awtknqz5cf7v",
            "address era": "Shelley",
            "amount": {
                "lovelace": 5663420,
                "policy 6714b1424806d7436033096cfe829fef534436c6f98992a4bd0f43ae": {
                    "asset 4879647261486561645631 (HydraHeadV1)": 1,
                    "asset 4fb53f7aa3e07e3f3ba8516ed345119e9841822385c0307311209b41": 1,
                    "asset 5ecf89bf0e00eaf8667d9ff9bb92c6faac2fbcc6ff71ce9a8d6c5033": 1
                }
            },
            "datum": {
                "constructor": 1,
                "fields": [
                    {
                        "constructor": 0,
                        "fields": [
                            {
                                "bytes": "6714b1424806d7436033096cfe829fef534436c6f98992a4bd0f43ae"
                            },
                            {
                                "list": [
                                    {
                                        "bytes": "a65dd1af5d938e394b05198b97870a33343d3dd2ec8feb8906a2cbbd2cc0ac96"
                                    },
                                    {
                                        "bytes": "32bce7668b68bcc66750963217e50244fb7a9495053fb787fd5480aa1e509366"
                                    }
                                ]
                            },
                            {
                                "constructor": 0,
                                "fields": [
                                    {
                                        "int": 300000
                                    }
                                ]
                            },
                            {
                                "int": 1
                            },
                            {
                                "bytes": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
                            }
                        ]
                    }
                ]
            },
            "network": "Testnet",
            "payment credential script hash": "be6ebc744208c660bf0fdc1cfbb5157477cd305de5b1777e575cbb4c",
            "reference script": null,
            "stake reference": null
        }
    ],
    "redeemers": [
        {
            "purpose": {
                "spending script witnessed input": "ac6ab83cf1a766fedc68e26ca33dbd5f457ecbb00997f34bf178f757d7152c81#0"
            },
            "redeemer": {
                "data": "Constr 0 [B \"g\\DC4\\177BH\\ACK\\215C`3\\tl\\254\\130\\159\\239SD6\\198\\249\\137\\146\\164\\189\\SIC\\174\"]",
                "execution units": {
                    "memory": 0,
                    "steps": 0
                }
            }
        },
        {
            "purpose": {
                "spending script witnessed input": "cf95ebccdc7cc42eb8957e3f34c96722f9022ea9e0a2dc64e30e82d342990de4#0"
            },
            "redeemer": {
                "data": "Constr 1 [Constr 0 [List [B \"\\143\\&5\\138P\\189\\233\\\\\\230\\156\\NAK\\bI\\176\\137\\178\\233r\\185\\202\\156\\134%?:\\219p\\178\\213\\&8\\204\\r\\179I<\\NUL\\236\\228\\255~\\199\\229\\194\\232\\207\\243\\200\\ETB\\138\\143\\f1H:k~\\f$\\209\\219a\\221\\246\\NUL\\EOT\",B \"\\163\\163\\CAN<='\\150\\FS\\254PaS\\EMNA\\180\\131\\185\\219V3\\176\\156\\173\\208\\189U\\160A\\216\\209\\v\\225\\f'\\241\\153\\ACK\\135g\\247usn\\STX\\SO\\US\\209\\172UO\\191\\207Y\\164\\176t2i\\228\\165\\212w\\b\"],I 1,Constr 0 [B \"\\172j\\184<\\241\\167f\\254\\220h\\226l\\163=\\189_E~\\203\\176\\t\\151\\243K\\241x\\247W\\215\\NAK,\\129\",I 0]]]",
                "execution units": {
                    "memory": 0,
                    "steps": 0
                }
            }
        }
    ],
    "reference inputs": [
        "39924b477411b6bd33258f63dd705e0712b513a21ea9bc422a23ec0be797ba03#0"
    ],
    "required signers (payment key hashes needed for scripts)": [
        "4fb53f7aa3e07e3f3ba8516ed345119e9841822385c0307311209b41"
    ],
    "return collateral": null,
    "total collateral": null,
    "treasuryDonation": 0,
    "update proposal": null,
    "validity range": {
        "lower bound": null,
        "upper bound": 84879619
    },
    "voters": {},
    "withdrawals": null,
    "witnesses": []
}

And these are the transaction inputs: https://preview.cexplorer.io/tx/ac6ab83cf1a766fedc68e26ca33dbd5f457ecbb00997f34bf178f757d7152c81 https://preview.cexplorer.io/tx/cf95ebccdc7cc42eb8957e3f34c96722f9022ea9e0a2dc64e30e82d342990de4

So these are collect com and deposit outputs spent in the increment tx and deposit output needs to end up in the single Head output.
I noticed that we tried to lock 1 ADA but the output contains 1.53 ada so the theory is that the 1 ada get's autobalanced and locked into a deposit contract as 1.53 ada while offchain code thinks the locked value is just 1 ada. This causes H4 later on since the amounts do not match.
We tested this assumption and it proves correct! All we need to do now is reject too low deposits so we don't scratch our heads again about this.

June 2025

2025-06-19

SB on debugging user logs

Loaded two node logs in vim and I filtered out etcd messages.
This is the deposit they tried to do https://preview.cardanoscan.io/transaction/4dea5f74880a761a3d312e122b9701e9ad2ed2d74880de5b47bef087ee10de14?tab=utxo
The user experienced H4 - HeadValueIsNotPreserved upon trying to post increment that looks like this:

{
    "auxiliary scripts": null,
    "certificates": null,
    "collateral inputs": [],
    "currentTreasuryValue": null,
    "era": "Conway",
    "fee": "0 Lovelace",
    "governance actions": [],
    "inputs": [
        "4dea5f74880a761a3d312e122b9701e9ad2ed2d74880de5b47bef087ee10de14#0",
        "843676b219d25b597ae6f74dfc3a14ff11996f5e4bc282db2e847837e6abff4d#0"
    ],
    "metadata": {
        "55555": "HydraV1/IncrementTx"
    },
    "mint": null,
    "outputs": [
        {
            "address": "addr_test1wzlxa0r5ggyvvc9lplwpe7a4z4680nfsthjmzam72awtknqz5cf7v",
            "address era": "Shelley",
            "amount": {
                "lovelace": 6107270,
                "policy 361b4ebba480bf5f279e513401bd1eb1bb1b1117000f9876d0b84401": {
                    "asset 546573744e4654 (TestNFT)": 1
                },
                "policy 5ea251107632c490977d94ba46f2d5466aac9ce643bdc87e2b61e8e3": {
                    "asset 14d26549b376a0a7b8ed8449948b28bd00d179527edfe57c6b5c5859": 1,
                    "asset 4879647261486561645631 (HydraHeadV1)": 1,
                    "asset ff6fd6496610513e984c82c611866d052e8183bba8e09db1c2fb6688": 1
                }
            },
            "datum": {
                "constructor": 1,
                "fields": [
                    {
                        "constructor": 0,
                        "fields": [
                            {
                                "bytes": "5ea251107632c490977d94ba46f2d5466aac9ce643bdc87e2b61e8e3"
                            },
                            {
                                "list": [
                                    {
                                        "bytes": "ca64297d89d65052b090bde90e5a5f97b56e572187157c7360900257d02a0f38"
                                    },
                                    {
                                        "bytes": "8c6e2f026318fc925cfd6de9913ba71325a805e95d2f8d258e2e2f82baf66c0b"
                                    }
                                ]
                            },
                            {
                                "constructor": 0,
                                "fields": [
                                    {
                                        "int": 300000
                                    }
                                ]
                            },
                            {
                                "int": 1
                            },
                            {
                                "bytes": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
                            }
                        ]
                    }
                ]
            },
            "network": "Testnet",
            "payment credential script hash": "be6ebc744208c660bf0fdc1cfbb5157477cd305de5b1777e575cbb4c",
            "reference script": null,
            "stake reference": null
        }
    ],
    "redeemers": [
        {
            "purpose": {
                "spending script witnessed input": "4dea5f74880a761a3d312e122b9701e9ad2ed2d74880de5b47bef087ee10de14#0"
            },
            "redeemer": {
                "data": "Constr 0 [B \"^\\162Q\\DLEv2\\196\\144\\151}\\148\\186F\\242\\213Fj\\172\\156\\230C\\189\\200~+a\\232\\227\"]",
                "execution units": {
                    "memory": 0,
                    "steps": 0
                }
            }
        },
        {
            "purpose": {
                "spending script witnessed input": "843676b219d25b597ae6f74dfc3a14ff11996f5e4bc282db2e847837e6abff4d#0"
            },
            "redeemer": {
                "data": "Constr 1 [Constr 0 [List [B \"\\180\\t\\SUB\\b\\210\\179\\203\\238\\180\\FS[\\f\\175\\b\\199\\211\\CAN\\221/ptu__\\161\\188\\n1|\\131Pd\\252`\\130<wN\\183\\187\\182\\210)\\150\\214\\147(C\\232\\173\\157\\153r\\235\\174\\188\\154\\177\\208\\241\\254\\DC4\\227\\ENQ\",B \"\\183\\250\\219\\217\\133\\138\\128\\250,\\199\\&0v\\175\\215\\161\\238\\SUB2u\\175\\231\\165\\SI\\EM\\218\\137\\246\\NUL\\254\\172W\\STX\\250\\162\\250\\234\\177\\209\\CAN\\129m\\f>xW2\\177\\225\\204'\\235/\\150\\184\\225\\SOF\\SI\\216\\181{\\188W\\a\"],I 1,Constr 0 [B \"M\\234_t\\136\\nv\\SUB=1.\\DC2+\\151\\SOH\\233\\173.\\210\\215H\\128\\222[G\\190\\240\\135\\238\\DLE\\222\\DC4\",I 0]]]",
                "execution units": {
                    "memory": 0,
                    "steps": 0
                }
            }
        }
    ],
    "reference inputs": [
        "39924b477411b6bd33258f63dd705e0712b513a21ea9bc422a23ec0be797ba03#0"
    ],
    "required signers (payment key hashes needed for scripts)": [
        "14d26549b376a0a7b8ed8449948b28bd00d179527edfe57c6b5c5859"
    ],
    "return collateral": null,
    "total collateral": null,
    "treasuryDonation": 0,
    "update proposal": null,
    "validity range": {
        "lower bound": null,
        "upper bound": 83567522
    },
    "voters": {},
    "withdrawals": null,
    "witnesses": []
}

The input utxo look like this:

            "843676b219d25b597ae6f74dfc3a14ff11996f5e4bc282db2e847837e6abff4d#0": {
              "address": "addr_test1wzlxa0r5ggyvvc9lplwpe7a4z4680nfsthjmzam72awtknqz5cf7v",
              "datum": null,
              "referenceScript": null,
              "value": {
                "5ea251107632c490977d94ba46f2d5466aac9ce643bdc87e2b61e8e3": {
                  "14d26549b376a0a7b8ed8449948b28bd00d179527edfe57c6b5c5859": 1,
                  "4879647261486561645631": 1,
                  "ff6fd6496610513e984c82c611866d052e8183bba8e09db1c2fb6688": 1
                },
                "lovelace": 4663420
              }
            }
            "4dea5f74880a761a3d312e122b9701e9ad2ed2d74880de5b47bef087ee10de14#0": {
                "address": "addr_test1wzhqrkk782wrgm2ujwhreceegy4epg9clqlefmrt4gjwxrqckxj2j",
                "datum": null,
                "referenceScript": null,
                "value": {
                  "361b4ebba480bf5f279e513401bd1eb1bb1b1117000f9876d0b84401": {
                    "546573744e4654": 1
                  },
                  "lovelace": 2348950
                }
              },

So this is the deposit tx https://preview.cardanoscan.io/transaction/4dea5f74880a761a3d312e122b9701e9ad2ed2d74880de5b47bef087ee10de14?tab=utxo
We see that the locked up UTxO is the one coming from the snapshot too so how come there is this one 4dea5f74880a761a3d312e122b9701e9ad2ed2d74880de5b47bef087ee10de14 used as the input? I think that since we construct the increment from the spendableUTxOfor some reason we find the wrong one which also means the observation could contain some bug but I want to be able to figure all this out from the logs.

May 2025

2025-05-19

SB on looking at the user issue with committing

We have a user having problems with committing a script into a head incrementally. They commit empty UTxO for two parties and then try to increment using a blueprint tx which leads to them unable to close and fanout.
They see H4 on increment tx HeadValueIsNotPreserved and then on fanout this leads to H54 FanoutUTxOToCommitHashMismatch but that should be the outcome of wrong increment.
First I want to take a look at the logs and detect the state of things before the actual increment takes place.
This is the deposit of node A together with the spendable utxo:

Deposit A

{
  "timestamp": "2025-05-15T08:27:38.493319047Z",
  "threadId": 92,
  "namespace": "HydraNode-\"1\"",
  "message": {
    "node": {
      "by": {
        "vkey": "0227964e8ce2091ab78f431bfa8c468ae7edb58b6626830847e75f2b4fb0f064"
      },
      "input": {
        "chainEvent": {
          "newChainState": {
            "recordedAt": {
              "blockHash": "dc9a6d8cd0b2b7f199ee524df54777e021fb69cef73f66b453c6e5ff27ec348e",
              "slot": 80641658,
              "tag": "ChainPoint"
            },
            "spendableUTxO": {
              "58a19e03bfc97f79b6d99f9a1ddc21d19f67d6f0ee2b717cea3bfea506207a2f#0": {
                "address": "addr_test1wzhqrkk782wrgm2ujwhreceegy4epg9clqlefmrt4gjwxrqckxj2j",
                "datum": null,
                "inlineDatum": {
                  "constructor": 0,
                  "fields": [
                    {
                      "bytes": "c3d43601333ee3b9904de784b11cb51848f121849dd918d33ce17e0a"
                    },
                    {
                      "int": 1747297945581
                    },
                    {
                      "list": [
                        {
                          "constructor": 0,
                          "fields": [
                            {
                              "constructor": 0,
                              "fields": [
                                {
                                  "bytes": "92328d0e0f67fb11a62155dc6ca3b9c8905e7afadb87d67231cbce9613d9cc6a"
                                },
                                {
                                  "int": 0
                                }
                              ]
                            },
                            {
                              "bytes": "d8799fd8799fd8799f581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261effd8799fd8799fd8799f581cea1a07cebb935acfe6fe49e9cfbd2864e21e3d6385a4391d29782f5cffffffffa240a1401a0016080a581c41a1a2139e297b0af14d55506569157a6a258c49ce2907bf7ec78b54a147546573744e465401d87b9fd8799fa2435f706b581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261e446e616d654d54657374204d696e74204e465401ffffd87a80ff"
                            }
                          ]
                        }
                      ]
                    }
                  ]
                },
                "inlineDatumRaw": "d8799f581cc3d43601333ee3b9904de784b11cb51848f121849dd918d33ce17e0a1b00000196d312c7ed9fd8799fd8799f582092328d0e0f67fb11a62155dc6ca3b9c8905e7afadb87d67231cbce9613d9cc6a00ff5f5840d8799fd8799fd8799f581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261effd8799fd8799fd8799f581cea1a07cebb935acfe6fe49e9cf5840bd2864e21e3d6385a4391d29782f5cffffffffa240a1401a0016080a581c41a1a2139e297b0af14d55506569157a6a258c49ce2907bf7ec78b54a1475465737458404e465401d87b9fd8799fa2435f706b581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261e446e616d654d54657374204d696e74204e46544701ffffd87a80ffffffffff",
                "inlineDatumhash": "7072708fc774009787ea7ce1d0d4398fd99f0485f443cacd293086259e27f922",
                "referenceScript": null,
                "value": {
                  "41a1a2139e297b0af14d55506569157a6a258c49ce2907bf7ec78b54": {
                    "546573744e4654": 1
                  },
                  "lovelace": 2348950
                }
              },
              "58a19e03bfc97f79b6d99f9a1ddc21d19f67d6f0ee2b717cea3bfea506207a2f#1": {
                "address": "addr_test1vqg3qe4umvdju0nxjs77rcv98rzmk4z9cshced8ysz520dqta4a9t",
                "datum": null,
                "datumhash": null,
                "inlineDatum": null,
                "inlineDatumRaw": null,
                "referenceScript": null,
                "value": {
                  "lovelace": 990666095
                }
              },
              "79d547486be78ea86594dc9e4bc855df4ae42472ed375f4555b309ff0ee72055#0": {
                "address": "addr_test1wq8r2y26937p835wekxhfeyc0szdg5u7xdmy803qhve8f0gsqwq93",
                "datum": null,
                "inlineDatum": {
                  "constructor": 1,
                  "fields": [
                    {
                      "constructor": 0,
                      "fields": [
                        {
                          "bytes": "c3d43601333ee3b9904de784b11cb51848f121849dd918d33ce17e0a"
                        },
                        {
                          "list": [
                            {
                              "bytes": "0227964e8ce2091ab78f431bfa8c468ae7edb58b6626830847e75f2b4fb0f064"
                            },
                            {
                              "bytes": "c5c1e857c0dbad1fb55a51acb46ad753889bd4aabb4cf1d5831a5379fe3af48b"
                            }
                          ]
                        },
                        {
                          "constructor": 0,
                          "fields": [
                            {
                              "int": 300000
                            }
                          ]
                        },
                        {
                          "int": 0
                        },
                        {
                          "bytes": "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"
                        }
                      ]
                    }
                  ]
                },
                "inlineDatumRaw": "d87a9fd8799f581cc3d43601333ee3b9904de784b11cb51848f121849dd918d33ce17e0a9f58200227964e8ce2091ab78f431bfa8c468ae7edb58b6626830847e75f2b4fb0f0645820c5c1e857c0dbad1fb55a51acb46ad753889bd4aabb4cf1d5831a5379fe3af48bffd8799f1a000493e0ff005820e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855ffff",
                "inlineDatumhash": "17460f2bd89e876a0d6bfb75f10971262c3d7c9dd5bb3d382f75edf814127c0a",
                "referenceScript": null,
                "value": {
                  "c3d43601333ee3b9904de784b11cb51848f121849dd918d33ce17e0a": {
                    "0861f42c304f2eaecf578f4631684dc338749ddd1a556d8bf8eed360": 1,
                    "111066bcdb1b2e3e66943de1e18538c5bb5445c42f8cb4e480a8a7b4": 1,
                    "4879647261486561645631": 1
                  },
                  "lovelace": 4663420
                }
              },
              "79d547486be78ea86594dc9e4bc855df4ae42472ed375f4555b309ff0ee72055#1": {
                "address": "addr_test1vqyxrapvxp8jatk02785vvtgfhpnsayam5d92mvtlrhdxcqp4jnw7",
                "datum": null,
                "datumhash": null,
                "inlineDatum": null,
                "inlineDatumRaw": null,
                "referenceScript": null,
                "value": {
                  "lovelace": 996105253
                }
              }
            }
          },
          "observedTx": {
            "deadline": "2025-05-15T08:32:25.581Z",
            "depositTxId": "58a19e03bfc97f79b6d99f9a1ddc21d19f67d6f0ee2b717cea3bfea506207a2f",
            "deposited": {
              "92328d0e0f67fb11a62155dc6ca3b9c8905e7afadb87d67231cbce9613d9cc6a#0": {
                "address": "addr_test1qzzgryrnvj9qtv4p0wnx5dle4k7a3z99jzxsh0p20ajzv8h2rgruawuntt87dljfa88m62ryug0r6cu95su362tc9awq660nfq",
                "datum": null,
                "inlineDatum": {
                  "constructor": 0,
                  "fields": [
                    {
                      "map": [
                        {
                          "k": {
                            "bytes": "5f706b"
                          },
                          "v": {
                            "bytes": "84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261e"
                          }
                        },
                        {
                          "k": {
                            "bytes": "6e616d65"
                          },
                          "v": {
                            "bytes": "54657374204d696e74204e4654"
                          }
                        }
                      ]
                    },
                    {
                      "int": 1
                    }
                  ]
                },
                "inlineDatumRaw": "d8799fa2435f706b581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261e446e616d654d54657374204d696e74204e465401ff",
                "inlineDatumhash": "2e733e63e1f1f8231eee004ce50670a90624b6b0a0d16260f92a485a1a5d86a1",
                "referenceScript": null,
                "value": {
                  "41a1a2139e297b0af14d55506569157a6a258c49ce2907bf7ec78b54": {
                    "546573744e4654": 1
                  },
                  "lovelace": 1443850
                }
              }
            },
            "headId": "c3d43601333ee3b9904de784b11cb51848f121849dd918d33ce17e0a",
            "tag": "OnDepositTx"
          },
          "tag": "Observation"
        },
        "tag": "ChainInput"
      },
      "inputId": 39,
      "tag": "BeginInput"
    },
    "tag": "Node"
  }
}

and the same on preview explorer https://preview.cardanoscan.io/transaction/58a19e03bfc97f79b6d99f9a1ddc21d19f67d6f0ee2b717cea3bfea506207a2f?tab=utxo

It seems there is a difference in what we observe as deposit and the spendable utxo we have locally. We observe a deposit of:

{
              "92328d0e0f67fb11a62155dc6ca3b9c8905e7afadb87d67231cbce9613d9cc6a#0": {
                "address": "addr_test1qzzgryrnvj9qtv4p0wnx5dle4k7a3z99jzxsh0p20ajzv8h2rgruawuntt87dljfa88m62ryug0r6cu95su362tc9awq660nfq",
                "datum": null,
                "inlineDatum": {
                  "constructor": 0,
                  "fields": [
                    {
                      "map": [
                        {
                          "k": {
                            "bytes": "5f706b"
                          },
                          "v": {
                            "bytes": "84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261e"
                          }
                        },
                        {
                          "k": {
                            "bytes": "6e616d65"
                          },
                          "v": {
                            "bytes": "54657374204d696e74204e4654"
                          }
                        }
                      ]
                    },
                    {
                      "int": 1
                    }
                  ]
                },
                "inlineDatumRaw": "d8799fa2435f706b581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261e446e616d654d54657374204d696e74204e465401ff",
                "inlineDatumhash": "2e733e63e1f1f8231eee004ce50670a90624b6b0a0d16260f92a485a1a5d86a1",
                "referenceScript": null,
                "value": {
                  "41a1a2139e297b0af14d55506569157a6a258c49ce2907bf7ec78b54": {
                    "546573744e4654": 1
                  },
                  "lovelace": 1443850
                }
              }
            },

https://preview.cexplorer.io/tx/92328d0e0f67fb11a62155dc6ca3b9c8905e7afadb87d67231cbce9613d9cc6a

But the locked deposit looks like this one:

              "58a19e03bfc97f79b6d99f9a1ddc21d19f67d6f0ee2b717cea3bfea506207a2f#0": {
                "address": "addr_test1wzhqrkk782wrgm2ujwhreceegy4epg9clqlefmrt4gjwxrqckxj2j",
                "datum": null,
                "inlineDatum": {
                  "constructor": 0,
                  "fields": [
                    {
                      "bytes": "c3d43601333ee3b9904de784b11cb51848f121849dd918d33ce17e0a"
                    },
                    {
                      "int": 1747297945581
                    },
                    {
                      "list": [
                        {
                          "constructor": 0,
                          "fields": [
                            {
                              "constructor": 0,
                              "fields": [
                                {
                                  "bytes": "92328d0e0f67fb11a62155dc6ca3b9c8905e7afadb87d67231cbce9613d9cc6a"
                                },
                                {
                                  "int": 0
                                }
                              ]
                            },
                            {
                              "bytes": "d8799fd8799fd8799f581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261effd8799fd8799fd8799f581cea1a07cebb935acfe6fe49e9cfbd2864e21e3d6385a4391d29782f5cffffffffa240a1401a0016080a581c41a1a2139e297b0af14d55506569157a6a258c49ce2907bf7ec78b54a147546573744e465401d87b9fd8799fa2435f706b581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261e446e616d654d54657374204d696e74204e465401ffffd87a80ff"
                            }
                          ]
                        }
                      ]
                    }
                  ]
                },
                "inlineDatumRaw": "d8799f581cc3d43601333ee3b9904de784b11cb51848f121849dd918d33ce17e0a1b00000196d312c7ed9fd8799fd8799f582092328d0e0f67fb11a62155dc6ca3b9c8905e7afadb87d67231cbce9613d9cc6a00ff5f5840d8799fd8799fd8799f581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261effd8799fd8799fd8799f581cea1a07cebb935acfe6fe49e9cf5840bd2864e21e3d6385a4391d29782f5cffffffffa240a1401a0016080a581c41a1a2139e297b0af14d55506569157a6a258c49ce2907bf7ec78b54a1475465737458404e465401d87b9fd8799fa2435f706b581c84819073648a05b2a17ba66a37f9adbdd888a5908d0bbc2a7f64261e446e616d654d54657374204d696e74204e46544701ffffd87a80ffffffffff",
                "inlineDatumhash": "7072708fc774009787ea7ce1d0d4398fd99f0485f443cacd293086259e27f922",
                "referenceScript": null,
                "value": {
                  "41a1a2139e297b0af14d55506569157a6a258c49ce2907bf7ec78b54": {
                    "546573744e4654": 1
                  },
                  "lovelace": 2348950
                }
              },

This means when increment is constructed using spendable utxo it is just wrong and we get a mismatch between values we expect in the head input + deposit and the head output. But the question is why do we try to construct the increment with wrong UTxO?

Deposit observation does not depend on the spendable UTxO at all, it cares only about the deposit transaction which then leads to posting increment tx which looks at spendable UTxO and tries to find some deposit utxo to spend. Clearly this leads to problems if we have different UTxO in the spendable utxo and observation!
Thinking on how to reproduce this in some test. I wonder how did we end up with some deposit utxo in our spendable/Head UTxO? Perhaps we need to do what SN suggested and tie together tx with UTxO - what we call ResolvedTx? That way the observed UTxO is always tied to some tx we observed.

2025-05-13

SB on hydra-node using blockfrost

Time to finalize the blockfrost PR. I still need to see fanout work but I expect no problems.
There are couple of ideas related to improving the design of this integration. I'll look up adr3 or so and if that sounds like it will take too much time I'll default to having two instances of BackendOps - one for direct and one for blockfrost backend.
I don't see close tx being on chain. Need to inspect.
Yup the close tx is never ending up on chain for some reason and I don't see any errors. Wat? I did see a deposit observation in between which is not related to this head. Simplifying the snapshot to only close whatever we committed I think is fine so I'll do that. Waiting on the test results is really taking it's time.
So the tx did not even end up in mempool! Wat? How come I don't see any error?
Perhaps the submission client stopped and is not submitting txs silently? But I saw PostedTx for this close already.
Waiting for a long time yielded this error:

        uncaught exception: APIBlockfrostError
       BlockfrostError "ServantClientError (ConnectionError (HttpExceptionRequest Request {\n  host                 = \"cardano-preview.blockfrost.io\"\n  port                 = 443\n  secure               = True\n  requestHeaders       = [(\"Accept\",\"application/json;charset=utf-8,application/json\"),(\"User-Agent\",\"blockfrost-haskell/0.12.2.0\"),(\"project_id\",\"w25SicgUreG35eWDNnSx6ZgYhvPrRpS8\")]\n  path                 = \"/api/v0/blocks/26ae9bb9162c694c7aac69dcfbcff931f260a17badc1e3cb811f7698af8ef1da\"\n  queryString          = \"\"\n  method               = \"GET\"\n  proxy                = Nothing\n  rawBody              = False\n  redirectCount        = 10\n  responseTimeout      = ResponseTimeoutDefault\n  requestVersion       = HTTP/1.1\n  proxySecureMode      = ProxySecureWithConnect\n}\n (InternalException (HandshakeFailed (Error_Misc \"Network.Socket.recvBuf: resource vanished (Connection reset by peer)\")))))"

so it seems blockfrost chokes sometimes...
I'll just wait for the test to either finish or die since I don't have any bright ideas on what could cause this problem with close when all other txs go through and are observed.
Maybe I do the typeclass changes and have somebody from the team to look at this with me? I am kinda tired of doing this thing for so long.
I gave up on the idea since I am not satisfied with these two separate instances. I used either type for the chain backend in the Options and this is not making it look pretty. Am I missing something? Would like to work with someone on this since I don't want to loose any more time.
Looking at close again I figured out what can be the problem - tx validity. Indeed the close tx starting validity looks like this:

TxValidityLowerBound AllegraEraOnwardsConway (SlotNo 80486130)
TxValidityUpperBound ShelleyBasedEraConway (Just (SlotNo 80486140))

but the tick immediately after submitting close tx is:

Tick
    { chainTime = 2025-05-13 13:15:25 UTC
    , chainSlot = ChainSlot 80486125
    }

So I can't get the exact slot of posted tx but what I believe is that the close tx get's discarded because of it's validity range and the tx never actually ends up on-chain. This would explain why I don't see it in the explorer as well.

Hypothesis was right! Bumping the contestation deadline as this is responsible for calculating the end validity of the transaction did the trick. I finally have a green test!

2025-05-12

SB on hydra-node using blockfrost

Today I want to get to a working test finally and for that to work out I need to find a bug in the observation code that prevents me from observing commit transaction.
I've sprinkled some tracing in the observation code to be able to figure out why a tx was not observed.
I see that the observation check that we are spending the initial output is failing for whatever reason.
This happens because resolving input UTxO comes out empty. But why exactly?
Perhaps I need to wait to see the published scripts first before observing but that doesn't make too much sense to me. TxIds I pass into direct chain should then not be valid and I should get a different error? Anyway, I'll give it a shot.
Nah that was not it. I noticed one important thing and that is that the slots keep going back. So, observed slots should always go forward otherwise we will update the spendable utxo again and rewind it back to the init observation state and not be able to observe next ones.
This is definitely a bug I am experiencing and I can't tell for now why do we go back to the first starting slot after observing the init?
Do I need to implement onRollBackward too? I can't see where do I go back to already seen slot argh
Just by using a TVar to record a block hash things started working! Wat? I was under the impression I am using the recursion correctly? Is it the retry that was screwing things up?
Now got a fuel utxo error related to collect yay!
Duh, I was using Alice keys for everything, fixed the error by using external generated key for committing and actually having enough funds.

2025-05-08

SB on hydra-node using blockfrost

I need to see why I can't query by address and txId and get UTxO for both faucet and alice?
Ok, this was happening because there is a LOT of UTxO at both faucet and alice address. Blockfrost gives us max 100 entries sorted asc so I couldn't find the last UTxO.
I've changed the awaitUTxO query to return only the last UTxO for specific address and then filter by txid.
This worked and I got further! Now the commit tx fails because of insufficient collateral.
This seems random and it worked second try? I added more ada and now it seems to work but then observing commit failed with:

        expected: OnCommitTx {headId = UnsafeHeadId "\212\164\159K\128/\130j\169\170\211\155\186\128}\182z\205KC\STX\162;b;K\"\DC3", party = Party {vkey = HydraVerificationKey (VerKeyEd25519DSIGN "d5bf4a3fcce717b0388bcc2749ebc148ad9969b23f45ee1b605fd58778576ac4")}, committed = fromList [(TxIn "9bae27ff84d84af2a808432341801edab64c75dc8ce38664b09ee54fa8abfa2f" (TxIx 0),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d"})) StakeRefNull)) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 7000000) (MultiAsset (fromList [])))) TxOutDatumNone ReferenceScriptNone)]}
        but got: OnInitTx {headId = UnsafeHeadId "\212\164\159K\128/\130j\169\170\211\155\186\128}\182z\205KC\STX\162;b;K\"\DC3", headSeed = UnsafeHeadSeed "\"b511295ef9050a5f4df3c52029abce4c9e3dd703bc6e606746640f46aca5679c#0\"", headParameters = HeadParameters {contestationPeriod = 10s, parties = [Party {vkey = HydraVerificationKey (VerKeyEd25519DSIGN "d5bf4a3fcce717b0388bcc2749ebc148ad9969b23f45ee1b605fd58778576ac4")}]}, participants = [UnsafeOnChainId "\248\166\140\209\142Y\166\172\232H\NAKZ\SO\150z\246OM\NUL\207\138\206\232\173\201Zk\r"]}

I already wait for 100 seconds which is way too much but perhaps I should wait some more to see if the observation goes through.
Bumping to 150 yields the same error so perhaps I need to look at observation code. Feels weird that we would observe init but fail to observe commit. Problems never stop appearing it seems.
Since we are just submitting a commit tx and then expect to see the observation I added the code to actually await for the utxo to appear at the address before waiting on the observation.
I also added a fail after some number of seconds when awaiting for utxo since we don't want to have tests running forever.
Now I see unexpected deposit observation after commit which means I need to wait some more for the deposit to settle.
Waiting more didn't help - I am confused since I can observe InitTx just fine and this observation appears again when I am trying to observe Commit after sending external commit tx.
On a separate note - if I run the test, ctrl-c it and run again the faucet utxo is not correct since we didn't reobserve it. There should be a way of knowing if certain UTxO is likely the last one and there is no pending tx in progress for certain address. I need to explore blockfrost api to see if I can find something suitable (mempool diagnostics or similar)
Nice there is https://docs.blockfrost.io/#tag/cardano--mempool/GET/mempool/addresses/{address} and https://docs.blockfrost.io/#tag/cardano--mempool/GET/mempool/{hash}
I will look into these later on once I see the freaking green test.
Running on public networks is also a pain in the ass for testing. There is this thing I could try https://github.com/blockfrost/blockfrost-backend-ryo but it would require a lot of fiddling around and I need a solution I can ship with hydra for testing purposes.
Another idea I just got is why don't I run some hydra-chain-observer test using the code from blockfrost observations? It might be faster for me to figure out where the error is? Argh, no extensive testing is in there, just a bit of observation testing.
I noticed that slots seems to be off when following the chain:

  new block hash: BlockHash "a014462e775f5df1a2830a175b06b1b4d98e0fa89999622eee32a74bc3a64c2e"
new slot number: 80051427
new block hash: BlockHash "c08a8b97f9d6389913ec69c5634d33b903e8b241bac5e1e54035bc7dde7227b1"
new slot number: 80051427

I see in the blockfrost Block that there is a _blockSlot and _blockEpochSlot. Not sure what is the difference, I'll try using _blockEpochSlot to see how it looks.
Ah no, this is wrong. Let's compare the block hash vs. slot number on cardano explorer and blockfrost. Btw SlotNo on cardano is Word64 while blockfrost uses Integer
Maybe this was just me being tired, block slots seem fine. Perhaps I need to somehow implement onRollBackward?
I added 5 block confirmations instead of 1 - right now I am very tired and it doesn't seem I will find the problem like this. A small break is in order. Argh, more confirmations require more waiting on the UTxO.
Before that I wanted to try filtering the UTxO I get from queryUTxOForTxIn
- for some reason this doesn't seem to work - I get BadInputsUTxO error. Need to raise this to blockfrost people.
I will leave this for now just to refresh my brain and instead focus on making my silly typeclass a bit better.

2025-05-07

SB on hydra-node using blockfrost

I managed to finally get to a working tx submission by using the CostModelsRaw from blockfrost. Plain CostModels contain the field names and they are not in order I was expecting them to be in. Now we can finally evaluate txs from the wallet.
After this fix I noticed I am missing the whole part of code to actually submit transactions in the Direct module.
I copy/pasted relevant code from cardano implementation that worked out of the box.
Now I had to bump the time it takes to observe some tx in the DirectChainSpec since blockfrost takes a bit of time more.
And now I see BadInputs again after observing the initialization/or sometimes we fail to observe initialization.
Pretty sure the code I copy/pasted with some changes is causing this - this should be the final step before blockfrost integration should start working.
Failure BadInputsUTxO happens when trying to incrementally commit some UTxO. I'll render the tx to see how it looks like first.
By looking at the tx it is obvious that seedFromFaucetBlockfrost needs await for faucet utxo but it needs to return whatever utxo we seeded to some actor.
For some reason just awaiting with the recipient address hangs - I would expect to be able to see the ouput with corresponding input txid working. Perhaps there is another bug related to how we construct cardano utxo (there is a related TODO about the two almost identical ways of doing it).

2025-05-05

SB on hydra-node using blockfrost

Let's see if publishing of scripts still works and start from there. I should inspect how EraHistory is assembled since that is related for tx evaluation which is where the error comes from.
Problem is that tx evaluation thinks PlutusV3 is not enabled yet and that information comes from looking at EraHistory.
There is still a problem with flaky waiting on UTxO from blockfrost which sometimes results in BadInput error, so I might also choose to fix that problem first.
Script publishing (with waiting) still works, good.
Let's focus first on BadInput error. This is how the current wallet UTxO looks like:

ubuntu@ip-10-1-6-128:~$ sudo ./cardano-cli-x86_64-linux query utxo --socket-path preview/node.socket --address addr_test1vztc80na8320zymhjekl40yjsnxkcvhu58x59mc2fuwvgkc332vxv --testnet-magic 2
                           TxHash                                 TxIx        Amount
--------------------------------------------------------------------------------------
dcec59151ed3616b0836bdb5d3e77f99cbe8525c83683030f1868cf4b2108ad3     1        143491445598 lovelace + 1 13d1f7feab83ff4db444bf96b8677949c5bf9c709671f30ff8f33ab3.487964726120446f6f6d202d2033726420506c6163652054726f706879 + 1 19c98d04cdb6e1e782a73e693697d4a46ca9820d5d490a3bf6470a07.487964726120446f6f6d202d20326e6420506c6163652054726f706879 + 1 1a22028742629f3cf38b3d1036a088fea59eb30237a675420fb25c11.2331 + 1 6d92350897706b14832c62c5b5644e918f0b6b3b63ffc00a1a463828.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 ad39d849181dc206488fd726240c00b55547153ffdca8c079e1e34d9.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 bfe4ab531fd625ef33ea355fd85953eb944bffa401af767666ff411c.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 c953682b6eb5891c0bda35718c5261587d57e5e408079cbeb8cf881a.2331 + 1 cd6076d9d0098da4c7670c08f230e4efe31d666263c9db5196805d6e.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 d0c91707d75011026193c0fce742443dde66fa790936981ece5d9f8b.2331 + 69918000000 d8906ca5c7ba124a0407a32dab37b2c82b13b3dcd9111e42940dcea4.0014df105553444d + 1 dd7e36888a487f8b27687f65abd93e6825b4eb3ce592ee5f504862df.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 fa10c5203512eeeb92bf79547b09f5cdb2e008689864b0175cca6fee.487964726120446f6f6d202d2034746820506c6163652054726f706879 + TxOutDatumNone

So more than enough ada but also some tokens all in one output. Why would this be problematic?
Running the test I get:

       FaucetBlockfrostError {blockFrostError = "BlockfrostBadRequest \"{\\\"contents\\\":{\\\"contents\\\":{\\\"contents\\\":{\\\"era\\\":\\\"ShelleyBasedEraConway\\\",\\\"error\\\":[\\\"ConwayUtxowFailure (UtxoFailure (ValueNotConservedUTxO (MaryValue (Coin 0) (MultiAsset (fromList []))) (MaryValue (Coin 143491445598) (MultiAsset (fromList [(PolicyID {policyID = ScriptHash \\\\\\\"13d1f7feab83ff4db444bf96b8677949c5bf9c709671f30ff8f33ab3\\\\\\\"},fromList [(\\\\\\\"487964726120446f6f6d202d2033726420506c6163652054726f706879\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"19c98d04cdb6e1e782a73e693697d4a46ca9820d5d490a3bf6470a07\\\\\\\"},fromList [(\\\\\\\"487964726120446f6f6d202d20326e6420506c6163652054726f706879\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"1a22028742629f3cf38b3d1036a088fea59eb30237a675420fb25c11\\\\\\\"},fromList [(\\\\\\\"2331\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"6d92350897706b14832c62c5b5644e918f0b6b3b63ffc00a1a463828\\\\\\\"},fromList [(\\\\\\\"487964726120446f6f6d202d2031737420506c6163652054726f706879\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"ad39d849181dc206488fd726240c00b55547153ffdca8c079e1e34d9\\\\\\\"},fromList [(\\\\\\\"487964726120446f6f6d202d2031737420506c6163652054726f706879\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"bfe4ab531fd625ef33ea355fd85953eb944bffa401af767666ff411c\\\\\\\"},fromList [(\\\\\\\"487964726120446f6f6d202d2031737420506c6163652054726f706879\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"c953682b6eb5891c0bda35718c5261587d57e5e408079cbeb8cf881a\\\\\\\"},fromList [(\\\\\\\"2331\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"cd6076d9d0098da4c7670c08f230e4efe31d666263c9db5196805d6e\\\\\\\"},fromList [(\\\\\\\"487964726120446f6f6d202d2031737420506c6163652054726f706879\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"d0c91707d75011026193c0fce742443dde66fa790936981ece5d9f8b\\\\\\\"},fromList [(\\\\\\\"2331\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"d8906ca5c7ba124a0407a32dab37b2c82b13b3dcd9111e42940dcea4\\\\\\\"},fromList [(\\\\\\\"0014df105553444d\\\\\\\",69918000000)]),(PolicyID {policyID = ScriptHash \\\\\\\"dd7e36888a487f8b27687f65abd93e6825b4eb3ce592ee5f504862df\\\\\\\"},fromList [(\\\\\\\"487964726120446f6f6d202d2031737420506c6163652054726f706879\\\\\\\",1)]),(PolicyID {policyID = ScriptHash \\\\\\\"fa10c5203512eeeb92bf79547b09f5cdb2e008689864b0175cca6fee\\\\\\\"},fromList [(\\\\\\\"487964726120446f6f6d202d2034746820506c6163652054726f706879\\\\\\\",1)])])))))\\\",\\\"ConwayUtxowFailure (UtxoFailure (BadInputsUTxO (fromList [TxIn (TxId {unTxId = SafeHash \\\\\\\"dcec59151ed3616b0836bdb5d3e77f99cbe8525c83683030f1868cf4b2108ad3\\\\\\\"}) (TxIx {unTxIx = 1})])))\\\"],\\\"kind\\\":\\\"ShelleyTxValidationError\\\"},\\\"tag\\\":\\\"TxValidationErrorInCardanoMode\\\"},\\\"tag\\\":\\\"TxCmdTxSubmitValidationError\\\"},\\\"tag\\\":\\\"TxSubmitFail\\\"}\""}

So dcec59151ed3616b0836bdb5d3e77f99cbe8525c83683030f1868cf4b2108ad3#1 seems invalid? Why? This is exactly the single UTxO I would like to be able to spend.
This is the same tx in the explorer https://preview.cexplorer.io/tx/dcec59151ed3616b0836bdb5d3e77f99cbe8525c83683030f1868cf4b2108ad3 and it seems like after this tx we have the following UTxO https://preview.cexplorer.io/address/addr_test1vztc80na8320zymhjekl40yjsnxkcvhu58x59mc2fuwvgkc332vxv/tx#data
It seems like we return wrong next UTxO from publishing scripts since I see that the last UTxO I spent was the one seeding 100 ada. Then we ran the test which published the scripts but then seeding first time worked and again failed after the initial 100 ada seed.
Both scripts publishing and seeding from UTxO call the same awaitTransactionId function. Altering the script publishing to await after every tx results in the same error.
Perhaps I could try just seeding three UTxO and print out the actuall utxo so I can try to pinpoint where the error comes from.
I see in the explorer that two transactions worked but the error I see is related to the first UTxO. How can it be that the two txs succeeded but the error I see comes from the UTxO I saw after the first tx? Third seeding can't possibly take in as the input some UTxO which is the output of a first tx?
I noticed that await uses queryUTxOByTxIn while getting the faucet utxo uses queryUTxO. There could be a problem where queryUTxO reports the wrong utxo. There is also toCardanoUTxO and it's prime version that need to be unified.
In general queryUTxO result can't be trusted. Seems like we get better UTxO information when we query transaction UTxO. We could try to chain these seed operations so if we provide Nothing to the seeding function it tries to query the UTxO - otherwise we re-use the UTxO we already know.
This could work but it is not exactly what I want since for example if you publish the scripts first (or do any other information) how do you know that after that queried UTxO will be the last one really?
In general we need a secure way to know this is the latest UTxO of specific address.
After chaining of the txs I get:


       FaucetBlockfrostError {blockFrostError = "BlockfrostBadRequest \"{\\\"contents\\\":{\\\"contents\\\":{\\\"contents\\\":{\\\"era\\\":\\\"ShelleyBasedEraConway\\\",\\\"error\\\":[\\\"ConwayUtxowFailure (MissingVKeyWitnessesUTXOW (fromList [KeyHash {unKeyHash = \\\\\\\"f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d\\\\\\\"}]))\\\"],\\\"kind\\\":\\\"ShelleyTxValidationError\\\"},\\\"tag\\\":\\\"TxValidationErrorInCardanoMode\\\"},\\\"tag\\\":\\\"TxCmdTxSubmitValidationError\\\"},\\\"tag\\\":\\\"TxSubmitFail\\\"}\""}

So we are missing a sig somewhere. This happens after the initial 100 ADA seed.
What I want is to be able to query the UTxO and know it is the actuall updated UTxO after the last sent tx. I could provide Tx hash I expect to see in the utxo and filter like that.
Let's do one round of tracing to try to spot where the problem is. We start with faucet utxo "6c2fcbb6b7d79a08eee4588feabeb79e2d8b3d77d4c916291c39ba514bb449f8#1". From the script publishing we see:

303f49141d3042bc0c25105ab0686ca274e92aa23ac83a65b260b98961c2108b#0
https://preview.cexplorer.io/tx/303f49141d3042bc0c25105ab0686ca274e92aa23ac83a65b260b98961c2108b

9d86e58a5a9afb24bfd22dfef0f171cb36496e7c510ac784b948259c4c4fd5ac#0
https://preview.cexplorer.io/tx/9d86e58a5a9afb24bfd22dfef0f171cb36496e7c510ac784b948259c4c4fd5ac

8a0243a3ab24d0acd9d64111f6a826485173236c36a77c54a6b53d425b1e81c7#0
https://preview.cexplorer.io/tx/8a0243a3ab24d0acd9d64111f6a826485173236c36a77c54a6b53d425b1e81c7

When we try to find faucet utxo after that we get 6c2fcbb6b7d79a08eee4588feabeb79e2d8b3d77d4c916291c39ba514bb449f8#1 which is the utxo we got BEFORE publishing of scripts. So this means either the await in the script publishing is not doing it's job or two different queries to get the latest blockfrost utxo are behaving different.
I tried using new function awaitUTxO that also tries to wait for specific utxo with txid but it seems that it hangs forever. Need to fix that one first.
Ok, the function seems to be working - it was just a matter of waiting on the last tx id from publishing txs instead of trying to find all three (which makes sense, I was just stupid).
In general is seems like blockfrost api is behaving different when you have a tx hash and want to get the UTxO for it, and, in contrast, when you have the address and just want to get the UTxO for that address.
One problem solved all though I think heavy refactoring should be next in line but let's focus on another problem at hand first.
I am not able to post an init tx because of: LedgerLanguageNotAvailableError
This should be tightly related to how we construct EpochInfo (and tightly related EraHistory) so I need to revisit this and compare what we get from a real cardano-node and blockfrost api.
I noticed that the filtering was reversed! This should fix the last problem but I still see ocasinal BadInputs and this is because we can't know what is the latest fresh faucet UTxO on each test run! This is so annoying - the endpoint that gives you utxo for address is not updated immediately so if you don't know the exact txid to await for then we need to wait a bit after each test run in order to hope that the UTxO we see after hydra scripts publishing is correct (or perhaps even the script publishing was failing because of the same problem).
I noticed that major and minor version in ProtVer are reversed 😱 This must be the problem I am facing.
Ha! found it. Now I get the error related to overspending the budget which I need to look into next.
I noticed that memory and steps in MaxTxExecutionUnits is reversed! Another found error! BUT I still see overspent budget.

April 2025

2025-04-30

SN on fixing deposits

Last step to make the node safe for deadlines too soon is to fix it not accepting snapshots if deadline too soon.
This is logic that needs to go into the ReqSn handling.
I wondered why does the ReqSn contain UTxO if the node needs to know about the deposit anyways? Make this a TxId too like the normal requested transaction ids to confirm.
As I added a created :: SlotNo to the DepositObservation, I realized this is the first of its kind where we want to observe a slot from the chain while we actually need its wall clock equivalent in the HeadLogic. Where to convert?
Problem: convertObservation is also used by the hydra-chain-observer and we can only convert time in a hydra-node (we only need it converted in the node).
- Could: front load conversion of slot to time to deposit creation, put it into the datum and only observe if it matches?
- Does not make sense.. we could convert on observation right away..
- Providing a TimeHandle to convertObservation makes sense if we can just switch the hydra-chain-observer to use the HeadObservation type directly.
Finally, to make the deposit only be signed after becoming active (settled), I reached for the Wait outcome to have the snapshot request be handled later, once the deposit is active. However, this requires significant increase in TTL and wait between re-enqueuing to ensure we can wait long enough for the deposit to become active (> deposit period).
Increasing the TTL is likely not a good idea because the input queue is not persisted (right now?). An alternative could be to shift the burden onto the snapshot leader and have them re-submit snapshot requests until the head moves forward again… this sounds actually quite nice as it would make the overall protocol more robust?

SB on hydra-node using blockfrost

Continuing from yesterday I need to make blockfrost awaiting work. The UTxO after publishing scripts is not good and I have duplicated logic for when you get script outputs and regular pub key outputs I need to unify.
After unifying these two functions I see that await works but sometimes it also gives back the wrong utxo which causes next tx in line to fail with BadInputs.
This is a bit weird since it seems api calls to blockfrost are somehow flaky?
Even after publishing scripts and seeding from a faucet works - when assembling a InitTx in wallet I get this dreaded error:

       ScriptFailedInWallet {redeemerPtr = "ConwayMinting (AsIx {unAsIx = 0})", failureReason = "ValidationFailure (WrapExUnits {unWrapExUnits = ExUnits' {exUnitsMem' = 0, exUnitsSteps' = 0}}) (CodecError (LedgerLanguageNotAvailableError {sdeAffectedLang = PlutusV3, sdeIntroPv = 9, sdeThisPv = 0})) [] (PlutusWithContext {pwcProtocolVersion = Version 0, pwcScript = Left (Plutus {plutusBinary = \"WRWiAQEAMzMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyMjIyIiIykwAQApEUgAyJQGpLJksmRkZGSqZmrmgARGAEImSqZmrmgARGAAImBkBiZuHSACADM3DpAAABGqueACNVc6ACbqgBImYESSAQNNMDYAWTMwICIyMzVzQAMACAEZuPACABMCEAEwI5ABkRABpCZgXkSyADGABIhkAKRLJkZmrmgAYAEAIzcQACkABEwBgA4wAgADGAEiZmZkZEREplJmBSkhA00wMQAyMwNiJZABjACkQrKzIBEikASRGRmauaABgAQAjNx4AgARGQCJFIAkiMjM1c0ADAAgBGbhwAwASMAJGABEwBAAQAEzA1IlkAGMACRDADkAKRABJgCAAkqADIiIiIiIiIiIBBMjIyMjIVkzAuSRA00wMgAyMzVzQAMACAEZuHMlkzMC0iMjM1c0ADAAgBGbjwAgAVAGMDBQB5CYAQAMUgABkZkRkRgBAAmB6RLIAMQA5EMgBSIzAIABMAYAMkAGRABJABkQAKACRGbgAAgAYG4zcAoAKQAUVkzAuSQBA00wMwAzIyIwAgATA7IlkAGMAKRDIAUiMlUzNXNAAiJgZgBCJgDgCGbjwAgBxQBZUAORAAorKyZgXJIBA00wNAAyMzVzQAMACAEZuHUAEzALESIyIwAgATA8IlkAGIAciEzAGACMAQAEoAkTMDMlkzMC0iMjM1c0ADAAgBGbjwAgAVAGABkKyADEwO0kAQNNMDgAkQyAFIlkAOKyZGZq5oAGABACM3DgApABRgARMD9JAQNNMDgASITBBSQBA00wOAARMDpJAQNNMDcAKAJGAEismYGZLIAMVAxkKyagbAAyEzAxSQEDTTEyADIzNXNAAwAIARm48AFQCImB2kkDTTEyAEhUDIZgdESyADGABIhgByAFIgAkwBAAQzAMAKUAaJmBckkDTTA1AFkyMzVzQAMACAEZuPlQApEQAZQBYmSzIAMikAuRGRmauaABgAQAjNx4AgARGQAZFIBciMjM1c0ADAAgBGbhwAwASMAJKgBSIgAkYASMAJGAEjACRgBBlQAZEQAIVmVABkQASKgYSFZMzMzA0MjIyMiIyUyUzNXNABCKyZmB7ACIoAMAEAKACAByGQAZEsmAUAFIVkzMEGAERQAYAIAUAEACkMgAyJZMwDADQApCsmZgiwAiKADABACgAgAUhkAGRLJqCMAFIVkzMEmAERQAQAgAUhWTMzMwSSIyVTM1c0ACIrJmYJsAIigAwAQAoAIAFIZABkSyagnABSFZMzBRgBEUAEAIAFIVk1AdABkMAGACAHABIwAkYASMAIjACCMAIzcOkAAAEUAIoARQAigBAAyGADAFALgDwA4AJEwDkmRMA1JkTAMSYiYBKTImAQkxEwBUmRMARJiJgApMETABSYyIyVTM1c0ACIrJmYH8AIigAgBABZCsmZmZgfkRkqmZq5oAERWTMwQ4ARFABgAgBQAQAKQyADIlk1BEACkKyZmCPACIoAMAEAKACABSGQAZEsmYCQCYAUhWTMwS4ARFABgAgBQAQAKQyADIlkwGAApCsmZgnwAiKADABACgAgAUhkAGRLJqA6AFIVkzMFOAERQAQAgAUhWTUFIAGQwAYAIB8AuAPADgAkYASMAJGAERgBIwAiMAJGAERgBIwAiMAIIwAjNw6QAAARQAigBFACKAEADIYAMAcAEiYAiTImAGkwRMANJjNw6QAQApkRkqmZq5oAERWTMwP4ARFABACACyFZMzMzA/IjJVMzVzQAIismZghwAiKADABACgAgAUhkAGRLJqCIAFIVkzMEeAERQAYAIAUAEACkMgAyJZMwEgEwApCsmZglwAiKADABACgAgAUhkAGRLJgMABSFZMzBPgBEUAGACAFABAApDIAMiWTUB0AKQrJmYKcAIigAwAQAoAIAFIZABkSyagQgBSFZMzBXgBEUAGACAFABAApDIAMiWTUFgAKQrJmYLcAIigAwAQAoAIAFIZABkSyaguABSFZMzBfgBEUAGACAFABAApDIAMiWTUGAAKQrJmYMcAIigAwAQAoAIAFIZABkSyZgXKDIAFIVkzMGeAERQAQAgAUhWTUDMAGQwAYAIEcB+A3AXgJwD4BcAeAHABIwAkYASMAIjACRgBEYASMAIjACRgBEYASMAIjACRgBEYASMAIjACRgBEYASMAIjACCMAIzcOkAAAEUAIoARQAigBAAyGADABABImAIkyJgBpMETADSYzcOkAIAKRkqmZq5oAERgAwAgjACM3DpADACGbh0gAAAiMjMzMDkigBFACABKAEUAJTMD0iEzA7gAwABEsmAKAFIVk1MAUSABACkMAGAGAFABIwAkTKAEamAIJAAgAhIAElk1A2ABkMAEAFGAEIzMzMDYiMlUzNXNAAiKyZmB1ACIoAIAQAKQrJqAMADIYAIAKMAJGAEEYARm4dIAAAIoARQAigBFACABEzMzA0IoARQAigBFABABFACKAEUAIoARQAgAZCsgAyFQMoqBjIiIYAIAUAGZgHiJEZEYAQAJggESyADEAORCZgDABGAIACADSFQMkTA4SQQNNMDkASFQMQrJmASAMoAcTA2SRA00xMQCRCsgAxACkQmB0kgEDTTExAATMDYiWQAYwAJEMAOQApEACmAIACGYBAAygBCYFAAUYAQZABkRABiM1UAEiM3AAApABJAAERmBgRLIAMYAEiEyWZADkRFIAkikAKUAKRkqmZq5oAERgAwAQBoA4IwAjNx4AIBwkMAMAGAEiACMAQAFIAMiIiIiIiIiIgDgBQBIBAAIwBQBopNFQGUiUBqSgMyJQGpKAyEZGRkZKpmauaABETKACZGRkZKpmauaABETIyMjIyMjIyMjIyMjIygAmRqA8brAATV0ICEyNQHzdYACauhAOmRqBCbrAATV0IBs3WmroQDJmYErrjMCV1zrTV0IBcyNQIzdYACauhAKmZgSgUOtNXQgEzNTIyEiMjIyMlUzNXNAAiMAEyMjIyVTM1c0ACIwATMAwAo1dCAFMAs1dCauiACCJglglGbh0gAAAjVXPABGqudABN1Rq6EAKZGRkZKpmauaABEYAJmAYAUauhACmAWauhNXRABBEwSwSjNw6QAAARqrngAjVXOgAm6o1dCauiACCJgjgjGbh0gAAAjVXPABGqudABN1QAJGRkZGSqZmrmgARGAEImSqZmrmgARGAAImCMCKZuHSACADM3DpAAABGqueACNVc6ACbqgASIyMjIyVTM1c0ACIwAhEyVTM1c0ACIwATAHNXQgBhEyVTM1c0ACIwBBEwRwRjNw6QAgAhm4dIAIAMzcOkAAAEaq54AI1VzoAJuqABdaauhAImYGDrjV0IA8zMCUjIyMjJVMzVzQAIjADN1xq6EAIImSqZmrmgARGASYF5q6EAMImSqZmrmgARGAOYF5q6EAQImSqZmrmgARGACbrTV0IAswLjV0Jq6IAUImSqZmrmgARGAWYGBq6EAYImSqZmrmgARGAKbrTV0IA8wLjV0Jq6IAcImCSCQZuHSAKAHM3DpAEADGbh0gBgBTNw6QAgAhm4dIAIAMzcOkAAAEaq54AI1VzoAJuqABAtNXQgDTMwJXXAWmroQBZuuNXQgCTMwJQJzMCUC8jIyMjJVMzVzQAIjACETJVMzVzQAIjAEETJVMzVzQAIjAAETBGBFM3DpACACGbh0gAgAzNw6QAAARqrngAjVXOgAm6oAE1dCAHMjUCQ3WAAmroQApmBc601dCADMwLnWmroTV0QAI1dEACauiABNXRAAmrogATV0QAJq6IAE1dEACauiABNXRAAmrogATV0QAJq6IAE1dEACauiACETAxAwM3DpAAABGqueACNVc6ACbqjV0IAc1dCADMjIyMlUzNXNAAiMAM3XGroQAgiZKpmauaABEYBJgOGroQA5mBCBAauhNXRABhEyVTM1c0ACIwBzAcNXQgCBEyVTM1c0ACIwATdaauhAFmA2auhNXRAChEyVTM1c0ACIwCzAdNXQgDBEyVTM1c0ACIwBTdaauhAHmA2auhNXRADhEwNgNTNw6QBQA5m4dIAgAYzcOkAMAKZuHSAEAEM3DpABABmbh0gAAAjVXPABGqudABN1Rq6E1dEACNXRABCJgWAVmbh0gAAAjVXPABGqudABN1QAJmBGRCZgQwACKAGYAwAU1MAQSABABCQAJmBERCZgQQACKAGYAoAU1MAQSABABCQAJGRkZGSqZmrmgARGACYB5q6EAKYA5q6E1dEAEETAlAkM3DpAAABGqueACNVc6ACbqgATMCAiEzAegAEUAMwBQApqYAgkACACEgASMjIyMlUzNXNAAiJkZQATIyMjJVMzVzQAIjABMBI1dCAFMwGCMjIyMlUzNXNAAiMAEwFzV0IAQRMlUzNXNAAiJlADN1pq6EASbrTV0IAM3WmroTV0QAI1dEAGImBeBcZuHSACADM3DpAAABGqueACNVc6ACbqgATV0Jq6IAIImBSBQZuHSAAACNVc8AEaq50AE3VGroQBJmYBbrjMAt1zrTV0IAUyMjIyVTM1c0ACIwABEyVTM1c0ACIwBTdcauhADCJkqmZq5oAERgBmroQBAiYFYFRm4dIAQAQzcOkAEAGZuHSAAACNVc8AEaq50AE3VGroQAZmAo641dCauiABGrogATV0QAQiYEYERm4dIAAAI1VzwARqrnQATdUACZgPEQmYDkAAigBmAQAFNTAEEgAQAQkACZgOkQmYDcAAigBmAKAFNTAEEgAQAQkACRkZGRkqmZq5oAERMoAJutNXQgBzAKNXQgAzIyMjJVMzVzQAIiZQCTMBUBY1dCAHNXQgAzMBV1xq6E1dEACNXRABCJkqmZq5oAERgAmYCoCxq6EAOZGRkZKpmauaABEYAJutNXQgBTdaauhNXRABBEwKgKTNw6QAAARqrngAjVXOgAm6o1dCauiADCJkqmZq5oAERgFmZgGgIOtNXQgCTMBZ1xq6E1dEAIETJVMzVzQAIjAHMwFwGDV0IAoRMlUzNXNAAiJkZQDTMBoBs1dCARMwHAFDV0IAUzMBEBR1pq6EAHJkZGRkqmZq5oAERgAm601dCAFN1pq6E1dEAEETAvAuM3DpAAABGqueACNVc6ACbqjV0Jq6IAGRGYDAAQAIauiABNXRADCJkqmZq5oAERgCmYDIDRq6EAeZGRkZKpmauaABETMB11xq6EAIRMC4C0zcOkAAAEaq54AI1VzoAJuqNXQmrogBwiZKpmauaABEYAQiYFYFRm4dIAwAgzcOkAUAOZuHSAIAGM3DpADACmbh0gBABDNw6QAQAZm4dIAAAI1VzwARqrnQATdUauhNXRAAjV0QAQiYEAD5m4dIAAAI1VzwARqrnQATdUACRGRGoARurABMwHSITMBuAARQA4AJgDGqudACmAKaq54AJNTAEEgAQAQkACZEZGRkZKpmauaABEYBpgEGroQApmAc601dCauiACCJkqmZq5oAERgJmASauhADmYB7rTV0Jq6IAMImSqZmrmgARGAGYBRq6EASYBBq6E1dEAIETJVMzVzQAIiZQCzAMNXQgDTAKNXQgAzdaauhNXRAAjV0QAoiZKpmauaABEYBJgGGroQBputNXQmrogBgiZKpmauaABEYCpgGmroQBwiZKpmauaABEYCJgHGroQCJutNXQmrogCAiZKpmauaABEYApuuNXQgEzdcauhNXRAEhEyVTM1c0ACIwBzdcauhAKm601dCauiAKCJkqmZq5oAERgAmAiauhALmAiauhNXRAFhEyVTM1c0ACIwDzASNXQgGBEwKQKDNw6QCgBhm4dIBIAszcOkAgAUZuHSAOAJM3DpAGAEGbh0gCgBzNw6QBAAxm4dIAYAUzcOkAIAIZuHSACADM3DpAAABGqueACNVc6ACbqgATIjIyMjJVMzVzQAIjABN1xq6EAIImSqZmrmgARGAKYA5q6EAMImSqZmrmgARGAGbrjV0IAkwCDV0Jq6IAQImBCBAZuHSAEAEM3DpABABmbh0gAAAjVXPABGqudABN1QAJGRkZGSqZmrmgARGACYA5q6EAIImSqZmrmgARGAEImSqZmrmgARGAIImBAA+ZuHSAEAEM3DpABABmbh0gAAAjVXPABGqudABN1QAJGRkZGSqZmrmgARGACYAxq6EAIImSqZmrmgARGAGYA5q6EAMImSqZmrmgARGAKbrjV0IAgRMB8B4zcOkAIAIZuHSACADM3DpAAABGqueACNVc6ACbqgASMjIyMlUzNXNAAiMAE3XGroQAgiZKpmauaABEYAZuuNXQgBhEwHQHDNw6QAQAZm4dIAAAI1VzwARqrnQATdUACRkZGRkqmZq5oAERgAm641dCAFN1pq6E1dEAEETAbAaM3DpAAABGqueACNVc6ACbqgATAWIiMlUzNXNAAiJgMpIBA1BUMwARMlUzNXNAAiJmAKZuBAYAEM3AgMABiJlABM3CACgAzNwgAgAIzAGAEADM3EABALmbhwAQFjAVIiMlUzNXNAAiIAYiZgCABGbhgAwAjNw4AIComAmkgQNQVDUAIAEiMjIyMlUzNXNAAiMAIRMlUzNXNAAiMAEwBzV0IAYRMBgBczcOkAAAGZuHSACACNVc8AEaq50AE3VAAkZGRkZKpmauaABEYAJuuNXQgBTdaauhNXRABBEwFQFDNw6QAAARqrngAjVXOgAm6oAEjIjUAI3WAAmYCJEJmAfAAIoAZgCgBTUwBBIAEAEJAAmAeRLIAMYAUiGQApEZKpmauaABETIzNXNAAwAIARm4cAJIAIRMAcAQzceAEkULSHlkcmFIZWFkVjEACIyIwAgATAQIlkAGMAKRDIAUiWTMAgAIAeMAEAFEwBgAwZFIAMiKQAZEoA8gAZEoA8lAGkSgDyUAYEyYzVziSQJMaACAARLIAMYAETNXOABQAhSADIiIiIiIiIiIAwJgEpIBA00xMgATAISRA00wOQATAHSRA00xMAAjIjACABMAgiWQAYwAJEKyYAoAUTAEABjACBMzMwASKAEUAIoARQAigAgAiIiIlMzMzV0gAImRmAOaq50AE1VzwAJuqABEwBTdWACJgCG6wAETADN1oAImAEbrgASIlMzVXPgAiAGJmAEauhABNXRAApAAJIEDUFQxACMmM1c4ADAAIyMAEAEjACIzACACABSJHMihAaXIrEgWsNzrWc4x/CJY44fego8Clh0vIEUASIEcDjURWix8E8aOzY105Jh8BNRTnjN2Q74guzJ0vQAzM1EiACIoAIAUAEQlIAUAEkUgDFnmis0iW8eo2Ro3sYNEBZlBGtKsH/zhQPkoxLw1HU0ASAAB\"}), pwcScriptHash = ScriptHash \"c48ad46525f11f52e9f67ef942a3e6e727966b89c43e7c323de5ad55\", pwcArgs = ScriptContext {scriptContextTxInfo = TxInfo {txInfoInputs = [TxInInfo {txInInfoOutRef = TxOutRef {txOutRefId = 0c59e68acd225bc7a8d91a37b183440599411ad2ac1ffce140f928c4bc351d4d, txOutRefIdx = 0}, txInInfoResolved = TxOut {txOutAddress = Address {addressCredential = PubKeyCredential f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d, addressStakingCredential = Nothing}, txOutValue = Value {getValue = Map {unMap = [(,Map {unMap = [(\"\",100000000)]})]}}, txOutDatum = NoOutputDatum, txOutReferenceScript = Nothing}}], txInfoReferenceInputs = [], txInfoOutputs = [TxOut {txOutAddress = Address {addressCredential = ScriptCredential 0e35115a2c7c13c68ecd8d74e4987c04d4539e337643be20bb3274bd, addressStakingCredential = Nothing}, txOutValue = Value {getValue = Map {unMap = [(,Map {unMap = [(\"\",0)]}),(c48ad46525f11f52e9f67ef942a3e6e727966b89c43e7c323de5ad55,Map {unMap = [(\"HydraHeadV1\",1)]})]}}, txOutDatum = OutputDatum (Datum {getDatum = Constr 0 [Constr 0 [I 10000],List [B \"\\213\\191J?\\204\\231\\ETB\\176\\&8\\139\\204'I\\235\\193H\\173\\153i\\178?E\\238\\ESC`_\\213\\135xWj\\196\"],B \"\\196\\138\\212e%\\241\\USR\\233\\246~\\249B\\163\\230\\231'\\150k\\137\\196>|2=\\229\\173U\",Constr 0 [B \"\\fY\\230\\138\\205\\\"[\\199\\168\\217\\SUB7\\177\\131D\\ENQ\\153A\\SUB\\210\\172\\US\\252\\225@\\249(\\196\\188\\&5\\GSM\",I 0]]}), txOutReferenceScript = Nothing},TxOut {txOutAddress = Address {addressCredential = ScriptCredential c8a101a5c8ac4816b0dceb59ce31fc2258e387de828f02961d2f2045, addressStakingCredential = Nothing}, txOutValue = Value {getValue = Map {unMap = [(,Map {unMap = [(\"\",0)]}),(c48ad46525f11f52e9f67ef942a3e6e727966b89c43e7c323de5ad55,Map {unMap = [(0xf8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d,1)]})]}}, txOutDatum = OutputDatum (Datum {getDatum = B \"\\196\\138\\212e%\\241\\USR\\233\\246~\\249B\\163\\230\\231'\\150k\\137\\196>|2=\\229\\173U\"}), txOutReferenceScript = Nothing}], txInfoFee = 0, txInfoMint = UnsafeMintValue (Map {unMap = [(c48ad46525f11f52e9f67ef942a3e6e727966b89c43e7c323de5ad55,Map {unMap = [(\"HydraHeadV1\",1),(0xf8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d,1)]})]}), txInfoTxCerts = [], txInfoWdrl = Map {unMap = []}, txInfoValidRange = Interval {ivFrom = LowerBound NegInf True, ivTo = UpperBound PosInf True}, txInfoSignatories = [], txInfoRedeemers = Map {unMap = [(Minting c48ad46525f11f52e9f67ef942a3e6e727966b89c43e7c323de5ad55,Redeemer {getRedeemer = Constr 0 []})]}, txInfoData = Map {unMap = []}, txInfoId = 71ebd1ece4ecc2001ca38f6d5a56b182f9103c58bfb3e3afa12c24f8308b523c, txInfoVotes = Map {unMap = []}, txInfoProposalProcedures = [], txInfoCurrentTreasuryAmount = Nothing, txInfoTreasuryDonation = Nothing}, scriptContextRedeemer = Redeemer {getRedeemer = Constr 0 []}, scriptContextScriptInfo = MintingScript c48ad46525f11f52e9f67ef942a3e6e727966b89c43e7c323de5ad55}, pwcExUnits = WrapExUnits {unWrapExUnits = ExUnits' {exUnitsMem' = 10000000000, exUnitsSteps' = 14000000}}, pwcCostModel = CostModel PlutusV3 [100788,420,1,1,100181,726,719,0,1,1000,173,0,1,1000,59957,4,1,11183,32,207616,8310,4,201305,8356,4,962335,18,2780678,6,442008,1,52538055,3756,18,267929,18,76433006,8868,18,52948122,18,1995836,36,3227919,12,901022,1,166917843,4307,36,284546,36,158221314,26549,36,74698472,36,333849714,1,254006273,72,2174038,72,1006041,43623,251,0,1,16000,100,16000,100,16000,100,16000,100,16000,100,16000,100,16000,100,16000,100,100,100,16000,100,94375,32,132994,32,61462,4,107878,680,0,1,72010,178,0,1,22151,32,107490,3298,1,91189,769,4,2,85848,123203,7305,-900,1716,549,57,85848,0,1,1,1000,42921,4,2,24548,29498,38,1,898148,27279,1,51775,558,1,39184,1000,60594,1,106057,655,1,141895,32,83150,32,15299,32,76049,1,13169,4,1293828,28716,63,0,1,2261318,64571,4,22100,10,28999,74,1,28999,74,1,43285,552,1,44749,541,1,33852,32,68246,32,72362,32,7243,32,7391,32,11546,32,85848,123203,7305,-900,1716,549,57,85848,0,1,90434,519,0,1,74433,32,100181,726,719,0,1,85848,123203,7305,-900,1716,549,57,85848,0,1,95336,1,1,85848,123203,7305,-900,1716,549,57,85848,0,1,180194,159,1,1,1964219,24520,3,159378,8813,0,1,955506,213312,0,2,270652,22588,4,1457325,64566,4,158519,8942,0,1,20467,1,4,0,141992,32,100788,420,1,1,81663,32,59498,32,20142,32,24588,32,20744,32,25933,32,24623,32,43053543,10,53384111,14333,10,43574283,26308,10,281145,18848,0,1,100181,726,719,0,1]})"}

Seems like this is happening when evaluating scripts and for some reason intro protocol version is 9 (PlutusV3) but current protocol version is zero.
evalTxExUnits is relevant here since it is responsible for throwing this error. There is also evalTxExUnitsWithLogs which should give more context.
I think I should focus first on getting the blockfrost utxo always to be bulletproof before I go further into wallet errors.
Currently there is only one UTxO at the faucet address:

                            TxHash                                 TxIx        Amount
--------------------------------------------------------------------------------------
f9d95b0f651d772ad4a98d4855d7f7f2ad259f8946e2145bafe96c97b45a8144     1        133920252523 lovelace + 1 13d1f7feab83ff4db444bf96b8677949c5bf9c709671f30ff8f33ab3.487964726120446f6f6d202d2033726420506c6163652054726f706879 + 1 19c98d04cdb6e1e782a73e693697d4a46ca9820d5d490a3bf6470a07.487964726120446f6f6d202d20326e6420506c6163652054726f706879 + 1 1a22028742629f3cf38b3d1036a088fea59eb30237a675420fb25c11.2331 + 1 6d92350897706b14832c62c5b5644e918f0b6b3b63ffc00a1a463828.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 ad39d849181dc206488fd726240c00b55547153ffdca8c079e1e34d9.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 bfe4ab531fd625ef33ea355fd85953eb944bffa401af767666ff411c.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 c953682b6eb5891c0bda35718c5261587d57e5e408079cbeb8cf881a.2331 + 1 cd6076d9d0098da4c7670c08f230e4efe31d666263c9db5196805d6e.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 d0c91707d75011026193c0fce742443dde66fa790936981ece5d9f8b.2331 + 69918000000 d8906ca5c7ba124a0407a32dab37b2c82b13b3dcd9111e42940dcea4.0014df105553444d + 1 dd7e36888a487f8b27687f65abd93e6825b4eb3ce592ee5f504862df.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 fa10c5203512eeeb92bf79547b09f5cdb2e008689864b0175cca6fee.487964726120446f6f6d202d2034746820506c6163652054726f706879 + TxOutDatumNone

and now I get constant failure with BadInput error! Clearly blockfrost doesn't like this utxo at all. Let me see if I seed from the faucet again and have two UTxO if the test will work always.

Oh look at this one:

                           TxHash                                 TxIx        Amount
--------------------------------------------------------------------------------------
642d74562744eb5739f36845be0a6462c34496c758a58b0e3a91ec0d8eb70ffb     0        10000000000 lovelace + TxOutDatumNone
98a404a9e0d0fceec5c3c0d9ea45802df09db9f5c0b8f98dfbefa2b3da7e3b9f     1        133838102650 lovelace + 1 13d1f7feab83ff4db444bf96b8677949c5bf9c709671f30ff8f33ab3.487964726120446f6f6d202d2033726420506c6163652054726f706879 + 1 19c98d04cdb6e1e782a73e693697d4a46ca9820d5d490a3bf6470a07.487964726120446f6f6d202d20326e6420506c6163652054726f706879 + 1 1a22028742629f3cf38b3d1036a088fea59eb30237a675420fb25c11.2331 + 1 6d92350897706b14832c62c5b5644e918f0b6b3b63ffc00a1a463828.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 ad39d849181dc206488fd726240c00b55547153ffdca8c079e1e34d9.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 bfe4ab531fd625ef33ea355fd85953eb944bffa401af767666ff411c.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 c953682b6eb5891c0bda35718c5261587d57e5e408079cbeb8cf881a.2331 + 1 cd6076d9d0098da4c7670c08f230e4efe31d666263c9db5196805d6e.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 d0c91707d75011026193c0fce742443dde66fa790936981ece5d9f8b.2331 + 69918000000 d8906ca5c7ba124a0407a32dab37b2c82b13b3dcd9111e42940dcea4.0014df105553444d + 1 dd7e36888a487f8b27687f65abd93e6825b4eb3ce592ee5f504862df.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 fa10c5203512eeeb92bf79547b09f5cdb2e008689864b0175cca6fee.487964726120446f6f6d202d2034746820506c6163652054726f706879 + TxOutDatumNone
ubuntu@ip-10-1-6-128:~$ sudo ./cardano-cli-x86_64-linux query utxo --socket-path preview/node.socket --address addr_test1vztc80na8320zymhjekl40yjsnxkcvhu58x59mc2fuwvgkc332vxv --testnet-magic 2
                           TxHash                                 TxIx        Amount
--------------------------------------------------------------------------------------
25727480009bb6d3b560914d821da88304ccec98a7439a694ce5ed53bbabab6e     1        143673796876 lovelace + 1 13d1f7feab83ff4db444bf96b8677949c5bf9c709671f30ff8f33ab3.487964726120446f6f6d202d2033726420506c6163652054726f706879 + 1 19c98d04cdb6e1e782a73e693697d4a46ca9820d5d490a3bf6470a07.487964726120446f6f6d202d20326e6420506c6163652054726f706879 + 1 1a22028742629f3cf38b3d1036a088fea59eb30237a675420fb25c11.2331 + 1 6d92350897706b14832c62c5b5644e918f0b6b3b63ffc00a1a463828.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 ad39d849181dc206488fd726240c00b55547153ffdca8c079e1e34d9.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 bfe4ab531fd625ef33ea355fd85953eb944bffa401af767666ff411c.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 c953682b6eb5891c0bda35718c5261587d57e5e408079cbeb8cf881a.2331 + 1 cd6076d9d0098da4c7670c08f230e4efe31d666263c9db5196805d6e.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 d0c91707d75011026193c0fce742443dde66fa790936981ece5d9f8b.2331 + 69918000000 d8906ca5c7ba124a0407a32dab37b2c82b13b3dcd9111e42940dcea4.0014df105553444d + 1 dd7e36888a487f8b27687f65abd93e6825b4eb3ce592ee5f504862df.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 fa10c5203512eeeb92bf79547b09f5cdb2e008689864b0175cca6fee.487964726120446f6f6d202d2034746820506c6163652054726f706879 + TxOutDatumNone

Looks like the code I added reorders the UTxO so in the end you always get one utxo although the faucet had two UTxO which both could be used for publishing scripts.

I need to print the state of the UTxO before and after as well as txs if I am going to resolve this problem.
This blockfrost work seems neverending but we'll get there.

2025-04-29

SB on hydra-node using blockfrost

I am at point where I wrote a test in the DirectChainSpec to open, close and fanout a head using blockfrost.
~~The test is not entirely using blockfrost since we also rely on cardano-node on preview for querying UTxO for example.~~
The problem is that sometimes faucet UTxO comes back empty which is weird since I see at least one UTxO with large amount of ada but also some NFT's.
Why queryUTxO fails to return the correct UTxO? I feel like this problem is not part of the thing I am trying to solve but I definitely need a green test otherwise I can't know if the chain following logic for blockfrost works.
Maybe I should test this in e2e scenario instead and see how things behave in there? ~~Is it possible that cardano-node is not in sync completely before test is ran?~~
One thing I noticed - I am using blockfrost with faucet key to publish scripts at the beginning but I am not awaiting for the local cardano-node to see these transactions!
Then when I query the faucet UTxO I either see empty UTxO or get BadInputs error so I think I need to await for script publishing transactions definitely.
This is far from optimal - perhaps I need to create equivalent functions that work with blockfrost api instead?
After re-mapping all needed functions into blockfrost versions and not use local cardano-node for anything, I still get BadInputs error..hmm. At least I see the correct UTxO picked up so I'll work my way from there. This is probably some logic in the UTxO seed...
After adding blockfrost equivalent for awaitForTransaction I am still at the same place - makes sense. The produced output is not a problem but the actual transaction.
I pretty printed the faucet utxo and the tx and I don't see anything weird.
Important note is - we get a valid tx when building but the Blockfrost returns an error when submitting!
Decided to find just a singe utxo that is responsible from seeding from a faucet just to reduce a clutter but the error is the same:

Faucet UTxO: 815e52d1ee#0 ↦ 54829439 lovelace
62fb023528#1 ↦ 9261435223 lovelace
70e6c21881#1 ↦ 129433058019 lovelace + 1 13d1f7feab83ff4db444bf96b8677949c5bf9c709671f30ff8f33ab3.487964726120446f6f6d202d2033726420506c6163652054726f706879 + 1 19c98d04cdb6e1e782a73e693697d4a46ca9820d5d490a3bf6470a07.487964726120446f6f6d202d20326e6420506c6163652054726f706879 + 1 1a22028742629f3cf38b3d1036a088fea59eb30237a675420fb25c11.2331 + 1 6d92350897706b14832c62c5b5644e918f0b6b3b63ffc00a1a463828.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 ad39d849181dc206488fd726240c00b55547153ffdca8c079e1e34d9.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 bfe4ab531fd625ef33ea355fd85953eb944bffa401af767666ff411c.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 c953682b6eb5891c0bda35718c5261587d57e5e408079cbeb8cf881a.2331 + 1 cd6076d9d0098da4c7670c08f230e4efe31d666263c9db5196805d6e.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 d0c91707d75011026193c0fce742443dde66fa790936981ece5d9f8b.2331 + 69918000000 d8906ca5c7ba124a0407a32dab37b2c82b13b3dcd9111e42940dcea4.0014df105553444d + 1 dd7e36888a487f8b27687f65abd93e6825b4eb3ce592ee5f504862df.487964726120446f6f6d202d2031737420506c6163652054726f706879 + 1 fa10c5203512eeeb92bf79547b09f5cdb2e008689864b0175cca6fee.487964726120446f6f6d202d2034746820506c6163652054726f706879
Found UTxO: 62fb023528#1 ↦ 9261435223 lovelace
"f99907e0b4e3c9d554a68e76c3a72b4090cffb5c12d0cd471e29e1d0fa7184d2"

== INPUTS (1)
- cd62585298998cd809f6fe08a4af3087dab8f73ed67132b8c8fd4162fb023528#1
      ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "9783be7d3c54f11377966dfabc9284cd6c32fca1cd42ef0a4f1cc45b"})) StakeRefNull
      9261435223 lovelace
      TxOutDatumNone
      ReferenceScriptNone

== COLLATERAL INPUTS (0)

== REFERENCE INPUTS (0)

== OUTPUTS (2)
Total number of assets: 1
- ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d"})) StakeRefNull
      100000000 lovelace
      TxOutDatumNone
- ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "9783be7d3c54f11377966dfabc9284cd6c32fca1cd42ef0a4f1cc45b"})) StakeRefNull
      9161262858 lovelace
      TxOutDatumNone

== TOTAL COLLATERAL
TxTotalCollateralNone

== RETURN COLLATERAL
TxReturnCollateralNone

== FEE
TxFeeExplicit ShelleyBasedEraConway (Coin 172365)

== VALIDITY
TxValidityNoLowerBound
TxValidityUpperBound ShelleyBasedEraConway Nothing

== MINT/BURN
0 lovelace

== SCRIPTS (0)
Total size (bytes):  0

== DATUMS (0)

== REDEEMERS (0)

== REQUIRED SIGNERS
[]

== METADATA
TxMetadataNone

  can open, close & fanout a Head using Blockfrost [✘]

Failures:

  test/Test/DirectChainSpec.hs:385:3:
  1) Test.DirectChain can open, close & fanout a Head using Blockfrost
       uncaught exception: APIBlockfrostError
       BlockfrostError "BlockfrostBadRequest \"{\\\"contents\\\":{\\\"contents\\\":{\\\"contents\\\":{\\\"era\\\":\\\"ShelleyBasedEraConway\\\",\\\"error\\\":[\\\"ConwayUtxowFailure (UtxoFailure (ValueNotConservedUTxO (MaryValue (Coin 0) (MultiAsset (fromList []))) (MaryValue (Coin 9261435223) (MultiAsset (fromList [])))))\\\",\\\"ConwayUtxowFailure (UtxoFailure (BadInputsUTxO (fromList [TxIn (TxId {unTxId = SafeHash \\\\\\\"cd62585298998cd809f6fe08a4af3087dab8f73ed67132b8c8fd4162fb023528\\\\\\\"}) (TxIx {unTxIx = 1})])))\\\"],\\\"kind\\\":\\\"ShelleyTxValidationError\\\"},\\\"tag\\\":\\\"TxValidationErrorInCardanoMode\\\"},\\\"tag\\\":\\\"TxCmdTxSubmitValidationError\\\"},\\\"tag\\\":\\\"TxSubmitFail\\\"}\""

I don't see anything wrong with the tx but blockfrost seems to thing the input is invalid for whatever reason. I'll explore tx endpoints to see if I can find something useful.
Checked the mapping between blockfrost/cardano when creating UTxO and all looks good.
Added awaitForTransaction - blockfrost variant which didn't help out with the error.
Made all functions in Blockfrost.Client work in BlockfrostClientT IO so I can run them all from outside (I suspected that calling multiple blockfrost connections can cause problems) and this didn't help either.
I think all these changes I did are good to keep but I still don't see why submitting of a seeding tx fails?

2025-04-28

SN on fixing deposits using behavior tests

Fixing behavior spec for deposits in future, but to soon requires us to check the deadline.. but what is the upper bound? Re-use contestation period or introduce a deposit period (already)?
Decide to use the contestation period to limit the upper bound now (no need for a new option right now)
Switching to a realistic contestation period had some other behavior specs fail? The ones with 60 deadlines were easy to fix with a newDeadlineFarEnoughAway, but surprisingly the "close with decommit in flight" was failing?
Switching the bespoke test to assert DecommitFinalized revealed that this server output is not actually yielded when the head was closed right after?
When trying to fix assertions on "can close with decommit in flight" (by also removing the wait for contestation), I realized that the DecommitFinalized for the correctly approved decommit was never sent because the Head is already in Closed state when handling the OnDecrementTx observation - is thes test suite not simulating blocks correctly or can this really happen?
Indeed, the simulated chain of the BehaviorSpec is just delaying observation by block time. As the decrement is only posted after approval, but the test client directly submits Close after the Decommit command, it is possible that the OnDecrementTx comes after the OnCloseTx.. but this is impossible using a proper chain.
We should look into using MockChain also in BehaviorSpec (we did not have that when originally writing this suite)
The cardano-ledger bailing on generated utxo is really annoying… went down a rabbit hole to have a test of our generators at least.

2025-04-26

SN refocusing on deposit fixes

Pivot: Originally thought I want a model-based test as I need rollbacks (using MockChain). The property I would have liked to test would be something like: "In an open head, a deposit only results in utxo available on L2 if not rolled back." However, the rollbacks in this property would need to be bounded and related to the deposit deadline. i.e. if the deposit was 150 blocks out from when posted, only rollbacks over 150 blocks should not result in the utxo added. This is not only hard, but maybe not as useful as I thought.
Realization: we don't need to talk about rollbacks if we make "settlement" first class in any property we want to express. For example, we can say "A deposit is only processed after settled" which implies a notion of settlement. Practically this could be a --deposit-period which we can configure per node and we then just need to ensure the node behaves correctly off-chain. Hence, statements like this can be expressed and tested using less involved test suites like the BehaviorSpec and notably without rollbacks.
Besides behavior properties like these, we then only need "deposits are correctly observed" in which we need to ensure that our implementation reports deposit time (or slot?) and deadline honestly.
This is similar to "only proper head is observed", i.e. we would generate happy deposit txs and ensure through mutations that all necessary things are still in place for it to be observeDepositTx correctly. For example: a proper deposit tx would contain an upper validity to record "now". Not having that set should make the deposit not observed.
Could define those "observation tests" also as part of hydra-tx ContractSpec? Are those "healty" + mutated generated transactions comparable in quality to StateSpec - which we then could get rid of?

2025-04-24

SN on model based testing interlude

Want to reproduce a specific model test by copying the action ... output. Unfortunately the show instances are not aligning well with available data constructors :(
After fixing some show instances, I could derive a unit test from the model spec using the action sequence.
Now I see again a Deposit with directly expired deadline not resulting in a utxo added and a following NewTx failing. I need to distinguish in the nextState whether a Deposit is expected to be adding a utxo or not. Could also do this in a second action to be more flexible in when to check whether a deposit already succeeded?
The incorrectly bound variables in the counter examples was because of incomplete HasVariables instances!
Made the model check pass again by not adding the deposited utxo directly to the model and also not wait for CommitFinalized in performDeposit. What is a good Actions api to do the assertions we want?
When I want to check whether a txid created by Deposit really happened, then I need to use TxIdType Tx in a symbolic Var .. which feels a bit add as other things we simplified to using Payment? Is there a way to switch return value types between model and run model?
Realized seems to be exactly this?
Had issus with the arbitraryQ of UTCTime always being way in the past. Also, the perform of DepositIncremented is annoying as we can easily miss the CommitFinalized and looking into historic serverOutputs is messy..

2025-04-23

SN on fixing deposits

How to deal with incompatible deposits? We do observe them, but should the head logic track them?
When introducing a currentTime to the open state (in order to determine deadline being good or not) I realize that Tick contents would be useful to have on the Observation chain event, which would be easily possible. How is the contestation deadline done?
- Aha! The need for tracking a currentTime : UTCTime in the HeadState can be worked around by tracking all deposits and discard them on tick.
- Hm.. but that would move the decision whether to snapshot a pending deposit only to the Tick handling. Which means that it may only happen on the next block..
- But this is where the deposit tracking needs to go anyways .. we will never issue a snapshot directly when observing the deposit (that's why we are here) and if we decouple the issuance from the observation, the logic needs to go to the Tick handling anyways!
When changing CommitRecorded I stumble over newLocalUTxO = localUTxO <> deposited .. why would we want to update our local ledger already when recording the deposit!? This was likely a bug too..
Specification will be quite different than what we need to implement: there are no deposits tracked and only a wait for any previous pending deposits. To what level do we need to specify the logic of delaying deposits and checking deadlines?
Why was the increment snapshotting only waiting for an "unresolved Decommit" before requesting a snapshot?
Why do we need to wait at all (for other decommits or commits) if there is no snapshot in flight and we are the leader.. why not just snapshot what we want?
After moving the incremental commit snapshot decision to Tick handling, the model fails because of a NewTx can't spend a UTxO added through a Deposit before -> interesting!
After bringing back a Uα equivalent to the HeadLogic the model spec consistently finds an empty utxoToCommit which fails to submit an incrementTx -> good!
Interestingly, the model allows to do action $ Deposit {headIdVar = var2, utxoToDeposit = [], deadline = 1864-06-16 04:36:38.606749385646 UTC} which obviously results in an empty utxo to commit.. this can happen in the wild too!
- Unclear where exactly we want to deal with empty deposits.
Back to where we started with a very old Deposit and the node trying to do an increment with deadline already passed. This should be easy to fix by just not trying to snapshot it. However, what if a dishonest hydra-node would do just that? Would we approve that snapshot? Certainly the on-chain script would forbid it, but this could stall the head.
- This is similar to the empty utxo thing. While we can make our honest hydra-node do funky stuff, we must ensure that we do not sign snapshots that are funky!
- Which tests would best capture this? The ModelSpec won't see these issues once our honest implementation stops requesting funky snapshots!
To determine whether a deposit is still (or already) fine, we are back to needing a notion of UTCTime when doing that decision? We could do that updating in the Tick handling and keep information about a deposit being Outdated or so. Then, the snapshot acknowledgment code can tell whether a deposit is valid and only sign if it is.
- Tracking a full Deposit type in pendingDeposits which has a DepositStatus.
- With the new Deposit type I can easily mark deposits as Expired and need to fix several behavior tests to put realistic deadlines. However, the observability in tests is lacking and I definitely need a DepositExpired server output to fix all tests.

The fact that we need to create state changes to have the state updated, but also want to see them applied before determining the next active deposit is maybe a hint to a more monadic way of writing the logic functions?
After adding the DepositActivated and DepositExpired state changes I was debugging why the "can process transactions while commit pending" is still not passing: This is injecting a deposit observation, then asserts CommitRecorded and directly submits a NewTx this will have a snapshot be requested and confirmed. This first snapshot, however, is not yet including the utxoToCommit of the deposit becaues the Deposit was not yet Active (we have not seen time passing).. so the next snapshot will need to settle it. However, the other party (the new snaphshot leader) never saw the deposit because we used injectChainEvent!

2025-04-22

SN on fixing deposits

Deposit fixes: How to test this situation? I need a test suite that includes the off-chain logic, but also allows control over rollbacks and spending inputs.
- Model based tests are not including incremental commits :(
- TxTraceSpec contains deposit/increment, but does only exercise the L1 related code
- The behavior tests do cover deposit/increment behavior, but deposit observations are only injected! So rollbacks would not cover them.
Lets bite the bullet.. at least the model-based MockChain could be easily adapated to do deposits in simulateCommit?
Ran into the same issue as we had on CI when shrinking was failingon partial !. Guarding the shrinkAction to only include actions if their party is still in the seed seems to fix this.. but now shrinking does not terminate?
- Detour on improving shrinking and counterexamples of that checkModel problem .. shifting back to fixing deposits.
After adding Deposit actions, implementing a simulateDeposit and adjusting some generators/preconditions, I consistently run into test failures with deadline <- arbitrary. This is already interesting! The hydra-node seems to still try to increment deposits with very far in the past (year 1864) deadlines -> first bug found and reproducible!

2025-04-09

SB on blockfrost wallet queries

After using blockfrost query to get all eras and try to construct EraHistory I was surprised to discover that using nonEmptyFromList fails.
I know for sure that I am not constructing empty list here so this is confusing.
Fond the example in the atlas repo https://atlas-app.io/ but those were also failing which is even more surprising.
When looking at the blockfrost query results I noticed there are multiple NetworkEraSummary that start and end with slot 0 which is surprising:

eras: [ NetworkEraSummary
    { _networkEraStart = NetworkEraBound
        { _boundEpoch = Epoch 0
        , _boundSlot = Slot 0
        , _boundTime = 0s
        }
    , _networkEraEnd = NetworkEraBound
        { _boundEpoch = Epoch 0
        , _boundSlot = Slot 0
        , _boundTime = 0s
        }
    , _networkEraParameters = NetworkEraParameters
        { _parametersEpochLength = EpochLength 4320
        , _parametersSlotLength = 20s
        , _parametersSafeZone = 864
        }
    }
, NetworkEraSummary
    { _networkEraStart = NetworkEraBound
        { _boundEpoch = Epoch 0
        , _boundSlot = Slot 0
        , _boundTime = 0s
        }
    , _networkEraEnd = NetworkEraBound
        { _boundEpoch = Epoch 0
        , _boundSlot = Slot 0
        , _boundTime = 0s
        }
    , _networkEraParameters = NetworkEraParameters
        { _parametersEpochLength = EpochLength 86400
        , _parametersSlotLength = 1s
        , _parametersSafeZone = 25920
        }
    }

After removing them I can parse EraHistory with success but the question is how to filter out values from blockfrost? Which are valid eras?
I'll try filtering all eras that start and end with slot 0
This worked - I reported what I found to the blockfrost guys
Now it is time to move forward and test if the wallet queries actually work
I picked one DirectChainTest and decided to alter it so it runs on preview using withCardanoNodeOnKnownNetwork but I get

  test/Test/DirectChainSpec.hs:124:3:
  1) Test.DirectChain can init and abort a 2-parties head after one party has committed
       uncaught exception: QueryException
       QueryProtocolParamsEncodingFailureOnEra (AnyCardanoEra AlonzoEra) "Error in $: key \"poolVotingThresholds\" not found"

It seems like re-mapping the protocol params from blockfrost fails on poolVotingThresholds.
This happens immediately when cardano-node reports MsgSocketIsReady

cardano-node --version
cardano-node 10.1.4 - linux-x86_64 - ghc-8.10
git rev 1f63dbf2ab39e0b32bf6901dc203866d3e37de08

I can see that this field exists in the conway-genesis.json in the tmp folder of a test run

SB on finalizing recover/decrement observations should not be conditional

After PR review comments from FT I wanted to add one suggestion and that is to see the Head closed and finalized after initially committing and then decommitting some UTxO.
This leads to H28 error on close and this means we tried to close with initial snapshot but in fact we already got the confirmed snapshot.
When inspecting the logs I found out that the node, after a restart, does not observe any SnapshotConfirmed and therefore tries to close with initial one which fails.
Question is: Why did the restarted node failed to re-observe confirmed snapshot event?
Added some test code to wait and see SnapshotConfirmed in the restarted node to confirm it actually sees this event happening and the test fails exactly at this point.
When both nodes are running I can view the snapshot confirmed message is there but after a restart - node fails to see SnapshotConfirmed message again.
In the logs for both node 1 and 2 before restart I see two SnapshotConfirmed messages but in the restarted node these events are gone.
I realized the close works if I close from node that was not restarted but what I want to do is wait for the restarted node to catch up and then close.
I removed fiddling with the recover and wanted to get this basic test working but closing with restarted node, even after re-observing the last decommit, fails with H28 FailedCloseInitial.
This means the restarted node tried to close with the initial snapshot but one of the values doesn't match. We expect the version to be 0, snapshot number to be 0 and utxo hash should match the initial one.
last-known-revision for both nodes before I shutdown one of them is 11 but the restarted node, after removing the last-known-revision file ends up having value 13. How come it received more messages?
When comparing the state files I see discrepancies in eventId and the restarted node has a DecommitRecorded as the last event (other than ticks)
Regular node decommit recorded:

{"eventId":44,"stateChanged":{"decommitTx":{"cborHex":"84a300d9010281825820ad7458781dc19e427fca77c8c7b2db1b56c81c11590e2ae3999f2f13db8c51c200018182581d60f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d1a004c4b400200a100d9010281825820eb94e8236e2099357fa499bfbc415968691573f25ec77435b7949f5fdfaa5da0584071b6c5956083ff7ac7ad49d5a75c77967b5ad2e7fd756c1de226f71cdf89e5d383bc88975c9ca7deab135f4ea9014666aa0e257f26bdd94dda2df60c922e9306f5f6","description":"","txId":"3095040e42ed9b193f8a66699b1631c17a85f670aee3c4d77fb3cfb195ea6bcb","type":"Tx ConwayEra"},"headId":"654b2b0e5ff3e0a902a12918b63628cdd478364caa4f0c758e6f7490","newLocalUTxO":{},"tag":"DecommitRecorded","utxoToDecommit":{"3095040e42ed9b193f8a66699b1631c17a85f670aee3c4d77fb3cfb195ea6bcb#0":{"address":"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3","datum":null,"datumhash":null,"inlineDatum":null,"inlineDatumRaw":null,"referenceScript":null,"value":{"lovelace":5000000}}}},"time":"2025-04-10T07:30:58.882632162Z"}

Restarted node decommit recorded

{"eventId":76,"stateChanged":{"decommitTx":{"cborHex":"84a300d9010281825820ad7458781dc19e427fca77c8c7b2db1b56c81c11590e2ae3999f2f13db8c51c200018182581d60f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d1a004c4b400200a100d9010281825820eb94e8236e2099357fa499bfbc415968691573f25ec77435b7949f5fdfaa5da0584071b6c5956083ff7ac7ad49d5a75c77967b5ad2e7fd756c1de226f71cdf89e5d383bc88975c9ca7deab135f4ea9014666aa0e257f26bdd94dda2df60c922e9306f5f6","description":"","txId":"3095040e42ed9b193f8a66699b1631c17a85f670aee3c4d77fb3cfb195ea6bcb","type":"Tx ConwayEra"},"headId":"654b2b0e5ff3e0a902a12918b63628cdd478364caa4f0c758e6f7490","newLocalUTxO":{},"tag":"DecommitRecorded","utxoToDecommit":{"3095040e42ed9b193f8a66699b1631c17a85f670aee3c4d77fb3cfb195ea6bcb#0":{"address":"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3","datum":null,"datumhash":null,"inlineDatum":null,"inlineDatumRaw":null,"referenceScript":null,"value":{"lovelace":5000000}}}},"time":"2025-04-10T07:31:02.301798566Z"}

Let's try to see the decommit timeline between two states (I am aware these event's do not need to be in order but I think etcd should deliver in order after restart)
So let's track this decommit between two nodes


DecommitRecorded 
running node        2025-04-10T07:30:58.882632162Z
restarted node      2025-04-10T07:31:02.301798566Z

DecommitApproved 
running node        2025-04-10T07:30:58.894604418Z
restarted node      missing event

DecommitFinalized  
running node        2025-04-10T07:30:59.007515339Z
restarted node      2025-04-10T07:31:02.300503374Z

So it seems like the restarted node is late couple of seconds but how can it be that in the test we wait to see DecommitFinalized and if we try to close after the restarted node still thinks it is at version 0?

2025-04-04

SN on exploring `dingo`

Trying out dingo and whether I could hook it up to hydra-node
When synchronizing preview with dingo the memory footprint was growing as sync progressed, but did not increase to same level when restarting the chain sync (althoug it picked up the starting slot etc.)
The system was swapping a lot of memory too (probably reached max of my 32GB)
Querying address of latest hydra head address shows two heads on preview, but our explorer only shows one?

Querying the dingo node seems to work, but I get a hydra scripts discovery error?

MissingScript {scriptName = "\957Initial", scriptHash = "c8a101a5c8ac4816b0dceb59ce31fc2258e387de828f02961d2f2045", discoveredScripts = fromList ["0e35115a2c7c13c68ecd8d74e4987c04d4539e337643be20bb3274bd"]}

Indeed dingo behaves slightly different on the queryUTxOByTxIn local state query: when requesting three txins, it only responds with one utxo

[ TxIn "b7b88533de303beefae2d8bb93fe1a1cd5e4fa3c4439c8198c83addfe79ecbdc" ( TxIx 0 ) , TxIn "da1cc0eef366031e96323b6620f57bc166cf743c74ce76b6c3a02c8f634a7d20" ( TxIx 0 ) , TxIn "6665f1dfdf9b9eb72a0dd6bb73e9e15567e188132b011e7cf6914c39907ac484" ( TxIx 0 ) ] returned utxo: 1

After fixing that to query three times, the next stop gap seems to come from chain sync:
```
bearer closed: "<socket: 23> closed when reading data, waiting on next header True"
```

Maybe something on the n2c handshake does not work? On dingo side I see:

{"time":"2025-04-05T13:47:05.495636842+02:00","level":"INFO","msg":"listener: accepted connection from unix@629","component":"connmanager"} {"time":"2025-04-05T13:47:05.4957064+02:00","level":"ERROR","msg":"listener: failed to setup connection: could not register protocol with muxer","component":"connmanager"}

When debugging how far we get on the handshake protocol I learn how gouroboros implements the state transitions of the miniprotocols using StateMap.
I realize that now the query for scripts not even works.. maybe my instrumentation broke something? Also.. all my instrumentation happened on vendored code in vendor/ of the dingo repo. I wonder how developers do the editing most convenient in this setup?
The Chain.Direct switch to connectToLocalNodeWithVersions was problematic, now it fetches the scripts correctly and the chain sync starts
It's definitely flaky in how "far" we get.. maybe the dingo node is only accepting n2c connections while connected upstream on n2n (I have been in a train with flaky connection).

Once it progressed now onto a RollForward where the queryTimeHandle would query the EraHistory and fail time conversion with error:

TimeConversionException {slotNo = SlotNo 77202345, reason = "PastHorizon {pastHorizonCallStack = [(\"runQuery\",SrcLoc {srcLocPackage = \"ouroboros-consensus-0.22.0.0-f90d7bc7c4431d706016c293a932800b9c1e28c3b268597acc5b945a9be83125\", srcLocModule = \"Ouroboros.Consensus.HardFork.History.Qry\", srcLocFile = \"src/ouroboros-consensus/Ouroboros/Consensus/HardFork/History/Qry.hs\", srcLocStartLine = 439, srcLocStartCol = 44, srcLocEndLine = 439, srcLocEndCol = 52}),(\"interpretQuery\",SrcLoc {srcLocPackage = \"hydra-node-0.21.0-inplace\", srcLocModule = \"Hydra.Chain.Direct.TimeHandle\", srcLocFile = \"src/Hydra/Chain/Direct/TimeHandle.hs\", srcLocStartLine = 91, srcLocStartCol = 10, srcLocEndLine = 91, srcLocEndCol = 24}),(\"slotToUTCTime\",SrcLoc {srcLocPackage = \"hydra-node-0.21.0-inplace\", srcLocModule = \"Hydra.Chain.Direct.TimeHandle\", srcLocFile = \"src/Hydra/Chain/Direct/TimeHandle.hs\", srcLocStartLine = 86, srcLocStartCol = 7, srcLocEndLine = 86, srcLocEndCol = 20}),(\"mkTimeHandle\",SrcLoc {srcLocPackage = \"hydra-node-0.21.0-inplace\", srcLocModule = \"Hydra.Chain.Direct.TimeHandle\", srcLocFile = \"src/Hydra/Chain/Direct/TimeHandle.hs\", srcLocStartLine = 116, srcLocStartCol = 10, srcLocEndLine = 116, srcLocEndCol = 22})], pastHorizonExpression = Some (EPair (ERelToAbsTime (ERelSlotToTime (EAbsToRelSlot (ELit (SlotNo 77202345))))) (ESlotLength (ELit (SlotNo 77202345)))), pastHorizonSummary = [EraSummary {eraStart = Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 0, boundEpoch = EpochNo 0}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 0, boundEpoch = EpochNo 0}), eraParams = EraParams {eraEpochSize = EpochSize 4320, eraSlotLength = SlotLength 20s, eraSafeZone = StandardSafeZone 0, eraGenesisWin = GenesisWindow {unGenesisWindow = 0}}},EraSummary {eraStart = Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 0, boundEpoch = EpochNo 0}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 0, boundEpoch = EpochNo 0}), eraParams = EraParams {eraEpochSize = EpochSize 86400, eraSlotLength = SlotLength 1s, eraSafeZone = StandardSafeZone 0, eraGenesisWin = GenesisWindow {unGenesisWindow = 0}}},EraSummary {eraStart = Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 0, boundEpoch = EpochNo 0}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 0, boundEpoch = EpochNo 0}), eraParams = EraParams {eraEpochSize = EpochSize 86400, eraSlotLength = SlotLength 1s, eraSafeZone = StandardSafeZone 0, eraGenesisWin = GenesisWindow {unGenesisWindow = 0}}},EraSummary {eraStart = Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 0, boundEpoch = EpochNo 0}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 0, boundEpoch = EpochNo 0}), eraParams = EraParams {eraEpochSize = EpochSize 86400, eraSlotLength = SlotLength 1s, eraSafeZone = StandardSafeZone 0, eraGenesisWin = GenesisWindow {unGenesisWindow = 0}}},EraSummary {eraStart = Bound {boundTime = RelativeTime 0s, boundSlot = SlotNo 172800, boundEpoch = EpochNo 2}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 259200s, boundSlot = SlotNo 86400, boundEpoch = EpochNo 1}), eraParams = EraParams {eraEpochSize = EpochSize 86400, eraSlotLength = SlotLength 1s, eraSafeZone = StandardSafeZone 0, eraGenesisWin = GenesisWindow {unGenesisWindow = 0}}},EraSummary {eraStart = Bound {boundTime = RelativeTime 259200s, boundSlot = SlotNo 55728000, boundEpoch = EpochNo 645}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 55814400s, boundSlot = SlotNo 345600, boundEpoch = EpochNo 4}), eraParams = EraParams {eraEpochSize = EpochSize 86400, eraSlotLength = SlotLength 1s, eraSafeZone = StandardSafeZone 0, eraGenesisWin = GenesisWindow {unGenesisWindow = 0}}},EraSummary {eraStart = Bound {boundTime = RelativeTime 55814400s, boundSlot = SlotNo 77155200, boundEpoch = EpochNo 893}, eraEnd = EraEnd (Bound {boundTime = RelativeTime 77241600s, boundSlot = SlotNo 55900800, boundEpoch = EpochNo 647}), eraParams = EraParams {eraEpochSize = EpochSize 86400, eraSlotLength = SlotLength 1s, eraSafeZone = StandardSafeZone 0, eraGenesisWin = GenesisWindow {unGenesisWindow = 0}}}]}"}

I saw that same error when using cardano-cli query tip .. seems like the era history local state query is not accurately reporting epoch bounds.
I conclude that dingo is easy to use and navigate around, but the N2C API is not complete yet. Maybe my work on the LocalStateQuery API in cardano-blueprint could benefit the project and making gouroboros more conformant (at least from a message serialization point of view).

March 2025

2025-03-28

SN on memory leak

Non-profiled Haskell binaries can be inspected using -s and -hT RTS arguments

Running the hydra-node using a 2GB state file as provided by GD the node will load the state and then fail on mismatched keys (as we have not the right ones):

 151,712,666,608 bytes allocated in the heap
  14,411,335,656 bytes copied during GC
     973,747,296 bytes maximum residency (53 sample(s))
      24,460,192 bytes maximum slop
            2033 MiB total memory in use (0 MiB lost due to fragmentation)

The peekForeverE in https://github.com/cardano-scaling/hydra/pull/1919 seem not to make any difference:

 151,712,692,632 bytes allocated in the heap
  14,409,258,352 bytes copied during GC
     973,732,032 bytes maximum residency (53 sample(s))
      24,545,088 bytes maximum slop
            2033 MiB total memory in use (0 MiB lost due to fragmentation)

Using hT a linear growth of memory can be seen quite easily.
First idea: lastEventId conduit was using foldMapC which might be building thunks via mappend
- Nope, that was not the issue.
That was not the issue.. disabling aggregation of chainStateHistory and only load headState next.
- Still linear growth.. so the culprit most likely is inside the main loading of headState (besides other issues?)
Let's turn on StrictData on all of HeadLogic as a first stab at seeing more stricture usage of HeadState et al.
This works! Making HeadLogic{.State, .Outcome} all StrictData already pushes the heap usage down ~5MB!
Possible explanation: With gigabytes of state updates we have almost exclusively TransactionReceived et al state changes. In the aggregate we usually build up thunks like allTxs = allTxs <> fromList [(txId tx, tx)] which will leak memory until forced into one concrete list when showing the HeadState first (which will probably collapse the memory usage again).

With StrictData we have a maximum residency of 10MB after loading 2GB of state events:

 152,176,815,256 bytes allocated in the heap
  16,702,572,088 bytes copied during GC
       9,967,848 bytes maximum residency (2387 sample(s))
         215,600 bytes maximum slop
              43 MiB total memory in use (0 MiB lost due to fragmentation)

Trying to narrow in exact source of memory leak so I do not need to put bangs everywhere
- allTxs and localTxs assignments are not the source of it .. maybe the coordinatedHeadState record update?
- No .. also not really. Maybe it's time to recompile with profiling enabled and make some coffee (this will take a while).
- When using profiling: True using the haskell.nix managed dependencies, I ran into this error:
- Setting enableProfiling = true in the haskell.nix project modules rebuilds the whole world, but that is expected.
- Hard to spot where exactly we are creating the space leak / thunks. This blog post is helpful still: http://blog.ezyang.com/2011/06/pinpointing-space-leaks-in-big-programs/
- I am a bit confused why so many of the cost center point to parsing and decoding code .. maybe the transactions themselves (which make up the majority of data) are not forced for long? This would make sense because the HeadLogic does not inspect transactions themselves (much).
- Only strictness annotations on a !tx did not help, but let's try a StrictData on StateChanged
- StrictData on HeadLogic.Outcome does not fix it … so it must be something related to the HeadState.
- The retainer profile actually points quite clearly to aggregate.
- The biggest things on the heap are bytes, thunks and types related to a cardano transaction body.
Going back to zero in on branches of aggregate via exclusion
- Disabling all CoordinatedHeadState modifications makes memory usage minimal again
- Enabling SnapshotConfirmed -> still bounded
- Enabling PartySignedSnapshot -> still bounded
- Enabling SnapshotRequested -> growing!
- Without allTxs update -> bounded!
- This line creates thunks!? allTxs = foldr Map.delete allTxs requestedTxIds
- Neither, forcing allTxs nor requestedTxIds helped
- Is it really only this line? enableing all other aggregate updates to CoordinatedHeadState
- It's both allTxs usages
- If we only make allTxs field strict? -> Bounded!

2025-03-27

SB on fanout utxo bug

After easy changes to FanoutTx to include observed UTxO instead of using the confirmed snapshot there are problems in the DirectChainSpec and Model.
Let's look at DirectChainSpec first - I need to come up with a utxo value for this line here:

aliceChain `observesInTime` OnFanoutTx headId mempty

Failed test looks like this:

  test/Test/DirectChainSpec.hs:578:35:
  1) Test.DirectChain can open, close & fanout a Head
       expected: OnFanoutTx {headId = UnsafeHeadId "eK+\SO_\243\224\169\STX\161)\CAN\182\&6(\205\212x6L\170O\fu\142ot\144", fanoutUTxO = fromList [(TxIn "0762c8de902abe1e292e691066328c932d95e29c9a564d466e8bc791527e359f" (TxIx 0),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "8163bc1d679f90d073784efdc761288dbc2dc21a352f69238070fc45"})) StakeRefNull)) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 2000000) (MultiAsset (fromList [])))) TxOutDatumNone ReferenceScriptNone),(TxIn "c9a733c945fdb7819648a58d7d6b9a30af2ac458a27f5bb7e9c41f92da82ba2c" (TxIx 0),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "8163bc1d679f90d073784efdc761288dbc2dc21a352f69238070fc45"})) StakeRefNull)) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 2000000) (MultiAsset (fromList [])))) TxOutDatumNone ReferenceScriptNone)]}
        but got: OnFanoutTx {headId = UnsafeHeadId "eK+\SO_\243\224\169\STX\161)\CAN\182\&6(\205\212x6L\170O\fu\142ot\144", fanoutUTxO = fromList [(TxIn "880c3d807a48d432788158f879a81a5ddc6c1ad6527fe70922175e621ea08092" (TxIx 0),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "0e35115a2c7c13c68ecd8d74e4987c04d4539e337643be20bb3274bd")) StakeRefNull)) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 4879080) (MultiAsset (fromList [(PolicyID {policyID = ScriptHash "654b2b0e5ff3e0a902a12918b63628cdd478364caa4f0c758e6f7490"},fromList [("4879647261486561645631",1),("f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d",1)])])))) (TxOutDatumInline BabbageEraOnwardsConway (HashableScriptData "\216{\159\216y\159X\FSeK+\SO_\243\224\169\STX\161)\CAN\182\&6(\205\212x6L\170O\fu\142ot\144\159X \213\191J?\204\231\ETB\176\&8\139\204'I\235\193H\173\153i\178?E\238\ESC`_\213\135xWj\196\255\216y\159\EM'\DLE\255\NUL\SOHX \193\211\DC4E\234\252\152\157\239\186\RSmVF\141\208\218\135\141\160{\fYFq\245\SOH\148\nOS\DC1X \227\176\196B\152\252\FS\DC4\154\251\244\200\153o\185$'\174A\228d\155\147L\164\149\153\ESCxR\184UX \227\176\196B\152\252\FS\DC4\154\251\244\200\153o\185$'\174A\228d\155\147L\164\149\153\ESCxR\184U\128\ESC\NUL\NUL\SOH\149\214\218\152\136\255\255" (ScriptDataConstructor 2 [ScriptDataConstructor 0 [ScriptDataBytes "eK+\SO_\243\224\169\STX\161)\CAN\182\&6(\205\212x6L\170O\fu\142ot\144",ScriptDataList [ScriptDataBytes "\213\191J?\204\231\ETB\176\&8\139\204'I\235\193H\173\153i\178?E\238\ESC`_\213\135xWj\196"],ScriptDataConstructor 0 [ScriptDataNumber 10000],ScriptDataNumber 0,ScriptDataNumber 1,ScriptDataBytes "\193\211\DC4E\234\252\152\157\239\186\RSmVF\141\208\218\135\141\160{\fYFq\245\SOH\148\nOS\DC1",ScriptDataBytes "\227\176\196B\152\252\FS\DC4\154\251\244\200\153o\185$'\174A\228d\155\147L\164\149\153\ESCxR\184U",ScriptDataBytes "\227\176\196B\152\252\FS\DC4\154\251\244\200\153o\185$'\174A\228d\155\147L\164\149\153\ESCxR\184U",ScriptDataList [],ScriptDataNumber 1743066405000]]))) ReferenceScriptNone)]}

So it seems like there is a script output in the observed UTxO with 4879080 lovelace, some tokens and seems like this is a head output and what we expect is distributed outputs to hydra-node parties containing the fanout amount.
These head assets that I see should have been burned already? We get this utxo in the observation using let inputUTxO = resolveInputsUTxO utxo tx
If I use

  (headInput, headOutput) <- findTxOutByScript inputUTxO Head.validatorScript
  UTxO.singleton (headInput, headOutput)

then the utxo is the same which is expected.

How come the fanout tx does not contain pub key outputs?
If I use utxoFromTx fanoutTx then I get the expected pub key outputs:

  1) Test.DirectChain can open, close & fanout a Head
       expected: OnFanoutTx {headId = UnsafeHeadId "eK+\SO_\243\224\169\STX\161)\CAN\182\&6(\205\212x6L\170O\fu\142ot\144", fanoutUTxO = fromList []}
        but got: OnFanoutTx {headId = UnsafeHeadId "eK+\SO_\243\224\169\STX\161)\CAN\182\&6(\205\212x6L\170O\fu\142ot\144", fanoutUTxO = fromList [(TxIn "431e45c0048e0aa104deaca1e8aca454c85efd71c52948e418d9119fd8cdf7b3" (TxIx 0),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "4e932840c5d2d3664237149fd3e9ba09c531581126fbdbab073c31ce"})) StakeRefNull)) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 2000000) (MultiAsset (fromList [])))) TxOutDatumNone ReferenceScriptNone),(TxIn "431e45c0048e0aa104deaca1e8aca454c85efd71c52948e418d9119fd8cdf7b3" (TxIx 1),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraConway) (ShelleyAddress Testnet (KeyHashObj (KeyHash {unKeyHash = "f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d"})) StakeRefNull)) (TxOutValueShelleyBased ShelleyBasedEraConway (MaryValue (Coin 90165992) (MultiAsset (fromList [])))) TxOutDatumNone ReferenceScriptNone)]}

but the overall test is red since we construct artificial TxIns in utxoFromTx

I created findPubKeyOutputs to match on all pub key outputs and then I see expected outputs but they also contain change output that returns some ada to the hydra-node wallet. Life is not simple.
In the end I changed all tests that match exactly on final utxo to make sure that subset of final utxo is there (disregarding the change output).
Changes in fanout observation boiled down to findPubKeyOutputs $ utxoFromTx tx

SB on memory leak on loading events from disk

Midnight people have reported that they still see some memory issues when loading a huge state file from disk.
The main problem is making sure the fix works, I still don't have a good idea on how to make sure my changes reduce the memory consumption.
Problem lies in this piece of code:

 (lastEventId, (headState, chainStateHistory)) <-
    runConduitRes $
      sourceEvents eventSource
        .| getZipSink
          ( (,)
              <$> ZipSink (foldMapC (Last . pure . getEventId))
              <*> ZipSink recoverHeadStateC
          )
...

recoverHeadStateC =
    mapC stateChanged
      .| getZipSink
        ( (,)
            <$> ZipSink (foldlC aggregate initialState)
            <*> ZipSink (foldlC aggregateChainStateHistory $ initHistory initialChainState)
        )

and of course the way we create PersistenceIncremental which is responsible for reading the file (sourceEvents eventSource part).

 sourceFileBS fp
          .| linesUnboundedAsciiC
          .| mapMC
            ( \bs ->
                case Aeson.eitherDecodeStrict' bs of
                  Left e -> ...
                  Right decoded -> ...
            )

Initially I noticed the usage of foldlC which is strict and thought perhaps this is the problem but could not find lazy alternative and in general I don't believe this is the real issue.
I am more keen to investigate this code:

 sourceFileBS fp
          .| linesUnboundedAsciiC
          .| mapMC ...

linesUnboundedAsciiC could be the cause since I believe it is converting the whole stream

Convert a stream of arbitrarily-chunked textual data into a stream of data
where each chunk represents a single line. Note that, if you have
unknownuntrusted input, this function is unsafe/, since it would allow an
attacker to form lines of massive length and exhaust memory.

I also found an interesting function peekForeverE that should Run a consuming conduit repeatedly, only stopping when there is no more data available from upstream.
Could I use benchmarks to simulate heavy load from disk?
I just tried running the benchmarks with and without one line change and it seems like the memory consumption is reduced
- BEFORE ->

Average confirmation time (ms): 57.599974154
P99: 76.48237684999998ms
P95: 67.55752405ms
P50: 56.9354805ms
Invalid txs: 0

### Memory data

 | Time | Used | Free |
|------|------|------|
 | 2025-03-27 15:39:59.474482067 UTC | 14.2G | 35.1G |
 | 2025-03-27 15:40:04.474412824 UTC | 14.4G | 34.9G |
 | 2025-03-27 15:40:09.474406479 UTC | 14.4G | 34.9G |
 | 2025-03-27 15:40:14.474403701 UTC | 14.4G | 34.8G |
 | 2025-03-27 15:40:19.47445777 UTC | 14.4G | 34.8G |
 | 2025-03-27 15:40:24.474392458 UTC | 14.4G | 34.8G |
 | 2025-03-27 15:40:29.474439923 UTC | 14.4G | 34.8G |
 | 2025-03-27 15:40:34.474408859 UTC | 14.5G | 34.7G |
 | 2025-03-27 15:40:39.474436556 UTC | 14.4G | 34.7G |
 | 2025-03-27 15:40:44.474414945 UTC | 14.5G | 34.7G |

Confirmed txs/Total expected txs: 300/300 (100.00 %)
Average confirmation time (ms): 9.364033643
P99: 19.919154109999997ms
P95: 15.478096ms
P50: 7.7630015ms
Invalid txs: 0

### Memory data

 | Time | Used | Free |
|------|------|------|
 | 2025-03-27 15:40:55.995225272 UTC | 14.2G | 35.1G |
 | 2025-03-27 15:41:00.995294779 UTC | 14.2G | 35.1G |
 | 2025-03-27 15:41:05.995309124 UTC | 14.2G | 35.1G |
 | 2025-03-27 15:41:10.995299687 UTC | 14.3G | 35.0G |
 | 2025-03-27 15:41:15.995284362 UTC | 14.3G | 35.0G |
 | 2025-03-27 15:41:20.995281122 UTC | 14.3G | 35.0G |

AFTER ->

Average confirmation time (ms): 57.095020378
P99: 72.8903286ms
P95: 66.89188805ms
P50: 57.172249ms
Invalid txs: 0

### Memory data

 | Time | Used | Free |
|------|------|------|
 | 2025-03-27 15:37:47.726878831 UTC | 13.7G | 35.6G |
 | 2025-03-27 15:37:52.726824668 UTC | 13.9G | 35.5G |
 | 2025-03-27 15:37:57.726768654 UTC | 14.0G | 35.3G |
 | 2025-03-27 15:38:02.72675874 UTC | 14.0G | 35.3G |
 | 2025-03-27 15:38:07.726756126 UTC | 14.0G | 35.3G |
 | 2025-03-27 15:38:12.726795633 UTC | 14.0G | 35.2G |
 | 2025-03-27 15:38:17.726793141 UTC | 14.1G | 35.2G |
 | 2025-03-27 15:38:22.726757309 UTC | 14.1G | 35.1G |
 | 2025-03-27 15:38:27.726764279 UTC | 14.1G | 35.1G |
 | 2025-03-27 15:38:32.726781991 UTC | 14.1G | 35.1G |

Confirmed txs/Total expected txs: 300/300 (100.00 %)
Average confirmation time (ms): 9.418157436
P99: 19.8506584ms
P95: 15.841609050000002ms
P50: 7.821248000000001ms
Invalid txs: 0

### Memory data

 | Time | Used | Free |
|------|------|------|
 | 2025-03-27 15:38:45.195815881 UTC | 13.8G | 35.6G |
 | 2025-03-27 15:38:50.195894922 UTC | 14.0G | 35.4G |
 | 2025-03-27 15:38:55.19592388 UTC | 13.8G | 35.5G |
 | 2025-03-27 15:39:00.195971592 UTC | 14.1G | 35.2G |
 | 2025-03-27 15:39:05.195891924 UTC | 14.3G | 35.0G |
 | 2025-03-27 15:39:10.195897911 UTC | 14.3G | 35.0G |

I think I could try to keep the state file after running the benchmarks and then try to start a hydra-node using this (hopefully huge) state file and then peek into prometheus metrics to observe reduced memory usage.
What I find weird is that the same persistence functions are used in the api server but there is no report leakage there - perhaps it boils down on how we consume this stream?
Managed to get a state file with over 300k events so let's see if we can measure reduced usage.
This is my invocation of hydra-node so I can copy paste it when needed:

./result/bin/hydra-node \
  --node-id 1 --api-host 127.0.0.1  \
  --monitoring-port 6000 \
  --hydra-signing-key /home/v0d1ch/code/hydra/memory/state-0/me.sk \
  --hydra-scripts-tx-id "8f46dbf87bd7eb849c62241335fb83b27e9b618ea4d341ffc1b2ad291c2ad416,25f236fa65036617306a0aaf0572ddc1568cee0bc14aee14238b1196243ecddd,59a236ac22eb1aa273c4bcd7849d43baddd8fcbc5c5052f2eb074cdccbe39ff4" \
  --cardano-signing-key /home/v0d1ch/code/hydra/memory/1.sk \
  --ledger-protocol-parameters /home/v0d1ch/code/hydra/memory/state-0/protocol-parameters.json \
  --testnet-magic 42 \
  --contestation-period 10 \
  --deposit-deadline 10 \
  --node-socket /home/v0d1ch/code/hydra/memory/node.socket \
  --persistence-dir /home/v0d1ch/code/hydra/memory/state-0

Didn't find the time to properly connect to some tool to measure the memory but by looking at the timestamps between LoadindState and LoadedState traces I can see that new changes give MUCH better performance:

With the current master:

start loading timestamp":"2025-03-27T17:12:08.57862623Z
loaded        timestamp":"2025-03-27T17:12:28.991870713Z

With one-liner change:   

start loading timestamp":"2025-03-27T16:58:54.055623085Z
loaded        timestamp":"2025-03-27T16:59:15.05648201Z

It looks like it took us 20 seconds to load around 335 mb state file and new change reduces this to around a second!

2025-03-24

SB on pending deposits bug

It seems like we have a bug in displaying our pending deposits where deposits that are already incremented or recovered still show when doing a request to hydra-node api /commits.
I extended one e2e test we had related to pending deposits and added one check after all others where I spin up again two hydra-nodes and call the endpoint to see if all pending deposits are cleared.

✦ ➜ cabal test hydra-cluster --test-options='--match="can see pending deposits" --seed 278123554'

The test seems flaky but in general it almost always fails.
From just looking at the code I couldn't see anything weird
Found one weird thing: I asserted that in the node-1 state file there are three CommitRecorded and three CommitRecovered but in node-2 state file there are two CommitRecovered missing.
Is the whole bug related to who does the recovering/recording?
The test outcome although red shows correct txids for the non-recovered txs
We only assert node-1 sees all CommitRecover messages but don't do it for the node-2 since that node is shut down at this point (in order to be able to prevent deposits from kicking in).
Is this a non-issue after all? I think so since we stop one node and then try to assert that after restart it sees some other commits being recovered but those were never recorded in the node local state. What is weird is that the test was flaky but using constant seed yields always the same results.
If a node fails to see OnIncrementTx then the deposit is stuck in pending local state forever.

2025-03-12

SB on Model tests weird behavior

Currently I had to sprinkle treadDelay here and there in the model tests since otherwise they hang for a long time and eventually (I think) report the shrinked values that fail the test.
This problem is visible mainly in CI where the resources available are not so big, locally the same tests pass.
If I remove the threadDelay the memory grows really big and I need to kill the process.
This started happening when I had to replace GetUTxO which no longer exists with queryState
I looked at this with NS and found out that we were missing to wait for all nodes to see a DecommitFinalized - we were waiting only on our node to see it. This seemed to have fixed the model test and was a bit surprising to be this easy since I expected a lot of problems in finding out what went wrong.

SB on figuring out what happened in our Head

The situation is that we are unable to close because of H13 MustNotChangeVersion
This happens because the version in the input datum (open datum) does not match with the version in the output (close datum).
Local state says I am on version 3 and onchain it seems the situation is the same - 3! But this can't be since the onchain check would pass then. This is how the datum looks https://preview.cexplorer.io/datum/8e4bd7ac38838098fbf23e5702653df2624bcfa4cf0c5236498deeede1fdca78
Looking at the state it seems like we try to close, the snapshot version contains correct version (3) but openVersion is still at 2:

  ...
                  "utxoToCommit": null,
                  "utxoToDecommit": null,
                  "version": 3
                },
                "tag": "ConfirmedSnapshot"
              },
              "headId": "50bb0874ae28515a2cff9c074916ffe05500a3b4eddea4178d1bed0b",
              "headParameters": {
                "contestationPeriod": 300,
                "parties": [
...
              "openVersion": 2,
              "tag": "CloseTx"
            },
            "tag": "OnChainEffect"
          }

Question is how did we get to this place? It must be that my node didn't observe and emit one CommmitFinalized which is when we do the version update - upon increment observation.
There are 24 lines with CommitFinalize message - only go up to version 2 while there are 36 lines with CommitRecorded - it seems like one recorded commit was not finalized for whatever reason.
OnIncrementTx shows up 8 times in the logs but in reality it is tied to only two increments so the third one was never observed.
OnDepositTx shows up 12 times in the logs but they are related to only two deposits.
Could it be that the decommit failed instead?
There is one DecommitRecorded and one DecommitFinalized so it seems good.
Seems like we have CommitRecorded for:
- "utxoToCommit":{"4b31dd7db92bde4359868911c1680ea28c0a38287a4e5b9f3c07086eca1ac26a#0"
- "utxoToCommit":{"4b31dd7db92bde4359868911c1680ea28c0a38287a4e5b9f3c07086eca1ac26a#1"
- "utxoToCommit":{"22cb19c790cd09391adf2a68541eb00638b8011593b3867206d2a12a97f4bf0d#0"
We received CommitFinalized for:
"theDeposit":"44fa1bc9b04d2ffee50fd84088517c3f7b530353834e7c678fdd05073881cb40"
- "theDeposit":"5b93f95068148482a1e27979517e8ab467f85e72551cfc9baaa2086a60e7353a"
So one commit was never finalized but it is a bit hard to connect recorded and finalized commits.
OnDepositTx was seen for txids: - 44fa1bc9b04d2ffee50fd84088517c3f7b530353834e7c678fdd05073881cb40 - 5b93f95068148482a1e27979517e8ab467f85e72551cfc9baaa2086a60e7353a - 83e7c36a9d4727e00169409f869d0f94737672c7e87850632b9efe1637f8ef8f
OnIncrementTx was seen for:
- 44fa1bc9b04d2ffee50fd84088517c3f7b530353834e7c678fdd05073881cb40
- 5b93f95068148482a1e27979517e8ab467f85e72551cfc9baaa2086a60e7353a so we missed to observe deposit `83e7c36a9d4727e00169409f869d0f94737672c7e87850632b9efe1637f8ef8f https://preview.cexplorer.io/tx/83e7c36a9d4727e00169409f869d0f94737672c7e87850632b9efe1637f8ef8f#data
Question is what to do with this Head? Can it be closed somehow?
We should query the deposit address to see what kind of UTxOs are available there.

2025-03-10

FT on SideLoad-Snapshot

Added an endpoint to GET the latest confirmed snapshot, which is needed to construct the side-load request, but it does not include information about the latest seen snapshot. Waiting on pull#1860 to enhance it.
In our scenario, the head got stuck on InitialSnapshot. This means that during side-loading, we must act similarly to clear pending transactions (pull#1840).
Wonder if the side-loaded snapshot version should be exactly the same as the current one, given that version bumping requires L1 interaction.
Also unclear if we should validate utxoToCommit and utxoToDecommit on the provided snapshot to match the last known state.
Concerned that a head can become stuck during a Recover or Decommit client input.
SideLoadSnapshot is the first ClientInput that contains a headId and must be verified when received by the node.
Uncertain whether WaitOnNotApplicableTx for localTxs not present in the side-loaded confirmed snapshot would trigger automatic re-submission.
I think this feature should not be added to TUI since it is not part of the core protocol or user journey.

2025-03-07

FT on SideLoad-Snapshot

Now that we have a head stuck on the initial snapshot, I want to explore how we can introspect the node state from the client side, as this will be necessary to create the side-load request.
Projecting the latest SnapshotConfirmed seems straightforward, but projecting the latest SeenSnapshot introduces code duplication in HeadLogic.aggregate and the ServerOutput projection.
These projections currently conflict heavily with pull#1860. For that reason, we are postponing these changes until it is merged.

2025-03-06

FT on SideLoad-Snapshot

We need to break down withHydraNode into several pieces to allow starting a node with incorrect ledger-protocol-params in its running configuration.
In this e2e scenario, we exercise a three party network where two nodes (node-1 and node-2) are healthy, and one (node-3) is misconfigured. In this setup, node-1 attemtps to submit a NewTx which is accepted by both healthy members but rejected by node-3. Then, when node-3 goes offline and comes back online using healthy pparams, it is expected to stop cooperating and cause the head to become stuck.
It seems that after node-3 comes back online, it only sees a PeerConnected message within 20s. Adding a delay for it to catch up does not help. From its logs, we don’t see messages for WaitOnNotApplicableTx, WaitOnSeenSnapshot, or DroppedFromQueue.
If node-3 tries to re-submit the same transaction, it is now accepted by node-3 but rejected by node-1 and node-2 due to ValueNotConservedUTxO (because it was already applied). Since node-3 is not the leader, we don’t see any new SnapshotRequested round being signed.
Node-1 and node-2 have already signed and observed each other signing for snapshot number 1, while node-3 has not seen anything. This means node-1 and node-2 are waiting for node-3 to sign in order to proceed. Now the head is stuck and won’t make any progress because node-3 has stopped cooperating.
New issue raised for head getting stuck issue#1773, which proposes to forcibly sync the snapshots of the hydra-nodes in order to align local ledger states.
Updating the sequence diagram for a head getting stuck using latest findings.
Now thinking how could we "Allow introspection of the current snapshot in a particular node"; as we want to be able to notice if the head has become stuck. We want to be able to observed who is missing to sign the current snapshot in flight (which is preventing from getting it confirmed).
Noticed that in onOpenNetworkReqTx we keep TransactionReceived even if not applicable, resulting in a list with potentially dupplicate elements (in case of resubmission).
Given that a head becoming stuck is an L2 issue due to network connectivity, I’m considering whether we could send more information about the local ledger state as part of PeerConnected to trigger auto-sync recovery based on discrepancies. Or perhaps we should broadcast InvalidTx instead?
Valid idea to explore after side-load.

2025-03-05

FT on SideLoad-Snapshot

Trying to reproduce a head becoming stuck in BehaviorSpec when a node starts with an invalid ledger.
Having oneMonth in BehaviorSpec's waitUntilMatch makes debugging harder. Reduced it to (6 * 24 * 3), allowing full output visibility.
After Bob reconnects using a valid ledger, we expected him to accept the transaction if re-submitted by him, but he rejects it instead.
It's uncertain whether Bob is rejecting the resubmission or something else, so I need to wait until all transactions are dropped from the queue.
Found that when Bob is resubmitting, he is in Idle state when he is expected to restart in Initial state.
This is interesting, as if a party suffers a disk error and loses persistence, side-loading may allow it to resume up to a certain point in time.
The idea is valid, but we should not accept a side-load when in Idle state—only when in Open state.
It seems this is the first time we attempt to restart a node in BehaviorSpec. Now checking if this is the right place or if I should design the scenario differently.
When trying to restart the node from existing sources, we noticed the need to use the hydrate function. This suggests we should not force reproducing this scenario in BehaviorSpec.
NodeSpec does not seem to be the right place either, as we don't have multiple peers connected to each other.
Trying to reproduce the scenario at the E2E level, now running on top of an etcd network.

2025-03-04

SB on fixing the persistence bug

Continuing where I left off yesterday - to fix a single test that should throw IncorrectAccessException but instead I saw yesterday:

 uncaught exception: IOException of type ResourceBusy

When I sprinkle some spy' to see the values of actually thread ids I don't get this exception anymore, just the test fails. So the exception is tightly coupled with how we check for threads in the PersistenceIncremental handle.
I tried labeling the threads and using throwTo from MonadFork but the result is the same.
Tried using withBinaryFile in both source and append and use conduit to stream from/to file but that didn't help.
Tried using bracket with openBinaryFile and then sink/source handle in the callback but the results are the same.
What is happening here?

2025-03-03

SB on api server as the event sink

There are only two problems left to solve here. First one being the IncorrectAccessException from persistence in the cluster tests. This one I have a plan on how to solve (have a way to register a thread that will append) and the other problem is some cluster tests fail since appropriate message was not observed.
One example test is persistence can load with empty commit.
I wanted to verify is the messages are coming through since the test fails at waitFor and I see the messages propagated (but I don't see HeadIsOpened twice!)
Looking at the messages the Greetings message does not contain correct HeadStatus anymore! There was a projection that made sure to update this feeld in Greetings message but now we shuffled things around and I don't think this projection works any more.
I see all messages correct (except headStatus in Greetings) but only propagated once (and we do restart the node in our test).
I see api server being spun up twice but second time I don't see message replay for some reason.
One funny thing is I see ChainRollback - perhaps something around this is broken?
I see one rebase mistake in Monitoring module that I reverted.
After some debugging I notice that the history loaded from the conduit is always empty list. This is the cause of our problems here!
Still digging around code to try and figure out what is happening. I see HeadOpened saved in persistence file and can't for the life of me figure out why it is not loaded on restart. I tried even passing in the complete intact event source conduit to make sure I am not consuming the conduit in the Server leaving it empty for the WSServer but this is not the problem I am having.
I remapped all projections to work with StateChanged instead of ServerOutput since it makes no sense to remap to ServerOutput just for that.
Suspecting that mapWhileC is the problem since it would stop each time it can't convert some StateEvent to ServerOutput from disk!
This was it - mapWhileC stops when it encounteres Nothing so it was not processing complete list of events! So happy to fix this.
Next is to tackle the IncorrectAccessException from persistence. I know why this happens (obviously we try to append from different thread) and sourcing the contents of a persistence file should not be guarded by correct thread id. In fact, we should allow all possible clients to accept (streamed) persistence contents and make sure to only append from one thread and that is the one in which hydra-node process is actually running.
I added another field to PersistenceIncremental called registerThread and it's sole purpose is to register a thread in which we run in - so that we are able to append (I also removed the check for thread id from source and moved it to append )
Ok, this was not the fix I was looking for. The registerThread is hidden in the persistence handle so if you don't have access to it from the outside how would you register a thread (for example in our tests).
I ended up registering a thread id on append if it doesn't exist and do a check if it is there but see one failure:


  test/Hydra/PersistenceSpec.hs:59:5:
  1) Hydra.Persistence.PersistenceIncremental it cannot load from a different thread once having started appending
       uncaught exception: IOException of type ResourceBusy
       /tmp/hydra-persistence-33802a411f862b7a/data: openBinaryFile: resource busy (file is locked)
       (after 1 test)
         []
         [String "WT",Null,Null]

I still need to investigate.

February 2025

2025-02-27

SB on state of things regarding api server memory consumption

There is no CommandFailed and ClientEffect
We don't have anymore GetUTxO client input therefore I had to call api using GET /snapshot/utxo request to obtain this information (in cluster tests)
For the tests that don't spin the api server I used TestHydraClient and it's queryState function to obtain the HeadState which in turn contains the head UTxO.
One important thing to note is that I had to add utxoToCommit in the snapshot projection in order to get the expected UTxO. This was a bug we had and nobody noticed.
We return Greetings and InvalidInput types from the api server without wrapping them into TimedServerOutput which is a bit annoying since now we need to double parse json values in tests. If the decoding fails for TimedServerOutput we try to parse just the ServerOutput.

Current problems:

After adding /?history=yes to hydra-cluster tests api client I started seeing IncorrectAccessException from the persistence. This is weird to me since all we do is read from the persistence event sink.
Querying the hydra node state in our Model tests to get the Head UTxO (instead of using GetUTxO client input) hangs sometimes and I don't see why. I suspect this has something to do with threads spawned in the model tests:

This is the diff, it looks benign:


 waitForUTxOToSpend ::
   forall m.
-  (MonadTimer m, MonadDelay m) =>
+  MonadDelay m =>
   UTxO ->
   CardanoSigningKey ->
   Value ->
   TestHydraClient Tx m ->
   m (Either UTxO (TxIn, TxOut CtxUTxO))
-waitForUTxOToSpend utxo key value node = go 100
+waitForUTxOToSpend utxo key value node = do
+  u <- headUTxO node
+  threadDelay 1
+  if u /= mempty
+    then case find matchPayment (UTxO.pairs u) of
+      Nothing -> pure $ Left utxo
+      Just (txIn, txOut) -> pure $ Right (txIn, txOut)
+    else pure $ Left utxo
  where
-  go :: Int -> m (Either UTxO (TxIn, TxOut CtxUTxO))
-  go = \case
-    0 ->
-      pure $ Left utxo
-    n -> do
-      node `send` Input.GetUTxO
-      threadDelay 5
-      timeout 10 (waitForNext node) >>= \case
-        Just (GetUTxOResponse _ u)
-          | u /= mempty ->
-              maybe
-                (go (n - 1))
-                (pure . Right)
-                (find matchPayment (UTxO.pairs u))
-        _ -> go (n - 1)
-
   matchPayment p@(_, txOut) =
     isOwned key p && value == txOutValue txOut

Model tests sometimes succeed but this is not good enough and we don't want anymore flaky tests.

2025-02-26

SN troubleshooting unclean restarts on etcd branch

Started by investigating hydra-cluster tests failing, for example this one erroring with

4) Test.EndToEnd, End-to-end on Cardano devnet, restarting nodes, close of an initial snapshot from re-initialized node is contested
    Process "hydra-node (2)" exited with failure code: 1
    Process stderr: RunServerException {ioException = Network.Socket.bind: resource busy (Address already in use), host = 0.0.0.0, port = 4002}

Seems like the hydra-node is not shutting down cleanly and scenarios like this
Isolated test scenarios where we simply expect withHydraNode to start/stop and restart within a certain time and not fail
Testing these tests on master it worked fine?! Seems to have something to do with etcd?
When debugging withHydraNode and trying to port it to typed-process, I noticed that we don't need the withHydraNode' variant really -> merged them
Back to the tests.. why are they failing while the hydra-node binary seems to behave just fine interactively?
With several threadDelay and prints all over the place I saw that the hydra-node spawns etcd as a sub-process, but when withProcess (any of its variants) results in stopProcess, the etcd child stays alive!
Issuing a ctrl+c on ghci has the etcd process log a signal detected and it shut downs
We are not sending SIGINT to the etcd process? Tried interruptProcessGroupOf in the Etcd module
My handlers (finally or bracket) are not called!? WTF moment
Found this issue which mentions that withProcess sends SIGTERM that is not handled by default
- Some familiar faces on this one
- This is also an interesting paragraph about how ctrl+c can be delegated to sub-process (not what we needed)
So the solution is two-fold:
- First, we need to make sure to send SIGINT to the etcd process whenever we are asked to shut down too (in the Etcd module)
- Also, we should initiate a graceful shutdown when the hydra-node receives SIGTERM
  - This is a better approach than making withHydraNode send a SIGINT to hydra-node
  - While that would work too, dealing SIGTERM in hydra-node is more generally useful
  - For example a docker stop sends SIGTERM to the main proces in a container

2025-02-19

SN working on etcd gprc client integration

When starting to use grapesy I had a conflict of ouroboros-network needing an older network than grapesy. Made me drop the ouroboros modules first.
Turns out we still depend transitively on the ouroboros-network packages (via cardano-api), but cabal resolver errors are even wors.
Adding a allower-newer: network still works
Is it fine to just use a newer version of network in the ouroboros-network?
The commits that bumped the upper bound does not indicate otherwise
Explicitly listed all packages in allow-newer and moved on with life

2025-02-17

SN working on etcd network connectivity

Working on PeerConnected (or an equivalent) for etcd network.
Changing the inbound type to Either Connectivity msg does not work well with the Authentication layer?
The composition using components (ADR7: https://hydra.family/head-protocol/adr/7) is quite complicated and only allows for a all-or-nothing interface out of a component without much support for optional parts.
In particular, an Etcd component that delivers Either Connectivity msg as inbound messages cannot be composed easily with the Authenticate component that verifes signatures of incoming messages (it would need to understand that this is an Either and only do it for Right msg).
Instead, I explore expanding NetworkCallback to not only deliver, but also provide an onConnectivity callback.
After designing a more composable onConnectivity handling, I wondered how the Etcd component would be determinig connectivity.
The etcdctl command line tool offers a member list which returns a list of members if on a majority cluster, e.g.

{"header":{"cluster<sub>id</sub>":8903038213291328342,"member<sub>id</sub>":1564273230663938083,"raft<sub>term</sub>":2},"members":\[{"ID":1564273230663938083,"name":"127.0.0.1:5001","peerURLs":\["<http://127.0.0.1:5001>"\],"clientURLs":\["<http://127.0.0.1:2379>"\]},{"ID":3728543818779710175,"name":"127.0.0.1:5002","peerURLs":\["<http://127.0.0.1:5002>"\],"clientURLs":\["<http://127.0.0.1:2380>"\]}\]}

But when invoked on a minority cluster it returns

  {"level":"warn","ts":"2025-02-17T22:49:48.211708+0100","logger":"etcd-client","caller":"[email protected]/retry<sub>interceptor</sub>.<go:63>","msg":"retrying
  of unary invoker
  failed","target":"etcd-endpoints://0xc000026000/127.0.0.1:2379","attempt":0,"error":"rpc
  error: code = DeadlineExceeded desc = context deadline exceeded"}
  Error: context deadline exceeded

When it cannot connect to an etcd instance it returns

  {"level":"warn","ts":"2025-02-17T22:49:32.583103+0100","logger":"etcd-client","caller":"[email protected]/retry<sub>interceptor</sub>.<go:63>","msg":"retrying
  of unary invoker
  failed","target":"etcd-endpoints://0xc0004b81e0/127.0.0.1:2379","attempt":0,"error":"rpc
  error: code = DeadlineExceeded desc = latest balancer error: last
  connection error: connection error: desc = \\transport: Error while
  dialing: dial tcp 127.0.0.1:2379: connect: connection refused\\"}
  Error: context deadline exceeded

When implementing pollMembers, suddenly the waitMessages was not blocked anymore?
While a litte crude, polling member list works nicely to get a full list of members (if we are connected to the majority cluster).
All this will change when we switch to a proper grpc client anyways

2025-02-05

SB on running conduit only once for projections

Current problem we want to solve is instead of passing a conduit to mkProjection function and running it inside we would like to stream data to all of the projections we have.
Seems like this is easier said than done since we also rely on a projection result which is a Projection handle that is used to update the TVar inside.
I thought it might be a good idea to alter mkProjection and make it run in ConduitT so it can receive events and propagate them further and then, in the end return the Projection handle.
I made changes to the mkProjection that compile

mkProjection ::
-  (MonadSTM m, MonadUnliftIO m) =>
+  MonadSTM m =>
   model ->
   -- | Projection function
   (model -> event -> model) ->
-  ConduitT () event (ResourceT m) () -> 
-  m (Projection (STM m) event model)
-mkProjection startingModel project eventSource = do
-  tv <- newTVarIO startingModel
-  runConduitRes $
-    eventSource .| mapM_C (lift . atomically . update tv)
-  pure
+  ConduitT event (Projection (STM m) event model) m ()
+mkProjection startingModel project = do
+  tv <- lift $ newTVarIO startingModel
+  meventSource <- await
+  _ <- case meventSource of
+    Nothing -> pure ()
+    Just eventSource ->
+      void $ yield eventSource .| mapM_C (atomically . update tv)
+  yield $
     Projection
       { getLatest = readTVar tv
       , update = update tv

but the main issue is that I can't get the results of all projections we need in the end that easy.

-- does not compile
headStatusP <- runConduitRes $ yield outputsC .| mkProjection Idle projectHeadStatus

We need to be able to process streamed data from disk and also output like 5 of these projections that do different things.
I discovered sequenceConduits which allows collection of the conduit result values.
Idea was to collect all projections which have the capability of receiving events as the conduit input.

[headStatusP] <- runConduit $ sequenceConduits [mkProjection Idle projectHeadStatus] >> sinkList

Oh, just realized sequenceConduits need to have exactly the same type so my plan just failed

I think I need to revisit our approach and start from scratch.

January 2025

2025-01-23

SB on state events streaming

So what we want to do is to reduce the memory footprint in hydra-node as the final outcome
There are couple of ADRs related to persisting stream of events and having different sinks that can read from the streams
Our API needs to become one of these event sinks
The first step is to prevent history output by default as history can grow pretty large and it is all kept in memory
We need to remove ServerOutput type and map all missing fields to StateChange type since that is what we will use to persist the changes to disk
I understand that we will keep existing projections but they will work on the StateChange type and each change will be forwarded to any existing sinks as the state changes over time
We already have PersistenceIncremental type that appends to disk, can we use similar handle? Most probably yes - but we need to pick the most performant function to write/read to/from disk.
Seems like we currently use eventPairFromPersistenceIncremental to setup event stream/sink. What we do is load all events from disk. We also have a TVar holding the event id. Ideally what we would like is to output every new event in our api server. I should take a look at our projections to see how we output individual messages.
Ok, yeah, projections are displaying the last message but looking at this code I am realizing how complex everything is. We should strive for simplicity here.
Another thought - would it help us to use Servant at least to separate the routing and handlers? I think it could help but otoh Servant can get crazy complex really fast.
So after looking at the relevant code and the issue https://github.com/cardano-scaling/hydra/issues/1618 I believe the most complex thing would be this Websocket needs to emit this information on new state changes. but even this is not hard I believe since we have control of what we need to do when setting up event source/sink pair.

SN on streaming events

Streaming events using conduit makes us buy into the unliftio and resourcet environment. Does this go well with our MonadThrow et al classes?
When using conduits in createHydraNode, the runConduitRes requires a MonadUnliftIO context. We have a IOSim usage of this though and its not clear if there can be a MonadUnliftIO (IOSim s) instance even?
We have not only loading [StateEvent] fully into memory, but also [ServerOutput].
Made mkProjection to take a conduit, but then we are running it for each (3 times). Should do something with fuseBoth or zip-like conduit combination.

2025-01-22

SN on multi version explorer

Started simplifying the hydra-explorer and wanted to get rid of all hydra-node, hydra-tx etc. dependencies because they include most of the cardano ecosystem. However, on the observer api we will need to refer to cardano-specifics like UTxO and some hydra entities like Party or HeadId. So a dependency onto hydra-tx is most likely needed.
Shouldn't these hydra specific types be in an actual hydra-api package? The hydra-tx or a future hydra-client could depend on that then.
When defining the observer API I was reaching for the OnChainTx data type as it has json instances and enumerates the things we need to observer. However, this would mean we need to depend on hydra-node in the hydra-explorer.
Could use the HeadObservation type, but that one is maybe a bit too low level and does not have JSON instances?
OnChainTx is really the level of detail we want (instantiated for cardano transactions, but not corrupted by cardano internal specifics)
Logging in the main entry point of Hydra.Explorer is depending on hydra-node anyways. We could be exploring something different to get rid of this? Got https://hackage.haskell.org/package/Blammo recommended to me.
Got everything to compile (with a cut-off hydra-chain-observer). Now I want to have an end-to-end integration test for hydra-explorer, that does not concern itself with individual observations, but rather that the (latest) hydra-chain-observer can be used with hydra-explorer. That, plus some (golden) testing agains the openapi schemas should be enough test coverage.
Modifying hydra and hydra-explorer repositories to integration test new http-based reporting.
- Doing so offline from a plane is a bit annoying as both nix or cabal would be pulling dependencies from the internet.
- Working around using an alias to the cabal built binary:

        alias hydra-chain-observer=../../hydra/dist-newstyle/build/x86_64-linux/ghc-9.6.6/hydra-chain-observer-0.19.0/x/hydra-chain-observer/build/hydra-chain-observer/hydra-chain-observer

cabal repl is not picking up the alias, maybe need to add it to PATH?
Adding a export PATH=<path to binary>:$PATH to .envrc is quite convenient
After connecting the two servers via a bounded queue, the test passes but sub-process are not gracefully stopped.

2025-01-21

SB on stake certificate registration

I created a relevant issue to track this new feature request to enable stake certificates on L2 ledger.
Didn't plan on working on this right away but wanted to explore a problem with PPViewHashesDontMatch when trying to submit a new tx on L2.
This happens both when obtaining the protocol-parameters from the hydra-node or if I query them from cardano-node (the latter is expected to fail on L2 since we reduce the fees to zero)
I added the line to print the protocol-parameters in our tx printer and it seems like changePParams is not setting the protocol-parameters correctly for whatever reason:

changePParams :: PParams (ShelleyLedgerEra Era) -> TxBodyContent BuildTx -> TxBodyContent BuildTx
changePParams pparams tx =
  tx{txProtocolParams = BuildTxWith $ Just $ LedgerProtocolParameters pparams}

There is setTxProtocolParams I should probably use instead.
No luck, how come this didn't work? I don't see why setting the protocol-parameters like this doesn't work....
I even compared the protocol-parameters loaded into the hydra-node and the ones I get back from hitting the hydra-node api and they are the same as expected
Running out of ideas

2025-01-20

SB on looking at withdraw zero problem

I want to know why I get mismatch between pparams on L2?
It is because we start the hydra-node in a separate temp directory from the test driver so I got rid of the problem by querying hydra-node to obtain L2 protocol-parameters
The weird issue I get is that the budget is overspent and it seems bumping the ExecutionUnits doesn't help at all.
When pretty-printing the L2 tx I noticed that cpu and memory for cert redeemer are both zero so that must be the source of culprit
Adding separately cert redeemer fixed the issue but I am now back to PPViewHashesDontMatch.
Not sure why this happens since I am doing a query to obtain hydra-node protocol parameters and using those to construct the transaction.
Note that even if I don't change protocol-parameters the error is the same
This whole chunk of work is to register a script address as a stake certificate and I still need to try to withdraw zero after this is working.
One thing I wanted to do is to use the dummy script as the provided Data in the Cert Redeemers - is this even possible?

2025-01-08

SN on aiken pinning & cleanup

When trying to align aiken version in our repository with what is generated into plutus.json, I encountered errors in hydra-tx tests even with the same aiken version as claimed.
Error: Expected the B constructor but got a different one
Seems to originate from plutus-core when it tries to run the builtin unBData on data that is not a B (bytestring)
The full error in hydra-tx tests actually includes what it tried to unBData: Caused by: unBData (Constr 0 [ Constr 0 [ List [ Constr 0 [ Constr 0 [ B #7db6c8edf4227f62e1233880981eb1d4d89c14c3c92b63b2e130ede21c128c61 , I 21 ] , Constr 0 [ Constr 0 [ Constr 0 [ B #b0e9c25d9abdfc5867b9c0879b66aa60abbc7722ed56f833a3e2ad94 ] , Constr 1 [] ] , Map [(B #, Map [(B #, I 231)])] , Constr 0 [] , Constr 1 [] ] ] , Constr 0 .... This looks a lot like a script context. Maybe something off with validator arguments?
How can I inspect the uplc of an aiken script?
It must be the "compile-time" parameter of the initial script, which expects the commit script hash. If we use that unapplied on the transaction, the script context trips the validator code.
How was the initialValidatorScript used on master such that these tests / usages pass?
Ahh .. someone applied the commit script parameter and stored the resulting script in the plutus.json! Most likely using aiken blueprint apply -v initial and then passing the aiken blueprint hash -v commit into that.
Realized that the plutus.json blueprint would have said that a script has parameters.

Logbook 2025 H1

September 2025

2025-09-16

SB on H4 in Increment tx

2025-09-08

SB on partial tokens output

2025-09-03

SB on partial tokens output

2025-09-03

SB on partial tokens output

August 2025

2025-08-14

SB on debugging H4 user issue on deposit

July 2025

2025-07-22

SB on fixing the input queue bug

2025-07-03

SB on increment bug

June 2025

2025-06-19

SB on debugging user logs

May 2025

2025-05-19

SB on looking at the user issue with committing

2025-05-13

SB on hydra-node using blockfrost

2025-05-12

SB on hydra-node using blockfrost

2025-05-08

SB on hydra-node using blockfrost

2025-05-07

SB on hydra-node using blockfrost

2025-05-05

SB on hydra-node using blockfrost

April 2025

2025-04-30

SN on fixing deposits

SB on hydra-node using blockfrost

2025-04-29

SB on hydra-node using blockfrost

2025-04-28

SN on fixing deposits using behavior tests

2025-04-26

SN refocusing on deposit fixes

2025-04-24

SN on model based testing interlude

2025-04-23

SN on fixing deposits

2025-04-22

SN on fixing deposits

2025-04-09

SB on blockfrost wallet queries

SB on finalizing recover/decrement observations should not be conditional

2025-04-04

SN on exploring dingo

March 2025

2025-03-28

SN on memory leak

2025-03-27

SB on fanout utxo bug

SB on memory leak on loading events from disk

2025-03-24

SB on pending deposits bug

2025-03-12

SB on Model tests weird behavior

SB on figuring out what happened in our Head

2025-03-10

FT on SideLoad-Snapshot

2025-03-07

FT on SideLoad-Snapshot

2025-03-06

FT on SideLoad-Snapshot

2025-03-05

FT on SideLoad-Snapshot

2025-03-04

SB on fixing the persistence bug

2025-03-03

SB on api server as the event sink

February 2025

2025-02-27

SN on exploring `dingo`