Checkmate is a tool to perform mutation analysis of your Solidity codebase. It
tests the mutants generated by Gambit.
Checkmate is capable of automatically generating the gambit_config.json
files, generating the mutants with Gambit's mutate
command and testing the
generated mutations. The goal of Checkmate is to provide a hands-off experience
for security researchers and developers to run a single command, go do something
else and come back to see the analysis results.
- You must have Go >=1.23 installed.
- You must have Certora's Gambit installed.
- You must have something to manage your local solidity compiler version. For example solc-select from Trail of Bits.
You can use the go's toolchain to install checkmate
. Install it with:
go install github.com/ChmielewskiKamil/checkmate@latest
Review the content of this repo, clone it and run go build
.
Check your version with:
checkmate --version
Upgrade with:
go install github.com/ChmielewskiKamil/checkmate@latest
Removing checkmate
from your OS is similar to other standalone Go binaries.
List files to confirm checkmate
is in the default location for Go binaries.
ls $HOME/go/bin
rm $HOME/go/bin/checkmate
List files to confirm checkmate.exe
is in the default location for Go
binaries.
dir %USERPROFILE%\go\bin
del %USERPROFILE%\go\bin\checkmate.exe
Navigate to the project that you want to analyze.
The usage of Checkmate is split in two stages. When you run the checkmate
for the first time it will generate you the gambit_config.json
file. This is
what Gambit requires to generate the modified versions of your code (mutants).
When using Gambit alone the config generation is time consuming as you have to
configure the remappings and select the files that you want to analyze.
Checkmate is doing that for you.
Once the config is generated you can run the checkmate
command for the second
time. This time it will see that gambit_config.json
is ready and will attempt
to generate the mutations.
The Qwen2.5-Coder-7B-Instruct
gives superior output and is fast. On an M1
MacBook Pro with 16 GBs of RAM it is able to analyze each missing test case in
~20 seconds.
DeepSeek-R1-Distill-Qwen-7B
gives decent results but is slow. It takes ~40-60 seconds to analyze a missing test case. This is a thinking model so it also prints the thought process output. The output format is unreliable and relatively hard to control while theQwen2.5-Coder-7B-Instruct
one-shots each test case even without examples.