Opportunity to save 1 instruction from mcycle checks #308

edubart · 2025-01-25T16:55:49Z

Context

Currently the interpreter hot loop does:

while (mcycle < mcycle_tick_end) {
    // Fetch, decode, execute
    mcycle++;
}

But it could be simplified to something like:

uint64_t remaining = mcycle_tick_end - mcycle;
mcycle += remaining;
for (; remaining > 0; remaining--) {
    // Fetch, decode, execute
}

This may reduce 1 instruction in the interpreter's hot inner loop for both amd64/arm64 (by using SUB instruction), see https://godbolt.org/z/MvPGYscaP as a PoC. But to do this, I will need to stop propagating mcycle on every memory access instruction, and maybe introduce an mtime CSR that gets incremented every RTC tick, in order to remove the need to propagate mcycle to client device when using rtc_cycle_to_time(a->read_mcycle()).

Furthermore, this will free up a register currently reserved for mcycle_tick_end, making it usable inside the interpreter's hot loop, allowing the optimizer to perform better register allocation inside the hot loop.

When doing this, it's worth experimenting with increasing RTC_FREQ_DIV_DEF from 8192 to 16384, since the interpreter outer loop will start performing a write to mtime every tick. Also, because the interpreter recently got 2x speedups, to the point where time inside the machine is advancing too fast when doing intensive computations, ideally the RTC frequency should have a value that attempts to make time pass closer to what would pass in the host.

This idea is something I've had for a while, and it has been briefly discussed internally. I am writing it down as an issue so I do not forget to attempt it someday.

The text was updated successfully, but these errors were encountered:

edubart · 2025-01-25T17:28:40Z

This is also directly related to #104

edubart added the optimization Optimization label Jan 25, 2025

edubart self-assigned this Jan 25, 2025

edubart added this to Machine Emulator SDK Jan 25, 2025

github-project-automation bot moved this to Todo in Machine Emulator SDK Jan 25, 2025

edubart linked a pull request Jan 31, 2025 that will close this issue

Add wall clock time CSR and optimize interpreter #309

Draft

edubart moved this from Todo to PR Available in Machine Emulator SDK Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opportunity to save 1 instruction from mcycle checks #308

Opportunity to save 1 instruction from mcycle checks #308

edubart commented Jan 25, 2025 •

edited

Loading

edubart commented Jan 25, 2025

Opportunity to save 1 instruction from mcycle checks #308

Opportunity to save 1 instruction from mcycle checks #308

Comments

edubart commented Jan 25, 2025 • edited Loading

Context

edubart commented Jan 25, 2025

edubart commented Jan 25, 2025 •

edited

Loading