Description
I have noticed an issue when running MultiPIM in host mode with multiple programs (multiple copies, for example, mcf x8). An assert error occurs in the 'T* insert(Address vpage_no, T& entry)' function in tlb/common_tlb.h, specifically at 'assert( !tlb_trie.count(vpage_no));' returning false.
Upon debugging, I found no issues with the operation of the code. However, it seems like the assert error is occurring due to parts of the code executing asynchronously. The reason I came to this conclusion is that the frequency of the problem significantly decreased when I defined and executed "debug_print()".
I managed to solve the assert error caused by asynchronous execution by acquiring a lock with 'futex_lock' in the 'shootdown(Address vpn)' function in tlb/common_tlb.h.
uint32_t shootdown(Address vpn)
{
T* entry = NULL;
entry = look_up(vpn);
futex_lock(&tlb_lock); // add
if(entry)
{
entry->set_invalid();
tlb_trie.erase(vpn);
tlb_trie_pa.erase(entry->p_page_no);
free_entry_list.push_back(entry);
}
futex_unlock(&tlb_lock); // add
uint32_t shootdown_lat=0;
if(enable_timing_mode)
shootdown_lat = hit_latency;
return shootdown_lat;
}
I would appreciate if you could consider this in your next patch. Thank you for your attention to this matter.