Perf counters used to measure and optimize workload SMT/HT mode in P8 or x86: Front-end stalls when backend is busy Hugepages effect: Minor Page-faults Optimal working set size: Cache-misses across different iteration in same execution Scaling across cores: Instructions per Seconds