Skip to content

Performance degradation running with half a socket in CPU system #1098

Open
@azhuvath

Description

@azhuvath

Given a small context, a paragraph with less than 100 words, we are trying to answer a query. There are 5 such queries and overall time taken is recorded. This experiment is conducted on a full system, full socket (in a dual socket machine), and half a socket in the machine (Purely CPU's with no accelerators). Seeing a strange behavior in which the performance degrades considerably scaling down from full socket to half a socket.

I conducted the same experiment using Intel Extension for PyTorch (IPEX). But I don't see the performance degradation moving from full socket to half a socket. Attaching the graphs to better understand the strange behavior.

Note: Could not do the IPEX experiment in 9480 due to some local issues. All the timings are average of 5 runs.

System Details
8380 - Intel® Xeon® Platinum 8380 Processor
8480 - Intel® Xeon® Platinum 8480+ Processor
9490 - Intel® Xeon® CPU Max 9480 Processor

Performance observed using Llama CPP with three different systems (8380, 8480, & 9480)
image

Performance observed using IPEX with two different system (8380, 8480)
image

You can see from the above graphs that moving from full socket to half socket has huge impact with Llama CPP where as it has very less impact with IPEX. Any ideas why this is happening with Llama CPP and not with IPEX?

I did capture system details while the executing the experiments using VTune Application Performance Snapshot (APS) tool. The elapsed time and the graph times differ because elapsed time include model loading and other activities. Attaching APS snapshot.

Full System
image

Full Socket
image

Half Socket
image

Not sure why the DRAM bandwidth considerably reduced when it is Half Socket for Llama CPP. This behavior is not seen with IPEX and graph clearly shows that the impact of moving from Full Socket to Half Socket is very gradual in IPEX, but not in Llama CPP.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions