Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hoping a feature which application can use accelerators without any change #4221

Closed
dingwei-2017 opened this issue Dec 20, 2024 · 1 comment
Assignees
Labels

Comments

@dingwei-2017
Copy link

Currently, many CPU vendors support hardware accelerators to implement functions such as compression/decompression. Hardware accelerators can help offload the CPU so that the CPU can handle more computing tasks. In addition, hardware accelerators can provide higher throughput and lower latency due to their proprietary design. ZSTD 1.5.4 introduces the mechanism of an external sequence producer to help use the accelerator's capabilities, which is good. However, the application is currently required to make minor code changes (such as registering an external serializer with libzstd), which makes applications less willing to use this feature. Does the community consider going further to enable upper-layer applications to be insensitive without modifying any code? Just like OpenSSL, users can define their own providers.

looking forward to receive a reply, thanks very much.

@Cyan4973 Cyan4973 self-assigned this Jan 23, 2025
@Cyan4973
Copy link
Contributor

Cyan4973 commented Jan 23, 2025

It is a difficult question, and it's not possible to provide a definitive answer, as this is an ongoing development.

In short, the idea that, thanks to the presence of hardware accelerator, one could transparently switch the application to use the hardware instead of the software, is not correct, on multiple levels.

To begin with, a hardware accelerator is not a replacement of the software: such accelerator tends to be good if not very good at certain things, but less stellar at others, and completely incompatible with many more. A classical distinction is that hardware can be good at compressing small blocks at fast speed, but not great at compressing large blocks, and unable to reach high compression ratio.

Therefore, the next best idea that comes to mind is to use the hardware accelerator opportunistically, i.e. when the software detects it is in a "good" scenario where hardware can be beneficial, then use the hardware, otherwise stay on the software.

This is better, but there are also drawbacks: how reliable is this opportunistic usage ?
Unfortunately, use cases can move to one category to another due to minute implementation details, that are quite difficult to understand or guess by general users. For example, maybe a flag is not supported, or conversely maybe it's necessarily set. Maybe it supports dictionary, or maybe not, or maybe it depends on size constraints. Whatever, it can get pretty complex.

Therefore, can a real-world use case reliably depend on it ? if the application has to make specific efforts to make sure it fits within the supported cases in order to employ the hardware accelerator, would it be more direct to actually tell that it wants to use the accelerator ? None of these steps is "modification free". And what about other properties such as reproducibility between runs, or across systems ?

In short, there are many questions associated with the goal of switching to hardware compression "transparently", and they can be difficult to answer, let alone solve.

Requiring the software to "make an effort" to tell "I want to use the hardware accelerator" is one way to avoid embarking some of this complexity. But then, it means that the software must be updated (and there is definitely a topic about making such a modification as easy as possible).
For a normal end user on laptop, such requirement is a tall order: the software is what it is, it can't be changed.
But for large data centers, the situation is a lot different: they typically have the source code, they can modify it and recompile, and they will value clear control over loose potential outcome. Priorities are different.
Given that these accelerators are currently more frequently present in server cpus than client cpus, the priorities of the datacenter community are currently the ones to be served.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants