dynamic_modules: add remote HTTP support for asyncDataSource module binary#43818
dynamic_modules: add remote HTTP support for asyncDataSource module binary#43818kanurag94 wants to merge 6 commits intoenvoyproxy:mainfrom
Conversation
Signed-off-by: Anurag Aggarwal <kanurag94@gmail.com>
Signed-off-by: Anurag Aggarwal <kanurag94@gmail.com>
Signed-off-by: Anurag Aggarwal <kanurag94@gmail.com>
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the ability to fetch dynamic modules from remote HTTP sources asynchronously. However, critical security vulnerabilities were identified in the caching and temporary file handling mechanisms. These include the use of the world-writable /tmp directory with predictable filenames and a lack of integrity verification for cached files, which could lead to local privilege escalation and arbitrary code execution. The staging file creation is also vulnerable to symlink attacks and race conditions. Furthermore, a critical use-after-free bug was found in the asynchronous callback implementation.
| [weak_state, proto_config_copy, module_config_copy, &context, | ||
| &scope](const std::string& data) { |
There was a problem hiding this comment.
The lambda for the RemoteAsyncDataProvider captures context and scope by reference. These references point to objects that are typically allocated on the stack during the configuration loading process. Since the lambda is executed asynchronously after the remote data fetch completes, the original stack frame will be gone, leading to dangling references and a use-after-free. This is a critical issue that could lead to crashes.
To resolve this, you should avoid capturing stack variables by reference in an asynchronous callback. Instead, you should extract the required long-lived objects (e.g., from context.api(), context.mainThreadDispatcher(), etc.) and capture those. This will likely require refactoring buildFilterFactoryCallback to accept these individual components instead of the entire ServerFactoryContext.
There was a problem hiding this comment.
ServerFactoryContextis long-lived and not stack allocated.stats::scopeis owned by listener and is again long lived.- RemoteAsyncDataProvider is fired during init so both references are guaranteed valid.
This comment seems wrong.
Signed-off-by: Anurag Aggarwal <kanurag94@gmail.com>
Signed-off-by: Anurag Aggarwal <kanurag94@gmail.com>
|
/retest |
1 similar comment
|
/retest |
|
CC @envoyproxy/coverage-shephards: FYI only for changes made to |
test/coverage.yaml
Outdated
| source/extensions/filters/http/cache: 95.9 | ||
| source/extensions/filters/http/dynamic_forward_proxy: 94.8 | ||
| source/extensions/filters/http/dynamic_modules: 95.2 | ||
| source/extensions/filters/http/dynamic_modules: 95.0 |
There was a problem hiding this comment.
Do we really have to decrease the coverage here? is there no way to have an integration test to cover everything?
There was a problem hiding this comment.
The coverage drop comes mainly from defensive paths:
- mkstemp() failure -- don't know how clearly but probably a exhaustion of FDs should only trigger this.
- write() failure -- needs interrupt signals or disk permission issues
- std::filesystem::permissions/rename failure -- needs mocking read only permissions.
- weak_ptr expiry -- needs a race condition to trigger.
They'd need exceptional situations to happen. All of them are very tought for UTs so I lowered the coverage instead.
Writing an integration test is quite hard for this one, if the approach looks good -- I can give it a try.
I have tried to increase coverage back with some possible paths that can be tested.
Signed-off-by: Anurag Aggarwal <kanurag94@gmail.com>
| cb_or_error.status().message()); | ||
| return; | ||
| } | ||
| state->filter_factory_cb = cb_or_error.value(); |
There was a problem hiding this comment.
Because the following lambda will be called at any threads, I think there may result in unexpected contention? here?
[async_state](Http::FilterChainFactoryCallbacks& callbacks) -> void {
if (async_state->filter_factory_cb) {
async_state->filter_factory_cb(callbacks);
}
};
As the first version, to simplify the review, we I think we and use a lock and left a TODO here. Then, we can resolve the lock with another PR. cc @agrawroh WDYT?
There was a problem hiding this comment.
wait, the remote fetching will pause the warming...so, it may be safe
Commit Message: dynamic_modules: add remote HTTP support for asyncDataSource module binary
Additional Description: Extends the asyncDataSource support (added in #43333 for local.filename) to support fetching dynamic module binaries from remote HTTP sources via module.remote in DynamicModuleConfig. The module is downloaded asynchronously during listener initialization with SHA256 verification, written to a temporary file, and loaded via dlopen. If the remote fetch fails, the filter is not installed and requests pass through (fail-open).
Risk Level: Low
Testing: Added unit tests + Manually tested
Docs Changes: API docs already a part of this PR
Release Notes: Added