-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NFFL FastNear indexer: CI test #322
Conversation
📝 WalkthroughWalkthroughThis pull request introduces several changes across multiple files, primarily focusing on the addition of a new GitHub Actions workflow for testing Docker Compose services related to the Fastnear project. It also includes updates to the Makefile for clearer testing commands, the introduction of a new Docker Compose configuration file, and modifications to existing Dockerfiles and integration tests. The changes enhance the testing framework, add new services, and improve logging and configuration clarity throughout the project. Changes
Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🧰 Additional context used📓 Learnings (1).github/workflows/compose-fastnear-test.yml (1)
🪛 actionlint (1.7.4).github/workflows/compose-fastnear-test.yml17-17: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue (action) 109-109: shellcheck reported issue in this script: SC2086:info:11:57: Double quote to prevent globbing and word splitting (shellcheck) 109-109: shellcheck reported issue in this script: SC2086:info:14:58: Double quote to prevent globbing and word splitting (shellcheck) 🔇 Additional comments (5).github/workflows/compose-fastnear-test.yml (5)
The trigger configuration and job setup are properly defined.
Remove redundant setup steps.
Apply this diff: - uses: actions/checkout@v3
+ uses: actions/checkout@v4
- - name: Set up Docker Compose
- run: |
- sudo apt-get update
- sudo apt-get install docker-compose -y 🧰 Tools🪛 actionlint (1.7.4)17-17: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue (action)
Incorrect build-args syntax will cause build failure. The YAML array syntax for build-args is invalid. Apply this diff: - build-args: |
- - TARGET="debug"
- - COMPILATION_MODE=""
+ build-args: |
+ TARGET=debug
+ COMPILATION_MODE=
Health check script requires hardening. The script has reliability issues:
Apply this diff: for service in "${services[@]}"; do
- status=$(docker inspect -f '{{.State.Health.Status}}' $service)
+ status=$(docker inspect -f '{{.State.Health.Status}}' "$service")
if [ "$status" != "healthy" ]; then
echo "Service $service is not healthy"
+ echo "Container logs:"
docker-compose -f ./docker-compose-fastnear.yml logs $service
+ echo "Container inspection:"
+ docker inspect "$service"
exit 1 🧰 Tools🪛 actionlint (1.7.4)109-109: shellcheck reported issue in this script: SC2086:info:11:57: Double quote to prevent globbing and word splitting (shellcheck) 109-109: shellcheck reported issue in this script: SC2086:info:14:58: Double quote to prevent globbing and word splitting (shellcheck)
File must end with newline. Add a newline at the end of the file. Apply this diff: - name: Tear down Docker Compose services
run: docker-compose -f ./docker-compose-fastnear.yml down
+ 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 18
🧹 Outside diff range and nitpick comments (14)
tests/integration/fastnear/fastnear_integration_test.go (3)
71-71
: Remove unnecessary sleepThe use of
time.Sleep(60 * time.Second)
introduces unnecessary delays in the test. Implement proper synchronisation instead of arbitrary waits.
193-195
: Avoid usingt.Fatalf
in cleanup functionsCalling
t.Fatalf
during cleanup can cause misleading test results. Uset.Log
ort.Error
to log cleanup errors without failing the test abruptly.
199-201
: Handle errors without terminating in cleanupIn cleanup, using
t.Fatalf
when terminating containers is inappropriate. Log the error instead of failing the test during the cleanup phase.config-files/relayer1-docker-compose-fastnear.yaml (1)
6-6
: Add a newline at the end of the fileThe file lacks a newline at the end. This is against POSIX standards and can cause issues with some tools.
🧰 Tools
🪛 yamllint (1.35.1)
[error] 6-6: no new line character at the end of file
(new-line-at-end-of-file)
Makefile (1)
146-147
: Consider enhancing test command with timeout and parallel options.While the current implementation is functional, consider adding timeout and parallel execution options for better test control.
test-integration-fastnear: ## runs fastnear integration test - go test ./tests/integration/fastnear/fastnear_integration_test.go -v -count=1 + go test ./tests/integration/fastnear/fastnear_integration_test.go -v -count=1 -timeout 10m -parallel 4docker-compose-fastnear.yml (2)
351-353
: Implement volume access restrictionsThe
near_cli_data
andnear_cli_keys
volumes contain sensitive data and are shared across multiple services without explicit access restrictions.Consider:
- Implementing volume access controls using Docker's volume driver plugins
- Using read-only mounts where possible:
volumes: - near_cli_keys:/root/.near-credentials:ro
- Implementing volume encryption for sensitive data
3-6
: Implement network segmentation for improved securityAll services are currently on a single network, which violates the principle of least privilege and increases the attack surface.
Implement network segmentation:
networks: frontend: driver: bridge backend: driver: bridge monitoring: driver: bridgeThen assign services to specific networks based on their communication requirements:
services: nearcore: networks: - backend prometheus: networks: - monitoringindexer/src/fastnear_indexer.rs (5)
Line range hint
51-63
: Critical: Add error handling and graceful shutdown mechanismThe
run
method spawns a task without any error handling or shutdown mechanism. This could lead to:
- Silent failures in the spawned task
- Resource leaks during application shutdown
Implement the following improvements:
pub fn run(&self) -> Receiver<PublishData> { info!("Starting FastNearIndexer"); let block_receiver = self.stream_latest_blocks(); let (publish_sender, publish_receiver) = mpsc::channel(self.channel_width); let addresses_to_rollup_ids = self.addresses_to_rollup_ids.clone(); + let (shutdown_sender, mut shutdown_receiver) = mpsc::channel(1); tokio::spawn(async move { - Self::process_blocks(block_receiver, publish_sender, addresses_to_rollup_ids).await; + tokio::select! { + _ = Self::process_blocks(block_receiver, publish_sender, addresses_to_rollup_ids) => { + error!("Block processing terminated unexpectedly"); + } + _ = shutdown_receiver.recv() => { + info!("Received shutdown signal"); + } + } }); publish_receiver }
Line range hint
151-164
: Critical: Implement robust error handling for network operationsThe network operations lack essential resilience patterns:
- No retry mechanism for transient failures
- No timeout configuration
- No circuit breaker to prevent cascade failures
Implement proper error handling:
+use tokio::time::timeout; +use std::time::Duration; async fn fetch_latest_block(client: &Client, fastnear_address: &str) -> Result<BlockWithTxHashes, Error> { debug!("Fetching latest block"); - let response = client - .get(fastnear_address) - .send() - .await - .and_then(|r| r.error_for_status()) - .map_err(|e| Error::NetworkError(e.to_string()))?; + let mut retries = 3; + while retries > 0 { + match timeout(Duration::from_secs(5), client.get(fastnear_address).send()).await { + Ok(Ok(response)) => { + if let Ok(response) = response.error_for_status() { + return response + .json::<BlockWithTxHashes>() + .await + .map_err(|e| Error::DeserializeJsonError(e.to_string())); + } + } + _ => { + retries -= 1; + if retries > 0 { + tokio::time::sleep(Duration::from_secs(1)).await; + continue; + } + } + } + } + Err(Error::NetworkError("Max retries exceeded".to_string())) }
Line range hint
200-227
: Security: Implement payload validation and rate limitingThe receipt handling lacks crucial security measures:
- No validation of payload size
- No rate limiting for large payloads
- No sanitisation of input data
Add proper validation:
fn receipt_filter_map(receipt_enum_view: ReceiptEnumView, rollup_id: u32) -> Option<PartialCandidateData> { trace!("Filtering receipt for rollup_id: {}", rollup_id); let payloads = match receipt_enum_view { ReceiptEnumView::Action { actions, .. } => actions .into_iter() .filter_map(Self::extract_args) + .filter(|args| args.len() <= 1024 * 1024) // 1MB limit .collect::<Vec<Vec<u8>>>(), _ => return None, };
Line range hint
231-241
: Critical: Add essential operational metricsThe current metrics implementation lacks crucial operational insights:
- Processing latency metrics
- Error rate metrics
- Payload size distribution
Add comprehensive metrics:
+use lazy_static::lazy_static; +use prometheus::{Histogram, IntCounter, register_histogram, register_int_counter}; +lazy_static! { + static ref PROCESSING_LATENCY: Histogram = register_histogram!( + "fastnear_processing_latency_seconds", + "Block processing latency in seconds" + ).unwrap(); + static ref ERROR_COUNTER: IntCounter = register_int_counter!( + "fastnear_processing_errors_total", + "Total number of processing errors" + ).unwrap(); +}
Line range hint
242-332
: Critical: Improve test coverageThe current test suite lacks comprehensive coverage:
- No error path testing
- No concurrent execution testing
- No performance benchmarks
Add comprehensive tests:
+#[tokio::test] +async fn test_process_blocks_error_handling() { + let (block_sender, block_receiver) = mpsc::channel(1); + let (publish_sender, _) = mpsc::channel(1); + let addresses = HashMap::new(); + + // Send invalid block + block_sender.send(BlockWithTxHashes { + block: Block::default(), + shards: vec![] + }).await.unwrap(); + + FastNearIndexer::process_blocks( + block_receiver, + publish_sender, + addresses + ).await; +} +#[tokio::test] +async fn test_concurrent_processing() { + let (block_sender, block_receiver) = mpsc::channel(10); + let (publish_sender, mut publish_receiver) = mpsc::channel(10); + let addresses = HashMap::new(); + + // Spawn multiple processors + let handles: Vec<_> = (0..3).map(|_| { + let receiver = block_receiver.clone(); + let sender = publish_sender.clone(); + let addrs = addresses.clone(); + tokio::spawn(async move { + FastNearIndexer::process_blocks( + receiver, + sender, + addrs + ).await; + }) + }).collect(); + + // Wait for all processors + for handle in handles { + handle.await.unwrap(); + } +}indexer/Dockerfile (2)
Line range hint
31-31
: TARGET ARG is declared but partially unusedThe
TARGET
argument is declared but its only usage is in the COPY command. This creates confusion about its intended purpose, especially since it's no longer used for compilation mode.Either:
- Remove the unused TARGET ARG and hardcode the path in COPY:
-ARG TARGET="release" -COPY --from=builder /tmp/indexer/target/${TARGET}/indexer . +COPY --from=builder /tmp/indexer/target/release/indexer .
- Or fully utilize TARGET throughout the Dockerfile as previously implemented.
Health check implementation requires adjustment
The current health check implementation has issues:
- The health check uses two different RPC methods ("block" and "status") but lacks proper error handling and timeout configurations
- The 10,000 retries setting is excessive and could mask underlying issues
- The health check interval (20s) and timeout (30s) values are not aligned with other services in the stack
- The implementation doesn't properly validate the node's sync status
Recommended changes:
- Reduce retries to a reasonable number (e.g., 3-5)
- Adjust timeout to be less than the interval (e.g., interval=30s, timeout=10s)
- Add proper error handling for RPC responses
- Validate both block timestamp and sync status before reporting healthy
🔗 Analysis chain
Line range hint
35-44
: Verify health check implementationThe health check implementation requires verification to ensure it aligns with FastNear indexer's specific requirements.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify health check endpoints and methods across the codebase # Check if these endpoints are consistently used across the codebase rg -A 2 '"method":"block"' rg -A 2 '"method":"status"' # Verify the timeout values are consistent with other services fd docker-compose | xargs rg 'HEALTHCHECK.*interval'Length of output: 648
Script:
#!/bin/bash # Let's verify the health check implementation more thoroughly # Check for any health check related configurations or implementations rg -i healthcheck # Look for any JSON-RPC endpoint implementations ast-grep --pattern 'jsonrpc.*2\.0' # Check for port 3030 usage across the codebase rg '3030' # Look for curl health check patterns in Dockerfiles fd Dockerfile | xargs rg 'curl.*health' # Check for any test files related to health checks fd -t f test | xargs rg -i 'health|status'Length of output: 116160
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (14)
.github/workflows/compose-fastnear-test.yml
(1 hunks)Makefile
(1 hunks)config-files/relayer1-docker-compose-fastnear.yaml
(1 hunks)docker-compose-fastnear.yml
(1 hunks)docker-compose.yml
(1 hunks)indexer/Dockerfile
(1 hunks)indexer/FastIndexer.dockerfile
(1 hunks)indexer/src/fastnear_indexer.rs
(13 hunks)indexer/src/main.rs
(2 hunks)indexer/src/rmq_publisher.rs
(1 hunks)tests/e2e/e2e_tests/tests/ctr_availability_test.rs
(1 hunks)tests/integration/fastnear/fastnear_integration_test.go
(1 hunks)tests/integration/integration_test.go
(8 hunks)tests/integration/test_base.go
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- indexer/src/rmq_publisher.rs
- tests/e2e/e2e_tests/tests/ctr_availability_test.rs
🧰 Additional context used
📓 Learnings (1)
.github/workflows/compose-fastnear-test.yml (1)
Learnt from: Fly-Style
PR: Nuffle-Labs/nffl#321
File: .github/workflows/compose-fastnear-test.yml:16-26
Timestamp: 2024-11-28T13:43:53.472Z
Learning: Updating the checkout action version and modifying the Docker Compose installation in GitHub workflows is unnecessary for this project.
🪛 yamllint (1.35.1)
config-files/relayer1-docker-compose-fastnear.yaml
[error] 6-6: no new line character at the end of file
(new-line-at-end-of-file)
.github/workflows/compose-fastnear-test.yml
[error] 131-131: no new line character at the end of file
(new-line-at-end-of-file)
🪛 actionlint (1.7.4)
.github/workflows/compose-fastnear-test.yml
17-17: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
109-109: shellcheck reported issue in this script: SC2086:info:11:57: Double quote to prevent globbing and word splitting
(shellcheck)
109-109: shellcheck reported issue in this script: SC2086:info:14:58: Double quote to prevent globbing and word splitting
(shellcheck)
🪛 Gitleaks (8.21.2)
docker-compose-fastnear.yml
300-300: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
331-331: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
332-332: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
tests/integration/test_base.go
239-239: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
296-296: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
328-328: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🪛 golangci-lint (1.62.2)
tests/integration/test_base.go
70-70: Error return value of operator.Start
is not checked
(errcheck)
97-97: Error return value of rpcServer.Start
is not checked
(errcheck)
32-32: SA1019: "io/ioutil" has been deprecated since Go 1.19: As of Go 1.16, the same functionality is now provided by package [io] or package [os], and those implementations should be preferred in new code. See the specific function documentation for details.
(staticcheck)
🔇 Additional comments (4)
indexer/FastIndexer.dockerfile (1)
33-33
: Confirm necessity of package removals
curl
has been removed from the installation. Ensure that no part of the application or its dependencies requires curl
at runtime.
.github/workflows/compose-fastnear-test.yml (1)
1-10
: LGTM: Workflow triggers are properly configured.
The workflow appropriately triggers on pushes to main and PR changes affecting the indexer directory.
indexer/src/main.rs (2)
33-36
: LGTM: Enhanced observability with metrics server logging.
The added logging statements provide clear visibility into the metrics server status.
74-74
: Whitespace change.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Prepare a CI test for FastNear indexer. Integration test is also prepared, but delayed due to priority shift and out of scope of this PR
This change is