-
-
Notifications
You must be signed in to change notification settings - Fork 914
porting sqlite operations to APSW along with unittests for the same #1076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
@CodeRabbit review |
✅ Actions performedReview triggered.
|
Summary by CodeRabbit
WalkthroughAdds optional APSW-backed SQLite support with PRAGMA settings and retry parameters, shifts several call sites to consume raw event/row/service JSON values, updates log-merge to parse embedded json_event strings, adds tests, docs, deps, and a minimal HTML report template. No public API signatures changed. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🔭 Outside diff range comments (1)
nettacker/database/db.py (1)
297-384
: Significant code duplication with submit_logs_to_dbThis function duplicates most of the logic from
submit_logs_to_db
. Consider refactoring to extract the common retry logic into a shared helper function.Create a generic helper function:
def _submit_with_retry(session, insert_func, log_data): """Generic retry logic for database insertions.""" if isinstance(session, tuple): connection, cursor = session try: for _ in range(Config.settings.max_retries): try: if not connection.in_transaction: connection.execute("BEGIN") insert_func(cursor, log_data) return send_submit_query(session) except apsw.BusyError as e: # ... retry logic ... return False finally: cursor.close() else: # SQLAlchemy logicThen simplify both functions to use this helper.
🧹 Nitpick comments (9)
tests/database/test_db.py (3)
170-181
: Consider adding assertion for sleep calls in retry test.The test mocks
time.sleep
but doesn't verify it was called. Add an assertion to ensure the retry delay is properly applied:- with patch("time.sleep"): + with patch("time.sleep") as mock_sleep: result = send_submit_query(session) + + # Verify sleep was called during retry + mock_sleep.assert_called()
403-431
: Improve Config mocking consistency.The test patches individual Config attributes separately. Consider using a more consistent mocking approach:
- @patch("nettacker.database.db.Config.settings.retry_delay", 0) - @patch("nettacker.database.db.Config.settings.max_retries", 1) + @patch("nettacker.database.db.Config") @patch("nettacker.database.db.logging.warn") @patch("nettacker.database.db.create_connection") - def test_apsw_busy_error(self, mock_create_conn, mock_warn): + def test_apsw_busy_error(self, mock_create_conn, mock_warn, mock_config): + mock_config.settings.retry_delay = 0 + mock_config.settings.max_retries = 1This approach is cleaner and prevents potential issues with partial Config mocking.
725-745
: Remove duplicate test for find_temp_events.This test duplicates the logic already covered in
test_sqlite_successful_lookup
(lines 656-668). Consider removing this redundant test or differentiating it with a unique scenario.nettacker/database/db.py (6)
100-112
: Simplify error handling in SQLAlchemy branchThe nested try-except blocks with identical exception handling are redundant.
else: - try: - for _ in range(1, 100): - try: - session.commit() - return True - except Exception: - time.sleep(0.1) - logging.warn(messages("database_connect_fail")) - return False - except Exception: - logging.warn(messages("database_connect_fail")) - return False + for _ in range(Config.settings.max_retries): + try: + session.commit() + return True + except Exception: + session.rollback() + time.sleep(Config.settings.retry_delay) + logging.warn(messages("database_connect_fail")) return False
148-149
: Use consistent error messagingThe error message should use the
messages()
function for consistency with the rest of the codebase.except Exception: cursor.execute("ROLLBACK") - logging.warn("Could not insert report...") + logging.warn(messages("database_error")) return False
200-200
: Use consistent error messaging-logging.warn("Could not remove old logs...") +logging.warn(messages("database_error"))
625-658
: Consider optimizing with fewer queriesThe current implementation executes 3 queries per target. Consider using a single query with GROUP BY or window functions for better performance.
Example optimization:
SELECT target, GROUP_CONCAT(DISTINCT module_name) as module_names, MAX(date) as latest_date, GROUP_CONCAT(event) as events FROM scan_events WHERE target IN ( SELECT DISTINCT target FROM scan_events ORDER BY id DESC LIMIT 10 OFFSET ? ) GROUP BY target
868-869
: Consider moving imports to module levelInternal imports can impact performance when the function is called multiple times.
Move these imports to the top of the file:
+from nettacker.core.graph import build_graph +from nettacker.lib.html_log import log_data def logs_to_report_html(target): """ generate HTML report with d3_tree_v2_graph for a host Args: target: the target Returns: HTML report """ - from nettacker.core.graph import build_graph - from nettacker.lib.html_log import log_data
1084-1105
: Simplify the capture variable logicThe current implementation with
capture
variable and nested loops is hard to follow. Consider using a dictionary for O(1) lookup.# Use a dictionary for efficient lookup target_index = {item["target"]: idx for idx, item in enumerate(selected)} if data.target in target_index: idx = target_index[data.target] # Update existing entry else: # Create new entry tmp = { "target": data.target, "info": { "module_name": [], "port": [], "date": [], "event": [], "json_event": [], }, } selected.append(tmp) target_index[data.target] = len(selected) - 1 idx = target_index[data.target]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (9)
nettacker/config.py
(3 hunks)nettacker/core/app.py
(1 hunks)nettacker/core/graph.py
(2 hunks)nettacker/core/lib/base.py
(1 hunks)nettacker/core/module.py
(1 hunks)nettacker/core/utils/common.py
(2 hunks)nettacker/database/db.py
(17 hunks)pyproject.toml
(1 hunks)tests/database/test_db.py
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/database/test_db.py (3)
nettacker/api/helpers.py (1)
structure
(1-12)nettacker/database/db.py (17)
db_inputs
(18-35)create_connection
(38-70)send_submit_query
(73-112)submit_report_to_db
(115-162)remove_old_logs
(165-212)submit_logs_to_db
(215-294)submit_temp_logs_to_db
(297-383)find_temp_events
(386-446)find_events
(449-492)select_reports
(495-557)get_scan_result
(560-591)last_host_logs
(594-715)get_logs_by_scan_id
(718-769)get_options_by_scan_id
(772-800)logs_to_report_json
(803-855)logs_to_report_html
(858-964)search_logs
(967-1123)nettacker/config.py (1)
as_dict
(26-27)
nettacker/database/db.py (4)
nettacker/config.py (1)
Config
(169-173)nettacker/database/models.py (3)
Report
(7-27)HostsLog
(67-100)TempEvents
(30-64)nettacker/api/helpers.py (1)
structure
(1-12)nettacker/core/graph.py (1)
build_graph
(26-49)
🔇 Additional comments (14)
pyproject.toml (1)
67-67
: APSW dependency version validatedThe caret constraint
^3.50.0.0
will include the latest 3.50.4.0 release, and our security check against PyPI/GitHub showed no known advisories for APSW. No changes required.nettacker/config.py (2)
92-93
: Excellent SQLite configuration choices.The WAL journal mode and NORMAL synchronous mode are optimal defaults for SQLite performance and concurrency. WAL mode allows concurrent readers with writers, and NORMAL synchronous provides good durability without significant performance impact.
139-140
: Well-designed retry configuration.The retry parameters (
max_retries = 3
andretry_delay = 0.1
) are reasonable defaults for handling SQLite database locks and transient errors without excessive overhead.nettacker/core/module.py (1)
81-81
: Event data access pattern improved.The change from
json.loads(service.json_event)
tojson.loads(service)
aligns with the broader event data standardization refactor. This simplifies data access by treating the service directly as a JSON string rather than accessing a nested attribute.nettacker/core/utils/common.py (2)
6-6
: JSON import properly added.The
json
import is correctly added to support the new JSON parsing functionality inmerge_logs_to_list
.
36-39
: Excellent backward compatibility handling.The enhancement to handle
json_event
keys is well-designed:
- Maintains backward compatibility during the event data refactor transition
- Idempotent operation (only parses if not already a dict)
- Non-intrusive to existing functionality as noted in the comment
nettacker/core/lib/base.py (1)
55-55
: Event data access streamlined.The change from
json.loads(event.event)
to direct dictionary accessevent["response"]["conditions_results"]
is consistent with the event data handling refactor. This eliminates unnecessary JSON parsing overhead while maintaining the same functionality.nettacker/core/app.py (1)
155-155
: LGTM! Correctly handles the new return format fromfind_events
.The change from
json.loads(row.json_event)
tojson.loads(row)
aligns with the database module refactoring wherefind_events
now returns JSON strings directly instead of ORM objects with ajson_event
attribute.nettacker/core/graph.py (2)
87-87
: Good refactoring - passes complete event context tomerge_logs_to_list
.This change improves the design by passing the entire event dictionary instead of pre-extracting the
json_event
field. This givesmerge_logs_to_list
more flexibility to handle different event structures.
160-168
: JavaScript JSON parsing supports full event objectsThe change to use
json.dumps(event)
is compatible with the existing JS in:
nettacker/web/static/report/json_parse.js
nettacker/web/static/index.html
Both files call
JSON.parse(...)
on the escaped JSON inserted into elements with IDs likejson_event_<index>
, so they will correctly deserialize the full event dictionary.No further fixes are needed.
tests/database/test_db.py (2)
1082-1122
: Good test coverage for HTML report generation.The HTML report tests properly mock the template components and verify the integration with the graph builder. The tests cover both SQLite and SQLAlchemy backends effectively.
1-1275
: Comprehensive test suite with excellent coverage.This test file provides thorough coverage of the dual database backend functionality introduced in the PR. The tests effectively verify:
- Both APSW (SQLite) and SQLAlchemy ORM paths
- Retry logic and transaction management
- Error handling and edge cases
- All major database operations
The test structure is clear and maintainable. Great work on ensuring the new APSW implementation is well-tested!
nettacker/database/db.py (2)
233-278
: Well-implemented retry logic for database locksThe retry mechanism with specific handling for
BusyError
and transaction state checking is well implemented. Good use of configuration parameters for max_retries and retry_delay.
509-557
: Well-implemented pagination logicBoth APSW and SQLAlchemy branches correctly implement pagination with proper offset calculations and consistent return format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a good performance improvement contribution.
Could you address coderabbit's suggestions first @pUrGe12 ?
sure I'll do that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
nettacker/core/module.py (2)
80-85
: Harden JSON loading: handle dict/bytes rows and decode failures.
json.loads(service)
assumesservice
is always a JSON string. If APSW returns bytes or ORM returns a dict, this will raise. Add type checks and guarded parsing.Apply:
- service_event = json.loads(service) - port = service_event["port"] - protocols = service_event["response"]["conditions_results"].keys() + try: + raw = service + if isinstance(raw, (bytes, bytearray)): + raw = raw.decode("utf-8", "replace") + service_event = json.loads(raw) if isinstance(raw, str) else raw + except (TypeError, json.JSONDecodeError) as e: + log.debug(f"Skipping unexpected service row: {service!r} ({e})") + continue + port = service_event.get("port") + protocols = ((service_event.get("response") or {}).get("conditions_results") or {}).keys() + if port is None: + continue
100-107
: Use safe dict lookup for service ports
Replaceself.discovered_services[payload["library"]]
with:ports = self.discovered_services.get(payload["library"], []) step = TemplateLoader.parse(step, {"port": ports})and add a test for the branch where
payload["library"]
isn’t indiscovered_services
.
♻️ Duplicate comments (1)
nettacker/config.py (1)
85-90
: Fix doc typos, stray text, and correct the default SQLite path.There’s a typo (“deafault”), stray UI text (“Add commentMore actions”), and the documented default path
.data/nettacker.db
doesn’t match the actual default.nettacker/data/nettacker.db
.Apply:
- Set the journal_mode (default="WAL") and - synchronous_mode (deafault="NORMAL"). Rest - of the fields can be left emptyAdd commentMore actions - This is the default database: - str(CWD / ".data/nettacker.db") + Set the journal_mode (default="WAL") and + synchronous_mode (default="NORMAL"). The rest + of the fields can be left empty. + Default SQLite database path: + str(CWD / ".nettacker/data/nettacker.db")
🧹 Nitpick comments (1)
pyproject.toml (1)
47-69
: Consider making APSW an optional extra to avoid unnecessary installs.For users on MySQL/PostgreSQL only, installing APSW can be skipped via an extra.
Add:
[tool.poetry.dependencies] # ... apsw = "^3.50.0.0" +[tool.poetry.extras] +sqlite = ["apsw"]Then document
poetry install --extras sqlite
for SQLite users.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (6)
nettacker/config.py
(3 hunks)nettacker/core/app.py
(1 hunks)nettacker/core/lib/base.py
(1 hunks)nettacker/core/module.py
(1 hunks)nettacker/core/utils/common.py
(2 hunks)pyproject.toml
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- nettacker/core/lib/base.py
- nettacker/core/app.py
- nettacker/core/utils/common.py
🧰 Additional context used
🪛 GitHub Actions: CI/CD
pyproject.toml
[error] 1-1: poetry install --with test failed: pyproject.toml changed significantly since poetry.lock was last generated. Run 'poetry lock' to fix the lock file.
🔇 Additional comments (3)
pyproject.toml (1)
67-67
: Verify APSW version availability on PyPI
Ensure that the^3.50.0.0
constraint matches a release published on PyPI and that binary wheels exist for Python 3.9, 3.10, 3.11, and 3.12. If version 3.50.0.0 is not available or missing wheels for any supported interpreter, pin to a published release that does.nettacker/config.py (2)
111-113
: Confirm PRAGMA usage in DB layer.Ensure
journal_mode
andsynchronous_mode
are applied on every SQLite connection (both initial and pooled) via PRAGMA statements in the APSW path.
160-161
: LGTM: sensible retry defaults for transient DB errors.Values look reasonable; please ensure db.py reads these and uses capped backoff.
c26a9c2
to
b1b9851
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (10)
nettacker/database/db.py (10)
15-15
: Avoid shadowing the stdlib logging module.Rename variable for clarity and to prevent confusion in imports.
-logging = logger.get_logger() +log = logger.get_logger()Update usages accordingly.
4-4
: Handle optional APSW dependency (import guard).Avoid import errors on non-SQLite backends.
-import apsw +try: + import apsw +except ImportError: + apsw = None
46-47
: Guard SQLite path when APSW is missing.Fail fast with a clear message.
if Config.db.engine.startswith("sqlite"): + if apsw is None: + raise ImportError("APSW is required for SQLite backend. Install with: pip install apsw")
50-50
: Fix busy-timeout units (should be milliseconds).Multiply by 1000, not 100.
-connection.setbusytimeout(int(config.settings.timeout) * 100) +# seconds -> milliseconds +connection.setbusytimeout(int(config.settings.timeout) * 1000)
46-57
: Wrap APSW connection setup with error handling.Prevent unhandled exceptions and log context.
- DB_PATH = config.db.as_dict()["name"] - connection = apsw.Connection(DB_PATH) - connection.setbusytimeout(int(config.settings.timeout) * 100) - cursor = connection.cursor() - - # Performance enhancing configurations. Put WAL cause that helps with concurrency - cursor.execute(f"PRAGMA journal_mode={Config.db.journal_mode}") - cursor.execute(f"PRAGMA synchronous={Config.db.synchronous_mode}") - - return connection, cursor + try: + DB_PATH = config.db.as_dict()["name"] + connection = apsw.Connection(DB_PATH) + connection.setbusytimeout(int(config.settings.timeout) * 1000) + cursor = connection.cursor() + # Performance enhancing configurations. WAL helps with concurrency + cursor.execute(f"PRAGMA journal_mode={Config.db.journal_mode}") + cursor.execute(f"PRAGMA synchronous={Config.db.synchronous_mode}") + return connection, cursor + except Exception as e: + logging.error(f"Failed to create APSW connection: {e}") + raise
85-98
: APSW commit loop closes connection every iteration and before retries.This causes failures on subsequent attempts and double-closing. Also hardcoded retries/sleeps.
- connection, cursor = session - for _ in range(100): - try: - connection.execute("COMMIT") - return True - except Exception: - connection.execute("ROLLBACK") - time.sleep(0.1) - finally: - connection.close() - connection.close() - logging.warn(messages("database_connect_fail")) - return False + connection, cursor = session + success = False + for _ in range(Config.settings.max_retries): + try: + connection.execute("COMMIT") + success = True + break + except Exception: + connection.execute("ROLLBACK") + time.sleep(Config.settings.retry_delay) + connection.close() + if not success: + logging.warn(messages("database_connect_fail")) + return success
399-447
: Remove unnecessary retry loop; unify return type.SELECTs don’t need retry loops; also return parsed JSON for both backends.
- try: - for _ in range(100): - try: - cursor.execute( - """ - SELECT event - FROM temp_events - WHERE target = ? AND module_name = ? AND scan_unique_id = ? AND event_name = ? - LIMIT 1 - """, - (target, module_name, scan_id, event_name), - ) - - row = cursor.fetchone() - cursor.close() - if row: - return json.loads(row[0]) - return [] - except Exception: - logging.warn("Database query failed...") - return [] - except Exception: - logging.warn(messages("database_connect_fail")) - return [] - return [] + try: + cursor.execute( + """ + SELECT event + FROM temp_events + WHERE target = ? AND module_name = ? AND scan_unique_id = ? AND event_name = ? + LIMIT 1 + """, + (target, module_name, scan_id, event_name), + ) + row = cursor.fetchone() + cursor.close() + return json.loads(row[0]) if row else [] + except Exception: + logging.warn(messages("database_connect_fail")) + return [] @@ - try: - for _ in range(1, 100): - try: - return ( - session.query(TempEvents) - .filter( - TempEvents.target == target, - TempEvents.module_name == module_name, - TempEvents.scan_unique_id == scan_id, - TempEvents.event_name == event_name, - ) - .first() - ) - except Exception: - time.sleep(0.1) - except Exception: - logging.warn(messages("database_connect_fail")) - return [] - return [] + try: + result = ( + session.query(TempEvents) + .filter( + TempEvents.target == target, + TempEvents.module_name == module_name, + TempEvents.scan_unique_id == scan_id, + TempEvents.event_name == event_name, + ) + .first() + ) + return json.loads(result.event) if result else [] + except Exception: + logging.warn(messages("database_connect_fail")) + return []
474-492
: Avoid redundant JSON load→dump; unify return type.Return parsed JSON for both backends.
- if rows: - return [json.dumps((json.loads(row[0]))) for row in rows] + if rows: + return [json.loads(row[0]) for row in rows] return [] @@ - return [ - row.json_event + return [ + json.loads(row.json_event) for row in session.query(HostsLog)
581-592
: Add robust handling for missing reports and file I/O errors.Prevents attribute errors and uncaught IOErrors.
- if row: - filename = row[0] - return filename, open(str(filename), "rb").read() - else: - return structure(status="error", msg="database error!") + if row: + filename = row[0] + try: + with open(str(filename), "rb") as fh: + return filename, fh.read() + except IOError as e: + logging.error(f"Failed to read report file: {e}") + return structure(status="error", msg="file read error!") + return structure(status="error", msg="database error!") @@ - filename = session.query(Report).filter_by(id=id).first().report_path_filename - - return filename, open(str(filename), "rb").read() + report = session.query(Report).filter_by(id=id).first() + if not report: + return structure(status="error", msg="database error!") + try: + with open(str(report.report_path_filename), "rb") as fh: + return report.report_path_filename, fh.read() + except IOError as e: + logging.error(f"Failed to read report file: {e}") + return structure(status="error", msg="file read error!")
659-671
: Cursor is closed inside the loop but reused later.Close after the loop.
- cursor.close() hosts.append( { "target": target, "info": { "module_name": module_names, "date": latest_date, "events": events, }, } ) - return hosts + cursor.close() + return hosts
🧹 Nitpick comments (6)
nettacker/database/db.py (6)
268-271
: Use connection for ROLLBACK to keep transaction handling consistent.Minor consistency fix; safer if cursor lifecycle changes.
- if connection.in_transaction: - cursor.execute("ROLLBACK") + if connection.in_transaction: + connection.execute("ROLLBACK")
318-319
: Standardize transaction control to use the connection.Keep BEGIN/ROLLBACK consistently on the connection.
- cursor.execute("BEGIN") + connection.execute("BEGIN") @@ - if connection.in_transaction: - connection.execute("ROLLBACK") + if connection.in_transaction: + connection.execute("ROLLBACK") @@ - if connection.in_transaction: - cursor.execute("ROLLBACK") + if connection.in_transaction: + connection.execute("ROLLBACK")Also applies to: 351-352, 358-359
611-617
: Query plan: DISTINCT with ORDER BY id is ambiguous.Consider grouping by target and ordering by MAX(id) for “latest per target”.
Example:
SELECT target FROM scan_events GROUP BY target ORDER BY MAX(id) DESC LIMIT 10 OFFSET ?
791-801
: Return parsed options JSON for both backends.Keeps API consistent.
- if rows: - return [{"options": row[0]} for row in rows] + if rows: + return [{"options": json.loads(row[0])} for row in rows] @@ - return [ - {"options": log.options} + return [ + {"options": json.loads(log.options)} for log in session.query(Report).filter(Report.scan_unique_id == scan_id).all() ]
883-897
: Consider JSON-decoding fields for graph building and HTML output.Verify build_graph expectations; if it expects dicts, decode below.
- "port": log[4], - "event": log[5], - "json_event": log[6], + "port": json.loads(log[4]), + "event": json.loads(log[5]), + "json_event": json.loads(log[6]),If build_graph expects strings, skip this change.
735-740
: Add supporting DB indexes for frequent filters.Improve search/query performance at scale.
- scan_events: composite indexes on (scan_unique_id), (target), (target, module_name), (target, module_name, scan_unique_id), and optionally a full-text or JSON index if supported.
- temp_events: (target, module_name, scan_unique_id, event_name).
- reports: (scan_unique_id), (date).
Also applies to: 986-1001, 1010-1014
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (9)
nettacker/config.py
(3 hunks)nettacker/core/app.py
(1 hunks)nettacker/core/graph.py
(2 hunks)nettacker/core/lib/base.py
(1 hunks)nettacker/core/module.py
(1 hunks)nettacker/core/utils/common.py
(2 hunks)nettacker/database/db.py
(17 hunks)pyproject.toml
(1 hunks)tests/database/test_db.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- pyproject.toml
- nettacker/core/utils/common.py
- tests/database/test_db.py
- nettacker/core/lib/base.py
- nettacker/core/graph.py
- nettacker/core/app.py
- nettacker/core/module.py
- nettacker/config.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.
Applied to files:
nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
Config
(191-195)as_dict
(26-27)tests/database/test_models.py (1)
session
(11-18)nettacker/database/models.py (3)
Report
(7-27)HostsLog
(67-100)TempEvents
(30-64)nettacker/api/helpers.py (1)
structure
(1-12)
🔇 Additional comments (1)
nettacker/database/db.py (1)
275-276
: Ack: returning True after exhausting retries is intentional.Not flagging as an error per your stated design to preserve scan continuity.
Also applies to: 363-364
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
tests/core/test_exclude_ports.py (4)
100-105
: Same issue: return JSON strings in test_sort_loops.- mock_event = MagicMock() - mock_event.json_event = json.dumps( - {"port": 80, "response": {"conditions_results": {"http": True}}} - ) - mock_find_events.return_value = [mock_event] + mock_find_events.return_value = [ + json.dumps({"port": 80, "response": {"conditions_results": {"http": True}}}) + ]
125-130
: Same issue: return JSON strings in test_start_unsupported_library.- mock_event = MagicMock() - mock_event.json_event = json.dumps( - {"port": 1234, "response": {"conditions_results": {"unsupported_lib": True}}} - ) - mock_find_events.return_value = [mock_event] + mock_find_events.return_value = [ + json.dumps( + {"port": 1234, "response": {"conditions_results": {"unsupported_lib": True}}} + ) + ]
185-190
: Same issue: return JSON strings in test_sort_loops_behavior.- mock_event = MagicMock() - mock_event.json_event = json.dumps( - {"port": 80, "response": {"conditions_results": {"http": True}}} - ) - mock_find_events.return_value = [mock_event] + mock_find_events.return_value = [ + json.dumps({"port": 80, "response": {"conditions_results": {"http": True}}}) + ]
313-319
: Same issue: return JSON strings for test_load_appends_port_to_existing_protocol.- mock_find_events.return_value = [ - MagicMock( - json_event=json.dumps({"port": 80, "response": {"conditions_results": {"http": {}}}}) - ), - MagicMock( - json_event=json.dumps({"port": 443, "response": {"conditions_results": {"http": {}}}}) - ), - ] + mock_find_events.return_value = [ + json.dumps({"port": 80, "response": {"conditions_results": {"http": {}}}}), + json.dumps({"port": 443, "response": {"conditions_results": {"http": {}}}}), + ]
♻️ Duplicate comments (1)
nettacker/database/db.py (1)
19-19
: Avoid shadowing imported symbol; use a distinct logger variable.Rename the local logger instance to avoid shadowing the imported module symbol and improve readability.
-logger = logger.get_logger() +log = logger.get_logger()Apply the same rename for all logger usages in this file.
🧹 Nitpick comments (8)
nettacker/database/db.py (6)
244-264
: Minor: BEGIN/ROLLBACK use is inconsistent between functions.submit_logs_to_db uses connection.execute("BEGIN") while submit_temp_logs_to_db uses cursor.execute("BEGIN"). Same for ROLLBACK. Prefer one style for consistency (APSW allows either). Suggest using connection.execute in both places.
- cursor.execute("BEGIN") + connection.execute("BEGIN") @@ - cursor.execute("ROLLBACK") + connection.execute("ROLLBACK")Also applies to: 326-347
424-451
: Return type inconsistency in find_temp_events (APSW dict vs ORM object).APSW branch returns a dict (json.loads), ORM branch returns an ORM row (or None). Downstream code can easily break on mixed types. Choose a single contract (e.g., JSON string or parsed dict) and make both branches conform; update tests accordingly.
Example (return JSON string in both branches):
- if row: - return json.loads(row[0]) + if row: + return row[0] # raw JSON string @@ - return ( - session.query(TempEvents) - .filter(...) - .first() - ) + result = session.query(TempEvents).filter(...).first() + return result.event if result else []If the intended contract is dicts, invert accordingly in both branches.
589-607
: Unify error return shape in get_scan_result.Docstring says this returns result content or an error in JSON type. On file I/O errors you currently return None; consider returning structure(status="error", msg="file read error!") for consistency across branches.
- except IOError as e: - logger.error(f"Failed to read report file: {e}") - return None + except IOError as e: + logger.error(f"Failed to read report file: {e}") + return structure(status="error", msg="file read error!") @@ - except IOError as e: - logger.error(f"Failed to read report file: {e}") - return None + except IOError as e: + logger.error(f"Failed to read report file: {e}") + return structure(status="error", msg="file read error!")
626-635
: DISTINCT with ORDER BY a non-selected column is fragile.SQLite allows ORDER BY columns not in the SELECT list, but it’s ambiguous for DISTINCT. To show the latest per target deterministically, order by MAX(id) per target.
- cursor.execute( - """ - SELECT DISTINCT target - FROM scan_events - ORDER BY id DESC - LIMIT 10 OFFSET ? - """, - [(page - 1) * 10], - ) + cursor.execute( + """ + SELECT target + FROM ( + SELECT target, MAX(id) AS max_id + FROM scan_events + GROUP BY target + ) + ORDER BY max_id DESC + LIMIT 10 OFFSET ? + """, + [(page - 1) * 10], + )
773-784
: json_event type mismatch between APSW and ORM paths.APSW branch returns dicts for json_event; ORM branch returns raw strings. Normalize ORM path to return dicts for parity (or switch APSW to strings, but keep it consistent).
- "json_event": log.json_event, + "json_event": json.loads(log.json_event) if log.json_event else {},
1000-1059
: GROUP BY without aggregating non-grouped columns can yield nondeterministic values.In the APSW search, date is selected but not grouped/aggregated. If determinism matters, consider selecting MAX(date) or removing date from SELECT and appending via another query.
tests/database/test_db.py (2)
706-714
: Clarify find_temp_events return type in ORM path.This test asserts the ORM object is returned. APSW path returns a dict. Consider unifying the contract (string or dict) across both paths and updating tests to match.
871-894
: select_reports SQLite exception test mocks the wrong method.You set mock_cursor.query.side_effect, but the implementation uses cursor.execute. To meaningfully exercise the error path, set execute.side_effect to raise.
- mock_cursor.query.side_effect = Exception("DB Error") + mock_cursor.execute.side_effect = Exception("DB Error")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (5)
nettacker/config.py
(3 hunks)nettacker/database/db.py
(17 hunks)pyproject.toml
(2 hunks)tests/core/test_exclude_ports.py
(1 hunks)tests/database/test_db.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- pyproject.toml
- nettacker/config.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.
Applied to files:
tests/database/test_db.py
nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.138Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.138Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.008Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.008Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.306Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.306Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.
Applied to files:
nettacker/database/db.py
🧬 Code graph analysis (2)
tests/database/test_db.py (3)
nettacker/api/helpers.py (1)
structure
(1-12)nettacker/database/db.py (17)
db_inputs
(22-39)create_connection
(42-81)send_submit_query
(84-123)submit_report_to_db
(126-173)remove_old_logs
(176-223)submit_logs_to_db
(226-305)submit_temp_logs_to_db
(308-394)find_temp_events
(397-452)find_events
(455-498)select_reports
(501-563)get_scan_result
(566-607)last_host_logs
(610-730)get_logs_by_scan_id
(733-784)get_options_by_scan_id
(787-815)logs_to_report_json
(818-870)logs_to_report_html
(873-979)search_logs
(982-1138)nettacker/config.py (1)
as_dict
(26-27)
nettacker/database/db.py (4)
nettacker/config.py (2)
Config
(191-195)as_dict
(26-27)nettacker/database/models.py (3)
Report
(7-27)HostsLog
(67-100)TempEvents
(30-64)nettacker/api/helpers.py (1)
structure
(1-12)nettacker/core/graph.py (1)
build_graph
(26-49)
🪛 GitHub Actions: CI/CD
tests/core/test_exclude_ports.py
[error] 70-70: Test 'test_load_with_service_discovery' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().
[error] 108-108: Test 'test_sort_loops' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().
[error] 135-135: Test 'test_start_unsupported_library' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().
[error] 193-193: Test 'test_sort_loops_behavior' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().
[error] 324-324: Test 'test_load_appends_port_to_existing_protocol' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().
🔇 Additional comments (3)
nettacker/database/db.py (2)
55-59
: Busy-timeout unit may be incorrect; confirm seconds→milliseconds conversion.APSW’s setbusytimeout expects milliseconds. If Config.settings.timeout is in seconds, multiply by 1000 (not 100). Current tests assert 3000 for 30s, which suggests a unit mismatch. Please verify the intended unit and adjust code and/or tests accordingly.
-connection.setbusytimeout(int(config.settings.timeout) * 100) +# Convert seconds to milliseconds for APSW busy timeout +connection.setbusytimeout(int(config.settings.timeout) * 1000)Also applies to: 61-65
483-484
: LGTM: normalized JSON string output for find_events (APSW).Keeping the return type as JSON strings aligns with callers using json.loads on each item.
tests/database/test_db.py (1)
124-131
: Confirm busy-timeout expectation in tests.The test expects setbusytimeout(3000) for a 30 second timeout. If timeout is seconds, APSW needs milliseconds (30000). Please confirm intended units and align code or test accordingly.
…nittests for db.py
37cfb4b
to
4cd2bc1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (4)
nettacker/database/db.py (4)
742-784
: json_event type differs across backends.SQLite branch returns parsed dict; ORM branch returns raw string. Pick one format (prefer dicts here) for callers’ sanity.
Apply:
- "json_event": log.json_event, + "json_event": json.loads(log.json_event) if log.json_event else {},
56-66
: Busy-timeout unit conversion is wrong (should be ms).APSW expects milliseconds; multiplying by 100 undercuts the intended timeout by 10x. Use 1000 and keep the try/except.
Apply:
- connection.setbusytimeout(int(config.settings.timeout) * 100) + # Convert seconds to milliseconds for APSW busy timeout + connection.setbusytimeout(int(config.settings.timeout) * 1000)
96-109
: APSW commit retry closes connection every attempt; breaks retries and leaks cursor.Connection is closed in finally inside the loop; cursor is never closed here. Move cleanup to a single outer finally, add rollback between attempts, and use configured retry params.
Apply:
- if isinstance(session, tuple): - connection, cursor = session - for _ in range(100): - try: - connection.execute("COMMIT") - return True - except Exception: - connection.execute("ROLLBACK") - time.sleep(0.1) - finally: - connection.close() - connection.close() - logger.warn(messages("database_connect_fail")) - return False + if isinstance(session, tuple): + connection, cursor = session + success = False + try: + for _ in range(Config.settings.max_retries): + try: + connection.execute("COMMIT") + success = True + break + except Exception: + try: + connection.execute("ROLLBACK") + except Exception: + pass + time.sleep(Config.settings.retry_delay) + finally: + try: + cursor.close() + except Exception: + pass + try: + connection.close() + except Exception: + pass + if not success: + logger.warn(messages("database_connect_fail")) + return success
111-123
: ORM commit retry lacks rollback and uses hardcoded policy.Add rollback, use Config.settings.{max_retries,retry_delay}, and return success consistently.
Apply:
- try: - for _ in range(1, 100): - try: - session.commit() - return True - except Exception: - time.sleep(0.1) - logger.warn(messages("database_connect_fail")) - return False - except Exception: - logger.warn(messages("database_connect_fail")) - return False + success = False + for _ in range(Config.settings.max_retries): + try: + session.commit() + success = True + break + except Exception: + session.rollback() + time.sleep(Config.settings.retry_delay) + if not success: + logger.warn(messages("database_connect_fail")) + return success
🧹 Nitpick comments (6)
nettacker/core/app.py (1)
159-163
: Harden subdomain_scan event parsing to be backend-agnostic and key-safe
json.loads(row)
assumesrow
is a JSON string. With dual backends (APSW/ORM),find_events(...)
may yield a dict or an ORM row exposing.json_event
. Also, direct indexing into ["response"]["conditions_results"]["content"] can raise on missing keys. Parse defensively and tolerate string/dict/object rows.Apply this diff within the loop:
- for row in find_events(target, "subdomain_scan", scan_id): - for sub_domain in json.loads(row)["response"]["conditions_results"]["content"]: + for row in find_events(target, "subdomain_scan", scan_id): + event = self._event_to_dict(row) + content = ( + event.get("response", {}) + .get("conditions_results", {}) + .get("content", []) + ) + if not isinstance(content, (list, tuple)): + continue + for sub_domain in content:Add this helper (inside the class or module) to normalize rows:
@staticmethod def _event_to_dict(row): if isinstance(row, dict): return row # ORM row with `json_event` if hasattr(row, "json_event") and isinstance(row.json_event, (str, bytes)): try: return json.loads(row.json_event) except Exception: return {} # Raw JSON string from APSW if isinstance(row, (str, bytes)): try: return json.loads(row) except Exception: return {} return {}Verification suggestion:
- Confirm
find_events
return type(s) for sqlite/mysql/postgres paths to ensure the helper covers all cases, and that subdomain events consistently populate response.conditions_results.content.nettacker/database/db.py (5)
46-49
: Docstring inaccurate about returns.For SQLite you return (connection, cursor); for others you return a Session, never False. Update to reflect actual behavior.
799-815
: Consider returning parsed options not raw JSON.Minor, but most functions parse JSON fields before returning.
Apply:
- return [{"options": row[0]} for row in rows] + return [{"options": json.loads(row[0])} for row in rows]
982-1018
: Potential DISTINCT/ORDER BY mismatch in search_logs targets query.Same pattern as last_host_logs; switch to subquery with MAX(id) ordering to avoid undefined behavior.
Apply:
- cursor.execute( - """ - SELECT DISTINCT target FROM scan_events - WHERE target LIKE ? OR date LIKE ? OR module_name LIKE ? - OR port LIKE ? OR event LIKE ? OR scan_unique_id LIKE ? - ORDER BY id DESC - LIMIT 10 OFFSET ? - """, - ( - f"%{query}%", - f"%{query}%", - f"%{query}%", - f"%{query}%", - f"%{query}%", - f"%{query}%", - (page * 10) - 10, - ), - ) + cursor.execute( + """ + WITH hits AS ( + SELECT id, target + FROM scan_events + WHERE target LIKE ? OR date LIKE ? OR module_name LIKE ? + OR port LIKE ? OR event LIKE ? OR scan_unique_id LIKE ? + ), + grouped AS ( + SELECT target, MAX(id) AS max_id + FROM hits + GROUP BY target + ) + SELECT target FROM grouped + ORDER BY max_id DESC + LIMIT 10 OFFSET ? + """, + (*(f"%{query}%",)*6, (page * 10) - 10), + )
268-276
: Minor: BusyError branch string check is unnecessary.apsw.BusyError already indicates busy/locked; the message check is redundant. Handle BusyError uniformly and let other exceptions fall through.
Apply (example for one function; mirror in the other):
- except apsw.BusyError as e: - if "database is locked" in str(e).lower(): + except apsw.BusyError: logger.warn( f"[Retry {_ + 1}/{Config.settings.max_retries}] Database is locked. Retrying..." ) if connection.in_transaction: connection.execute("ROLLBACK") time.sleep(Config.settings.retry_delay) continue - else: - if connection.in_transaction: - connection.execute("ROLLBACK") - return FalseAlso applies to: 347-365
688-691
: Prefer logger.warning over warn (deprecated in stdlib).If your logger mirrors stdlib, warning() is preferred. If warn() is intentional in your wrapper, ignore.
Also applies to: 1063-1065, 1134-1135
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
poetry.lock
is excluded by!**/*.lock
📒 Files selected for processing (10)
nettacker/config.py
(3 hunks)nettacker/core/app.py
(1 hunks)nettacker/core/graph.py
(2 hunks)nettacker/core/lib/base.py
(1 hunks)nettacker/core/module.py
(1 hunks)nettacker/core/utils/common.py
(2 hunks)nettacker/database/db.py
(17 hunks)pyproject.toml
(2 hunks)tests/core/test_exclude_ports.py
(5 hunks)tests/database/test_db.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
- nettacker/core/module.py
- tests/database/test_db.py
- pyproject.toml
- nettacker/core/utils/common.py
- nettacker/core/lib/base.py
- nettacker/core/graph.py
- nettacker/config.py
- tests/core/test_exclude_ports.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.138Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.138Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.008Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.008Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.306Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.306Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.
Applied to files:
nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
Config
(191-195)as_dict
(26-27)nettacker/database/models.py (3)
Report
(7-27)HostsLog
(67-100)TempEvents
(30-64)nettacker/api/helpers.py (1)
structure
(1-12)nettacker/core/graph.py (1)
build_graph
(26-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: CodeQL (python)
- GitHub Check: CodeQL (javascript)
- GitHub Check: Run tests
🔇 Additional comments (4)
nettacker/database/db.py (4)
4-8
: Good: Optional APSW import guard.Prevents hard failures when non-SQLite engines are used.
471-488
: find_events APSW/ORM return strings consistently.Good consistency; APSW normalizes with dumps(loads(.)) to match ORM string output.
137-173
: Nice: Parameterized SQL and consistent JSON handling.Inserts/Deletes and report selection use parameter binding and parse JSON fields consistently. Good.
Also applies to: 187-224, 513-563
625-634
: Discard DISTINCT/ORDER BY compatibility concern
SQLite permits ordering by non-projected columns in a simple SELECT with DISTINCT, soSELECT DISTINCT target … ORDER BY id DESC
is valid. (www2.sqlite.org, sqlite.org)Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
♻️ Duplicate comments (2)
nettacker/database/db.py (2)
111-122
: SQLAlchemy commit retry lacks rollback and uses hardcoded policyAdd rollback between attempts and use configured retry count/delay. This was raised earlier; reapplying here.
- try: - for _ in range(1, 100): - try: - session.commit() - return True - except Exception: - time.sleep(0.1) - logger.warn(messages("database_connect_fail")) - return False - except Exception: - logger.warn(messages("database_connect_fail")) - return False + success = False + for _ in range(Config.settings.max_retries): + try: + session.commit() + success = True + break + except Exception: + session.rollback() + time.sleep(Config.settings.retry_delay) + if not success: + logger.warn(messages("database_connect_fail")) + return success
96-109
: Critical: COMMIT retry loop closes the connection on every attemptClosing in finally inside the loop guarantees subsequent retries operate on a closed connection and also leaks the cursor. Move cleanup outside the loop; add rollback and configurable retry/backoff. Keep a finally (per tests).
- if isinstance(session, tuple): - connection, cursor = session - for _ in range(100): - try: - connection.execute("COMMIT") - return True - except Exception: - connection.execute("ROLLBACK") - time.sleep(0.1) - finally: - connection.close() - connection.close() - logger.warn(messages("database_connect_fail")) - return False + if isinstance(session, tuple): + connection, cursor = session + success = False + try: + for _ in range(Config.settings.max_retries): + try: + connection.execute("COMMIT") + success = True + break + except Exception: + try: + connection.execute("ROLLBACK") + except Exception: + pass + time.sleep(Config.settings.retry_delay) + finally: + try: + cursor.close() + except Exception: + pass + try: + connection.close() + except Exception: + pass + if not success: + logger.warn(messages("database_connect_fail")) + return success
🧹 Nitpick comments (4)
nettacker/database/db.py (4)
55-63
: Busy-timeout unit mismatch (seconds→milliseconds)APSW expects milliseconds. Multiplying by 100 likely under-waits. Use 1000.
- connection.setbusytimeout(int(config.settings.timeout) * 100) + # Convert seconds to milliseconds + connection.setbusytimeout(int(config.settings.timeout) * 1000)
285-287
: Typo in log message ("retires" → "retries")Minor but user-facing.
- # All retires exhausted but we want to continue operation + # All retries exhausted but we want to continue operation
329-336
: Use connection.execute('BEGIN') for consistencyElsewhere you use connection-level BEGIN/ROLLBACK. Align here too.
- if not connection.in_transaction: - cursor.execute("BEGIN") + if not connection.in_transaction: + connection.execute("BEGIN")
770-782
: Unify json_event type across backends (get_logs_by_scan_id)ORM branch returns a raw JSON string while APSW returns a dict. Prefer consistent dicts here (you already do so in logs_to_report_json).
- "json_event": log.json_event, + "json_event": json.loads(log.json_event) if log.json_event else {},
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
nettacker/database/db.py
(17 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.138Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.138Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.008Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.008Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.306Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.306Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.
Applied to files:
nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
Config
(191-195)as_dict
(26-27)nettacker/database/models.py (3)
Report
(7-27)HostsLog
(67-100)TempEvents
(30-64)nettacker/api/helpers.py (1)
structure
(1-12)nettacker/core/graph.py (1)
build_graph
(26-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: CodeQL (javascript)
- GitHub Check: CodeQL (python)
🔇 Additional comments (2)
nettacker/database/db.py (2)
4-8
: APSW import guard — good changeOptional dependency handled cleanly; avoids hard-fail on non-SQLite backends.
884-977
: Minor: parsing vs raw in HTML reportYou pass raw text fields (as intended for HTML). No changes needed beyond connection close handled above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (8)
docs/Usage.md (8)
635-641
: Tighten "Database" section wording and list formatting
- Remove the comma after “Nettacker” and avoid blank lines breaking the list.
- Keep database names consistent with later sections.
-OWASP Nettacker, currently supports three databases: - -- SQLite -- MySQL -- PostgreSQL +OWASP Nettacker currently supports three databases: + +- SQLite +- MySQL +- PostgreSQL
649-658
: Use fenced code block with language; avoid indented blockSwitch to fenced code with language for better readability and lint compliance (MD046).
- engine = "sqlite" - name = str(CWD / ".nettacker/data/nettacker.db") - host = "" - port = "" - username = "" - password = "" - ssl_mode = "disable" - journal_mode = "WAL" - synchronous_mode = "NORMAL" +```python +engine = "sqlite" +name = str(CWD / ".nettacker/data/nettacker.db") +host = "" +port = "" +username = "" +password = "" +ssl_mode = "disable" +journal_mode = "WAL" +synchronous_mode = "NORMAL" +```
659-659
: Polish grammar and add minor durability caveat
- Grammar: “according to need” -> “according to your needs”; “use SQLite database” -> “use the SQLite database”.
- Optional note: NORMAL improves performance but reduces durability on sudden power loss vs FULL.
-These are the default and recommended settings. Feel free to play around and change them according to need. To use SQLite database, ensure that the `engine` value is set to `sqlite` and the `name` is the path to your database. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations. +These are the default and recommended settings. Feel free to adjust them according to your needs. To use the SQLite database, ensure that `engine` is set to `sqlite` and `name` points to your database file. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations (note: `synchronous=NORMAL` trades some durability vs `FULL`).
661-661
: Remove trailing colon from headingComplies with MD026.
-## MySQL configuration: +## MySQL configuration
666-675
: Add language to fenced code; remove SQLite-only fields from MySQL example
- Add “python” on the fence (MD040).
- Drop
journal_mode
andsynchronous_mode
to avoid implying they matter for MySQL. Keepssl_mode
only if the code actually honors it for MySQL.-``` - engine = "mysql" - name = "nettacker" - host = "localhost" - port = 3306 - username = "root" - password = "some-password" - ssl_mode = "disable" - journal_mode = "WAL" - synchronous_mode = "NORMAL" -``` +```python +engine = "mysql" +name = "nettacker" +host = "localhost" +port = 3306 +username = "root" +password = "some-password" +# ssl_mode: use if supported by your setup (e.g., "require") +```Also update the note just below to reflect actual support for
ssl_mode
on MySQL in this codebase.Would you like me to scan the repository to confirm which fields are honored for MySQL before we finalize the doc text?
Also applies to: 677-677
681-681
: Use consistent product name in headingAlign with earlier “PostgreSQL”.
-## Postgres Configuration +## PostgreSQL configuration
650-650
: Clarify path creation for default SQLite locationIf
.nettacker/data/
isn’t auto-created, add a note to create it; if it is auto-created by the app, state that explicitly to prevent user confusion.You can confirm by grepping for
mkdir
orPath(...).mkdir
in database initialization code.
697-697
: SSL guidance consistencyYou recommend
ssl_mode="require"
for PostgreSQL. If MySQL also supports SSL in this codebase, mirror the guidance in the MySQL section for consistency.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
docs/Usage.md
(1 hunks)nettacker/config.py
(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- nettacker/config.py
🧰 Additional context used
🪛 LanguageTool
docs/Usage.md
[grammar] ~638-~638: There might be a mistake here.
Context: ...ntly supports three databases: - SQLite - MySQL - PostgreSQL The default database...
(QB_NEW_EN)
[grammar] ~639-~639: There might be a mistake here.
Context: ...ports three databases: - SQLite - MySQL - PostgreSQL The default database is SQLi...
(QB_NEW_EN)
[grammar] ~646-~646: Ensure spelling is correct
Context: ...r liking. ## SQLite configuration The configruations are for a SQLite wrapper called *APSW...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
[style] ~659-~659: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...e the default and recommended settings. Feel free to play around and change them according t...
(FEEL_FREE_TO_STYLE_ME)
[grammar] ~659-~659: There might be a mistake here.
Context: ...d and change them according to need. To use SQLite database, ensure that the `engin...
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md
649-649: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
661-661: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
665-665: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
683-683: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test Docker image
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/Usage.md (1)
681-691
: Add language to the PostgreSQL fenced block (keep engine = "postgres").Same as earlier feedback: add a language tag; keep engine identifier as "postgres" to match the codebase/tests.
-``` +```python engine = "postgres" name = "nettacker" host = "localhost" port = 5432 username = "root" password = "some-password" ssl_mode = "disable" journal_mode = "WAL" synchronous_mode = "NORMAL"</blockquote></details> </blockquote></details> <details> <summary>🧹 Nitpick comments (6)</summary><blockquote> <details> <summary>docs/Usage.md (6)</summary><blockquote> `634-641`: **Tighten wording and fix minor grammar (remove comma after “Nettacker”, avoid “db”).** - Remove the comma after “OWASP Nettacker”. - Prefer “database” over “db” and simplify phrasing. ```diff -OWASP Nettacker, currently supports three databases: +OWASP Nettacker currently supports three databases: ... -The default database is SQLite. You can, however, configure the db to your liking. +The default database is SQLite. You can configure the database to your needs.
642-656
: Use a fenced code block with a language tag (MD046/MD040).The SQLite snippet is an indented block; convert to a fenced block with a language for better rendering and lint compliance.
- engine = "sqlite" - name = str(CWD / ".nettacker/data/nettacker.db") - host = "" - port = "" - username = "" - password = "" - ssl_mode = "disable" - journal_mode = "WAL" - synchronous_mode = "NORMAL" +```python +engine = "sqlite" +name = str(CWD / ".nettacker/data/nettacker.db") +host = "" +port = "" +username = "" +password = "" +ssl_mode = "disable" +journal_mode = "WAL" +synchronous_mode = "NORMAL" +```
659-659
: Remove trailing colon in heading (MD026).Trailing punctuation in headings is flagged by markdownlint; drop the colon.
-## MySQL configuration: +## MySQL configuration
663-673
: Add language to the fenced MySQL block (MD040).Specify the language to improve syntax highlighting and satisfy linters.
-``` +```python engine = "mysql" name = "nettacker" host = "localhost" port = 3306 username = "root" password = "some-password" ssl_mode = "disable" journal_mode = "WAL" synchronous_mode = "NORMAL"--- `693-695`: **Minor grammar nits (comma after introductory phrase; streamline note).** Add a comma after “In this case,” and simplify the note sentence. ```diff -In this case the irrelevant fields are `journal_mode` and `synchronous_mode`. You don't have to update/change/remove them. +In this case, the irrelevant fields are `journal_mode` and `synchronous_mode`. You don't have to update, change, or remove them. -**Note**: If you want encryption, then set `ssl_mode` to `require`. +**Note**: For encryption, set `ssl_mode` to `require`.
642-695
: Apply markdownlint fixes for MD026/MD040/MD046 in docs/Usage.md
- Remove the trailing colon from the “## MySQL configuration:” header (line 659) to satisfy MD026.
- Add explicit language identifiers (e.g.
ini,
python, orbash) to all
fences in this file to satisfy MD040.- Convert any remaining indented (4-space) code examples to fenced code blocks to satisfy MD046.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
docs/Usage.md
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
🪛 LanguageTool
docs/Usage.md
[grammar] ~636-~636: There might be a mistake here.
Context: ...ntly supports three databases: - SQLite - MySQL - PostgreSQL The default database...
(QB_NEW_EN)
[grammar] ~637-~637: There might be a mistake here.
Context: ...ports three databases: - SQLite - MySQL - PostgreSQL The default database is SQLi...
(QB_NEW_EN)
[style] ~657-~657: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...e the default and recommended settings. Feel free to play around and change them according t...
(FEEL_FREE_TO_STYLE_ME)
[grammar] ~657-~657: There might be a mistake here.
Context: ...d and change them according to need. To use SQLite database, ensure that the `engin...
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md
647-647: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
659-659: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
663-663: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
681-681: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Test Docker 27.5.0-1
ubuntu.24.04noble image build - GitHub Check: Test Docker image
…s for backward compatibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (10)
docs/Usage.md (2)
683-693
: Add language to fenced code block (MD040)Specify a language for the Postgres config example.
-``` +```ini engine = "postgres" name = "nettacker" host = "localhost" port = 5432 username = "root" password = "some-password" ssl_mode = "disable" journal_mode = "WAL" synchronous_mode = "NORMAL"--- `657-660`: **Tighten tone and grammar; “SQLite” capitalization; minor wording** Reword for concision and formality. Also “SQLite” and “lightweight”. ```diff -These are the default and recommended settings. Feel free to play around and change them according to need. To use SQLite database, ensure that the `engine` value is set to `sqlite` and the `name` is the path to your database. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations. +These are the default and recommended settings. Adjust as needed. To use the SQLite database, set `engine` to `sqlite` and `name` to your database path. `journal_mode` and `synchronous_mode` are tuned for multithreaded I/O.
Optional:
-> Note: You can choose to use a lite wrapper for Sqlite called APSW by setting the `use_apsw_for_sqlite` parameter inside config to True for performance enhancements. +> Note: You can enable the lightweight SQLite wrapper APSW by setting `use_apsw_for_sqlite` to `True` in the config for performance gains.nettacker/database/db.py (8)
56-61
: APSW busy-timeout uses 100 ms-per-second; should be 1000setbusytimeout expects milliseconds. Convert seconds → ms.
- connection.setbusytimeout(int(config.settings.timeout) * 100) + # Convert seconds to milliseconds + connection.setbusytimeout(int(config.settings.timeout) * 1000)
100-113
: Critical: connection closed inside retry loop; retries operate on a closed handleMove close(s) out of the loop, add rollback between attempts, and use configurable retry params.
- if isinstance(session, tuple): - connection, cursor = session - for _ in range(100): - try: - connection.execute("COMMIT") - return True - except Exception: - connection.execute("ROLLBACK") - time.sleep(0.1) - finally: - connection.close() - connection.close() - logger.warn(messages("database_connect_fail")) - return False + if isinstance(session, tuple): + connection, cursor = session + success = False + try: + for _ in range(Config.settings.max_retries): + try: + connection.execute("COMMIT") + success = True + break + except Exception: + try: + connection.execute("ROLLBACK") + except Exception: + pass + time.sleep(Config.settings.retry_delay) + finally: + try: + cursor.close() + except Exception: + pass + try: + connection.close() + except Exception: + pass + if not success: + logger.warn(messages("database_connect_fail")) + return success @@ - else: - try: - for _ in range(1, 100): - try: - session.commit() - return True - except Exception: - time.sleep(0.1) - logger.warn(messages("database_connect_fail")) - return False - except Exception: - logger.warn(messages("database_connect_fail")) - return False + else: + success = False + for _ in range(Config.settings.max_retries): + try: + session.commit() + success = True + break + except Exception: + session.rollback() + time.sleep(Config.settings.retry_delay) + if not success: + logger.warn(messages("database_connect_fail")) + return successAlso applies to: 115-126
147-168
: Close APSW connection as well in finally (avoid leak on both success/error paths)Currently only the cursor is closed here.
- finally: - cursor.close() + finally: + try: + cursor.close() + finally: + connection.close()
195-219
: Close APSW connection in remove_old_logs finally blockMirror the fix from submit_report_to_db.
- finally: - cursor.close() + finally: + try: + cursor.close() + finally: + connection.close()
419-419
: Contract regression: find_temp_events should return a JSON string; also close connection; fix ORM branchAlign with callers and with find_events; update docstring; close APSW connection.
- an array with JSON events or an empty array + a JSON string or an empty array @@ - row = cursor.fetchone() - cursor.close() - if row: - return json.loads(row[0]) + row = cursor.fetchone() + cursor.close() + connection.close() + if row: + return row[0] return [] @@ - return ( - session.query(TempEvents) - .filter( - TempEvents.target == target, - TempEvents.module_name == module_name, - TempEvents.scan_unique_id == scan_id, - TempEvents.event_name == event_name, - ) - .first() - ) + result = ( + session.query(TempEvents) + .filter( + TempEvents.target == target, + TempEvents.module_name == module_name, + TempEvents.scan_unique_id == scan_id, + TempEvents.event_name == event_name, + ) + .first() + ) + return result.event if result else []Also applies to: 435-439, 445-454
482-486
: Close APSW connection after reads (find_events)Prevent leaked connections on read-path success and error.
- rows = cursor.fetchall() - cursor.close() + rows = cursor.fetchall() + cursor.close() + connection.close()
638-640
: Close APSW connection in last_host_logs on both early-return and success pathsAvoid leaking the connection.
- if not targets: - return structure(status="finished", msg="No more search results") + if not targets: + cursor.close() + connection.close() + return structure(status="finished", msg="No more search results") @@ - cursor.close() + cursor.close() + connection.close() return hostsAlso applies to: 687-689
534-545
: Close APSW connection after select_reports fetchAdd connection.close() after cursor.close().
- cursor.close() + cursor.close() + connection.close() for row in rows:
🧹 Nitpick comments (4)
docs/Usage.md (3)
647-656
: Fix MD046: switch indented block to fenced code block with a languageUse a fenced block for the SQLite snippet to satisfy markdownlint and improve readability.
- engine = "sqlite" - name = str(CWD / ".nettacker/data/nettacker.db") - host = "" - port = "" - username = "" - password = "" - ssl_mode = "disable" - journal_mode = "WAL" - synchronous_mode = "NORMAL" +```ini +engine = "sqlite" +name = ".nettacker/data/nettacker.db" +host = "" +port = "" +username = "" +password = "" +ssl_mode = "disable" +journal_mode = "WAL" +synchronous_mode = "NORMAL" +```
661-661
: Remove trailing colon from heading (MD026)Drop the colon at the end of the MySQL heading.
-## MySQL configuration: +## MySQL configuration
665-675
: Add language to fenced code block (MD040)Specify a language for the MySQL config example.
-``` +```ini engine = "mysql" name = "nettacker" host = "localhost" port = 3306 username = "root" password = "some-password" ssl_mode = "disable" journal_mode = "WAL" synchronous_mode = "NORMAL"</blockquote></details> <details> <summary>nettacker/database/db.py (1)</summary><blockquote> `589-599`: **Use logger.exception for failures; also close connection in APSW branch** Prefer exception-level logging to capture tracebacks; also close APSW connection after cursor.close(). ```diff - cursor.close() + cursor.close() + connection.close() @@ - except IOError as e: - logger.error(f"Failed to read report file: {e}") + except IOError: + logger.exception("Failed to read report file") return None @@ - except IOError as e: - logger.error(f"Failed to read report file: {e}") + except IOError: + logger.exception("Failed to read report file") return None
And earlier:
- logger.error(f"Failed to create APSW connection: {e}") + logger.exception("Failed to create APSW connection")Also applies to: 605-610, 68-69
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
docs/Usage.md
(1 hunks)nettacker/config.py
(4 hunks)nettacker/database/db.py
(17 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- nettacker/config.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.
Applied to files:
nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
Config
(194-198)as_dict
(26-27)nettacker/database/models.py (3)
Report
(7-27)HostsLog
(67-100)TempEvents
(30-64)nettacker/api/helpers.py (1)
structure
(1-12)nettacker/core/graph.py (1)
build_graph
(26-49)
🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md
647-647: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
661-661: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
665-665: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
683-683: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 Ruff (0.12.2)
nettacker/database/db.py
53-53: Avoid specifying long messages outside the exception class
(TRY003)
66-66: Consider moving this statement to an else
block
(TRY300)
68-68: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
105-105: Consider moving this statement to an else
block
(TRY300)
106-106: Do not catch blind exception: Exception
(BLE001)
119-119: Consider moving this statement to an else
block
(TRY300)
120-120: Do not catch blind exception: Exception
(BLE001)
123-123: Consider moving this statement to an else
block
(TRY300)
124-124: Do not catch blind exception: Exception
(BLE001)
162-162: Do not catch blind exception: Exception
(BLE001)
213-213: Do not catch blind exception: Exception
(BLE001)
282-282: Do not catch blind exception: Exception
(BLE001)
286-287: try
-except
-pass
detected, consider logging the exception
(S110)
286-286: Do not catch blind exception: Exception
(BLE001)
363-364: try
-except
-pass
detected, consider logging the exception
(S110)
363-363: Do not catch blind exception: Exception
(BLE001)
371-372: try
-except
-pass
detected, consider logging the exception
(S110)
371-371: Do not catch blind exception: Exception
(BLE001)
374-374: Do not catch blind exception: Exception
(BLE001)
378-379: try
-except
-pass
detected, consider logging the exception
(S110)
378-378: Do not catch blind exception: Exception
(BLE001)
439-439: Consider moving this statement to an else
block
(TRY300)
440-440: Do not catch blind exception: Exception
(BLE001)
486-486: Consider moving this statement to an else
block
(TRY300)
487-487: Do not catch blind exception: Exception
(BLE001)
544-544: Consider moving this statement to an else
block
(TRY300)
546-546: Do not catch blind exception: Exception
(BLE001)
563-563: Do not catch blind exception: Exception
(BLE001)
596-596: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
608-608: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
688-688: Consider moving this statement to an else
block
(TRY300)
690-690: Do not catch blind exception: Exception
(BLE001)
1065-1065: Do not catch blind exception: Exception
(BLE001)
1136-1136: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Test Docker 26.0.0-1
ubuntu.24.04noble image build - GitHub Check: Test Docker 26.1.4-1
ubuntu.24.04noble image build - GitHub Check: Test Docker image
- GitHub Check: Build package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
docs/Usage.md (1)
673-673
: Make tone concise and add the missing article.Previous feedback still applies; rephrase to a formal tone.
-These are the default and recommended settings. Feel free to play around and change them according to need. To use SQLite database, ensure that the `engine` value is set to `sqlite` and the `name` is the path to your database. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations. +These are the default and recommended settings. To use the SQLite database, set `engine` to `sqlite` and `name` to your database path. `journal_mode` and `synchronous_mode` are selected for multithreaded I/O.
🧹 Nitpick comments (6)
docs/Usage.md (6)
650-657
: Tighten grammar and tone in DB intro.Remove the comma after “Nettacker” and avoid “db” slang; keep it concise.
-OWASP Nettacker, currently supports three databases: +OWASP Nettacker currently supports three databases: @@ -The default database is SQLite. You can, however, configure the db to your liking. +The default database is SQLite. You can configure the database to your needs.
660-672
: Fix MD046/MD040: use fenced code block with language for SQLite sample.Convert the indented block to a fenced block and annotate with a language (use ini for consistency with other samples).
-The configurations below are for a SQLite wrapper called **APSW** (Another Python SQLite Wrapper). The configurations can be found inside `nettacker/config.py` file under the `DBConfig` class. - - - engine = "sqlite" - name = str(CWD / ".nettacker/data/nettacker.db") - host = "" - port = "" - username = "" - password = "" - ssl_mode = "disable" - journal_mode = "WAL" - synchronous_mode = "NORMAL" +The configurations below are for a SQLite wrapper called **APSW** (Another Python SQLite Wrapper). The configurations can be found in `nettacker/config.py` under the `DBConfig` class. + +```ini +engine = "sqlite" +name = ".nettacker/data/nettacker.db" +host = "" +port = "" +username = "" +password = "" +ssl_mode = "disable" +journal_mode = "WAL" +synchronous_mode = "NORMAL" +```
678-678
: Remove trailing colon in heading (MD026).-## MySQL configuration: +## MySQL configuration
681-691
: Add language to fenced block (MD040) and drop irrelevant fields to avoid confusion.Keep the sample minimal; you already note irrelevant fields aren’t required.
-``` +```ini engine = "mysql" name = "nettacker" host = "localhost" port = 3306 username = "root" password = "some-password" - ssl_mode = "disable" - journal_mode = "WAL" - synchronous_mode = "NORMAL"--- `699-709`: **Add language to fenced block (MD040) and retain engine string “postgres” per codebase; trim irrelevant fields.** Align with earlier guidance to keep engine “postgres” and remove SQLite-only knobs from the sample. ```diff -``` +```ini engine = "postgres" name = "nettacker" host = "localhost" port = 5432 username = "root" password = "some-password" - ssl_mode = "disable" - journal_mode = "WAL" - synchronous_mode = "NORMAL"
--- `711-714`: **Clarify which fields are ignored and tighten SSL note; optionally align heading to “PostgreSQL”.** Be explicit and consistent. ```diff -In this case the irrelevant fields are `journal_mode` and `synchronous_mode`. You don't have to update/change/remove them. - -**Note**: If you want encryption, then set `ssl_mode` to `require`. +In this backend, `journal_mode` and `synchronous_mode` are ignored. + +Note: To require TLS, set `ssl_mode = "require"` (if supported by your driver/environment).
Optional: Rename the section heading at Line 695 to “## PostgreSQL configuration” for consistency with common naming.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/Usage.md
(1 hunks)nettacker/core/graph.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- nettacker/core/graph.py
🧰 Additional context used
📓 Path-based instructions (1)
docs/**
📄 CodeRabbit inference engine (AGENTS.md)
Place documentation under docs/
Files:
docs/Usage.md
🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md
663-663: Code block style
Expected: fenced; Actual: indented
(MD046, code-block-style)
677-677: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
681-681: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
699-699: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Test Docker 27.5.0-1
ubuntu.24.04noble image build - GitHub Check: Test Docker 26.0.0-1
ubuntu.24.04noble image build - GitHub Check: Test Docker 26.1.4-1
ubuntu.24.04noble image build - GitHub Check: Build package
- GitHub Check: Test Docker image
🔇 Additional comments (1)
docs/Usage.md (1)
675-675
: Capitalize SQLite and tighten APSW note — flag verified.Replace the docs line with:
Note: To use APSW for SQLite, set
use_apsw_for_sqlite = True
in the config for potential performance gains.Flag defined: nettacker/config.py:151-153. Flag checked: nettacker/database/db.py:51-53.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
nettacker/core/lib/base.py (1)
49-58
: Avoid infinite wait; add safe JSON handlingPolling without a timeout risks hanging if the event never arrives; also JSON parsing can throw. Add a max-wait (or attempt cap) and handle decode/KeyError gracefully before appending.
- while True: - event = find_temp_events(target, module_name, scan_id, event_name) - if event: - events.append(json.loads(event)["response"]["conditions_results"]) - break - time.sleep(0.1) + max_attempts = 300 # ~30s at 0.1s intervals + attempts = 0 + while attempts < max_attempts: + event = find_temp_events(target, module_name, scan_id, event_name) + if event: + try: + payload = json.loads(event) + events.append(payload.get("response", {}).get("conditions_results", [])) + break + except (ValueError, TypeError, KeyError): + # malformed/partial event; keep waiting briefly + pass + time.sleep(0.1) + attempts += 1
♻️ Duplicate comments (5)
nettacker/database/db.py (5)
56-60
: Use milliseconds for APSW busy timeoutAPSW’s
setbusytimeout
expects milliseconds; multiplying by 100 under-waits by 10x for second-based configs.- connection.setbusytimeout(int(config.settings.timeout) * 100) + # seconds -> milliseconds + connection.setbusytimeout(int(config.settings.timeout) * 1000)APSW setbusytimeout units (ms vs seconds)?
100-114
: APSW commit retry closes connection inside the loop (breaks retries) and can leak cursorClosing per-iteration invalidates subsequent attempts. Close once after the loop; keep the finally for tests but outside the retry. Also use configured retry/delay.
- if isinstance(session, tuple): - connection, cursor = session - for _ in range(100): - try: - connection.execute("COMMIT") - return True - except Exception: - connection.execute("ROLLBACK") - time.sleep(0.1) - finally: - connection.close() - connection.close() - logger.warn(messages("database_connect_fail")) - return False + if isinstance(session, tuple): + connection, cursor = session + success = False + try: + for _ in range(Config.settings.max_retries): + try: + connection.execute("COMMIT") + success = True + break + except Exception: + try: + connection.execute("ROLLBACK") + except Exception: + pass + time.sleep(Config.settings.retry_delay) + finally: + try: + cursor.close() + except Exception: + pass + try: + connection.close() + except Exception: + pass + if not success: + logger.warn(messages("database_connect_fail")) + return success
642-644
: Close APSW cursor/connection on early returnThis path leaks resources when no targets are found.
- if not targets: - return structure(status="finished", msg="No more search results") + if not targets: + cursor.close() + connection.close() + return structure(status="finished", msg="No more search results")
115-126
: ORM commit retry lacks rollback and uses hardcoded policyAdd rollback between attempts and read retry config.
- try: - for _ in range(1, 100): - try: - session.commit() - return True - except Exception: - time.sleep(0.1) - logger.warn(messages("database_connect_fail")) - return False - except Exception: - logger.warn(messages("database_connect_fail")) - return False + for _ in range(Config.settings.max_retries): + try: + session.commit() + return True + except Exception: + try: + session.rollback() + except Exception: + pass + time.sleep(Config.settings.retry_delay) + logger.warn(messages("database_connect_fail")) + return False
780-792
: Unify json_event type across backends (parse to dict in ORM path)APSW branch returns parsed dicts; ORM path returns raw string. Parse for consistency.
- "json_event": log.json_event, + "json_event": json.loads(log.json_event) if log.json_event else {},
🧹 Nitpick comments (3)
report.html (1)
1-1
: Make the report template valid HTML and easier to populateThis one-line blob is invalid HTML and hard to maintain. Replace with a minimal, valid skeleton with placeholders the renderer fills.
-<table><graph_html>/*css*/</table>datetargetmodule_nameportlogsjson_event<tr>nowx</tr></table><div id="json_length">1</div><p class="footer">Software Details: OWASP Nettacker version 1.0 [beta] in now ScanID: scan-id</p><script>/*js*/</script> +<!doctype html> +<html lang="en"> + <head> + <meta charset="utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <style>/*css*/</style> + <title>Nettacker Report</title> + </head> + <body> + <div id="graph"><graph_html></graph_html></div> + <table id="events"> + <thead><tr><th>date</th><th>target</th><th>module_name</th><th>port</th><th>logs</th><th>json_event</th></tr></thead> + <tbody><!-- rows injected here --></tbody> + </table> + <div id="json_length">1</div> + <p class="footer">Software Details: OWASP Nettacker version 1.0 [beta] in now ScanID: scan-id</p> + <script>/*js*/</script> + </body> + </html>tests/database/test_db.py (2)
706-722
: Align expected return type for SQLAlchemy path of find_temp_events (JSON string, not dict)The implementation returns a JSON string (
result.event
). Update the mock and assertion accordingly.- fake_result = MagicMock() - fake_result.event = {"foo": "bar"} + fake_result = MagicMock() + fake_result.event = '{"foo":"bar"}' @@ - assert result == {"foo": "bar"} + assert result == '{"foo":"bar"}'
860-869
: The exception is set on an unused attribute
select_reports
usescursor.execute(...)
, notcursor.query
. Patchexecute.side_effect
so the test actually exercises the error path.- mock_cursor.query.side_effect = Exception("DB Error") + mock_cursor.execute.side_effect = Exception("DB Error")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
nettacker/core/lib/base.py
(1 hunks)nettacker/database/db.py
(17 hunks)report.html
(1 hunks)tests/database/test_db.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py
: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical
Files:
nettacker/core/lib/base.py
nettacker/database/db.py
tests/database/test_db.py
nettacker/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Add docstrings for public APIs in the nettacker package
Files:
nettacker/core/lib/base.py
nettacker/database/db.py
nettacker/core/**
📄 CodeRabbit inference engine (AGENTS.md)
Place core libraries under nettacker/core/
Files:
nettacker/core/lib/base.py
tests/**/test_*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/test_*.py
: Place tests under tests/ and name files as test_*.py
Use pytest (with pytest-asyncio, xdist) for tests
Files:
tests/database/test_db.py
tests/**
📄 CodeRabbit inference engine (AGENTS.md)
Mirror package layout in tests (e.g., tests/core/, tests/lib/, etc.)
Files:
tests/database/test_db.py
🧠 Learnings (4)
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
Applied to files:
nettacker/core/lib/base.py
nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Applied to files:
nettacker/core/lib/base.py
nettacker/database/db.py
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.
Applied to files:
nettacker/database/db.py
tests/database/test_db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.
Applied to files:
nettacker/database/db.py
🧬 Code graph analysis (2)
nettacker/database/db.py (4)
nettacker/config.py (1)
Config
(194-198)nettacker/database/models.py (3)
Report
(7-27)HostsLog
(67-100)TempEvents
(30-64)nettacker/api/helpers.py (1)
structure
(1-12)nettacker/core/graph.py (1)
build_graph
(28-51)
tests/database/test_db.py (4)
nettacker/api/helpers.py (1)
structure
(1-12)nettacker/database/db.py (17)
db_inputs
(22-40)create_connection
(43-85)send_submit_query
(88-127)submit_report_to_db
(130-178)remove_old_logs
(181-229)submit_logs_to_db
(232-313)submit_temp_logs_to_db
(316-403)find_temp_events
(406-455)find_events
(458-502)select_reports
(505-568)get_scan_result
(571-613)last_host_logs
(616-737)get_logs_by_scan_id
(740-792)get_options_by_scan_id
(795-824)logs_to_report_json
(827-880)logs_to_report_html
(883-990)search_logs
(993-1150)tests/database/test_sqlite.py (1)
mock_config
(12-14)nettacker/config.py (1)
as_dict
(26-27)
🪛 Ruff (0.12.2)
nettacker/database/db.py
53-53: Avoid specifying long messages outside the exception class
(TRY003)
66-66: Consider moving this statement to an else
block
(TRY300)
68-68: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
105-105: Consider moving this statement to an else
block
(TRY300)
106-106: Do not catch blind exception: Exception
(BLE001)
119-119: Consider moving this statement to an else
block
(TRY300)
120-120: Do not catch blind exception: Exception
(BLE001)
123-123: Consider moving this statement to an else
block
(TRY300)
124-124: Do not catch blind exception: Exception
(BLE001)
162-162: Do not catch blind exception: Exception
(BLE001)
214-214: Do not catch blind exception: Exception
(BLE001)
284-284: Do not catch blind exception: Exception
(BLE001)
288-289: try
-except
-pass
detected, consider logging the exception
(S110)
288-288: Do not catch blind exception: Exception
(BLE001)
363-364: try
-except
-pass
detected, consider logging the exception
(S110)
363-363: Do not catch blind exception: Exception
(BLE001)
371-372: try
-except
-pass
detected, consider logging the exception
(S110)
371-371: Do not catch blind exception: Exception
(BLE001)
374-374: Do not catch blind exception: Exception
(BLE001)
378-379: try
-except
-pass
detected, consider logging the exception
(S110)
378-378: Do not catch blind exception: Exception
(BLE001)
438-438: Consider moving this statement to an else
block
(TRY300)
439-439: Do not catch blind exception: Exception
(BLE001)
488-488: Consider moving this statement to an else
block
(TRY300)
489-489: Do not catch blind exception: Exception
(BLE001)
547-547: Consider moving this statement to an else
block
(TRY300)
549-549: Do not catch blind exception: Exception
(BLE001)
566-566: Do not catch blind exception: Exception
(BLE001)
600-600: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
612-612: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
693-693: Consider moving this statement to an else
block
(TRY300)
695-695: Do not catch blind exception: Exception
(BLE001)
1075-1075: Do not catch blind exception: Exception
(BLE001)
1146-1146: Do not catch blind exception: Exception
(BLE001)
tests/database/test_db.py
34-34: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"
(S108)
116-116: Probable insecure usage of temporary file or directory: "/tmp/test.db"
(S108)
126-126: Probable insecure usage of temporary file or directory: "/tmp/test.db"
(S108)
185-185: Unused method argument: mock_messages
(ARG002)
188-188: Create your own exception
(TRY002)
188-188: Avoid specifying long messages outside the exception class
(TRY003)
214-214: Unused method argument: mock_messages
(ARG002)
248-248: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"
(S108)
376-376: Unused method argument: mock_messages
(ARG002)
383-383: Unused method argument: mock_submit
(ARG002)
476-476: Unused method argument: mock_submit
(ARG002)
495-495: Unused method argument: mock_submit
(ARG002)
548-548: Unused method argument: mock_messages
(ARG002)
598-598: Unused method argument: mock_temp
(ARG002)
693-693: Unused method argument: mock_messages
(ARG002)
832-832: Probable insecure usage of temporary file or directory: "/tmp/report.json"
(S108)
852-852: Probable insecure usage of temporary file or directory: "/tmp/report.json"
(S108)
879-879: Probable insecure usage of temporary file or directory: "/tmp/report.json"
(S108)
892-892: Probable insecure usage of temporary file or directory: "/tmp/report.json"
(S108)
917-917: Probable insecure usage of temporary file or directory: "/tmp/report.json"
(S108)
933-933: Probable insecure usage of temporary file or directory: "/tmp/report.json"
(S108)
943-943: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"
(S108)
948-948: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"
(S108)
951-951: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"
(S108)
1150-1150: Unused method argument: mock_messages
(ARG002)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (4)
nettacker/database/db.py (4)
56-66
: APSW busy-timeout uses 100×; APSW expects milliseconds.If
config.settings.timeout
is seconds, multiply by 1000, not 100. Current value shortens the wait and can inflate BusyError retries.- connection.setbusytimeout(int(config.settings.timeout) * 100) + connection.setbusytimeout(int(config.settings.timeout) * 1000)Also applies to: 59-59
19-19
: Rename local logger to avoid shadowing the imported module.
logger = logger.get_logger()
overwrites the importedlogger
module symbol. Use a distinct name (e.g.,log
) and update references in this file.-logger = logger.get_logger() +log = logger.get_logger()
100-114
: Retry loop closes the connection each iteration; breaks retries.
finally: connection.close()
runs on every loop pass, so subsequent retries operate on a closed connection. Close once after the loop; use configured retry params.- if isinstance(session, tuple): - connection, cursor = session - for _ in range(100): - try: - connection.execute("COMMIT") - return True - except Exception: - connection.execute("ROLLBACK") - time.sleep(0.1) - finally: - connection.close() - connection.close() - logger.warn(messages("database_connect_fail")) - return False + if isinstance(session, tuple): + connection, cursor = session + success = False + try: + for _ in range(Config.settings.max_retries): + try: + connection.execute("COMMIT") + success = True + break + except Exception: + try: + connection.execute("ROLLBACK") + except Exception: + pass + time.sleep(Config.settings.retry_delay) + finally: + try: + cursor.close() + finally: + connection.close() + if not success: + logger.warn(messages("database_connect_fail")) + return success
115-126
: SQLAlchemy commit loop lacks rollback and uses hardcoded policy.Add
rollback()
and useConfig.settings.{max_retries,retry_delay}
.- else: - try: - for _ in range(1, 100): - try: - session.commit() - return True - except Exception: - time.sleep(0.1) - logger.warn(messages("database_connect_fail")) - return False - except Exception: - logger.warn(messages("database_connect_fail")) - return False + else: + success = False + for _ in range(Config.settings.max_retries): + try: + session.commit() + success = True + break + except Exception: + session.rollback() + time.sleep(Config.settings.retry_delay) + if not success: + logger.warn(messages("database_connect_fail")) + return success
🧹 Nitpick comments (6)
nettacker/database/db.py (6)
291-293
: Typo in log message (“retires”).Fix the spelling to “retries” for clarity.
- # All retires exhausted but we want to continue operation + # All retries exhausted but we want to continue operation
337-338
: Use connection for BEGIN to match other APSW paths.Be consistent with
submit_logs_to_db
which usesconnection.execute("BEGIN")
.- if not connection.in_transaction: - cursor.execute("BEGIN") + if not connection.in_transaction: + connection.execute("BEGIN")
406-418
: Docstring return type is inaccurate for find_temp_events.Function returns a single JSON string (or []) due to
LIMIT 1
, not “an array”.- Returns: - an array with JSON events or an empty array + Returns: + a JSON string (event) or [] if not found
780-792
: Inconsistent json_event type between APSW and ORM branches.APSW returns dict via
json.loads
, ORM returns raw string. Normalize to dict.- "json_event": log.json_event, + "json_event": json.loads(log.json_event) if log.json_event else {},
68-68
: Prefer logger.exception() to capture stack traces.Improves diagnosability over plain error logging.
- logger.error(f"Failed to create APSW connection: {e}") + logger.exception(f"Failed to create APSW connection: {e}") @@ - logger.error(f"Failed to read report file: {e}") + logger.exception(f"Failed to read report file: {e}") @@ - logger.error(f"Failed to read report file: {e}") + logger.exception(f"Failed to read report file: {e}")Also applies to: 600-600, 612-612
1033-1040
: GROUP BY with non-aggregated date is undefined in SQLite; select MAX(date).Prevents arbitrary date values per group.
- SELECT date, module_name, port, event, json_event FROM scan_events + SELECT MAX(date) AS date, module_name, port, event, json_event FROM scan_events WHERE target = ? - GROUP BY module_name, port, scan_unique_id, event + GROUP BY module_name, port, scan_unique_id, event
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
nettacker/database/db.py
(17 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py
: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical
Files:
nettacker/database/db.py
nettacker/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
Add docstrings for public APIs in the nettacker package
Files:
nettacker/database/db.py
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Applied to files:
nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.
Applied to files:
nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
Config
(194-198)as_dict
(26-27)nettacker/database/models.py (3)
Report
(7-27)HostsLog
(67-100)TempEvents
(30-64)nettacker/api/helpers.py (1)
structure
(1-12)nettacker/core/graph.py (1)
build_graph
(28-51)
🪛 Ruff (0.12.2)
nettacker/database/db.py
53-53: Avoid specifying long messages outside the exception class
(TRY003)
66-66: Consider moving this statement to an else
block
(TRY300)
68-68: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
105-105: Consider moving this statement to an else
block
(TRY300)
106-106: Do not catch blind exception: Exception
(BLE001)
119-119: Consider moving this statement to an else
block
(TRY300)
120-120: Do not catch blind exception: Exception
(BLE001)
123-123: Consider moving this statement to an else
block
(TRY300)
124-124: Do not catch blind exception: Exception
(BLE001)
162-162: Do not catch blind exception: Exception
(BLE001)
214-214: Do not catch blind exception: Exception
(BLE001)
284-284: Do not catch blind exception: Exception
(BLE001)
288-289: try
-except
-pass
detected, consider logging the exception
(S110)
288-288: Do not catch blind exception: Exception
(BLE001)
363-364: try
-except
-pass
detected, consider logging the exception
(S110)
363-363: Do not catch blind exception: Exception
(BLE001)
371-372: try
-except
-pass
detected, consider logging the exception
(S110)
371-371: Do not catch blind exception: Exception
(BLE001)
374-374: Do not catch blind exception: Exception
(BLE001)
378-379: try
-except
-pass
detected, consider logging the exception
(S110)
378-378: Do not catch blind exception: Exception
(BLE001)
438-438: Consider moving this statement to an else
block
(TRY300)
439-439: Do not catch blind exception: Exception
(BLE001)
488-488: Consider moving this statement to an else
block
(TRY300)
489-489: Do not catch blind exception: Exception
(BLE001)
547-547: Consider moving this statement to an else
block
(TRY300)
549-549: Do not catch blind exception: Exception
(BLE001)
566-566: Do not catch blind exception: Exception
(BLE001)
600-600: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
612-612: Use logging.exception
instead of logging.error
Replace with exception
(TRY400)
693-693: Consider moving this statement to an else
block
(TRY300)
695-695: Do not catch blind exception: Exception
(BLE001)
1075-1075: Do not catch blind exception: Exception
(BLE001)
1079-1080: try
-except
-pass
detected, consider logging the exception
(S110)
1079-1079: Do not catch blind exception: Exception
(BLE001)
1151-1151: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (4)
nettacker/database/db.py (4)
141-179
: Report insertion path looks solid.Transaction boundaries, parameter binding, and cleanup are correct.
192-230
: Old logs removal path looks good.Uses parameterized DELETE and shared commit helper.
470-503
: Consistent JSON-string returns across backends.
find_events
aligns APSW and ORM to return JSON strings; matches calling code expectations.
803-825
: Keep returning raw JSON strings — no change required.Call sites (nettacker/core/graph.py:340–344) and tests (tests/database/test_db.py:1036–1040) call json.loads(item["options"]), so the current implementation returning raw JSON strings is correct.
except Exception: | ||
try: | ||
if connection.in_transaction: | ||
connection.execute("ROLLBACK") | ||
except Exception: | ||
pass | ||
return False | ||
# All retires exhausted but we want to continue operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Broad excepts hide root causes and can skip cleanup.
Narrow exceptions where possible (e.g., apsw.BusyError
, apsw.ConstraintError
, SQLAlchemy SQLAlchemyError
) and ensure cleanup via finally
. At minimum, log with logger.exception(...)
.
Also applies to: 375-380, 439-441, 489-491, 695-697, 1075-1081
🧰 Tools
🪛 Ruff (0.12.2)
284-284: Do not catch blind exception: Exception
(BLE001)
288-289: try
-except
-pass
detected, consider logging the exception
(S110)
288-288: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
In nettacker/database/db.py around lines 284-291 (and similarly at 375-380,
439-441, 489-491, 695-697, 1075-1081), the code uses broad bare excepts which
hide root causes and can skip cleanup; replace those bare except Exception
blocks with narrow exception catches (e.g., apsw.BusyError,
apsw.ConstraintError, sqlalchemy.SQLAlchemyError or the specific DB driver
exceptions you expect), add logger.exception(...) inside the except to record
the stack trace, and move transaction cleanup (connection.execute("ROLLBACK"))
into a finally block (or ensure it runs in an except/finally sequence) so
rollback always executes when needed; if multiple exception types require
different handling, catch them separately and re-raise or return appropriately
while still logging.
if not targets: | ||
return structure(status="finished", msg="No more search results") | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Close APSW resources on early return.
Leak: cursor/connection not closed when no targets.
- if not targets:
- return structure(status="finished", msg="No more search results")
+ if not targets:
+ cursor.close()
+ connection.close()
+ return structure(status="finished", msg="No more search results")
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if not targets: | |
return structure(status="finished", msg="No more search results") | |
if not targets: | |
cursor.close() | |
connection.close() | |
return structure(status="finished", msg="No more search results") |
🤖 Prompt for AI Agents
In nettacker/database/db.py around lines 642 to 644 the function returns early
when no targets without closing APSW resources; ensure the cursor and connection
are closed before the return to avoid leaks — either call cursor.close() and
connection.close() just before returning or refactor the function to use a
try/finally (or context manager) that always closes the cursor and connection,
then return the structure(status="finished", msg="No more search results") after
resources are cleaned up.
Proposed change
This PR aims to move SQLite database operations from SQLAlchemy's ORM to APSW raw SQL queries for lesser computational overhead. It includes the necessary supporting changes in other files and unittests for db.py.
Type of change
Checklist
make pre-commit
, it didn't generate any changesmake test
, all tests passed locally