Skip to content

Conversation

pUrGe12
Copy link
Contributor

@pUrGe12 pUrGe12 commented Jun 4, 2025

Proposed change

This PR aims to move SQLite database operations from SQLAlchemy's ORM to APSW raw SQL queries for lesser computational overhead. It includes the necessary supporting changes in other files and unittests for db.py.

Type of change

  • New core framework functionality
  • Bugfix (non-breaking change which fixes an issue)
  • Code refactoring without any functionality changes
  • New or existing module/payload change
  • Localization improvement
  • Dependency upgrade
  • Documentation improvement

Checklist

  • I've followed the contributing guidelines
  • I've run make pre-commit, it didn't generate any changes
  • I've run make test, all tests passed locally

@securestep9 securestep9 added the gsoc GSoC work label Jun 25, 2025
@securestep9
Copy link
Collaborator

@CodeRabbit review

Copy link
Contributor

coderabbitai bot commented Aug 1, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

coderabbitai bot commented Aug 1, 2025

Summary by CodeRabbit

  • New Features
    • Optional high-performance SQLite mode via APSW with configurable journal and sync settings.
    • Added minimal HTML report template for log rendering.
  • Improvements
    • More robust log merging, event parsing, and HTML escaping for reports.
    • Configurable retry behavior for database operations.
  • Documentation
    • Updated database configuration guide (SQLite, MySQL, PostgreSQL) with APSW option and defaults.
  • Tests
    • Extensive database tests covering SQLite (APSW/ORM), MySQL, and PostgreSQL paths.
  • Chores
    • Added dependencies: apsw and pytest-asyncio.

Walkthrough

Adds optional APSW-backed SQLite support with PRAGMA settings and retry parameters, shifts several call sites to consume raw event/row/service JSON values, updates log-merge to parse embedded json_event strings, adds tests, docs, deps, and a minimal HTML report template. No public API signatures changed.

Changes

Cohort / File(s) Summary
Configuration updates
nettacker/config.py
Added DbConfig.journal_mode and DbConfig.synchronous_mode; added DefaultSettings.use_apsw_for_sqlite, DefaultSettings.max_retries, and DefaultSettings.retry_delay; updated SQLite docstrings and default DB path guidance.
Core JSON source adjustments
nettacker/core/app.py, nettacker/core/graph.py, nettacker/core/module.py, nettacker/core/lib/base.py
Parsing changed to use raw row/event/service values (JSON strings or dicts) instead of .json_event attributes; graph.py now passes full event dicts to merge_logs_to_list and escapes json.dumps(event) for HTML.
Log merging utility enhancement
nettacker/core/utils/common.py
Added import json; merge_logs_to_list now pre-parses result["json_event"] when present and not already a dict before merging logs.
Database dual-backend (APSW + ORM)
nettacker/database/db.py
Introduced optional APSW import guard and APSW-backed SQLite path; create_connection may return (conn, cursor) for APSW or ORM Session otherwise; DB operations branch to raw SQL + explicit transactions/retries for APSW, preserving ORM flows for other backends.
Dependencies and test tooling
pyproject.toml
Added runtime dependency apsw = "^3.50.0.0" and dev dependency pytest-asyncio = "^1.1.0"; configured pytest asyncio marker.
Database tests (new)
tests/database/test_db.py
New comprehensive tests covering APSW and ORM paths: connections, transactions, inserts, deletes, selects, pagination, reporting, retry on locks, and error handling (heavy mocking).
Core tests alignment
tests/core/test_exclude_ports.py
Updated mocks to return JSON strings (json.dumps(...)) instead of objects with json_event attributes to match new code expectations.
Documentation
docs/Usage.md
Reworked DB configuration docs to reference DbConfig in nettacker/config.py; documented APSW toggle, SQLite PRAGMAs (journal/synchronous), default DB path, and examples for SQLite/MySQL/Postgres.
Report template
report.html
Added a minimal HTML report skeleton with placeholders, metadata/footer, and script/css placeholders.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title accurately and concisely summarizes the primary change—porting SQLite operations to APSW and adding unit tests—and directly reflects the core modifications in the changeset. It is specific, readable, and relevant to the files changed (database module and tests).
Description Check ✅ Passed The pull request description clearly states the intended change (moving SQLite from SQLAlchemy to APSW raw SQL), notes supporting changes and included unit tests, and includes a brief checklist; it is directly related to the changeset and not off-topic. The level of detail is sufficient for this lenient check.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🔭 Outside diff range comments (1)
nettacker/database/db.py (1)

297-384: Significant code duplication with submit_logs_to_db

This function duplicates most of the logic from submit_logs_to_db. Consider refactoring to extract the common retry logic into a shared helper function.

Create a generic helper function:

def _submit_with_retry(session, insert_func, log_data):
    """Generic retry logic for database insertions."""
    if isinstance(session, tuple):
        connection, cursor = session
        try:
            for _ in range(Config.settings.max_retries):
                try:
                    if not connection.in_transaction:
                        connection.execute("BEGIN")
                    insert_func(cursor, log_data)
                    return send_submit_query(session)
                except apsw.BusyError as e:
                    # ... retry logic ...
            return False
        finally:
            cursor.close()
    else:
        # SQLAlchemy logic

Then simplify both functions to use this helper.

🧹 Nitpick comments (9)
tests/database/test_db.py (3)

170-181: Consider adding assertion for sleep calls in retry test.

The test mocks time.sleep but doesn't verify it was called. Add an assertion to ensure the retry delay is properly applied:

-        with patch("time.sleep"):
+        with patch("time.sleep") as mock_sleep:
             result = send_submit_query(session)
+        
+        # Verify sleep was called during retry
+        mock_sleep.assert_called()

403-431: Improve Config mocking consistency.

The test patches individual Config attributes separately. Consider using a more consistent mocking approach:

-    @patch("nettacker.database.db.Config.settings.retry_delay", 0)
-    @patch("nettacker.database.db.Config.settings.max_retries", 1)
+    @patch("nettacker.database.db.Config")
     @patch("nettacker.database.db.logging.warn")
     @patch("nettacker.database.db.create_connection")
-    def test_apsw_busy_error(self, mock_create_conn, mock_warn):
+    def test_apsw_busy_error(self, mock_create_conn, mock_warn, mock_config):
+        mock_config.settings.retry_delay = 0
+        mock_config.settings.max_retries = 1

This approach is cleaner and prevents potential issues with partial Config mocking.


725-745: Remove duplicate test for find_temp_events.

This test duplicates the logic already covered in test_sqlite_successful_lookup (lines 656-668). Consider removing this redundant test or differentiating it with a unique scenario.

nettacker/database/db.py (6)

100-112: Simplify error handling in SQLAlchemy branch

The nested try-except blocks with identical exception handling are redundant.

 else:
-    try:
-        for _ in range(1, 100):
-            try:
-                session.commit()
-                return True
-            except Exception:
-                time.sleep(0.1)
-        logging.warn(messages("database_connect_fail"))
-        return False
-    except Exception:
-        logging.warn(messages("database_connect_fail"))
-        return False
+    for _ in range(Config.settings.max_retries):
+        try:
+            session.commit()
+            return True
+        except Exception:
+            session.rollback()
+            time.sleep(Config.settings.retry_delay)
+    logging.warn(messages("database_connect_fail"))
     return False

148-149: Use consistent error messaging

The error message should use the messages() function for consistency with the rest of the codebase.

 except Exception:
     cursor.execute("ROLLBACK")
-    logging.warn("Could not insert report...")
+    logging.warn(messages("database_error"))
     return False

200-200: Use consistent error messaging

-logging.warn("Could not remove old logs...")
+logging.warn(messages("database_error"))

625-658: Consider optimizing with fewer queries

The current implementation executes 3 queries per target. Consider using a single query with GROUP BY or window functions for better performance.

Example optimization:

SELECT 
    target,
    GROUP_CONCAT(DISTINCT module_name) as module_names,
    MAX(date) as latest_date,
    GROUP_CONCAT(event) as events
FROM scan_events
WHERE target IN (
    SELECT DISTINCT target 
    FROM scan_events 
    ORDER BY id DESC 
    LIMIT 10 OFFSET ?
)
GROUP BY target

868-869: Consider moving imports to module level

Internal imports can impact performance when the function is called multiple times.

Move these imports to the top of the file:

+from nettacker.core.graph import build_graph
+from nettacker.lib.html_log import log_data

 def logs_to_report_html(target):
     """
     generate HTML report with d3_tree_v2_graph for a host
     
     Args:
         target: the target
     
     Returns:
         HTML report
     """
-    from nettacker.core.graph import build_graph
-    from nettacker.lib.html_log import log_data

1084-1105: Simplify the capture variable logic

The current implementation with capture variable and nested loops is hard to follow. Consider using a dictionary for O(1) lookup.

# Use a dictionary for efficient lookup
target_index = {item["target"]: idx for idx, item in enumerate(selected)}

if data.target in target_index:
    idx = target_index[data.target]
    # Update existing entry
else:
    # Create new entry
    tmp = {
        "target": data.target,
        "info": {
            "module_name": [],
            "port": [],
            "date": [],
            "event": [],
            "json_event": [],
        },
    }
    selected.append(tmp)
    target_index[data.target] = len(selected) - 1
    idx = target_index[data.target]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 958e1bc and 8fca377.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • nettacker/config.py (3 hunks)
  • nettacker/core/app.py (1 hunks)
  • nettacker/core/graph.py (2 hunks)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/core/module.py (1 hunks)
  • nettacker/core/utils/common.py (2 hunks)
  • nettacker/database/db.py (17 hunks)
  • pyproject.toml (1 hunks)
  • tests/database/test_db.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
tests/database/test_db.py (3)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/database/db.py (17)
  • db_inputs (18-35)
  • create_connection (38-70)
  • send_submit_query (73-112)
  • submit_report_to_db (115-162)
  • remove_old_logs (165-212)
  • submit_logs_to_db (215-294)
  • submit_temp_logs_to_db (297-383)
  • find_temp_events (386-446)
  • find_events (449-492)
  • select_reports (495-557)
  • get_scan_result (560-591)
  • last_host_logs (594-715)
  • get_logs_by_scan_id (718-769)
  • get_options_by_scan_id (772-800)
  • logs_to_report_json (803-855)
  • logs_to_report_html (858-964)
  • search_logs (967-1123)
nettacker/config.py (1)
  • as_dict (26-27)
nettacker/database/db.py (4)
nettacker/config.py (1)
  • Config (169-173)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
🔇 Additional comments (14)
pyproject.toml (1)

67-67: APSW dependency version validated

The caret constraint ^3.50.0.0 will include the latest 3.50.4.0 release, and our security check against PyPI/GitHub showed no known advisories for APSW. No changes required.

nettacker/config.py (2)

92-93: Excellent SQLite configuration choices.

The WAL journal mode and NORMAL synchronous mode are optimal defaults for SQLite performance and concurrency. WAL mode allows concurrent readers with writers, and NORMAL synchronous provides good durability without significant performance impact.


139-140: Well-designed retry configuration.

The retry parameters (max_retries = 3 and retry_delay = 0.1) are reasonable defaults for handling SQLite database locks and transient errors without excessive overhead.

nettacker/core/module.py (1)

81-81: Event data access pattern improved.

The change from json.loads(service.json_event) to json.loads(service) aligns with the broader event data standardization refactor. This simplifies data access by treating the service directly as a JSON string rather than accessing a nested attribute.

nettacker/core/utils/common.py (2)

6-6: JSON import properly added.

The json import is correctly added to support the new JSON parsing functionality in merge_logs_to_list.


36-39: Excellent backward compatibility handling.

The enhancement to handle json_event keys is well-designed:

  • Maintains backward compatibility during the event data refactor transition
  • Idempotent operation (only parses if not already a dict)
  • Non-intrusive to existing functionality as noted in the comment
nettacker/core/lib/base.py (1)

55-55: Event data access streamlined.

The change from json.loads(event.event) to direct dictionary access event["response"]["conditions_results"] is consistent with the event data handling refactor. This eliminates unnecessary JSON parsing overhead while maintaining the same functionality.

nettacker/core/app.py (1)

155-155: LGTM! Correctly handles the new return format from find_events.

The change from json.loads(row.json_event) to json.loads(row) aligns with the database module refactoring where find_events now returns JSON strings directly instead of ORM objects with a json_event attribute.

nettacker/core/graph.py (2)

87-87: Good refactoring - passes complete event context to merge_logs_to_list.

This change improves the design by passing the entire event dictionary instead of pre-extracting the json_event field. This gives merge_logs_to_list more flexibility to handle different event structures.


160-168: JavaScript JSON parsing supports full event objects

The change to use json.dumps(event) is compatible with the existing JS in:

  • nettacker/web/static/report/json_parse.js
  • nettacker/web/static/index.html

Both files call JSON.parse(...) on the escaped JSON inserted into elements with IDs like json_event_<index>, so they will correctly deserialize the full event dictionary.

No further fixes are needed.

tests/database/test_db.py (2)

1082-1122: Good test coverage for HTML report generation.

The HTML report tests properly mock the template components and verify the integration with the graph builder. The tests cover both SQLite and SQLAlchemy backends effectively.


1-1275: Comprehensive test suite with excellent coverage.

This test file provides thorough coverage of the dual database backend functionality introduced in the PR. The tests effectively verify:

  • Both APSW (SQLite) and SQLAlchemy ORM paths
  • Retry logic and transaction management
  • Error handling and edge cases
  • All major database operations

The test structure is clear and maintainable. Great work on ensuring the new APSW implementation is well-tested!

nettacker/database/db.py (2)

233-278: Well-implemented retry logic for database locks

The retry mechanism with specific handling for BusyError and transaction state checking is well implemented. Good use of configuration parameters for max_retries and retry_delay.


509-557: Well-implemented pagination logic

Both APSW and SQLAlchemy branches correctly implement pagination with proper offset calculations and consistent return format.

Copy link
Collaborator

@arkid15r arkid15r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good performance improvement contribution.
Could you address coderabbit's suggestions first @pUrGe12 ?

@pUrGe12
Copy link
Contributor Author

pUrGe12 commented Aug 30, 2025

sure I'll do that

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
nettacker/core/module.py (2)

80-85: Harden JSON loading: handle dict/bytes rows and decode failures.

json.loads(service) assumes service is always a JSON string. If APSW returns bytes or ORM returns a dict, this will raise. Add type checks and guarded parsing.

Apply:

-                service_event = json.loads(service)
-                port = service_event["port"]
-                protocols = service_event["response"]["conditions_results"].keys()
+                try:
+                    raw = service
+                    if isinstance(raw, (bytes, bytearray)):
+                        raw = raw.decode("utf-8", "replace")
+                    service_event = json.loads(raw) if isinstance(raw, str) else raw
+                except (TypeError, json.JSONDecodeError) as e:
+                    log.debug(f"Skipping unexpected service row: {service!r} ({e})")
+                    continue
+                port = service_event.get("port")
+                protocols = ((service_event.get("response") or {}).get("conditions_results") or {}).keys()
+                if port is None:
+                    continue

100-107: Use safe dict lookup for service ports
Replace self.discovered_services[payload["library"]] with:

ports = self.discovered_services.get(payload["library"], [])
step = TemplateLoader.parse(step, {"port": ports})

and add a test for the branch where payload["library"] isn’t in discovered_services.

♻️ Duplicate comments (1)
nettacker/config.py (1)

85-90: Fix doc typos, stray text, and correct the default SQLite path.

There’s a typo (“deafault”), stray UI text (“Add commentMore actions”), and the documented default path .data/nettacker.db doesn’t match the actual default .nettacker/data/nettacker.db.

Apply:

-        Set the journal_mode (default="WAL") and
-        synchronous_mode (deafault="NORMAL"). Rest
-        of the fields can be left emptyAdd commentMore actions
-        This is the default database:
-        str(CWD / ".data/nettacker.db")
+        Set the journal_mode (default="WAL") and
+        synchronous_mode (default="NORMAL"). The rest
+        of the fields can be left empty.
+        Default SQLite database path:
+        str(CWD / ".nettacker/data/nettacker.db")
🧹 Nitpick comments (1)
pyproject.toml (1)

47-69: Consider making APSW an optional extra to avoid unnecessary installs.

For users on MySQL/PostgreSQL only, installing APSW can be skipped via an extra.

Add:

 [tool.poetry.dependencies]
 # ...
 apsw = "^3.50.0.0"

+[tool.poetry.extras]
+sqlite = ["apsw"]

Then document poetry install --extras sqlite for SQLite users.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 8fca377 and c26a9c2.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • nettacker/config.py (3 hunks)
  • nettacker/core/app.py (1 hunks)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/core/module.py (1 hunks)
  • nettacker/core/utils/common.py (2 hunks)
  • pyproject.toml (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • nettacker/core/lib/base.py
  • nettacker/core/app.py
  • nettacker/core/utils/common.py
🧰 Additional context used
🪛 GitHub Actions: CI/CD
pyproject.toml

[error] 1-1: poetry install --with test failed: pyproject.toml changed significantly since poetry.lock was last generated. Run 'poetry lock' to fix the lock file.

🔇 Additional comments (3)
pyproject.toml (1)

67-67: Verify APSW version availability on PyPI
Ensure that the ^3.50.0.0 constraint matches a release published on PyPI and that binary wheels exist for Python 3.9, 3.10, 3.11, and 3.12. If version 3.50.0.0 is not available or missing wheels for any supported interpreter, pin to a published release that does.

nettacker/config.py (2)

111-113: Confirm PRAGMA usage in DB layer.

Ensure journal_mode and synchronous_mode are applied on every SQLite connection (both initial and pooled) via PRAGMA statements in the APSW path.


160-161: LGTM: sensible retry defaults for transient DB errors.

Values look reasonable; please ensure db.py reads these and uses capped backoff.

@pUrGe12 pUrGe12 force-pushed the apsw-with-sqlalchemy branch from c26a9c2 to b1b9851 Compare August 31, 2025 02:52
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (10)
nettacker/database/db.py (10)

15-15: Avoid shadowing the stdlib logging module.

Rename variable for clarity and to prevent confusion in imports.

-logging = logger.get_logger()
+log = logger.get_logger()

Update usages accordingly.


4-4: Handle optional APSW dependency (import guard).

Avoid import errors on non-SQLite backends.

-import apsw
+try:
+    import apsw
+except ImportError:
+    apsw = None

46-47: Guard SQLite path when APSW is missing.

Fail fast with a clear message.

 if Config.db.engine.startswith("sqlite"):
+    if apsw is None:
+        raise ImportError("APSW is required for SQLite backend. Install with: pip install apsw")

50-50: Fix busy-timeout units (should be milliseconds).

Multiply by 1000, not 100.

-connection.setbusytimeout(int(config.settings.timeout) * 100)
+# seconds -> milliseconds
+connection.setbusytimeout(int(config.settings.timeout) * 1000)

46-57: Wrap APSW connection setup with error handling.

Prevent unhandled exceptions and log context.

-        DB_PATH = config.db.as_dict()["name"]
-        connection = apsw.Connection(DB_PATH)
-        connection.setbusytimeout(int(config.settings.timeout) * 100)
-        cursor = connection.cursor()
-
-        # Performance enhancing configurations. Put WAL cause that helps with concurrency
-        cursor.execute(f"PRAGMA journal_mode={Config.db.journal_mode}")
-        cursor.execute(f"PRAGMA synchronous={Config.db.synchronous_mode}")
-
-        return connection, cursor
+        try:
+            DB_PATH = config.db.as_dict()["name"]
+            connection = apsw.Connection(DB_PATH)
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)
+            cursor = connection.cursor()
+            # Performance enhancing configurations. WAL helps with concurrency
+            cursor.execute(f"PRAGMA journal_mode={Config.db.journal_mode}")
+            cursor.execute(f"PRAGMA synchronous={Config.db.synchronous_mode}")
+            return connection, cursor
+        except Exception as e:
+            logging.error(f"Failed to create APSW connection: {e}")
+            raise

85-98: APSW commit loop closes connection every iteration and before retries.

This causes failures on subsequent attempts and double-closing. Also hardcoded retries/sleeps.

-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logging.warn(messages("database_connect_fail"))
-        return False
+        connection, cursor = session
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                connection.execute("COMMIT")
+                success = True
+                break
+            except Exception:
+                connection.execute("ROLLBACK")
+                time.sleep(Config.settings.retry_delay)
+        connection.close()
+        if not success:
+            logging.warn(messages("database_connect_fail"))
+        return success

399-447: Remove unnecessary retry loop; unify return type.

SELECTs don’t need retry loops; also return parsed JSON for both backends.

-        try:
-            for _ in range(100):
-                try:
-                    cursor.execute(
-                        """
-                        SELECT event
-                        FROM temp_events
-                        WHERE target = ? AND module_name = ? AND scan_unique_id = ? AND event_name = ?
-                        LIMIT 1
-                    """,
-                        (target, module_name, scan_id, event_name),
-                    )
-
-                    row = cursor.fetchone()
-                    cursor.close()
-                    if row:
-                        return json.loads(row[0])
-                    return []
-                except Exception:
-                    logging.warn("Database query failed...")
-                    return []
-        except Exception:
-            logging.warn(messages("database_connect_fail"))
-            return []
-        return []
+        try:
+            cursor.execute(
+                """
+                SELECT event
+                FROM temp_events
+                WHERE target = ? AND module_name = ? AND scan_unique_id = ? AND event_name = ?
+                LIMIT 1
+            """,
+                (target, module_name, scan_id, event_name),
+            )
+            row = cursor.fetchone()
+            cursor.close()
+            return json.loads(row[0]) if row else []
+        except Exception:
+            logging.warn(messages("database_connect_fail"))
+            return []
@@
-        try:
-            for _ in range(1, 100):
-                try:
-                    return (
-                        session.query(TempEvents)
-                        .filter(
-                            TempEvents.target == target,
-                            TempEvents.module_name == module_name,
-                            TempEvents.scan_unique_id == scan_id,
-                            TempEvents.event_name == event_name,
-                        )
-                        .first()
-                    )
-                except Exception:
-                    time.sleep(0.1)
-        except Exception:
-            logging.warn(messages("database_connect_fail"))
-            return []
-        return []
+        try:
+            result = (
+                session.query(TempEvents)
+                .filter(
+                    TempEvents.target == target,
+                    TempEvents.module_name == module_name,
+                    TempEvents.scan_unique_id == scan_id,
+                    TempEvents.event_name == event_name,
+                )
+                .first()
+            )
+            return json.loads(result.event) if result else []
+        except Exception:
+            logging.warn(messages("database_connect_fail"))
+            return []

474-492: Avoid redundant JSON load→dump; unify return type.

Return parsed JSON for both backends.

-            if rows:
-                return [json.dumps((json.loads(row[0]))) for row in rows]
+            if rows:
+                return [json.loads(row[0]) for row in rows]
             return []
@@
-        return [
-            row.json_event
+        return [
+            json.loads(row.json_event)
             for row in session.query(HostsLog)

581-592: Add robust handling for missing reports and file I/O errors.

Prevents attribute errors and uncaught IOErrors.

-        if row:
-            filename = row[0]
-            return filename, open(str(filename), "rb").read()
-        else:
-            return structure(status="error", msg="database error!")
+        if row:
+            filename = row[0]
+            try:
+                with open(str(filename), "rb") as fh:
+                    return filename, fh.read()
+            except IOError as e:
+                logging.error(f"Failed to read report file: {e}")
+                return structure(status="error", msg="file read error!")
+        return structure(status="error", msg="database error!")
@@
-        filename = session.query(Report).filter_by(id=id).first().report_path_filename
-
-        return filename, open(str(filename), "rb").read()
+        report = session.query(Report).filter_by(id=id).first()
+        if not report:
+            return structure(status="error", msg="database error!")
+        try:
+            with open(str(report.report_path_filename), "rb") as fh:
+                return report.report_path_filename, fh.read()
+        except IOError as e:
+            logging.error(f"Failed to read report file: {e}")
+            return structure(status="error", msg="file read error!")

659-671: Cursor is closed inside the loop but reused later.

Close after the loop.

-                cursor.close()
                 hosts.append(
                     {
                         "target": target,
                         "info": {
                             "module_name": module_names,
                             "date": latest_date,
                             "events": events,
                         },
                     }
                 )
 
-            return hosts
+            cursor.close()
+            return hosts
🧹 Nitpick comments (6)
nettacker/database/db.py (6)

268-271: Use connection for ROLLBACK to keep transaction handling consistent.

Minor consistency fix; safer if cursor lifecycle changes.

-                            if connection.in_transaction:
-                                cursor.execute("ROLLBACK")
+                            if connection.in_transaction:
+                                connection.execute("ROLLBACK")

318-319: Standardize transaction control to use the connection.

Keep BEGIN/ROLLBACK consistently on the connection.

-                            cursor.execute("BEGIN")
+                            connection.execute("BEGIN")
@@
-                                if connection.in_transaction:
-                                    connection.execute("ROLLBACK")
+                                if connection.in_transaction:
+                                    connection.execute("ROLLBACK")
@@
-                            if connection.in_transaction:
-                                cursor.execute("ROLLBACK")
+                            if connection.in_transaction:
+                                connection.execute("ROLLBACK")

Also applies to: 351-352, 358-359


611-617: Query plan: DISTINCT with ORDER BY id is ambiguous.

Consider grouping by target and ordering by MAX(id) for “latest per target”.

Example:

SELECT target
FROM scan_events
GROUP BY target
ORDER BY MAX(id) DESC
LIMIT 10 OFFSET ?

791-801: Return parsed options JSON for both backends.

Keeps API consistent.

-        if rows:
-            return [{"options": row[0]} for row in rows]
+        if rows:
+            return [{"options": json.loads(row[0])} for row in rows]
@@
-        return [
-            {"options": log.options}
+        return [
+            {"options": json.loads(log.options)}
             for log in session.query(Report).filter(Report.scan_unique_id == scan_id).all()
         ]

883-897: Consider JSON-decoding fields for graph building and HTML output.

Verify build_graph expectations; if it expects dicts, decode below.

-                "port": log[4],
-                "event": log[5],
-                "json_event": log[6],
+                "port": json.loads(log[4]),
+                "event": json.loads(log[5]),
+                "json_event": json.loads(log[6]),

If build_graph expects strings, skip this change.


735-740: Add supporting DB indexes for frequent filters.

Improve search/query performance at scale.

  • scan_events: composite indexes on (scan_unique_id), (target), (target, module_name), (target, module_name, scan_unique_id), and optionally a full-text or JSON index if supported.
  • temp_events: (target, module_name, scan_unique_id, event_name).
  • reports: (scan_unique_id), (date).

Also applies to: 986-1001, 1010-1014

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between c26a9c2 and b1b9851.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (9)
  • nettacker/config.py (3 hunks)
  • nettacker/core/app.py (1 hunks)
  • nettacker/core/graph.py (2 hunks)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/core/module.py (1 hunks)
  • nettacker/core/utils/common.py (2 hunks)
  • nettacker/database/db.py (17 hunks)
  • pyproject.toml (1 hunks)
  • tests/database/test_db.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • pyproject.toml
  • nettacker/core/utils/common.py
  • tests/database/test_db.py
  • nettacker/core/lib/base.py
  • nettacker/core/graph.py
  • nettacker/core/app.py
  • nettacker/core/module.py
  • nettacker/config.py
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (191-195)
  • as_dict (26-27)
tests/database/test_models.py (1)
  • session (11-18)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
🔇 Additional comments (1)
nettacker/database/db.py (1)

275-276: Ack: returning True after exhausting retries is intentional.

Not flagging as an error per your stated design to preserve scan continuity.

Also applies to: 363-364

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
tests/core/test_exclude_ports.py (4)

100-105: Same issue: return JSON strings in test_sort_loops.

-    mock_event = MagicMock()
-    mock_event.json_event = json.dumps(
-        {"port": 80, "response": {"conditions_results": {"http": True}}}
-    )
-    mock_find_events.return_value = [mock_event]
+    mock_find_events.return_value = [
+        json.dumps({"port": 80, "response": {"conditions_results": {"http": True}}})
+    ]

125-130: Same issue: return JSON strings in test_start_unsupported_library.

-    mock_event = MagicMock()
-    mock_event.json_event = json.dumps(
-        {"port": 1234, "response": {"conditions_results": {"unsupported_lib": True}}}
-    )
-    mock_find_events.return_value = [mock_event]
+    mock_find_events.return_value = [
+        json.dumps(
+            {"port": 1234, "response": {"conditions_results": {"unsupported_lib": True}}}
+        )
+    ]

185-190: Same issue: return JSON strings in test_sort_loops_behavior.

-    mock_event = MagicMock()
-    mock_event.json_event = json.dumps(
-        {"port": 80, "response": {"conditions_results": {"http": True}}}
-    )
-    mock_find_events.return_value = [mock_event]
+    mock_find_events.return_value = [
+        json.dumps({"port": 80, "response": {"conditions_results": {"http": True}}})
+    ]

313-319: Same issue: return JSON strings for test_load_appends_port_to_existing_protocol.

-    mock_find_events.return_value = [
-        MagicMock(
-            json_event=json.dumps({"port": 80, "response": {"conditions_results": {"http": {}}}})
-        ),
-        MagicMock(
-            json_event=json.dumps({"port": 443, "response": {"conditions_results": {"http": {}}}})
-        ),
-    ]
+    mock_find_events.return_value = [
+        json.dumps({"port": 80, "response": {"conditions_results": {"http": {}}}}),
+        json.dumps({"port": 443, "response": {"conditions_results": {"http": {}}}}),
+    ]
♻️ Duplicate comments (1)
nettacker/database/db.py (1)

19-19: Avoid shadowing imported symbol; use a distinct logger variable.

Rename the local logger instance to avoid shadowing the imported module symbol and improve readability.

-logger = logger.get_logger()
+log = logger.get_logger()

Apply the same rename for all logger usages in this file.

🧹 Nitpick comments (8)
nettacker/database/db.py (6)

244-264: Minor: BEGIN/ROLLBACK use is inconsistent between functions.

submit_logs_to_db uses connection.execute("BEGIN") while submit_temp_logs_to_db uses cursor.execute("BEGIN"). Same for ROLLBACK. Prefer one style for consistency (APSW allows either). Suggest using connection.execute in both places.

-                            cursor.execute("BEGIN")
+                            connection.execute("BEGIN")
@@
-                                cursor.execute("ROLLBACK")
+                                connection.execute("ROLLBACK")

Also applies to: 326-347


424-451: Return type inconsistency in find_temp_events (APSW dict vs ORM object).

APSW branch returns a dict (json.loads), ORM branch returns an ORM row (or None). Downstream code can easily break on mixed types. Choose a single contract (e.g., JSON string or parsed dict) and make both branches conform; update tests accordingly.

Example (return JSON string in both branches):

-            if row:
-                return json.loads(row[0])
+            if row:
+                return row[0]  # raw JSON string
@@
-                    return (
-                        session.query(TempEvents)
-                        .filter(...)
-                        .first()
-                    )
+                    result = session.query(TempEvents).filter(...).first()
+                    return result.event if result else []

If the intended contract is dicts, invert accordingly in both branches.


589-607: Unify error return shape in get_scan_result.

Docstring says this returns result content or an error in JSON type. On file I/O errors you currently return None; consider returning structure(status="error", msg="file read error!") for consistency across branches.

-            except IOError as e:
-                logger.error(f"Failed to read report file: {e}")
-                return None
+            except IOError as e:
+                logger.error(f"Failed to read report file: {e}")
+                return structure(status="error", msg="file read error!")
@@
-        except IOError as e:
-            logger.error(f"Failed to read report file: {e}")
-            return None
+        except IOError as e:
+            logger.error(f"Failed to read report file: {e}")
+            return structure(status="error", msg="file read error!")

626-635: DISTINCT with ORDER BY a non-selected column is fragile.

SQLite allows ORDER BY columns not in the SELECT list, but it’s ambiguous for DISTINCT. To show the latest per target deterministically, order by MAX(id) per target.

-            cursor.execute(
-                """
-                SELECT DISTINCT target 
-                FROM scan_events
-                ORDER BY id DESC 
-                LIMIT 10 OFFSET ?
-                """,
-                [(page - 1) * 10],
-            )
+            cursor.execute(
+                """
+                SELECT target
+                FROM (
+                    SELECT target, MAX(id) AS max_id
+                    FROM scan_events
+                    GROUP BY target
+                )
+                ORDER BY max_id DESC
+                LIMIT 10 OFFSET ?
+                """,
+                [(page - 1) * 10],
+            )

773-784: json_event type mismatch between APSW and ORM paths.

APSW branch returns dicts for json_event; ORM branch returns raw strings. Normalize ORM path to return dicts for parity (or switch APSW to strings, but keep it consistent).

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},

1000-1059: GROUP BY without aggregating non-grouped columns can yield nondeterministic values.

In the APSW search, date is selected but not grouped/aggregated. If determinism matters, consider selecting MAX(date) or removing date from SELECT and appending via another query.

tests/database/test_db.py (2)

706-714: Clarify find_temp_events return type in ORM path.

This test asserts the ORM object is returned. APSW path returns a dict. Consider unifying the contract (string or dict) across both paths and updating tests to match.


871-894: select_reports SQLite exception test mocks the wrong method.

You set mock_cursor.query.side_effect, but the implementation uses cursor.execute. To meaningfully exercise the error path, set execute.side_effect to raise.

-        mock_cursor.query.side_effect = Exception("DB Error")
+        mock_cursor.execute.side_effect = Exception("DB Error")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b1b9851 and 6a9c6cb.

📒 Files selected for processing (5)
  • nettacker/config.py (3 hunks)
  • nettacker/database/db.py (17 hunks)
  • pyproject.toml (2 hunks)
  • tests/core/test_exclude_ports.py (1 hunks)
  • tests/database/test_db.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pyproject.toml
  • nettacker/config.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • tests/database/test_db.py
  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.138Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.138Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.008Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.008Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.306Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.306Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (2)
tests/database/test_db.py (3)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/database/db.py (17)
  • db_inputs (22-39)
  • create_connection (42-81)
  • send_submit_query (84-123)
  • submit_report_to_db (126-173)
  • remove_old_logs (176-223)
  • submit_logs_to_db (226-305)
  • submit_temp_logs_to_db (308-394)
  • find_temp_events (397-452)
  • find_events (455-498)
  • select_reports (501-563)
  • get_scan_result (566-607)
  • last_host_logs (610-730)
  • get_logs_by_scan_id (733-784)
  • get_options_by_scan_id (787-815)
  • logs_to_report_json (818-870)
  • logs_to_report_html (873-979)
  • search_logs (982-1138)
nettacker/config.py (1)
  • as_dict (26-27)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (191-195)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
🪛 GitHub Actions: CI/CD
tests/core/test_exclude_ports.py

[error] 70-70: Test 'test_load_with_service_discovery' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().


[error] 108-108: Test 'test_sort_loops' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().


[error] 135-135: Test 'test_start_unsupported_library' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().


[error] 193-193: Test 'test_sort_loops_behavior' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().


[error] 324-324: Test 'test_load_appends_port_to_existing_protocol' failed due to TypeError in json.loads (MagicMock) when loading service in Module.load().

🔇 Additional comments (3)
nettacker/database/db.py (2)

55-59: Busy-timeout unit may be incorrect; confirm seconds→milliseconds conversion.

APSW’s setbusytimeout expects milliseconds. If Config.settings.timeout is in seconds, multiply by 1000 (not 100). Current tests assert 3000 for 30s, which suggests a unit mismatch. Please verify the intended unit and adjust code and/or tests accordingly.

-connection.setbusytimeout(int(config.settings.timeout) * 100)
+# Convert seconds to milliseconds for APSW busy timeout
+connection.setbusytimeout(int(config.settings.timeout) * 1000)

Also applies to: 61-65


483-484: LGTM: normalized JSON string output for find_events (APSW).

Keeping the return type as JSON strings aligns with callers using json.loads on each item.

tests/database/test_db.py (1)

124-131: Confirm busy-timeout expectation in tests.

The test expects setbusytimeout(3000) for a 30 second timeout. If timeout is seconds, APSW needs milliseconds (30000). Please confirm intended units and align code or test accordingly.

@pUrGe12 pUrGe12 requested a review from arkid15r September 2, 2025 13:04
@pUrGe12 pUrGe12 force-pushed the apsw-with-sqlalchemy branch from 37cfb4b to 4cd2bc1 Compare September 3, 2025 04:14
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (4)
nettacker/database/db.py (4)

742-784: json_event type differs across backends.

SQLite branch returns parsed dict; ORM branch returns raw string. Pick one format (prefer dicts here) for callers’ sanity.

Apply:

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},

56-66: Busy-timeout unit conversion is wrong (should be ms).

APSW expects milliseconds; multiplying by 100 undercuts the intended timeout by 10x. Use 1000 and keep the try/except.

Apply:

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            # Convert seconds to milliseconds for APSW busy timeout
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)

96-109: APSW commit retry closes connection every attempt; breaks retries and leaks cursor.

Connection is closed in finally inside the loop; cursor is never closed here. Move cleanup to a single outer finally, add rollback between attempts, and use configured retry params.

Apply:

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            except Exception:
+                pass
+            try:
+                connection.close()
+            except Exception:
+                pass
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

111-123: ORM commit retry lacks rollback and uses hardcoded policy.

Add rollback, use Config.settings.{max_retries,retry_delay}, and return success consistently.

Apply:

-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                success = True
+                break
+            except Exception:
+                session.rollback()
+                time.sleep(Config.settings.retry_delay)
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success
🧹 Nitpick comments (6)
nettacker/core/app.py (1)

159-163: Harden subdomain_scan event parsing to be backend-agnostic and key-safe

json.loads(row) assumes row is a JSON string. With dual backends (APSW/ORM), find_events(...) may yield a dict or an ORM row exposing .json_event. Also, direct indexing into ["response"]["conditions_results"]["content"] can raise on missing keys. Parse defensively and tolerate string/dict/object rows.

Apply this diff within the loop:

-                for row in find_events(target, "subdomain_scan", scan_id):
-                    for sub_domain in json.loads(row)["response"]["conditions_results"]["content"]:
+                for row in find_events(target, "subdomain_scan", scan_id):
+                    event = self._event_to_dict(row)
+                    content = (
+                        event.get("response", {})
+                        .get("conditions_results", {})
+                        .get("content", [])
+                    )
+                    if not isinstance(content, (list, tuple)):
+                        continue
+                    for sub_domain in content:

Add this helper (inside the class or module) to normalize rows:

@staticmethod
def _event_to_dict(row):
    if isinstance(row, dict):
        return row
    # ORM row with `json_event`
    if hasattr(row, "json_event") and isinstance(row.json_event, (str, bytes)):
        try:
            return json.loads(row.json_event)
        except Exception:
            return {}
    # Raw JSON string from APSW
    if isinstance(row, (str, bytes)):
        try:
            return json.loads(row)
        except Exception:
            return {}
    return {}

Verification suggestion:

  • Confirm find_events return type(s) for sqlite/mysql/postgres paths to ensure the helper covers all cases, and that subdomain events consistently populate response.conditions_results.content.
nettacker/database/db.py (5)

46-49: Docstring inaccurate about returns.

For SQLite you return (connection, cursor); for others you return a Session, never False. Update to reflect actual behavior.


799-815: Consider returning parsed options not raw JSON.

Minor, but most functions parse JSON fields before returning.

Apply:

-            return [{"options": row[0]} for row in rows]
+            return [{"options": json.loads(row[0])} for row in rows]

982-1018: Potential DISTINCT/ORDER BY mismatch in search_logs targets query.

Same pattern as last_host_logs; switch to subquery with MAX(id) ordering to avoid undefined behavior.

Apply:

-            cursor.execute(
-                """
-                SELECT DISTINCT target FROM scan_events
-                WHERE target LIKE ? OR date LIKE ? OR module_name LIKE ?
-                OR port LIKE ? OR event LIKE ? OR scan_unique_id LIKE ?
-                ORDER BY id DESC
-                LIMIT 10 OFFSET ?
-                """,
-                (
-                    f"%{query}%",
-                    f"%{query}%",
-                    f"%{query}%",
-                    f"%{query}%",
-                    f"%{query}%",
-                    f"%{query}%",
-                    (page * 10) - 10,
-                ),
-            )
+            cursor.execute(
+                """
+                WITH hits AS (
+                  SELECT id, target
+                  FROM scan_events
+                  WHERE target LIKE ? OR date LIKE ? OR module_name LIKE ?
+                        OR port LIKE ? OR event LIKE ? OR scan_unique_id LIKE ?
+                ),
+                grouped AS (
+                  SELECT target, MAX(id) AS max_id
+                  FROM hits
+                  GROUP BY target
+                )
+                SELECT target FROM grouped
+                ORDER BY max_id DESC
+                LIMIT 10 OFFSET ?
+                """,
+                (*(f"%{query}%",)*6, (page * 10) - 10),
+            )

268-276: Minor: BusyError branch string check is unnecessary.

apsw.BusyError already indicates busy/locked; the message check is redundant. Handle BusyError uniformly and let other exceptions fall through.

Apply (example for one function; mirror in the other):

-                    except apsw.BusyError as e:
-                        if "database is locked" in str(e).lower():
+                    except apsw.BusyError:
                             logger.warn(
                                 f"[Retry {_ + 1}/{Config.settings.max_retries}] Database is locked. Retrying..."
                             )
                             if connection.in_transaction:
                                 connection.execute("ROLLBACK")
                             time.sleep(Config.settings.retry_delay)
                             continue
-                        else:
-                            if connection.in_transaction:
-                                connection.execute("ROLLBACK")
-                            return False

Also applies to: 347-365


688-691: Prefer logger.warning over warn (deprecated in stdlib).

If your logger mirrors stdlib, warning() is preferred. If warn() is intentional in your wrapper, ignore.

Also applies to: 1063-1065, 1134-1135

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 37cfb4b and 4cd2bc1.

⛔ Files ignored due to path filters (1)
  • poetry.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • nettacker/config.py (3 hunks)
  • nettacker/core/app.py (1 hunks)
  • nettacker/core/graph.py (2 hunks)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/core/module.py (1 hunks)
  • nettacker/core/utils/common.py (2 hunks)
  • nettacker/database/db.py (17 hunks)
  • pyproject.toml (2 hunks)
  • tests/core/test_exclude_ports.py (5 hunks)
  • tests/database/test_db.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (8)
  • nettacker/core/module.py
  • tests/database/test_db.py
  • pyproject.toml
  • nettacker/core/utils/common.py
  • nettacker/core/lib/base.py
  • nettacker/core/graph.py
  • nettacker/config.py
  • tests/core/test_exclude_ports.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.138Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.138Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.008Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.008Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.306Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.306Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (191-195)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: CodeQL (python)
  • GitHub Check: CodeQL (javascript)
  • GitHub Check: Run tests
🔇 Additional comments (4)
nettacker/database/db.py (4)

4-8: Good: Optional APSW import guard.

Prevents hard failures when non-SQLite engines are used.


471-488: find_events APSW/ORM return strings consistently.

Good consistency; APSW normalizes with dumps(loads(.)) to match ORM string output.


137-173: Nice: Parameterized SQL and consistent JSON handling.

Inserts/Deletes and report selection use parameter binding and parse JSON fields consistently. Good.

Also applies to: 187-224, 513-563


625-634: Discard DISTINCT/ORDER BY compatibility concern
SQLite permits ordering by non-projected columns in a simple SELECT with DISTINCT, so SELECT DISTINCT target … ORDER BY id DESC is valid. (www2.sqlite.org, sqlite.org)

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (2)
nettacker/database/db.py (2)

111-122: SQLAlchemy commit retry lacks rollback and uses hardcoded policy

Add rollback between attempts and use configured retry count/delay. This was raised earlier; reapplying here.

-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                success = True
+                break
+            except Exception:
+                session.rollback()
+                time.sleep(Config.settings.retry_delay)
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

96-109: Critical: COMMIT retry loop closes the connection on every attempt

Closing in finally inside the loop guarantees subsequent retries operate on a closed connection and also leaks the cursor. Move cleanup outside the loop; add rollback and configurable retry/backoff. Keep a finally (per tests).

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            except Exception:
+                pass
+            try:
+                connection.close()
+            except Exception:
+                pass
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success
🧹 Nitpick comments (4)
nettacker/database/db.py (4)

55-63: Busy-timeout unit mismatch (seconds→milliseconds)

APSW expects milliseconds. Multiplying by 100 likely under-waits. Use 1000.

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            # Convert seconds to milliseconds
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)

285-287: Typo in log message ("retires" → "retries")

Minor but user-facing.

-                # All retires exhausted but we want to continue operation
+                # All retries exhausted but we want to continue operation

329-336: Use connection.execute('BEGIN') for consistency

Elsewhere you use connection-level BEGIN/ROLLBACK. Align here too.

-                        if not connection.in_transaction:
-                            cursor.execute("BEGIN")
+                        if not connection.in_transaction:
+                            connection.execute("BEGIN")

770-782: Unify json_event type across backends (get_logs_by_scan_id)

ORM branch returns a raw JSON string while APSW returns a dict. Prefer consistent dicts here (you already do so in logs_to_report_json).

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 4cd2bc1 and 776d6e7.

📒 Files selected for processing (1)
  • nettacker/database/db.py (17 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.347Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.347Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.138Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.138Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.008Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.008Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.306Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.306Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (191-195)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: CodeQL (javascript)
  • GitHub Check: CodeQL (python)
🔇 Additional comments (2)
nettacker/database/db.py (2)

4-8: APSW import guard — good change

Optional dependency handled cleanly; avoids hard-fail on non-SQLite backends.


884-977: Minor: parsing vs raw in HTML report

You pass raw text fields (as intended for HTML). No changes needed beyond connection close handled above.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (8)
docs/Usage.md (8)

635-641: Tighten "Database" section wording and list formatting

  • Remove the comma after “Nettacker” and avoid blank lines breaking the list.
  • Keep database names consistent with later sections.
-OWASP Nettacker, currently supports three databases:
-
-- SQLite
-- MySQL
-- PostgreSQL
+OWASP Nettacker currently supports three databases:
+
+- SQLite
+- MySQL
+- PostgreSQL

649-658: Use fenced code block with language; avoid indented block

Switch to fenced code with language for better readability and lint compliance (MD046).

-    engine = "sqlite"
-    name = str(CWD / ".nettacker/data/nettacker.db")
-    host = ""
-    port = ""
-    username = ""
-    password = ""
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
+```python
+engine = "sqlite"
+name = str(CWD / ".nettacker/data/nettacker.db")
+host = ""
+port = ""
+username = ""
+password = ""
+ssl_mode = "disable"
+journal_mode = "WAL"
+synchronous_mode = "NORMAL"
+```

659-659: Polish grammar and add minor durability caveat

  • Grammar: “according to need” -> “according to your needs”; “use SQLite database” -> “use the SQLite database”.
  • Optional note: NORMAL improves performance but reduces durability on sudden power loss vs FULL.
-These are the default and recommended settings. Feel free to play around and change them according to need. To use SQLite database, ensure that the `engine` value is set to `sqlite` and the `name` is the path to your database. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations.
+These are the default and recommended settings. Feel free to adjust them according to your needs. To use the SQLite database, ensure that `engine` is set to `sqlite` and `name` points to your database file. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations (note: `synchronous=NORMAL` trades some durability vs `FULL`).

661-661: Remove trailing colon from heading

Complies with MD026.

-## MySQL configuration:
+## MySQL configuration

666-675: Add language to fenced code; remove SQLite-only fields from MySQL example

  • Add “python” on the fence (MD040).
  • Drop journal_mode and synchronous_mode to avoid implying they matter for MySQL. Keep ssl_mode only if the code actually honors it for MySQL.
-```
-    engine = "mysql"
-    name = "nettacker"
-    host = "localhost"
-    port = 3306
-    username = "root"
-    password = "some-password"
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
-```
+```python
+engine = "mysql"
+name = "nettacker"
+host = "localhost"
+port = 3306
+username = "root"
+password = "some-password"
+# ssl_mode: use if supported by your setup (e.g., "require")
+```

Also update the note just below to reflect actual support for ssl_mode on MySQL in this codebase.

Would you like me to scan the repository to confirm which fields are honored for MySQL before we finalize the doc text?

Also applies to: 677-677


681-681: Use consistent product name in heading

Align with earlier “PostgreSQL”.

-## Postgres Configuration
+## PostgreSQL configuration

650-650: Clarify path creation for default SQLite location

If .nettacker/data/ isn’t auto-created, add a note to create it; if it is auto-created by the app, state that explicitly to prevent user confusion.

You can confirm by grepping for mkdir or Path(...).mkdir in database initialization code.


697-697: SSL guidance consistency

You recommend ssl_mode="require" for PostgreSQL. If MySQL also supports SSL in this codebase, mirror the guidance in the MySQL section for consistency.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 776d6e7 and a92c6fd.

📒 Files selected for processing (2)
  • docs/Usage.md (1 hunks)
  • nettacker/config.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • nettacker/config.py
🧰 Additional context used
🪛 LanguageTool
docs/Usage.md

[grammar] ~638-~638: There might be a mistake here.
Context: ...ntly supports three databases: - SQLite - MySQL - PostgreSQL The default database...

(QB_NEW_EN)


[grammar] ~639-~639: There might be a mistake here.
Context: ...ports three databases: - SQLite - MySQL - PostgreSQL The default database is SQLi...

(QB_NEW_EN)


[grammar] ~646-~646: Ensure spelling is correct
Context: ...r liking. ## SQLite configuration The configruations are for a SQLite wrapper called *APSW...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[style] ~659-~659: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...e the default and recommended settings. Feel free to play around and change them according t...

(FEEL_FREE_TO_STYLE_ME)


[grammar] ~659-~659: There might be a mistake here.
Context: ...d and change them according to need. To use SQLite database, ensure that the `engin...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md

649-649: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)


661-661: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


665-665: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


683-683: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Test Docker image

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/Usage.md (1)

681-691: Add language to the PostgreSQL fenced block (keep engine = "postgres").

Same as earlier feedback: add a language tag; keep engine identifier as "postgres" to match the codebase/tests.

-```
+```python
     engine = "postgres"
     name = "nettacker"
     host = "localhost"
     port = 5432
     username = "root"
     password = "some-password"
     ssl_mode = "disable"
     journal_mode = "WAL"
     synchronous_mode = "NORMAL"

</blockquote></details>

</blockquote></details>

<details>
<summary>🧹 Nitpick comments (6)</summary><blockquote>

<details>
<summary>docs/Usage.md (6)</summary><blockquote>

`634-641`: **Tighten wording and fix minor grammar (remove comma after “Nettacker”, avoid “db”).**

- Remove the comma after “OWASP Nettacker”.
- Prefer “database” over “db” and simplify phrasing.


```diff
-OWASP Nettacker, currently supports three databases:
+OWASP Nettacker currently supports three databases:
...
-The default database is SQLite. You can, however, configure the db to your liking.
+The default database is SQLite. You can configure the database to your needs.

642-656: Use a fenced code block with a language tag (MD046/MD040).

The SQLite snippet is an indented block; convert to a fenced block with a language for better rendering and lint compliance.

-    engine = "sqlite"
-    name = str(CWD / ".nettacker/data/nettacker.db")
-    host = ""
-    port = ""
-    username = ""
-    password = ""
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
+```python
+engine = "sqlite"
+name = str(CWD / ".nettacker/data/nettacker.db")
+host = ""
+port = ""
+username = ""
+password = ""
+ssl_mode = "disable"
+journal_mode = "WAL"
+synchronous_mode = "NORMAL"
+```

659-659: Remove trailing colon in heading (MD026).

Trailing punctuation in headings is flagged by markdownlint; drop the colon.

-## MySQL configuration:
+## MySQL configuration

663-673: Add language to the fenced MySQL block (MD040).

Specify the language to improve syntax highlighting and satisfy linters.

-```
+```python
     engine = "mysql"
     name = "nettacker"
     host = "localhost"
     port = 3306
     username = "root"
     password = "some-password"
     ssl_mode = "disable"
     journal_mode = "WAL"
     synchronous_mode = "NORMAL"

---

`693-695`: **Minor grammar nits (comma after introductory phrase; streamline note).**

Add a comma after “In this case,” and simplify the note sentence.


```diff
-In this case the irrelevant fields are `journal_mode` and `synchronous_mode`. You don't have to update/change/remove them.
+In this case, the irrelevant fields are `journal_mode` and `synchronous_mode`. You don't have to update, change, or remove them.
-**Note**: If you want encryption, then set `ssl_mode` to `require`.
+**Note**: For encryption, set `ssl_mode` to `require`.

642-695: Apply markdownlint fixes for MD026/MD040/MD046 in docs/Usage.md

  • Remove the trailing colon from the “## MySQL configuration:” header (line 659) to satisfy MD026.
  • Add explicit language identifiers (e.g. ini, python, or bash) to all fences in this file to satisfy MD040.
  • Convert any remaining indented (4-space) code examples to fenced code blocks to satisfy MD046.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a92c6fd and cb79437.

📒 Files selected for processing (1)
  • docs/Usage.md (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).
🪛 LanguageTool
docs/Usage.md

[grammar] ~636-~636: There might be a mistake here.
Context: ...ntly supports three databases: - SQLite - MySQL - PostgreSQL The default database...

(QB_NEW_EN)


[grammar] ~637-~637: There might be a mistake here.
Context: ...ports three databases: - SQLite - MySQL - PostgreSQL The default database is SQLi...

(QB_NEW_EN)


[style] ~657-~657: Consider using a less common alternative to make your writing sound more unique and professional.
Context: ...e the default and recommended settings. Feel free to play around and change them according t...

(FEEL_FREE_TO_STYLE_ME)


[grammar] ~657-~657: There might be a mistake here.
Context: ...d and change them according to need. To use SQLite database, ensure that the `engin...

(QB_NEW_EN)

🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md

647-647: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)


659-659: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


663-663: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


681-681: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Test Docker 27.5.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker image

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (10)
docs/Usage.md (2)

683-693: Add language to fenced code block (MD040)

Specify a language for the Postgres config example.

-```
+```ini
     engine = "postgres"
     name = "nettacker"
     host = "localhost"
     port = 5432
     username = "root"
     password = "some-password"
     ssl_mode = "disable"
     journal_mode = "WAL"
     synchronous_mode = "NORMAL"

---

`657-660`: **Tighten tone and grammar; “SQLite” capitalization; minor wording**

Reword for concision and formality. Also “SQLite” and “lightweight”.


```diff
-These are the default and recommended settings. Feel free to play around and change them according to need. To use SQLite database, ensure that the `engine` value is set to `sqlite` and the `name` is the path to your database. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations.
+These are the default and recommended settings. Adjust as needed. To use the SQLite database, set `engine` to `sqlite` and `name` to your database path. `journal_mode` and `synchronous_mode` are tuned for multithreaded I/O.

Optional:

-> Note: You can choose to use a lite wrapper for Sqlite called APSW by setting the `use_apsw_for_sqlite` parameter inside config to True for performance enhancements.
+> Note: You can enable the lightweight SQLite wrapper APSW by setting `use_apsw_for_sqlite` to `True` in the config for performance gains.
nettacker/database/db.py (8)

56-61: APSW busy-timeout uses 100 ms-per-second; should be 1000

setbusytimeout expects milliseconds. Convert seconds → ms.

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            # Convert seconds to milliseconds
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)

100-113: Critical: connection closed inside retry loop; retries operate on a closed handle

Move close(s) out of the loop, add rollback between attempts, and use configurable retry params.

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            except Exception:
+                pass
+            try:
+                connection.close()
+            except Exception:
+                pass
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success
@@
-    else:
-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+    else:
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                success = True
+                break
+            except Exception:
+                session.rollback()
+                time.sleep(Config.settings.retry_delay)
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

Also applies to: 115-126


147-168: Close APSW connection as well in finally (avoid leak on both success/error paths)

Currently only the cursor is closed here.

-        finally:
-            cursor.close()
+        finally:
+            try:
+                cursor.close()
+            finally:
+                connection.close()

195-219: Close APSW connection in remove_old_logs finally block

Mirror the fix from submit_report_to_db.

-        finally:
-            cursor.close()
+        finally:
+            try:
+                cursor.close()
+            finally:
+                connection.close()

419-419: Contract regression: find_temp_events should return a JSON string; also close connection; fix ORM branch

Align with callers and with find_events; update docstring; close APSW connection.

-        an array with JSON events or an empty array
+        a JSON string or an empty array
@@
-            row = cursor.fetchone()
-            cursor.close()
-            if row:
-                return json.loads(row[0])
+            row = cursor.fetchone()
+            cursor.close()
+            connection.close()
+            if row:
+                return row[0]
             return []
@@
-        return (
-            session.query(TempEvents)
-            .filter(
-                TempEvents.target == target,
-                TempEvents.module_name == module_name,
-                TempEvents.scan_unique_id == scan_id,
-                TempEvents.event_name == event_name,
-            )
-            .first()
-        )
+        result = (
+            session.query(TempEvents)
+            .filter(
+                TempEvents.target == target,
+                TempEvents.module_name == module_name,
+                TempEvents.scan_unique_id == scan_id,
+                TempEvents.event_name == event_name,
+            )
+            .first()
+        )
+        return result.event if result else []

Also applies to: 435-439, 445-454


482-486: Close APSW connection after reads (find_events)

Prevent leaked connections on read-path success and error.

-            rows = cursor.fetchall()
-            cursor.close()
+            rows = cursor.fetchall()
+            cursor.close()
+            connection.close()

638-640: Close APSW connection in last_host_logs on both early-return and success paths

Avoid leaking the connection.

-            if not targets:
-                return structure(status="finished", msg="No more search results")
+            if not targets:
+                cursor.close()
+                connection.close()
+                return structure(status="finished", msg="No more search results")
@@
-            cursor.close()
+            cursor.close()
+            connection.close()
             return hosts

Also applies to: 687-689


534-545: Close APSW connection after select_reports fetch

Add connection.close() after cursor.close().

-            cursor.close()
+            cursor.close()
+            connection.close()
             for row in rows:
🧹 Nitpick comments (4)
docs/Usage.md (3)

647-656: Fix MD046: switch indented block to fenced code block with a language

Use a fenced block for the SQLite snippet to satisfy markdownlint and improve readability.

-    engine = "sqlite"
-    name = str(CWD / ".nettacker/data/nettacker.db")
-    host = ""
-    port = ""
-    username = ""
-    password = ""
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
+```ini
+engine = "sqlite"
+name = ".nettacker/data/nettacker.db"
+host = ""
+port = ""
+username = ""
+password = ""
+ssl_mode = "disable"
+journal_mode = "WAL"
+synchronous_mode = "NORMAL"
+```

661-661: Remove trailing colon from heading (MD026)

Drop the colon at the end of the MySQL heading.

-## MySQL configuration:
+## MySQL configuration

665-675: Add language to fenced code block (MD040)

Specify a language for the MySQL config example.

-```
+```ini
     engine = "mysql"
     name = "nettacker"
     host = "localhost"
     port = 3306
     username = "root"
     password = "some-password"
     ssl_mode = "disable"
     journal_mode = "WAL"
     synchronous_mode = "NORMAL"

</blockquote></details>
<details>
<summary>nettacker/database/db.py (1)</summary><blockquote>

`589-599`: **Use logger.exception for failures; also close connection in APSW branch**

Prefer exception-level logging to capture tracebacks; also close APSW connection after cursor.close().


```diff
-        cursor.close()
+        cursor.close()
+        connection.close()
@@
-            except IOError as e:
-                logger.error(f"Failed to read report file: {e}")
+            except IOError:
+                logger.exception("Failed to read report file")
                 return None
@@
-        except IOError as e:
-            logger.error(f"Failed to read report file: {e}")
+        except IOError:
+            logger.exception("Failed to read report file")
             return None

And earlier:

-            logger.error(f"Failed to create APSW connection: {e}")
+            logger.exception("Failed to create APSW connection")

Also applies to: 605-610, 68-69

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6e03522 and 92e0404.

📒 Files selected for processing (3)
  • docs/Usage.md (1 hunks)
  • nettacker/config.py (4 hunks)
  • nettacker/database/db.py (17 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • nettacker/config.py
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (194-198)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (26-49)
🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md

647-647: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)


661-661: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


665-665: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


683-683: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 Ruff (0.12.2)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


123-123: Consider moving this statement to an else block

(TRY300)


124-124: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


213-213: Do not catch blind exception: Exception

(BLE001)


282-282: Do not catch blind exception: Exception

(BLE001)


286-287: try-except-pass detected, consider logging the exception

(S110)


286-286: Do not catch blind exception: Exception

(BLE001)


363-364: try-except-pass detected, consider logging the exception

(S110)


363-363: Do not catch blind exception: Exception

(BLE001)


371-372: try-except-pass detected, consider logging the exception

(S110)


371-371: Do not catch blind exception: Exception

(BLE001)


374-374: Do not catch blind exception: Exception

(BLE001)


378-379: try-except-pass detected, consider logging the exception

(S110)


378-378: Do not catch blind exception: Exception

(BLE001)


439-439: Consider moving this statement to an else block

(TRY300)


440-440: Do not catch blind exception: Exception

(BLE001)


486-486: Consider moving this statement to an else block

(TRY300)


487-487: Do not catch blind exception: Exception

(BLE001)


544-544: Consider moving this statement to an else block

(TRY300)


546-546: Do not catch blind exception: Exception

(BLE001)


563-563: Do not catch blind exception: Exception

(BLE001)


596-596: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


608-608: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


688-688: Consider moving this statement to an else block

(TRY300)


690-690: Do not catch blind exception: Exception

(BLE001)


1065-1065: Do not catch blind exception: Exception

(BLE001)


1136-1136: Do not catch blind exception: Exception

(BLE001)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Test Docker 26.0.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.1.4-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker image
  • GitHub Check: Build package

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
docs/Usage.md (1)

673-673: Make tone concise and add the missing article.

Previous feedback still applies; rephrase to a formal tone.

-These are the default and recommended settings. Feel free to play around and change them according to need. To use SQLite database, ensure that the `engine` value is set to `sqlite` and the `name` is the path to your database. The `journal_mode` and `synchronous_mode` are chosen to be optimal for multithreaded I/O operations.
+These are the default and recommended settings. To use the SQLite database, set `engine` to `sqlite` and `name` to your database path. `journal_mode` and `synchronous_mode` are selected for multithreaded I/O.
🧹 Nitpick comments (6)
docs/Usage.md (6)

650-657: Tighten grammar and tone in DB intro.

Remove the comma after “Nettacker” and avoid “db” slang; keep it concise.

-OWASP Nettacker, currently supports three databases:
+OWASP Nettacker currently supports three databases:
@@
-The default database is SQLite. You can, however, configure the db to your liking.
+The default database is SQLite. You can configure the database to your needs.

660-672: Fix MD046/MD040: use fenced code block with language for SQLite sample.

Convert the indented block to a fenced block and annotate with a language (use ini for consistency with other samples).

-The configurations below are for a SQLite wrapper called **APSW** (Another Python SQLite Wrapper). The configurations can be found inside `nettacker/config.py` file under the `DBConfig` class. 
-
-
-    engine = "sqlite"
-    name = str(CWD / ".nettacker/data/nettacker.db")
-    host = ""
-    port = ""
-    username = ""
-    password = ""
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"
+The configurations below are for a SQLite wrapper called **APSW** (Another Python SQLite Wrapper). The configurations can be found in `nettacker/config.py` under the `DBConfig` class.
+
+```ini
+engine = "sqlite"
+name = ".nettacker/data/nettacker.db"
+host = ""
+port = ""
+username = ""
+password = ""
+ssl_mode = "disable"
+journal_mode = "WAL"
+synchronous_mode = "NORMAL"
+```

678-678: Remove trailing colon in heading (MD026).

-## MySQL configuration:
+## MySQL configuration

681-691: Add language to fenced block (MD040) and drop irrelevant fields to avoid confusion.

Keep the sample minimal; you already note irrelevant fields aren’t required.

-```
+```ini
     engine = "mysql"
     name = "nettacker"
     host = "localhost"
     port = 3306
     username = "root"
     password = "some-password"
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"

---

`699-709`: **Add language to fenced block (MD040) and retain engine string “postgres” per codebase; trim irrelevant fields.**

Align with earlier guidance to keep engine “postgres” and remove SQLite-only knobs from the sample.

```diff
-```
+```ini
     engine = "postgres"
     name = "nettacker"
     host = "localhost"
     port = 5432
     username = "root"
     password = "some-password"
-    ssl_mode = "disable"
-    journal_mode = "WAL"
-    synchronous_mode = "NORMAL"

---

`711-714`: **Clarify which fields are ignored and tighten SSL note; optionally align heading to “PostgreSQL”.**

Be explicit and consistent.

```diff
-In this case the irrelevant fields are `journal_mode` and `synchronous_mode`. You don't have to update/change/remove them.
-
-**Note**: If you want encryption, then set `ssl_mode` to `require`.
+In this backend, `journal_mode` and `synchronous_mode` are ignored.
+
+Note: To require TLS, set `ssl_mode = "require"` (if supported by your driver/environment).

Optional: Rename the section heading at Line 695 to “## PostgreSQL configuration” for consistency with common naming.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 92e0404 and 6d91f9d.

📒 Files selected for processing (2)
  • docs/Usage.md (1 hunks)
  • nettacker/core/graph.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • nettacker/core/graph.py
🧰 Additional context used
📓 Path-based instructions (1)
docs/**

📄 CodeRabbit inference engine (AGENTS.md)

Place documentation under docs/

Files:

  • docs/Usage.md
🪛 markdownlint-cli2 (0.17.2)
docs/Usage.md

663-663: Code block style
Expected: fenced; Actual: indented

(MD046, code-block-style)


677-677: Trailing punctuation in heading
Punctuation: ':'

(MD026, no-trailing-punctuation)


681-681: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


699-699: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
  • GitHub Check: Test Docker 27.5.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.0.0-1ubuntu.24.04noble image build
  • GitHub Check: Test Docker 26.1.4-1ubuntu.24.04noble image build
  • GitHub Check: Build package
  • GitHub Check: Test Docker image
🔇 Additional comments (1)
docs/Usage.md (1)

675-675: Capitalize SQLite and tighten APSW note — flag verified.

Replace the docs line with:

Note: To use APSW for SQLite, set use_apsw_for_sqlite = True in the config for potential performance gains.

Flag defined: nettacker/config.py:151-153. Flag checked: nettacker/database/db.py:51-53.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
nettacker/core/lib/base.py (1)

49-58: Avoid infinite wait; add safe JSON handling

Polling without a timeout risks hanging if the event never arrives; also JSON parsing can throw. Add a max-wait (or attempt cap) and handle decode/KeyError gracefully before appending.

-            while True:
-                event = find_temp_events(target, module_name, scan_id, event_name)
-                if event:
-                    events.append(json.loads(event)["response"]["conditions_results"])
-                    break
-                time.sleep(0.1)
+            max_attempts = 300  # ~30s at 0.1s intervals
+            attempts = 0
+            while attempts < max_attempts:
+                event = find_temp_events(target, module_name, scan_id, event_name)
+                if event:
+                    try:
+                        payload = json.loads(event)
+                        events.append(payload.get("response", {}).get("conditions_results", []))
+                        break
+                    except (ValueError, TypeError, KeyError):
+                        # malformed/partial event; keep waiting briefly
+                        pass
+                time.sleep(0.1)
+                attempts += 1
♻️ Duplicate comments (5)
nettacker/database/db.py (5)

56-60: Use milliseconds for APSW busy timeout

APSW’s setbusytimeout expects milliseconds; multiplying by 100 under-waits by 10x for second-based configs.

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            # seconds -> milliseconds
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)
APSW setbusytimeout units (ms vs seconds)?

100-114: APSW commit retry closes connection inside the loop (breaks retries) and can leak cursor

Closing per-iteration invalidates subsequent attempts. Close once after the loop; keep the finally for tests but outside the retry. Also use configured retry/delay.

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            except Exception:
+                pass
+            try:
+                connection.close()
+            except Exception:
+                pass
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

642-644: Close APSW cursor/connection on early return

This path leaks resources when no targets are found.

-            if not targets:
-                return structure(status="finished", msg="No more search results")
+            if not targets:
+                cursor.close()
+                connection.close()
+                return structure(status="finished", msg="No more search results")

115-126: ORM commit retry lacks rollback and uses hardcoded policy

Add rollback between attempts and read retry config.

-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                return True
+            except Exception:
+                try:
+                    session.rollback()
+                except Exception:
+                    pass
+                time.sleep(Config.settings.retry_delay)
+        logger.warn(messages("database_connect_fail"))
+        return False

780-792: Unify json_event type across backends (parse to dict in ORM path)

APSW branch returns parsed dicts; ORM path returns raw string. Parse for consistency.

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},
🧹 Nitpick comments (3)
report.html (1)

1-1: Make the report template valid HTML and easier to populate

This one-line blob is invalid HTML and hard to maintain. Replace with a minimal, valid skeleton with placeholders the renderer fills.

-<table><graph_html>/*css*/</table>datetargetmodule_nameportlogsjson_event<tr>nowx</tr></table><div id="json_length">1</div><p class="footer">Software Details: OWASP Nettacker version 1.0 [beta] in now ScanID: scan-id</p><script>/*js*/</script>
+<!doctype html>
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1">
+    <style>/*css*/</style>
+    <title>Nettacker Report</title>
+  </head>
+  <body>
+    <div id="graph"><graph_html></graph_html></div>
+    <table id="events">
+      <thead><tr><th>date</th><th>target</th><th>module_name</th><th>port</th><th>logs</th><th>json_event</th></tr></thead>
+      <tbody><!-- rows injected here --></tbody>
+    </table>
+    <div id="json_length">1</div>
+    <p class="footer">Software Details: OWASP Nettacker version 1.0 [beta] in now ScanID: scan-id</p>
+    <script>/*js*/</script>
+  </body>
+  </html>
tests/database/test_db.py (2)

706-722: Align expected return type for SQLAlchemy path of find_temp_events (JSON string, not dict)

The implementation returns a JSON string (result.event). Update the mock and assertion accordingly.

-        fake_result = MagicMock()
-        fake_result.event = {"foo": "bar"}
+        fake_result = MagicMock()
+        fake_result.event = '{"foo":"bar"}'
@@
-        assert result == {"foo": "bar"}
+        assert result == '{"foo":"bar"}'

860-869: The exception is set on an unused attribute

select_reports uses cursor.execute(...), not cursor.query. Patch execute.side_effect so the test actually exercises the error path.

-        mock_cursor.query.side_effect = Exception("DB Error")
+        mock_cursor.execute.side_effect = Exception("DB Error")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d91f9d and 5a02e5f.

📒 Files selected for processing (4)
  • nettacker/core/lib/base.py (1 hunks)
  • nettacker/database/db.py (17 hunks)
  • report.html (1 hunks)
  • tests/database/test_db.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (5)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • nettacker/core/lib/base.py
  • nettacker/database/db.py
  • tests/database/test_db.py
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/core/lib/base.py
  • nettacker/database/db.py
nettacker/core/**

📄 CodeRabbit inference engine (AGENTS.md)

Place core libraries under nettacker/core/

Files:

  • nettacker/core/lib/base.py
tests/**/test_*.py

📄 CodeRabbit inference engine (AGENTS.md)

tests/**/test_*.py: Place tests under tests/ and name files as test_*.py
Use pytest (with pytest-asyncio, xdist) for tests

Files:

  • tests/database/test_db.py
tests/**

📄 CodeRabbit inference engine (AGENTS.md)

Mirror package layout in tests (e.g., tests/core/, tests/lib/, etc.)

Files:

  • tests/database/test_db.py
🧠 Learnings (4)
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/core/lib/base.py
  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/core/lib/base.py
  • nettacker/database/db.py
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
  • tests/database/test_db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (2)
nettacker/database/db.py (4)
nettacker/config.py (1)
  • Config (194-198)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (28-51)
tests/database/test_db.py (4)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/database/db.py (17)
  • db_inputs (22-40)
  • create_connection (43-85)
  • send_submit_query (88-127)
  • submit_report_to_db (130-178)
  • remove_old_logs (181-229)
  • submit_logs_to_db (232-313)
  • submit_temp_logs_to_db (316-403)
  • find_temp_events (406-455)
  • find_events (458-502)
  • select_reports (505-568)
  • get_scan_result (571-613)
  • last_host_logs (616-737)
  • get_logs_by_scan_id (740-792)
  • get_options_by_scan_id (795-824)
  • logs_to_report_json (827-880)
  • logs_to_report_html (883-990)
  • search_logs (993-1150)
tests/database/test_sqlite.py (1)
  • mock_config (12-14)
nettacker/config.py (1)
  • as_dict (26-27)
🪛 Ruff (0.12.2)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


123-123: Consider moving this statement to an else block

(TRY300)


124-124: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


214-214: Do not catch blind exception: Exception

(BLE001)


284-284: Do not catch blind exception: Exception

(BLE001)


288-289: try-except-pass detected, consider logging the exception

(S110)


288-288: Do not catch blind exception: Exception

(BLE001)


363-364: try-except-pass detected, consider logging the exception

(S110)


363-363: Do not catch blind exception: Exception

(BLE001)


371-372: try-except-pass detected, consider logging the exception

(S110)


371-371: Do not catch blind exception: Exception

(BLE001)


374-374: Do not catch blind exception: Exception

(BLE001)


378-379: try-except-pass detected, consider logging the exception

(S110)


378-378: Do not catch blind exception: Exception

(BLE001)


438-438: Consider moving this statement to an else block

(TRY300)


439-439: Do not catch blind exception: Exception

(BLE001)


488-488: Consider moving this statement to an else block

(TRY300)


489-489: Do not catch blind exception: Exception

(BLE001)


547-547: Consider moving this statement to an else block

(TRY300)


549-549: Do not catch blind exception: Exception

(BLE001)


566-566: Do not catch blind exception: Exception

(BLE001)


600-600: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


612-612: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


693-693: Consider moving this statement to an else block

(TRY300)


695-695: Do not catch blind exception: Exception

(BLE001)


1075-1075: Do not catch blind exception: Exception

(BLE001)


1146-1146: Do not catch blind exception: Exception

(BLE001)

tests/database/test_db.py

34-34: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


116-116: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


126-126: Probable insecure usage of temporary file or directory: "/tmp/test.db"

(S108)


185-185: Unused method argument: mock_messages

(ARG002)


188-188: Create your own exception

(TRY002)


188-188: Avoid specifying long messages outside the exception class

(TRY003)


214-214: Unused method argument: mock_messages

(ARG002)


248-248: Probable insecure usage of temporary file or directory: "/tmp/test_report.json"

(S108)


376-376: Unused method argument: mock_messages

(ARG002)


383-383: Unused method argument: mock_submit

(ARG002)


476-476: Unused method argument: mock_submit

(ARG002)


495-495: Unused method argument: mock_submit

(ARG002)


548-548: Unused method argument: mock_messages

(ARG002)


598-598: Unused method argument: mock_temp

(ARG002)


693-693: Unused method argument: mock_messages

(ARG002)


832-832: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


852-852: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


879-879: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


892-892: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


917-917: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


933-933: Probable insecure usage of temporary file or directory: "/tmp/report.json"

(S108)


943-943: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


948-948: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


951-951: Probable insecure usage of temporary file or directory: "/tmp/mock_report.json"

(S108)


1150-1150: Unused method argument: mock_messages

(ARG002)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
nettacker/database/db.py (4)

56-66: APSW busy-timeout uses 100×; APSW expects milliseconds.

If config.settings.timeout is seconds, multiply by 1000, not 100. Current value shortens the wait and can inflate BusyError retries.

-            connection.setbusytimeout(int(config.settings.timeout) * 100)
+            connection.setbusytimeout(int(config.settings.timeout) * 1000)

Also applies to: 59-59


19-19: Rename local logger to avoid shadowing the imported module.

logger = logger.get_logger() overwrites the imported logger module symbol. Use a distinct name (e.g., log) and update references in this file.

-logger = logger.get_logger()
+log = logger.get_logger()

100-114: Retry loop closes the connection each iteration; breaks retries.

finally: connection.close() runs on every loop pass, so subsequent retries operate on a closed connection. Close once after the loop; use configured retry params.

-    if isinstance(session, tuple):
-        connection, cursor = session
-        for _ in range(100):
-            try:
-                connection.execute("COMMIT")
-                return True
-            except Exception:
-                connection.execute("ROLLBACK")
-                time.sleep(0.1)
-            finally:
-                connection.close()
-        connection.close()
-        logger.warn(messages("database_connect_fail"))
-        return False
+    if isinstance(session, tuple):
+        connection, cursor = session
+        success = False
+        try:
+            for _ in range(Config.settings.max_retries):
+                try:
+                    connection.execute("COMMIT")
+                    success = True
+                    break
+                except Exception:
+                    try:
+                        connection.execute("ROLLBACK")
+                    except Exception:
+                        pass
+                    time.sleep(Config.settings.retry_delay)
+        finally:
+            try:
+                cursor.close()
+            finally:
+                connection.close()
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success

115-126: SQLAlchemy commit loop lacks rollback and uses hardcoded policy.

Add rollback() and use Config.settings.{max_retries,retry_delay}.

-    else:
-        try:
-            for _ in range(1, 100):
-                try:
-                    session.commit()
-                    return True
-                except Exception:
-                    time.sleep(0.1)
-            logger.warn(messages("database_connect_fail"))
-            return False
-        except Exception:
-            logger.warn(messages("database_connect_fail"))
-            return False
+    else:
+        success = False
+        for _ in range(Config.settings.max_retries):
+            try:
+                session.commit()
+                success = True
+                break
+            except Exception:
+                session.rollback()
+                time.sleep(Config.settings.retry_delay)
+        if not success:
+            logger.warn(messages("database_connect_fail"))
+        return success
🧹 Nitpick comments (6)
nettacker/database/db.py (6)

291-293: Typo in log message (“retires”).

Fix the spelling to “retries” for clarity.

-                # All retires exhausted but we want to continue operation
+                # All retries exhausted but we want to continue operation

337-338: Use connection for BEGIN to match other APSW paths.

Be consistent with submit_logs_to_db which uses connection.execute("BEGIN").

-                        if not connection.in_transaction:
-                            cursor.execute("BEGIN")
+                        if not connection.in_transaction:
+                            connection.execute("BEGIN")

406-418: Docstring return type is inaccurate for find_temp_events.

Function returns a single JSON string (or []) due to LIMIT 1, not “an array”.

-    Returns:
-        an array with JSON events or an empty array
+    Returns:
+        a JSON string (event) or [] if not found

780-792: Inconsistent json_event type between APSW and ORM branches.

APSW returns dict via json.loads, ORM returns raw string. Normalize to dict.

-                "json_event": log.json_event,
+                "json_event": json.loads(log.json_event) if log.json_event else {},

68-68: Prefer logger.exception() to capture stack traces.

Improves diagnosability over plain error logging.

-            logger.error(f"Failed to create APSW connection: {e}")
+            logger.exception(f"Failed to create APSW connection: {e}")
@@
-                logger.error(f"Failed to read report file: {e}")
+                logger.exception(f"Failed to read report file: {e}")
@@
-            logger.error(f"Failed to read report file: {e}")
+            logger.exception(f"Failed to read report file: {e}")

Also applies to: 600-600, 612-612


1033-1040: GROUP BY with non-aggregated date is undefined in SQLite; select MAX(date).

Prevents arbitrary date values per group.

-                    SELECT date, module_name, port, event, json_event FROM scan_events
+                    SELECT MAX(date) AS date, module_name, port, event, json_event FROM scan_events
                     WHERE target = ?
-                    GROUP BY module_name, port, scan_unique_id, event
+                    GROUP BY module_name, port, scan_unique_id, event
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a02e5f and 5c7c1fb.

📒 Files selected for processing (1)
  • nettacker/database/db.py (17 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Use 4-space indents in Python code
Limit lines to 99 characters (ruff/ruff-format/isort profile=black)
Module and file names should use lower_snake_case
Function and variable names should use lower_snake_case
Class names should use PascalCase
Constants should use UPPER_SNAKE_CASE
Keep functions small and add type hints where practical

Files:

  • nettacker/database/db.py
nettacker/**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

Add docstrings for public APIs in the nettacker package

Files:

  • nettacker/database/db.py
🧠 Learnings (4)
📚 Learning: 2025-08-31T02:23:10.377Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:275-276
Timestamp: 2025-08-31T02:23:10.377Z
Learning: In nettacker/database/db.py, the submit_logs_to_db and submit_temp_logs_to_db functions intentionally return True after exhausting all database retry attempts. This is a design decision to ensure operational continuity - allowing scans to continue even if individual log entries fail to be inserted into the database. Returning False would break the scanning operation.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:41:18.191Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:400-427
Timestamp: 2025-08-31T14:41:18.191Z
Learning: In the find_temp_events function in nettacker/database/db.py, both APSW and SQLAlchemy branches must return JSON strings for consistency. The calling code expects JSON strings, not parsed dictionaries. The APSW branch should return row[0] directly (raw JSON string from database) and SQLAlchemy should return result.event (JSON string from ORM).

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:26:17.031Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:477-477
Timestamp: 2025-08-31T14:26:17.031Z
Learning: In the find_events function in nettacker/database/db.py, the calling code expects JSON strings, not dictionaries. Both APSW and SQLAlchemy branches must return strings for consistency. The pattern json.dumps(json.loads()) in APSW is required to match the SQLAlchemy branch which returns raw JSON strings from the database.

Applied to files:

  • nettacker/database/db.py
📚 Learning: 2025-08-31T14:37:41.349Z
Learnt from: pUrGe12
PR: OWASP/Nettacker#1076
File: nettacker/database/db.py:85-98
Timestamp: 2025-08-31T14:37:41.349Z
Learning: In nettacker/database/db.py, the send_submit_query function requires a finally block for proper testcase execution. The finally block should not be removed as it's specifically needed for testing scenarios to ensure proper resource cleanup.

Applied to files:

  • nettacker/database/db.py
🧬 Code graph analysis (1)
nettacker/database/db.py (4)
nettacker/config.py (2)
  • Config (194-198)
  • as_dict (26-27)
nettacker/database/models.py (3)
  • Report (7-27)
  • HostsLog (67-100)
  • TempEvents (30-64)
nettacker/api/helpers.py (1)
  • structure (1-12)
nettacker/core/graph.py (1)
  • build_graph (28-51)
🪛 Ruff (0.12.2)
nettacker/database/db.py

53-53: Avoid specifying long messages outside the exception class

(TRY003)


66-66: Consider moving this statement to an else block

(TRY300)


68-68: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


105-105: Consider moving this statement to an else block

(TRY300)


106-106: Do not catch blind exception: Exception

(BLE001)


119-119: Consider moving this statement to an else block

(TRY300)


120-120: Do not catch blind exception: Exception

(BLE001)


123-123: Consider moving this statement to an else block

(TRY300)


124-124: Do not catch blind exception: Exception

(BLE001)


162-162: Do not catch blind exception: Exception

(BLE001)


214-214: Do not catch blind exception: Exception

(BLE001)


284-284: Do not catch blind exception: Exception

(BLE001)


288-289: try-except-pass detected, consider logging the exception

(S110)


288-288: Do not catch blind exception: Exception

(BLE001)


363-364: try-except-pass detected, consider logging the exception

(S110)


363-363: Do not catch blind exception: Exception

(BLE001)


371-372: try-except-pass detected, consider logging the exception

(S110)


371-371: Do not catch blind exception: Exception

(BLE001)


374-374: Do not catch blind exception: Exception

(BLE001)


378-379: try-except-pass detected, consider logging the exception

(S110)


378-378: Do not catch blind exception: Exception

(BLE001)


438-438: Consider moving this statement to an else block

(TRY300)


439-439: Do not catch blind exception: Exception

(BLE001)


488-488: Consider moving this statement to an else block

(TRY300)


489-489: Do not catch blind exception: Exception

(BLE001)


547-547: Consider moving this statement to an else block

(TRY300)


549-549: Do not catch blind exception: Exception

(BLE001)


566-566: Do not catch blind exception: Exception

(BLE001)


600-600: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


612-612: Use logging.exception instead of logging.error

Replace with exception

(TRY400)


693-693: Consider moving this statement to an else block

(TRY300)


695-695: Do not catch blind exception: Exception

(BLE001)


1075-1075: Do not catch blind exception: Exception

(BLE001)


1079-1080: try-except-pass detected, consider logging the exception

(S110)


1079-1079: Do not catch blind exception: Exception

(BLE001)


1151-1151: Do not catch blind exception: Exception

(BLE001)

🔇 Additional comments (4)
nettacker/database/db.py (4)

141-179: Report insertion path looks solid.

Transaction boundaries, parameter binding, and cleanup are correct.


192-230: Old logs removal path looks good.

Uses parameterized DELETE and shared commit helper.


470-503: Consistent JSON-string returns across backends.

find_events aligns APSW and ORM to return JSON strings; matches calling code expectations.


803-825: Keep returning raw JSON strings — no change required.

Call sites (nettacker/core/graph.py:340–344) and tests (tests/database/test_db.py:1036–1040) call json.loads(item["options"]), so the current implementation returning raw JSON strings is correct.

Comment on lines +284 to +291
except Exception:
try:
if connection.in_transaction:
connection.execute("ROLLBACK")
except Exception:
pass
return False
# All retires exhausted but we want to continue operation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Broad excepts hide root causes and can skip cleanup.

Narrow exceptions where possible (e.g., apsw.BusyError, apsw.ConstraintError, SQLAlchemy SQLAlchemyError) and ensure cleanup via finally. At minimum, log with logger.exception(...).

Also applies to: 375-380, 439-441, 489-491, 695-697, 1075-1081

🧰 Tools
🪛 Ruff (0.12.2)

284-284: Do not catch blind exception: Exception

(BLE001)


288-289: try-except-pass detected, consider logging the exception

(S110)


288-288: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In nettacker/database/db.py around lines 284-291 (and similarly at 375-380,
439-441, 489-491, 695-697, 1075-1081), the code uses broad bare excepts which
hide root causes and can skip cleanup; replace those bare except Exception
blocks with narrow exception catches (e.g., apsw.BusyError,
apsw.ConstraintError, sqlalchemy.SQLAlchemyError or the specific DB driver
exceptions you expect), add logger.exception(...) inside the except to record
the stack trace, and move transaction cleanup (connection.execute("ROLLBACK"))
into a finally block (or ensure it runs in an except/finally sequence) so
rollback always executes when needed; if multiple exception types require
different handling, catch them separately and re-raise or return appropriately
while still logging.

Comment on lines +642 to +644
if not targets:
return structure(status="finished", msg="No more search results")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Close APSW resources on early return.

Leak: cursor/connection not closed when no targets.

-            if not targets:
-                return structure(status="finished", msg="No more search results")
+            if not targets:
+                cursor.close()
+                connection.close()
+                return structure(status="finished", msg="No more search results")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if not targets:
return structure(status="finished", msg="No more search results")
if not targets:
cursor.close()
connection.close()
return structure(status="finished", msg="No more search results")
🤖 Prompt for AI Agents
In nettacker/database/db.py around lines 642 to 644 the function returns early
when no targets without closing APSW resources; ensure the cursor and connection
are closed before the return to avoid leaks — either call cursor.close() and
connection.close() just before returning or refactor the function to use a
try/finally (or context manager) that always closes the cursor and connection,
then return the structure(status="finished", msg="No more search results") after
resources are cleaned up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gsoc GSoC work
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants