diff --git a/pages/advanced-algorithms/available-algorithms/migrate.mdx b/pages/advanced-algorithms/available-algorithms/migrate.mdx index 4609c2c63..884b57e6d 100644 --- a/pages/advanced-algorithms/available-algorithms/migrate.mdx +++ b/pages/advanced-algorithms/available-algorithms/migrate.mdx @@ -32,6 +32,19 @@ filter, and convert relational data into a graph format. | **Implementation** | Python | | **Parallelism** | sequential | + +When running multiple migrations against the same source, avoid repeating the `config` map in every call. +Use [server-side parameters](/database-management/server-side-parameters) to store the connection config once +and reference it as `$config` across all your queries: + +```cypher +SET GLOBAL PARAMETER pg_config = {user: 'memgraph', password: 'password', host: 'localhost', database: 'demo_db'}; + +CALL migrate.postgresql('users', $pg_config) YIELD row CREATE (u:User {id: row.id}); +CALL migrate.postgresql('orders', $pg_config) YIELD row CREATE (o:Order {id: row.id}); +``` + + --- ## Procedures diff --git a/pages/ai-ecosystem/graph-rag/atomic-pipelines.mdx b/pages/ai-ecosystem/graph-rag/atomic-pipelines.mdx index 5955144d8..bdd1f81fa 100644 --- a/pages/ai-ecosystem/graph-rag/atomic-pipelines.mdx +++ b/pages/ai-ecosystem/graph-rag/atomic-pipelines.mdx @@ -57,9 +57,9 @@ Offer the ability to compute embeddings during the preprocessing or retrieval stages. 11. [`llm.complete`](/advanced-algorithms/available-algorithms/llm) function: Allows you to call any LLM under any given Cypher query. -12. Server-side parameters: Could be used for many different things, but in this -context, the parameters help you with managing configuration under any given -query or pipeline. +12. [Server-side parameters](/database-management/server-side-parameters): Could +be used for many different things, but in this context, the parameters help you +with managing configuration under any given query or pipeline. ## Question and Pipeline Types @@ -107,4 +107,4 @@ exists, it synthesizes what is missing or not well-covered. Example use cases: - **Publishing**: *"I want to write a book, which important topics in this domain are not yet well covered in the existing literature?"* -![atomic_graphrag_pipelines](/pages/ai-ecosystem/graph-rag/atomic-pipelines/atomic-graphrag-pipelines.png) \ No newline at end of file +![atomic_graphrag_pipelines](/pages/ai-ecosystem/graph-rag/atomic-pipelines/atomic-graphrag-pipelines.png) diff --git a/pages/clustering/high-availability/best-practices.mdx b/pages/clustering/high-availability/best-practices.mdx index 6ea252f75..cbfd1264d 100644 --- a/pages/clustering/high-availability/best-practices.mdx +++ b/pages/clustering/high-availability/best-practices.mdx @@ -279,6 +279,19 @@ The configuration value can be controlled using the query: SET COORDINATOR SETTING 'max_replica_read_lag' TO '10' ; ``` +### `deltas_batch_progress_size` + +Users can control how often replicas report back to main that they're still processing the data (transactions, WALs, snapshots) +the main has sent to them. The default value is 100'000 which should be enough for most of your transactions. +However, if processing 100'000 deltas takes more than 30s (because you're dealing with large deltas or you have older CPUs), +you can set the configuration value `deltas_batch_progress_size` to a smaller value. This will avoid timeouts on replicas so +you won't see the query exception "At least one SYNC replica has not committed", but at the cost of lower throughput since replicas +will be sending in-progress messages to the main more often. + +``` +SET COORDINATOR SETTING 'deltas_batch_progress_size' TO '50000'; +``` + ## Observability Monitoring cluster health is essential. Key metrics include: diff --git a/pages/clustering/high-availability/ha-commands-reference.mdx b/pages/clustering/high-availability/ha-commands-reference.mdx index 84fe3e3c2..b5adec944 100644 --- a/pages/clustering/high-availability/ha-commands-reference.mdx +++ b/pages/clustering/high-availability/ha-commands-reference.mdx @@ -14,12 +14,18 @@ Memgraph High Availability (HA) cluster. ## Cluster registration commands -**Important:** **All registration commands (adding coordinators and registering -data instances) must be executed on the *same* coordinator.** You may choose any +**Important:** You may choose any coordinator for the initial setup; it automatically becomes the leader. After setup, the choice no longer matters. + +All queries can be run on any coordinator. If currently the coordinator is not a +leader, the query will be automatically forwarded to the current leader and +executed there. This is because the Raft protocol specifies that only the +leader should accept changes in the cluster. + + ### `ADD COORDINATOR` Adds a coordinator to the cluster. @@ -78,7 +84,6 @@ REMOVE COORDINATOR coordinatorId; {

Behavior & implications

} -- Must be executed on the **leader** coordinator. - Leader coordinator **cannot** remove itself. To remove the leader, first trigger a leadership change. @@ -106,7 +111,6 @@ of each instance. {

Behavior & implications

} -- Must be executed on the **leader** coordinator. - Only bolt server can be updated. {

Example

} @@ -294,8 +298,7 @@ SHOW INSTANCE; ### `SHOW REPLICATION LAG` -Shows replication lag (in committed transactions) for all instances. Must be run -on the **leader**. +Shows replication lag (in committed transactions) for all instances. ```cypher SHOW REPLICATION LAG; @@ -326,7 +329,5 @@ FORCE RESET CLUSTER STATE; {

Implications

} -- Must be executed on the **leader** coordinator. - diff --git a/pages/clustering/high-availability/how-high-availability-works.mdx b/pages/clustering/high-availability/how-high-availability-works.mdx index 0e31c61b6..1db882d1e 100644 --- a/pages/clustering/high-availability/how-high-availability-works.mdx +++ b/pages/clustering/high-availability/how-high-availability-works.mdx @@ -133,9 +133,20 @@ Below is a cleaned-up categorization. #### Coordinator → Coordinator RPCs -| RPC | Purpose | Description | -| -------------------- | ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------- | -| `ShowInstancesRpc` | Follower requests cluster state from leader | Sent by a follower coordinator to the leader coordinator when a user executes `SHOW INSTANCES` through the follower. | +| RPC | Purpose | Description | +| ------------------------ | -------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------- | +| `ShowInstancesRpc` | Follower requests cluster state from leader. | Sent by a follower coordinator to the leader coordinator when a user executes `SHOW INSTANCES` through the follower. | +| `AddCoordinatorRpc` | Follower requests adding coordinator. | Sent by a follower coordinator to the leader coordinator when a user executes `ADD COORDINATOR` through the follower. | +| `RemoveCoordinatorRpc` | Follower requests removing coordinator. | Sent by a follower coordinator to the leader coordinator when a user executes `REMOVE COORDINATOR` through the follower. | +| `RegisterInstanceRpc` | Follower requests registering an instance. | Sent by a follower coordinator to the leader coordinator when a user executes `REGISTER INSTANCE` through the follower. | +| `UnregisterInstanceRpc` | Follower requests unregistering an instance. | Sent by a follower coordinator to the leader coordinator when a user executes `UNREGISTER INSTANCE` through the follower. | +| `SetInstanceToMainRpc` | Follower requests updating main instance. | Sent by a follower coordinator to the leader coordinator when a user executes `SET INSTANCE TO MAIN` through the follower. | +| `DemoteInstanceRpc` | Follower requests demoting an instance. | Sent by a follower coordinator to the leader coordinator when a user executes `DEMOTE INSTANCE` through the follower. | +| `UpdateConfigRpc` | Follower requests updating config. | Sent by a follower coordinator to the leader coordinator when a user executes `UPDATE CONFIG` through the follower. | +| `ForceResetRpc` | Follower requests resetting cluster state. | Sent by a follower coordinator to the leader coordinator when a user executes `FORCE RESET` through the follower. | +| `ShowInstancesRpc` | Follower requests showing cluster state. | Sent by a follower coordinator to the leader coordinator when a user executes `SHOW INSTANCES` through the follower. | +| `GetRoutingTableRpc` | Follower requests a routing table. | Sent by a follower coordinator to the leader coordinator when a user connects using `bolt+routing` through the follower. | +| `CoordReplicationLagRpc` | Follower requests replication lag info. | Sent by a follower coordinator to the leader coordinator when a user executes `SHOW REPLICATION LAG` through the follower. | #### Coordinator → Data Instance RPCs diff --git a/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx b/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx index 64c7e186d..4de49f007 100644 --- a/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx +++ b/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx @@ -586,6 +586,41 @@ Memgraph HA uses standard Kubernetes **startup**, **readiness**, and - **Coordinators**: probed on the **NuRaft server** - **Data instances**: probed on the **Bolt server** +## Debugging + +There are different ways in which you can debug Memgraph's HA cluster in production. One way is to send us logs from all instances if you notice some issue. That's why we +advise users to set the log level to `TRACE` if possible. Note however that running `TRACE` log level has some performance costs, especially when logging to stderr additionally +to files. If the performance is your concern, first try to set `--also-log-to-stderr=false` since logging to files is cheaper. If you're still unhappy with the performance overhead +of logging, use `--log-level=DEBUG` (higher log level will also be fine like `INFO`, `CRITICAL`...) and `--also-log-to-stderr=true`. + +If you notice your application is crashing, you will be able to collect core dumps by setting `storage.data.createCoreDumpsClaim` and `storage.coordinators.createCoreDumpsClaim` +to `true`. That will trigger the creation of init container which will be run in privileged mode as root user to set-up all the necessary things on your nodes to be able to +collect core dumps. You can then create the debug pod and attach PVC containing core dumps to that pod to be able to extract core dumps outside of the K8s nodes. The example +of such a debug pod is the following YAML file: +```yaml +apiVersion: v1 +kind: Pod +metadata: + name: debug-coredump +spec: + containers: + - name: debug + image: ubuntu:22.04 + command: ["sleep", "infinity"] + volumeMounts: + - name: coredumps + mountPath: /var/core/memgraph + volumes: + - name: coredumps + persistentVolumeClaim: + claimName: memgraph-data-0-core-dumps-storage-memgraph-data-0-0 + restartPolicy: Never +``` +There is also a possibility of automatically uploading core dumps to S3. To do that, set `coreDumpUploader.enabled` to `true` and configure the S3 bucket, +AWS region, and credentials secret in the `coreDumpUploader` section. Note that the `createCoreDumpsClaim` flag for the relevant role (data/coordinators) +must also be set to `true`, as the uploader sidecar mounts the same PVC used for core dump storage. Core dumps are uploaded to +`s3://///`. + ## Monitoring @@ -748,7 +783,7 @@ and their default values. | `prometheus.memgraphExporter.pullFrequencySeconds` | How often will Memgraph's Prometheus exporter pull data from Memgraph instances. | `5` | | `prometheus.memgraphExporter.repository` | The repository where Memgraph's Prometheus exporter image is available. | `memgraph/prometheus-exporter` | | `prometheus.memgraphExporter.tag` | The tag of Memgraph's Prometheus exporter image. | `0.2.1` | -| `prometheus.serviceMonitor.enabled` | If enabled, a `ServiceMonitor` object will be deployed. | `true` | +| `prometheus.serviceMonitor.enabled` | If enabled, a `ServiceMonitor` object will be deployed. | `true` | | `prometheus.serviceMonitor.kubePrometheusStackReleaseName` | The release name under which `kube-prometheus-stack` chart is installed. | `kube-prometheus-stack` | | `prometheus.serviceMonitor.interval` | How often will Prometheus pull data from Memgraph's Prometheus exporter. | `15s` | | `labels.coordinators.podLabels` | Enables you to set labels on a pod level. | `{}` | @@ -759,6 +794,17 @@ and their default values. | `extraEnv.coordinators` | Env variables that users can define and are applied to coordinators | `[]` | | `initContainers.data` | Init containers that users can define that will be applied to data instances. | `[]` | | `initContainers.coordinators` | Init containers that users can define that will be applied to coordinators. | `[]` | +| `coreDumpUploader.enabled` | Enable the core dump S3 uploader sidecar. Requires `storage..createCoreDumpsClaim` to be `true`. | `false` | +| `coreDumpUploader.image.repository` | Docker image repository for the uploader sidecar | `amazon/aws-cli` | +| `coreDumpUploader.image.tag` | Docker image tag for the uploader sidecar | `2.33.28` | +| `coreDumpUploader.image.pullPolicy` | Image pull policy for the uploader sidecar | `IfNotPresent` | +| `coreDumpUploader.s3BucketName` | S3 bucket name where core dumps will be uploaded | `""` | +| `coreDumpUploader.s3Prefix` | S3 key prefix (folder) for uploaded core dumps | `core-dumps` | +| `coreDumpUploader.awsRegion` | AWS region of the S3 bucket | `us-east-1` | +| `coreDumpUploader.pollIntervalSeconds` | How often (in seconds) the sidecar checks for new core dump files | `30` | +| `coreDumpUploader.secretName` | Name of the K8s Secret containing AWS credentials | `aws-s3-credentials` | +| `coreDumpUploader.accessKeySecretKey` | Key in the K8s Secret for `AWS_ACCESS_KEY_ID` | `AWS_ACCESS_KEY_ID` | +| `coreDumpUploader.secretAccessKeySecretKey` | Key in the K8s Secret for `AWS_SECRET_ACCESS_KEY` | `AWS_SECRET_ACCESS_KEY` | For the `data` and `coordinators` sections, each item in the list has the diff --git a/pages/database-management/_meta.ts b/pages/database-management/_meta.ts index cb0c5a2e0..801f27ca3 100644 --- a/pages/database-management/_meta.ts +++ b/pages/database-management/_meta.ts @@ -9,6 +9,7 @@ export default { "monitoring": "Monitoring", "multi-tenancy": "Multi-tenancy", "query-metadata": "Query metadata", + "server-side-parameters": "Server-side parameters", "server-stats": "Server stats", "ssl-encryption": "SSL encryption", "system-configuration": "System configuration" diff --git a/pages/database-management/authentication-and-authorization/query-privileges.mdx b/pages/database-management/authentication-and-authorization/query-privileges.mdx index d24575f88..ba0e6fdc3 100644 --- a/pages/database-management/authentication-and-authorization/query-privileges.mdx +++ b/pages/database-management/authentication-and-authorization/query-privileges.mdx @@ -124,6 +124,9 @@ Memgraph's privilege system controls access to various database operations throu | `DATA DIRECTORY LOCK STATUS` | `DURABILITY` | `DATA DIRECTORY LOCK STATUS` | | `FREE MEMORY` | `FREE_MEMORY` | `FREE MEMORY` | | `SHOW CONFIG` | `CONFIG` | `SHOW CONFIG` | +| `SET [GLOBAL] PARAMETER` | `SERVER_SIDE_PARAMETERS` | `SET GLOBAL PARAMETER x = 'value'` | +| `UNSET [GLOBAL] PARAMETER` | `SERVER_SIDE_PARAMETERS` | `UNSET PARAMETER x` | +| `SHOW PARAMETERS` | `SERVER_SIDE_PARAMETERS` | `SHOW PARAMETERS` | | `CREATE TRIGGER` | `TRIGGER` | `CREATE TRIGGER ...` | | `DROP TRIGGER` | `TRIGGER` | `DROP TRIGGER ...` | | `SHOW TRIGGERS` | `TRIGGER` | `SHOW TRIGGERS` | diff --git a/pages/database-management/configuration.mdx b/pages/database-management/configuration.mdx index 6fe01b68d..a7acf916d 100644 --- a/pages/database-management/configuration.mdx +++ b/pages/database-management/configuration.mdx @@ -319,6 +319,7 @@ fallback to the value of the command-line argument. | timezone | IANA timezone identifier string setting the instance's timezone. | yes | | storage.snapshot.interval | Define periodic snapshot schedule via 6-field cron expression (seconds, minute, hour, day of month, month, day of week—an [Enterprise feature](/database-management/enabling-memgraph-enterprise)) or as a period in seconds. Set to empty string to disable. | no | | storage-gc-aggressive | Enables aggressive garbage collection, which performs full cleanup on GC call where deltas, vertices, edges or indices and constraints skip lists are being cleaned up. This setting requires taking a unique lock that will temporarily block the system during garbage collection. | yes | +| storage.access_timeout_sec | Storage access timeout in seconds. Guards against queries waiting indefinitely for storage access. Valid range: [1, 1000000]. Corresponds to `--storage-access-timeout-sec`. | no | | aws.region | AWS region in which your S3 service is located. | yes | | aws.access_key | Access key used to READ the file from S3. | yes | | aws.secret_key | Secret key used to READ the file from S3. | yes | @@ -342,6 +343,10 @@ If you want to change a value for a specific setting, following query should be SET DATABASE SETTING "setting.name" TO "some-value"; ``` +For reusable query values accessed as `$name`, see +[Server-side parameters](/database-management/server-side-parameters). Unlike +database settings, server-side parameters are resolved during query execution. + ### Multitenancy and configuration If you are using a multi-tenant architecture, all isolated databases share @@ -472,7 +477,8 @@ in Memgraph. | `--storage-automatic-edge-type-index-creation-enabled=false` | Enables automatic creation of indices on edge types. Only usable in IN_MEMORY_TRANSACTIONAL mode. | `[bool]` | | `--storage-property-store-compression-enabled=false` | Controls whether the properties should be compressed in the storage. | `[bool]` | | `--storage-property-store-compression-level=mid` | Controls property store compression level. Allowed values: low, mid, high | `[string]` | -| `--storage-access-timeout-sec=1` | Storage access timeout in seconds. Used to fine-tune the responsiveness and guard against queries indefinitely waiting. | `[uint64]` | +| `--storage-floating-point-resolution-bits=64` | Max bits for floating-point property storage. Allowed values: 16, 32, 64. Lower values save memory but reduce precision. | `[uint64]` | +| `--storage-access-timeout-sec=1` | Storage access timeout in seconds. Used to fine-tune the responsiveness and guard against queries indefinitely waiting. Can also be changed at runtime via `SET DATABASE SETTING 'storage.access_timeout_sec' TO 'value'`. Valid range: [1, 1000000]. | `[uint64]` | | `--storage-enable-backup-dir=true` | Controls whether `.old` directory will be used to store backup. | `[bool]` | ### Streams @@ -522,6 +528,8 @@ This section contains the list of all other relevant flags used within Memgraph. | `--isolation-level=SNAPSHOT_ISOLATION` | Isolation level used for the transactions. Allowed values: SNAPSHOT_ISOLATION, READ_COMMITTED, READ_UNCOMMITTED. | `[string]` | | `--log-file=/var/log/memgraph/memgraph.log` | Path to where the log should be stored. If set to an empty string (`--log-file=`), no logs will be saved. | `[string]` | | `--log-level=WARNING` | Minimum log level. Allowed values: TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL. | `[string]` | +| `--logger-type=sync` | Type of logger used by Memgraph. Allowed values: `sync`, `async`. When set to `async`, log messages are buffered and written in a background thread, reducing the performance impact of logging on query execution. | `[string]` | +| `--log-retention-days=35` | Controls for how many days daily log files will be preserved. Allowed values: 1–1000000. | `[uint64]` | | `--memory-limit=0` | Total memory limit in MiB. Set to 0 to use the default values which are 100% of the physical memory if the swap is enabled and 90% of the physical memory otherwise. | `[uint64]` | | `--metrics-address` | Host for HTTP server for exposing metrics. | `[string]` | | `--metrics-port` | Port for HTTP server for exposing metrics. | `[uint64]` | diff --git a/pages/database-management/debugging.mdx b/pages/database-management/debugging.mdx index 4bd1b9498..84cff2401 100644 --- a/pages/database-management/debugging.mdx +++ b/pages/database-management/debugging.mdx @@ -470,100 +470,124 @@ kubectl get pvc -l app= # Get the PersistentVolumeClaims for t ### Debugging Memgraph pods -To use `gdb` inside a Kubernetes pod, the container must run in **privileged -mode**. To run any given container in the privileged mode, the k8s cluster -itself needs to have an appropriate configuration. +You can attach GDB to a running Memgraph pod using [ephemeral debug +containers](https://kubernetes.io/docs/tasks/debug/debug-application/debug-running-pod/#ephemeral-container). +This approach injects a debug container into an existing pod — no need to +redeploy or create a separate privileged pod. -Below is an example on how to start the privileged `kind` cluster. +**Requirements:** kubectl 1.32+, Kubernetes 1.25+ (ephemeral containers must be +enabled). -{

Create a privileged kind cluster

} +{

Attach GDB using the debug script

} -First, create new config `debug-cluster.yaml` file with allow-privileged -enabled. +The +[`debug-memgraph.sh`](https://github.com/memgraph/helm-charts/tree/main/scripts/debug-memgraph.sh) +script automates the entire workflow: it creates an ephemeral container with +root privileges and `SYS_PTRACE`, installs GDB, finds the Memgraph process, and +attaches to it. -```yaml -kind: Cluster -apiVersion: kind.x-k8s.io/v1alpha4 -nodes: - - role: control-plane - image: kindest/node:v1.31.0 - extraPortMappings: - - containerPort: 80 - hostPort: 8080 - protocol: TCP - kubeadmConfigPatches: - - | - kind: ClusterConfiguration - kubeletConfiguration: - extraArgs: - allow-privileged: "true" -# To inspect the cluster run `kubectl get pods -n kube-system`. -# If some of the pods is in the CrashLoopBackOff status, try running `kubectl -# logs -n kube-system` to get the error message. +```bash +./scripts/debug-memgraph.sh memgraph-data-0-0 ``` -To start the cluster, execute the following command: -``` -kind create cluster --name --config debug-cluster.yaml +The script auto-detects the target container name from the pod name (`data` → +`memgraph-data`, `coordinator` → `memgraph-coordinator`). You can override this +and other options: + +```bash +# Specify container, namespace, or image explicitly +./scripts/debug-memgraph.sh memgraph-data-0-0 -c memgraph-data -n my-namespace + +# Use a custom debug image +DEBUG_IMAGE=ubuntu:24.04 ./scripts/debug-memgraph.sh memgraph-data-0-0 ``` -{

Deploy a debug pod

} +Once attached, GDB will continue the process and stop on any crash or signal. +Use `bt` (backtrace) to inspect the call stack when it stops. + +{

Manual approach (alternative)

} + +If you can't use ephemeral containers (older Kubernetes versions, or cluster +policies that block `kubectl debug`), you can deploy a privileged debug pod on +the same node and attach GDB from there. + +First, identify which node your target pod is running on: + +```bash +kubectl get pods -o wide +``` -Once cluster is up and running, create a new `debug-pod.yaml` file with the -following content: +Edit +[`perf_pod.yaml`](https://github.com/memgraph/helm-charts/tree/main/scripts/perf_pod.yaml) +and set `nodeName` to match the target pod's node: ```yaml apiVersion: v1 kind: Pod metadata: - name: debug-pod + name: debug spec: containers: - - name: my-container - image: memgraph/memgraph:3.2.0-relwithdebinfo # Use the latest, but make sure it's the relwithdebinfo one! + - args: + - "3600" + command: + - sleep + image: ubuntu:22.04 + name: debug + imagePullPolicy: Always securityContext: - runAsUser: 0 # Runs the container as root. privileged: true - capabilities: - add: ["SYS_PTRACE"] - allowPrivilegeEscalation: true - command: ["sleep"] - args: ["infinity"] - stdin: true - tty: true + hostPID: true + nodeName: # must match target pod's node + restartPolicy: Never ``` -To get the pod up and running and open a shell inside it run: +```bash +kubectl apply -f scripts/perf_pod.yaml ``` -kubectl apply -f debug-pod.yaml -kubectl exec -it debug-pod -- bash + +The `hostPID: true` setting gives the debug pod visibility into all processes on +the node. Since multiple Memgraph processes may be running on the same node, use +[`find-memgraph-pid.sh`](https://github.com/memgraph/helm-charts/tree/main/scripts/find-memgraph-pid.sh) +to find the correct PID by matching the pod UID against `/proc//cgroup`: + +```bash +./scripts/find-memgraph-pid.sh memgraph-data-0-0 ``` -Once you are in the pod execute: +Then exec into the debug pod, install GDB, and attach to the Memgraph process: + +```bash +kubectl exec -it debug -- bash +apt-get update && apt-get install -y gdb procps +gdb -p ``` -apt-get update && apt-get install -y gdb -su memgraph -gdb --args ./memgraph -run + +Once GDB stops on a crash or signal, use the [GDB +commands](/database-management/debugging#list-of-useful-commands-when-in-gdb) to +investigate. Clean up when done: + +```bash +kubectl delete pod debug ``` -Once you have memgraph up and running under `gdb`, run your workload (insert -data, write or queries…). When you manage to recreate the issue, use the [gdb -commands](/database-management/debugging#list-of-useful-commands-when-in-gdb) -to pin point the exact issue. +{

How the debug script works

} -{

Delete the debug pod

} +Memgraph Helm charts run pods as non-root (uid 101, gid 103) with a restrictive +security context. The `debug-memgraph.sh` script works around this by: + +1. Using `kubectl debug` with `--profile=sysadmin` to grant `SYS_PTRACE` +2. Applying a custom security profile that overrides `runAsUser` to 0 (root) for + the ephemeral container only — the Memgraph container is unaffected +3. Targeting the Memgraph container with `--target` to share its process + namespace, making the Memgraph PID visible inside the debug container -To delete the debug pod run: -``` -kubectl delete pod debug-pod -```
-k8s official documentation on how to [debug running +Kubernetes official documentation on how to [debug running pods](https://kubernetes.io/docs/tasks/debug/debug-application/debug-running-pod/) -is quite detailed. +covers additional techniques including node-level debugging. ### Handling core dumps @@ -578,6 +602,8 @@ To enable core dumps, create a `values.yaml` file with at least the following se createCoreDumpsClaim: true ``` +If you're running Memgraph high availability chart, you can automatically upload [core dumps to S3](/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx). + Setting this value to true will also enable the use of GDB inside Memgraph containers when using our provided [charts](https://github.com/memgraph/helm-charts). @@ -612,6 +638,132 @@ If you have k8s cluster under any major cloud provider + you want to store the dumps under S3, probably the best repo to check out is the [core-dump-handler](https://github.com/IBM/core-dump-handler). +## Profiling Memgraph in Kubernetes + +Profile a Memgraph process running inside a Kubernetes pod using `perf` and generate flame graphs. + +### Prerequisites + +- `kubectl` configured with access to your cluster +- A running Memgraph deployment (standalone or HA) + +### Step 1: Identify the target pod + +```bash +kubectl get pods -o wide +``` + +| NAME | READY | STATUS | RESTARTS | AGE | IP | NODE | +|---|---|---|---|---|---|---| +| memgraph-coordinator-1-0 | 1/1 | Running | 0 | 23h | 10.244.3.227 | aks-nodepool1-...000002 | +| memgraph-coordinator-2-0 | 1/1 | Running | 0 | 23h | 10.244.0.173 | aks-nodepool1-...000000 | +| memgraph-coordinator-3-0 | 1/1 | Running | 0 | 23h | 10.244.4.250 | aks-nodepool1-...000003 | +| memgraph-data-0-0 | 1/1 | Running | 1 (22h ago) | 23h | 10.244.2.152 | aks-nodepool1-...000001 | +| memgraph-data-1-0 | 1/1 | Running | 0 | 22m | 10.244.1.199 | aks-nodepool1-...000004 | + +In this example, we want to profile `memgraph-data-1-0`, which is currently the MAIN instance. Note the **NODE** it is running on — the debug pod must be scheduled on the same node. + +### Step 2: Deploy the debug pod + +Edit `perf_pod.yaml` and set `nodeName` to match the target pod's node: + +```yaml +apiVersion: v1 +kind: Pod +metadata: + name: debug +spec: + containers: + - args: + - "3600" + command: + - sleep + image: ubuntu:22.04 + name: debug + imagePullPolicy: Always + securityContext: + privileged: true + hostPID: true + nodeName: aks-nodepool1-38123842-vmss000004 # <-- must match target pod's node + restartPolicy: Never +``` + +```bash +kubectl apply -f scripts/perf_pod.yaml +``` + +The debug pod needs `privileged: true` and `hostPID: true` so it can see host processes and access `/proc//cgroup` to match processes to pods. + +### Step 3: Find the Memgraph PID + +Since multiple Memgraph processes may be visible from the host PID namespace (due to Kubernetes multi-tenancy), we need to match the correct one to our target pod. The [`find-memgraph-pid.sh`](find-memgraph-pid.sh) script does this automatically — it resolves the pod's UID, lists all `memgraph` processes inside the debug pod, and matches via `/proc//cgroup`: + +```bash +./scripts/find-memgraph-pid.sh memgraph-data-1-0 +``` + +Output: +``` +Pod: memgraph-data-1-0 +UID: c8707c88-631b-467c-af9f-26e9dac8e780 +UID fragment: 26e9dac8e780 +Debug pod: debug + +Found memgraph PIDs: 1335771 1396816 +cgroup match: /proc/1396816/cgroup:0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod...26e9dac8e780... + +Matched memgraph PID: 1396816 +``` + +Use `-d` to specify a different debug pod name, or `-n` for a non-default namespace: +```bash +./scripts/find-memgraph-pid.sh memgraph-data-1-0 -d my-debug-pod -n memgraph. # Default is debug for pod name and default for namespace +``` + +### Step 4: Install perf in the debug pod + +```bash +kubectl exec -it debug -- bash +``` + +Inside the debug pod: +```bash +apt-get update && apt-get install -y linux-tools-common linux-tools-generic +``` + +> **Note (AKS / cloud kernels):** `apt-get install linux-tools-$(uname -r)` will fail if the host kernel is a cloud-specific variant (e.g., `5.15.0-1102-azure`) because the matching package isn't in standard Ubuntu repos. Use `linux-tools-generic` instead — the generic `perf` binary works in most cases. If it complains about a version mismatch, invoke it directly: +> ```bash +> /usr/lib/linux-tools/*/perf record ... +> ``` + +### Step 5: Record a perf profile + +```bash +perf record -p --call-graph dwarf sleep 30 +``` + +Replace `` with the PID from Step 3. Adjust the duration (`sleep 30`) as needed — run your workload during this window. + +### Step 6: Generate a flame graph + +```bash +apt-get install -y git +git clone https://github.com/brendangregg/FlameGraph +perf script | ./FlameGraph/stackcollapse-perf.pl > out.perf-folded +./FlameGraph/flamegraph.pl out.perf-folded > perf.svg +``` + +### Step 7: Copy results and clean up + +From your local machine: +```bash +kubectl cp debug:perf.svg perf.svg +kubectl cp debug:perf.data perf.data # optional: raw perf data for later analysis +kubectl delete pod debug +``` + +Open `perf.svg` in a browser to explore the interactive flame graph. + ### Specific cloud provider instructions * [AWS](https://github.com/memgraph/helm-charts/tree/main/charts/memgraph-high-availability/aws) diff --git a/pages/database-management/enabling-memgraph-enterprise.mdx b/pages/database-management/enabling-memgraph-enterprise.mdx index 6de5c5f63..3aecb48f0 100644 --- a/pages/database-management/enabling-memgraph-enterprise.mdx +++ b/pages/database-management/enabling-memgraph-enterprise.mdx @@ -24,8 +24,31 @@ Whether you bought Memgraph Enterprise or requested a trial, you will receive a file with the values you need to set the `organization.name` and the `enterprise.license` configuration values to. +## License sources and selection + +Memgraph accepts a license key from three sources and automatically selects the +best one on startup or whenever the license settings change: + +| Source | How to provide | Priority | +|--------|---------------|---------| +| **CLI flags** | `--license-key` and `--organization-name` at startup | Highest (3) | +| **Environment variables** | `MEMGRAPH_ENTERPRISE_LICENSE` and `MEMGRAPH_ORGANIZATION_NAME` | Medium (2) | +| **Database settings** | `SET DATABASE SETTING` queries (persisted to disk) | Lowest (1) | + +**Winner selection:** Among all sources that provide a valid, non-expired key +with a matching organization name, Memgraph picks the license with the furthest +expiry date. A license with no expiry date (`valid_until = 0`, "forever") always +beats any finite expiry. If two candidates expire at the same time, the +higher-priority source wins. + +**Persistence:** The winning license is automatically written back to persistent +storage so it remains active across restarts even if the CLI flags or environment +variables are no longer passed. + +## Providing the license + If you want to enable the Enterprise Edition on startup, [set the configuration -flags](/configuration/configuration-settings#changing-configuration) or +flags](/configuration/configuration-settings#changing-configuration) or [environment variables](/database-management/configuration#environment-variables) to the correct values. @@ -35,11 +58,18 @@ can also be adjusted [during runtime](/configuration/configuration-settings#during-runtime), or you can run the following queries to set the values: -``` +```cypher SET DATABASE SETTING 'organization.name' TO 'Organization'; SET DATABASE SETTING 'enterprise.license' TO 'License'; ``` +Setting a license key via `SET DATABASE SETTING` is validated immediately: +- A **malformed or undecodable** key is rejected with an error — the write does not go through. +- An **already-expired** key is rejected with an error — the write does not go through. +- An **organization name mismatch** is not checked at write time; it is caught + during the next revalidation and Memgraph will log a warning and fall back to + community mode if no other valid source is available. + To check the set values run: ```opencypher @@ -69,7 +99,10 @@ That means it is possible to analyze the existing data but new data can no longer be added until you upgrade or free storage by deleting some of the data. Upon upgrading the license by entering a new license key the `write` queries -will be enabled. +will be enabled. When multiple valid license keys are present (for example a CLI +key and a key set via `SET DATABASE SETTING`), Memgraph automatically picks the +one with the furthest expiry, so providing a longer-lived key from any source +takes effect immediately. To check the used storage, run `SHOW STORAGE INFO;`. @@ -80,6 +113,13 @@ data stored in the database will remain intact. You will still be able to add more data, but any enterprise features that require specific actions will no longer function. For example, you will not be able to create any new databases. +## Switching between Community and Enterprise editions + +Enterprise user and role details are persisted in the database across editions, +including license expiry. However, modifying users while running the Community +build will result in the loss of enterprise-specific attributes such as multiple +role assignments or impersonation privileges. + ## Security features ### Role-based access control diff --git a/pages/database-management/logs.mdx b/pages/database-management/logs.mdx index 1a1efed3b..c49887bee 100644 --- a/pages/database-management/logs.mdx +++ b/pages/database-management/logs.mdx @@ -11,7 +11,6 @@ The default location of logs is inside the `/var/log/memgraph/` directory. That location can be configured using the `--log-file` [configuration flag](/database-management/configuration#other). - Memgraph tracks logs at various levels: TRACE, DEBUG, INFO, WARNING, ERROR, CRITICAL. By default, it is using the WARNING level, but you can change the level using the [`--log-level`](/database-management/configuration#other) @@ -19,6 +18,17 @@ configuration flag or [during runtime](/database-management/configuration#change-configuration-settings-during-runtime) using the `SET DATABASE SETTING "log.level" TO "TRACE";` +By default, Memgraph uses synchronous logging. To reduce the performance impact +of logging on query execution, you can enable asynchronous logging by setting the +[`--logger-type`](/database-management/configuration#other) configuration flag +to `async`. With asynchronous logging, log messages are buffered and written in a +background thread instead of blocking the executing thread. + +Memgraph creates daily log files and retains them for 35 days by default. You +can change the retention period using the +[`--log-retention-days`](/database-management/configuration#other) configuration +flag. + The configuration set during runtime will be applied only for that session. You can check the log level by running `SHOW DATABASE SETTING "log.level";` query. diff --git a/pages/database-management/server-side-parameters.mdx b/pages/database-management/server-side-parameters.mdx new file mode 100644 index 000000000..361f683e2 --- /dev/null +++ b/pages/database-management/server-side-parameters.mdx @@ -0,0 +1,142 @@ +--- +title: Server-side parameters +description: Configure and reuse server-side parameters in Memgraph with database or global scope. +--- + +# Server-side parameters + +Server-side parameters are named values stored by Memgraph and reusable in +queries through `$parameterName`, just like user-provided parameters. + +They are useful when you want reusable defaults, while still allowing clients to override values when needed. + +## Parameter scopes + +Server-side parameters support two scopes: + +- **Database scope**: set with `SET PARAMETER ...`; visible only in the current + database. +- **Global scope**: set with `SET GLOBAL PARAMETER ...`; visible across + databases. + +When you run `SHOW PARAMETERS`, Memgraph returns: +- all global parameters +- parameters from the current database + +## Resolution order + +When a query uses `$name`, values are resolved in this order: + +1. user-provided query parameter (from the client) +2. database-scoped server-side parameter +3. global server-side parameter + +This means: +- user-provided parameters are preferred over server-side parameters +- database-scoped server-side parameters are preferred over global ones + +## Set a parameter + +Use `SET PARAMETER` for database scope, or `SET GLOBAL PARAMETER` for global +scope. + +```opencypher +SET PARAMETER x = 'db_value'; +SET GLOBAL PARAMETER x = 'global_value'; +``` + +You can set the value from: +- a literal +- a user-provided parameter (for example `$config`) +- a map value + +```opencypher +SET GLOBAL PARAMETER app_config = $config; +SET GLOBAL PARAMETER limits = {timeout: 120, mode: 'safe'}; +SET GLOBAL PARAMETER ids = [10, 20, 30]; +``` + +## Unset a parameter + +Remove a parameter by name in either database or global scope. + +```opencypher +UNSET PARAMETER x; +UNSET GLOBAL PARAMETER x; +``` + +## Show parameters + +List current server-side parameters: + +```opencypher +SHOW PARAMETERS; +``` + +Output columns: +- `name`: parameter name +- `value`: stored value (JSON-encoded string form) +- `scope`: `database` or `global` + +## Privileges + +Server-side parameter queries require the `SERVER_SIDE_PARAMETERS` privilege. +See the [Query privileges reference](/database-management/authentication-and-authorization/query-privileges). + +## Use cases + +### Use in queries with `$` + +Once set, server-side parameters are available as `$name` in queries: + +```opencypher +SET GLOBAL PARAMETER tenant = 'acme'; +CREATE (:Account {tenant: $tenant}); +``` + +If the client sends a user parameter with the same name, that user value is +used first: + +```opencypher +SET GLOBAL PARAMETER x = 'server_value'; +CREATE (:Node {property: $x}); +``` + +If the query is run with user parameter `x = 'client_value'`, the node property +will be `'client_value'`. + +### Reusing connection config in data migrations + +The [`migrate` module](/advanced-algorithms/available-algorithms/migrate) procedures accept a `config` map with +connection parameters like `host`, `port`, `user`, and `password`. When running multiple migration queries +against the same source, repeating this map in every call is verbose and error-prone. + +Store the config once as a server-side parameter and reference it with `$config` across all your migration queries: + +```opencypher +SET GLOBAL PARAMETER mysql_config = { + user: 'memgraph', + password: 'password', + host: 'localhost', + database: 'demo_db' +}; +``` + +Then use `$mysql_config` in every migration call instead of repeating the full map: + +```opencypher +CALL migrate.mysql('users', $mysql_config) +YIELD row +CREATE (u:User {id: row.id, name: row.name}); + +CALL migrate.mysql('orders', $mysql_config) +YIELD row +CREATE (o:Order {id: row.id, total: row.total}); + +CALL migrate.mysql('SELECT user_id, order_id FROM user_orders', $mysql_config) +YIELD row +MATCH (u:User {id: row.user_id}), (o:Order {id: row.order_id}) +CREATE (u)-[:PLACED]->(o); +``` + +This works for all `migrate` procedures — `migrate.postgresql()`, `migrate.mysql()`, `migrate.neo4j()`, `migrate.sql_server()`, and others. diff --git a/pages/database-management/server-stats.md b/pages/database-management/server-stats.md index d97c306c6..cdb4e4bf3 100644 --- a/pages/database-management/server-stats.md +++ b/pages/database-management/server-stats.md @@ -40,6 +40,8 @@ The result will contain the following fields: | unreleased_delta_objects | The current number of still allocated objects with the information about the changes that write transactions have made, called Delta objects. Refer to allocation and deallocation of Delta objects [on this page](/fundamentals/storage-memory-usage#in-memory-transactional-storage-mode-default). | | disk_usage | The amount of disk space used by the data directory (in B, KiB, MiB, GiB or TiB). | | memory_tracked | The amount of RAM allocated in the system and tracked by Memgraph (in B, KiB, MiB, GiB or TiB).
For more info, check out [memory control](/fundamentals/storage-memory-usage). | +| graph_memory_tracked | The portion of `memory_tracked` used by graph structures (vertices, edges, properties). | +| vector_index_memory_tracked | The portion of `memory_tracked` used by vector index embeddings. | | allocation_limit | The current allocation limit set for this instance (in B, KiB, MiB, GiB or TiB).
For more info, check out the [memory control](/fundamentals/storage-memory-usage#control-memory-usage). | | global_isolation_level | The current `global` isolation level.
For more info, check out [isolation levels](/fundamentals/transactions#isolation-levels). | | session_isolation_level | The current `session` isolation level. | @@ -49,22 +51,25 @@ The result will contain the following fields: ## License information -Running the following query will return certain information about the Memgraph -Enterprise License that was injected into the system. +Running the following query will return information about the active Memgraph +Enterprise license (the winning license selected from all configured sources). ```cypher SHOW LICENSE INFO; ``` -| Field | Description | -|-------------------|-----------------------------------------------------| -| organization_name | Organization name for the enterprise license. | -| license_key | Encoded license key. | -| is_valid | Brief flag whether the license is currently valid. | -| license_type | Enterprise / OEM | -| valid_until | Date when the license expires. | -| memory_limit | Memory limit (in GiB). | -| status | Descriptive status of the license validity. | +| Field | Description | +|-------------------|------------------------------------------------------------------------------------------------------| +| organization_name | Organization name for the enterprise license. | +| license_key | Encoded license key. | +| is_valid | Whether the license is currently valid. Uses the same validation logic as enterprise feature checks. | +| license_type | Enterprise / OEM | +| valid_until | Date when the license expires, or `FOREVER` for non-expiring licenses. | +| memory_limit | Memory limit (in GiB). | +| status | Descriptive status of the license validity. | + +If no license has been provided, `is_valid` is `false` and `status` reads +`"You have not provided any license!"`. diff --git a/pages/fundamentals/indexes.mdx b/pages/fundamentals/indexes.mdx index 5802e394f..72079059b 100644 --- a/pages/fundamentals/indexes.mdx +++ b/pages/fundamentals/indexes.mdx @@ -923,8 +923,13 @@ Once registered: - Existing data is indexed in the background If you encounter write transaction failures during registration, you can -increase the wait time using the `--storage-access-timeout-sec` flag, which -extends how long write queries will retry before failing. +increase the wait time using the `--storage-access-timeout-sec` flag or at runtime via: + +```cypher +SET DATABASE SETTING 'storage.access_timeout_sec' TO 'value'; +``` + +This extends how long write queries will retry before failing. The system maintains **full MVCC consistency** throughout the process, ensuring transactional integrity. Long-running index operations can be safely cancelled diff --git a/pages/fundamentals/storage-memory-usage.mdx b/pages/fundamentals/storage-memory-usage.mdx index da11d7d97..a46a18d05 100644 --- a/pages/fundamentals/storage-memory-usage.mdx +++ b/pages/fundamentals/storage-memory-usage.mdx @@ -445,7 +445,7 @@ accurately. If you want to **estimate** memory usage in `IN_MEMORY_TRANSACTIONAL` storage mode, use the following formula: -$\texttt{StorageRAMUsage} = \texttt{NumberOfVertices} \times 212\text{B} + \texttt{NumberOfEdges} \times 162\text{B}$ +$\texttt{StorageRAMUsage} = \texttt{NumberOfVertices} \times 204\text{B} + \texttt{NumberOfEdges} \times 154\text{B}$ @@ -463,7 +463,7 @@ According to the formula, storage memory usage should be: $ \begin{aligned} -\texttt{StorageRAMUsage} &= 21,723 \times 260\text{B} + 682,943 \times 180\text{B} \\ &= 5,647,980\text{B} + 122,929,740\text{B}\\ &= 128,577,720\text{B} \approx 125\text{MB} +\texttt{StorageRAMUsage} &= 21,723 \times 252\text{B} + 682,943 \times 172\text{B} \\ &= 5,474,196\text{B} + 117,466,196\text{B}\\ &= 122,940,392\text{B} \approx 117\text{MB} \end{aligned} $ @@ -504,8 +504,8 @@ Each `Delta` object has a least **56B**. {

Vertex memory layout

} -Each `Vertex` object has at least **88B** + **56B** for the `Delta` object, in -total, a minimum of **144B**. +Each `Vertex` object has at least **80B** + **56B** for the `Delta` object, in +total, a minimum of **136B**. Each additional label takes **4B**. @@ -515,8 +515,8 @@ memory allocation. {

Edge memory layout

} -Each `Edge` object has at least **40B** + **56B** for the `Delta` object, in -total, a minimum of **96B**. +Each `Edge` object has at least **32B** + **56B** for the `Delta` object, in +total, a minimum of **88B**. {

SkipList memory layout

} @@ -565,7 +565,7 @@ $\texttt{propertySize} = \texttt{basicMetadata} + \texttt{propertyID} + [\texttt |`NULL` | 0B | 0B | Memgraph treats null values same as if they're not present. Therefore, `NULL` values are not stored in the property store. | |`BOOL` | 1B + 1B | 2B | The value is written in the first byte of the basic metadata. | |`INT` | 1B + 1B + 1B, 2B, 4B or 8B | 3B - 10B | Basic metadata, property ID and the value depending on the size of the integer. | -|`DOUBLE` | 1B + 1B + 8B | 10B | Basic metadata, property ID and the value | +|`DOUBLE` | 1B + 1B + 2B, 4B or 8B | 4B, 6B or 10B | Basic metadata, property ID and the value. Value size depends on `--storage-floating-point-resolution-bits`: 16-bit (half), 32-bit (float), 64-bit (double). | |`STRING` | 1B + 1B + 1B + min 1B | at least 4B | Basic metadata, property ID, additional metadata and lastly the value depending on the size of the string, where 1 ASCII character in the string takes up 1B. | |`LIST` | 1B + 1B + 1B + min 1B | at least 4B (empty) | Basic metadata, property ID, additional metadata and the total size depends on the number and size of the values in the list. | |`MAP` | 1B + 1B + 1B + min 1B | at least 4B (empty) | Basic metadata, property ID, additional metadata and the total size depends on the number and size of the values in the map. | @@ -607,10 +607,10 @@ and properties occupy, we are going to use the following formula: $\texttt{NumberOfVertices} \times (\texttt{Vertex} + \texttt{properties} + \texttt{SkipListNode} + \texttt{next\_pointers} + \texttt{Delta}).$ Let's assume the name on average has $3\text{B}+10\text{B} = 13\text{B}$ (each -name is on average 10 characters long). One the average values are included, the -calculation is: +name is on average 10 characters long). Once the average values are included, +the calculation is: -$19,148 \times (88\text{B} + 13\text{B} + 16\text{B} + 16\text{B} + 56\text{B}) = 19,148 \times 189\text{B} = 3,618,972\text{B}.$ +$19,148 \times (80\text{B} + 13\text{B} + 16\text{B} + 16\text{B} + 56\text{B}) = 19,148 \times 181\text{B} = 3,465,788\text{B}.$ The remaining 2,584 vertices are the `ComicSeries` vertices with the `title` and `publishYear` properties. Let's assume that the `title` property is @@ -622,9 +622,9 @@ list occupies $3 \times 2\text{B} \times 2\text{B} = 12\text{B}$. Using the same formula as above, but being careful to include both `title` and `publishYear` properties, the calculation is: -$2584 \times (88\text{B} + 13\text{B} + 12\text{B} + 16\text{B} + 16\text{B} + 56\text{B}) = 2584 \times 201\text{B} = 519,384\text{B}.$ +$2584 \times (80\text{B} + 13\text{B} + 12\text{B} + 16\text{B} + 16\text{B} + 56\text{B}) = 2584 \times 193\text{B} = 498,712\text{B}.$ -In total, $4,138,356\text{B}$ to store vertices. +In total, $3,964,500\text{B}$ to store vertices. The edges don't have any properties on them, so the formula is as follows: @@ -632,7 +632,7 @@ $\texttt{NumberOfEdges} \times (\texttt{Edge} + \texttt{SkipListNode} + \texttt{ There are 682,943 edges in the Marvel dataset. Hence, we have: -$682,943 \times (40\text{B}+16\text{B}+16\text{B}+56\text{B}) = 682,943 \times 128\text{B} = 87,416,704\text{B}.$ +$682,943 \times (32\text{B}+16\text{B}+16\text{B}+56\text{B}) = 682,943 \times 120\text{B} = 81,953,160\text{B}.$ Next, `Hero`, `Comic` and `ComicSeries` labels have label indexes. To calculate how much space they take up, use the following formula: @@ -771,6 +771,8 @@ SHOW STORAGE INFO; | "unreleased_delta_objects" | 0 | | "disk_usage" | "104.46KiB" | | "memory_tracked" | "8.52MiB" | +| "graph_memory_tracked" | "6.14MiB" | +| "vector_index_memory_tracked" | "2.38MiB" | | "allocation_limit" | "58.55GiB" | | "global_isolation_level" | "SNAPSHOT_ISOLATION" | | "session_isolation_level" | "" | @@ -816,6 +818,44 @@ Keep in mind that even though it can reduce memory usage, compression can impact
+#### Floating-point resolution + +For workloads with many `DOUBLE` properties, you can reduce their storage size by lowering the floating-point precision via the `--storage-floating-point-resolution-bits` flag. The allowed values are `64` (default), `32`, and `16`, corresponding to double, float, and half precision respectively. + +To enable 32-bit float precision, start Memgraph with: + +``` +--storage-floating-point-resolution-bits=32 +``` + +With this flag set, storing `3.14159265358979` will be retrieved as approximately `3.1415927` (float precision). Example: + +```cypher +CREATE (:Node {value: 3.14159265358979}); +MATCH (n:Node) RETURN n.value; +// Returns: 3.1415927 (with --storage-floating-point-resolution-bits=32) +// Returns: 3.14159265358979 (with --storage-floating-point-resolution-bits=64) +``` + +| Value | Precision | Bytes per value | Savings vs default | +|-------|-----------|-----------------|--------------------| +| `64` | double (full) | 8B | — | +| `32` | float | 4B | 4B per double | +| `16` | half | 2B | 6B per double | + + + +WAL files and snapshots always serialize `DOUBLE` values as full 64-bit IEEE 754, so the durability format itself is resolution-independent. + +However, precision is baked in at write time: WAL and snapshot entries store the value that `GetProperty()` returns, which is already truncated to the active resolution. This has two consequences when the flag changes across a restart: + +- **Lowering the resolution** (e.g. 64 → 32): doubles recovered from WAL/snapshot are re-encoded into PropertyStore at the new lower precision, causing additional truncation. +- **Raising the resolution** (e.g. 32 → 64): no further truncation occurs, but precision already lost in previous sessions cannot be recovered from WAL/snapshot alone. + +If exact round-trip fidelity is required, set the resolution before loading data and do not change it afterwards. + + + ### Deallocating memory Memgraph has a garbage collector that deallocates unused objects, thus freeing diff --git a/pages/fundamentals/transactions.mdx b/pages/fundamentals/transactions.mdx index 330ed3064..4d3a959d4 100644 --- a/pages/fundamentals/transactions.mdx +++ b/pages/fundamentals/transactions.mdx @@ -83,25 +83,96 @@ constraints upon the execution of the final query in the transaction. Memgraph can return information about running transactions and allow you to terminate them. -### Show running transactions +### Show transactions -To get information about running transaction execute the following query: +To get information about all active transactions execute: ```cypher SHOW TRANSACTIONS; ``` + +Each row in the result represents one transaction (or one in-progress snapshot +creation) and contains five columns: + +| Column | Type | Description | +|---|---|---| +| `username` | `String` | The user who started the transaction, or `""` if authentication is disabled. | +| `transaction_id` | `String` | Unique numeric identifier of the transaction. Use this value with `TERMINATE TRANSACTIONS`. | +| `query` | `List[String]` | Queries executed within the transaction so far. | +| `status` | `String` | Lifecycle phase of the transaction: `running`, `committing`, or `aborting`. Snapshot rows always show `running`. | +| `metadata` | `Map` | Metadata supplied by the client when the transaction was opened. For in-progress snapshots this contains progress details (see below). | + ```copy=false +memgraph> SHOW TRANSACTIONS; ++----------+------------------------+-----------------------------------------------+--------------+----------+ +| username | transaction_id | query | status | metadata | ++----------+------------------------+-----------------------------------------------+--------------+----------+ +| "" | "9223372036854794885" | ["UNWIND range(1,100) AS i CREATE(:L{p:i});"] | "committing" | {} | +| "" | "9223372036854794896" | ["SHOW TRANSACTIONS"] | "running" | {} | ++----------+------------------------+-----------------------------------------------+--------------+----------+ +``` + +#### Filter by status + +You can limit the output to transactions in a specific lifecycle phase by +naming one or more statuses before the `TRANSACTIONS` keyword: + +```cypher +SHOW RUNNING TRANSACTIONS; +SHOW COMMITTING TRANSACTIONS; +SHOW ABORTING TRANSACTIONS; +SHOW RUNNING, COMMITTING TRANSACTIONS; +``` + +When multiple statuses are listed (comma-separated) the result is their union — +rows matching any of the requested statuses are returned. +Omitting the status list is equivalent to requesting all three statuses. +#### Snapshot progress rows + +While a snapshot is being created (triggered periodically, on exit, or +manually with `CREATE SNAPSHOT`) a synthetic row is included in the result +with `transaction_id` set to `"snapshot"`. The `metadata` map for these +rows contains: + +| Key | Description | +|---|---| +| `phase` | Current phase of snapshot creation: `EDGES`, `VERTICES`, `INDICES`, `CONSTRAINTS`, or `FINALIZING`. | +| `items_done` | Number of objects serialized in the current phase so far. | +| `items_total` | Total number of objects expected in the current phase. | +| `elapsed_ms` | Milliseconds elapsed since the snapshot started. | +| `db_name` | Name of the database whose snapshot is being created. | + +```copy=false memgraph> SHOW TRANSACTIONS; -+---------------+-----------------------------+-------------------------------------------+----------------+ -| username | transaction_id | query | metadata | -+---------------+-----------------------------+-------------------------------------------+----------------+ -| "" | "9223372036854794885" | ["CALL infinite.get() YIELD * RETURN *;"] | {} | -| "" | "9223372036854794896" | ["SHOW TRANSACTIONS"] | {} | -+---------------+-----------------------------+-------------------------------------------+----------------+ ++----------+----------------+-----------------------------+-----------+------------------------------------------------------------------+ +| username | transaction_id | query | status | metadata | ++----------+----------------+-----------------------------+-----------+------------------------------------------------------------------+ +| "" | "snapshot" | ["CREATE SNAPSHOT"] | "running" | {phase: "VERTICES", items_done: 142000, items_total: 500000, ... | ++----------+----------------+-----------------------------+-----------+------------------------------------------------------------------+ ``` -By default, the users can see and terminate only the transactions they have + +Snapshot progress values are read from independent atomic counters and are not +captured as a single consistent snapshot. `items_done`, `items_total`, and +`phase` may reflect slightly different points in time, so treat them as +best-effort estimates rather than exact figures. In particular, `items_done` +may briefly read as `0` when the phase transitions, and `elapsed_ms` may be +absent if the snapshot started between the phase check and the time read. + + + +Snapshot rows cannot be terminated. Passing `"snapshot"` to `TERMINATE +TRANSACTIONS` will have no effect — background snapshot creation runs outside +the normal transaction lifecycle and cannot be interrupted via Cypher. + + +Because snapshot rows always have `status` `"running"`, they are suppressed +when you use `SHOW COMMITTING TRANSACTIONS` or `SHOW ABORTING TRANSACTIONS`. + +#### Permissions + +By default, users can see and terminate only the transactions they have started. For all other transactions, the user must have the [**TRANSACTION_MANAGEMENT** privilege](/database-management/authentication-and-authorization/role-based-access-control) which the admin assigns with the following query: @@ -117,11 +188,11 @@ using the following query: REVOKE TRANSACTION_MANAGEMENT FROM user; ``` - + When Memgraph is first started there is only one explicit super-admin user that has all the privileges, including the **TRANSACTION_MANAGEMENT** privilege. The super-admin user is able to see all -transactions. +transactions. If you are connecting to Memgraph using a client, you can pass additional @@ -213,18 +284,17 @@ Managing transactions is done by establishing a new connection to the database. **Show and terminate transactions** -The output of the `SHOW TRANSACTIONS` command shows that a query is +The output of the `SHOW TRANSACTIONS` command shows that a query is currently being run as part of the transaction ID "9223372036854794885". ```copy=false - memgraph> SHOW TRANSACTIONS; -+---------------+-----------------------------+-------------------------------------------+----------------+ -| username | transaction_id | query | metadata | -+---------------+-----------------------------+-------------------------------------------+----------------+ -| "" | "9223372036854794885" | ["CALL infinite.get() YIELD * RETURN *;"] | {} | -| "" | "9223372036854794896" | ["SHOW TRANSACTIONS"] | {} | -+---------------+-----------------------------+-------------------------------------------+----------------+ ++----------+------------------------+-------------------------------------------+-----------+----------+ +| username | transaction_id | query | status | metadata | ++----------+------------------------+-------------------------------------------+-----------+----------+ +| "" | "9223372036854794885" | ["CALL infinite.get() YIELD * RETURN *;"] | "running" | {} | +| "" | "9223372036854794896" | ["SHOW TRANSACTIONS"] | "running" | {} | ++----------+------------------------+-------------------------------------------+-----------+----------+ ``` To terminate the transaction, run the following query: diff --git a/pages/help-center/errors/transactions.mdx b/pages/help-center/errors/transactions.mdx index 0354485b9..dbe4f008a 100644 --- a/pages/help-center/errors/transactions.mdx +++ b/pages/help-center/errors/transactions.mdx @@ -141,9 +141,9 @@ Here are the [instructions](/configuration/configuration-settings#using-flags-an Here are the storage access error messages you might encounter: -1. **Cannot get shared access storage. Try stopping other queries that are running in parallel.** -2. **Cannot get unique access to the storage. Try stopping other queries that are running in parallel.** -3. **Cannot get read only access to the storage. Try stopping other queries that are running in parallel.** +1. **Cannot get shared access to the storage. Try stopping other parallel queries.** +2. **Cannot get unique access to the storage. Try stopping other parallel queries.** +3. **Cannot get read-only access to the storage. Try stopping other parallel queries.** ### Understanding storage access timeout @@ -155,16 +155,32 @@ Storage access timeouts occur during query preparation when the query execution These timeouts prevent worker starvation and database blocking that could occur if queries were to wait indefinitely for storage access. -Users can fine-tune the timeout by setting the flag `--storage-access-timeout-sec`. -Longer timeouts will result in fewer access timeouts, but can lead to worse responsiveness from the database. This is due to workers waiting longer for access before failing. +Users can fine-tune the timeout via the runtime setting or the startup flag. + +To check the current value: + +```cypher +SHOW DATABASE SETTING 'storage.access_timeout_sec'; +``` + +To increase the timeout to 10 seconds at runtime (valid range: 1–1000000): + +```cypher +SET DATABASE SETTING 'storage.access_timeout_sec' TO '10'; +``` + +The change takes effect immediately and does not require a restart. The setting falls back to the `--storage-access-timeout-sec` startup flag value on restart (default: `1`). You can also inspect all current flag values with `SHOW CONFIG`. + +Longer timeouts will result in fewer access timeouts, but can lead to worse responsiveness from the database, as workers wait longer before failing. ### Handling storage access timeout When you encounter a storage access timeout: 1. Check for long-running queries that might be blocking storage access. -2. Consider breaking down complex queries that require unique access into smaller operations. -3. Retry the query after other queries have completed. -4. If possible, schedule queries requiring unique access during periods of lower database activity. +2. Increase the timeout if your workload legitimately needs more time: `SET DATABASE SETTING 'storage.access_timeout_sec' TO '10';` +3. Consider breaking down complex queries that require unique access into smaller operations. +4. Retry the query after other queries have completed. +5. If possible, schedule queries requiring unique access during periods of lower database activity. \ No newline at end of file diff --git a/pages/querying/vector-search.mdx b/pages/querying/vector-search.mdx index 197c1812f..e7ce04be7 100644 --- a/pages/querying/vector-search.mdx +++ b/pages/querying/vector-search.mdx @@ -73,6 +73,38 @@ The following options apply to both single-store vector indexes (nodes) and vect If resizing fails due to memory limitations, an exception will be thrown. Default value is `2`. - `scalar_kind: string (default=f32)` ➡ The [scalar kind](#scalar-kind) used to store each vector component. Smaller types reduce memory usage but may decrease precision. +### Using a function for configuration + +Instead of a static map literal, you can pass a **query module function** that returns the configuration map. This lets you centralize index configurations and reuse them across queries. + +Any `@mgp.function` that returns a map containing at least the mandatory `dimension` and `capacity` fields can be used. + +For example, given a query module `vector_index_config` defined as: + +```python +import mgp + +@mgp.function +def default_config() -> mgp.Map: + return {"dimension": 128, "capacity": 1000} + +@mgp.function +def config(dimension: int, capacity: int, metric: str = "l2sq", scalar_kind: str = "f32") -> mgp.Map: + return {"dimension": dimension, "capacity": capacity, "metric": metric, "scalar_kind": scalar_kind} +``` + +You can use these functions when creating vector indexes: + +```cypher +CREATE VECTOR INDEX idx ON :Label(embedding) WITH CONFIG vector_index_config.default_config(); +``` + +```cypher +CREATE VECTOR INDEX idx ON :Label(embedding) WITH CONFIG vector_index_config.config(128, 1000, "cos"); +``` + +The function is evaluated at index-creation time. Both the node and edge index variants support this syntax. + ## Run vector search To run vector search, call the `vector_search` query module: use `vector_search.search()` for a vector index on nodes and `vector_search.search_edges()` for a vector index on edges. @@ -203,6 +235,25 @@ Alternative options, such as `f16` for lower memory usage, allow you to fine-tun | `i16` | 16-bit signed integer. | | `i8` | 8-bit signed integer. | +## Monitor vector index memory + +Memgraph tracks vector index memory separately from the rest of the graph data. You can inspect both from `SHOW STORAGE INFO`: + +```cypher +SHOW STORAGE INFO; +``` + +Two fields are relevant: + +- `graph_memory_tracked` — memory used by graph structures (vertices, edges, properties). +- `vector_index_memory_tracked` — memory used by vector index embeddings stored in the index backend. + +Together, these two fields sum to `memory_tracked` (the total tracked allocation). The instance-level `--memory-limit` applies to the combined total: if inserting a vector would exceed the limit, Memgraph throws a `Memory limit exceeded` error. + + +Deleting vertices or removing a vector property from nodes does **not** free `vector_index_memory_tracked`. However, that memory is reused when new vectors are inserted into the same index, so the reserved capacity is not wasted. Memory is fully released only when the entire index is dropped with `DROP VECTOR INDEX`. + + ## Drop vector index Vector indices are dropped with the `DROP VECTOR INDEX` command. You need to give the name of the index to be deleted. diff --git a/pages/release-notes.mdx b/pages/release-notes.mdx index d4c70403e..2f0dbd45d 100644 --- a/pages/release-notes.mdx +++ b/pages/release-notes.mdx @@ -42,21 +42,205 @@ troubleshoot in production. ## 🚀 Latest release +### Memgraph v3.9.0 + +{

✨ New features

} + +- Vector indexes can now be created with a function in the `WITH CONFIG` + clause: `CREATE VECTOR INDEX ... WITH CONFIG fun();` so you can generate + index configuration dynamically instead of using a static map literal. + [#3801](https://github.com/memgraph/memgraph/pull/3801) +- Added `--storage-floating-point-resolution-bits` to choose double (64), float + (32), or half (16) precision for floating-point property values system-wide, + so you can reduce memory use for graphs with many float properties when lower + precision is acceptable. + [#3817](https://github.com/memgraph/memgraph/pull/3817) +- Added coordinator setting `deltas_batch_progress_size` to control how many + deltas a replica processes before sending an in-progress heartbeat during + replication. HA clusters can tune it to avoid RPC timeouts on large + transactions (lower value) or reduce overhead on fast replicas (higher + value). Set via `SET COORDINATOR SETTING`; default is 100000 (unchanged + behavior for existing clusters). + [#3797](https://github.com/memgraph/memgraph/pull/3797) +- `SHOW TRANSACTIONS` now lists transactions in committing and aborting phases + (in addition to running), and shows any snapshot creation in progress with + phase and progress in metadata, so you can see what is running on the + database at a glance. [#3833](https://github.com/memgraph/memgraph/pull/3833) +- Added `--logger-type` flag (`sync` or `async`, default `sync`). When set to + `async`, log messages are written by a background thread instead of inline, + reducing latency on hot paths under high throughput. Default behavior is + unchanged; use `--logger-type=async` to opt in. + [#3844](https://github.com/memgraph/memgraph/pull/3844) +- Added `--log-retention-days` to control how many days daily log files are + kept before automatic removal. Default is 35 (unchanged from previous + behavior); set the flag to tune retention for storage or compliance. + [#3862](https://github.com/memgraph/memgraph/pull/3862) +- `--storage-access-timeout-sec` can now be updated at runtime via the + `storage.access_timeout_sec` database setting, so you can increase the + timeout when you hit storage timeout errors without restarting the instance. + [#3882](https://github.com/memgraph/memgraph/pull/3882) +- Added server-side global parameters: define named values that persist across + sessions and are resolved in queries. Use `SET GLOBAL PARAMETER`, `UNSET + GLOBAL PARAMETER`, and `SHOW PARAMETERS` to manage them, and reference values + in Cypher with `$param` (client-provided parameters take precedence). Global + parameters are replicated and recovered on restart, so you can use them as a + durable, cluster-wide store for frequently used query values. + [#3717](https://github.com/memgraph/memgraph/pull/3717) + +{

🛠️ Improvements

} + +- `SHOW STORAGE INFO` now splits memory into `graph_memory_tracked` and + `vector_index_memory_tracked`, so you can see how much memory vector indices + use versus graph data and better plan capacity or troubleshoot memory usage. + [#3847](https://github.com/memgraph/memgraph/pull/3847) +- Renamed the `SHOW STORAGE INFO` key from `embeddings_memory_tracked` to + `vector_index_memory_tracked` so the name reflects that it accounts for all + vector index memory, not only embeddings data. + [#3863](https://github.com/memgraph/memgraph/pull/3863) +- ARM platforms now install the correct Go and C# binaries for driver tests, so + driver tests can run on ARM in addition to x86. + [#3776](https://github.com/memgraph/memgraph/pull/3776) +- Error messages on storage access timeout are now more descriptive so you can + identify and troubleshoot lock or contention issues more easily. + [#3778](https://github.com/memgraph/memgraph/pull/3778) +- Reduced memory use for vertices and edges by packing boolean values in the + delta pointer layout (8 bytes saved per vertex and per edge) with no + performance impact. [#3814](https://github.com/memgraph/memgraph/pull/3814) +- When `--storage-snapshot-on-exit` is false, any in-progress periodic snapshot + is aborted on shutdown so the instance stops faster instead of waiting for + the snapshot to finish. The snapshot digest is now updated only after a + successful write, so a failed or aborted snapshot no longer causes the next + periodic snapshot to be skipped. + [#3823](https://github.com/memgraph/memgraph/pull/3823) +- License selection now considers the database, environment variables, and CLI + together, automatically choosing the license with the furthest expiry (with + source priority to break ties), so you no longer need to clear old database + settings for a newer CLI or env license to apply. Malformed or expired keys + are rejected immediately when set, and reported license info now matches + actual feature availability. + [#3859](https://github.com/memgraph/memgraph/pull/3859) +- EdgeSetProperty WAL delta has been expanded. This speeds up edge recovery and + replication. Nothing to do, this is all automatic and back-compatible. + [#3799](https://github.com/memgraph/memgraph/pull/3799) + +{

🐞 Bug fixes

} + +- Improved query planning for BFS with index lookup: a cost model now chooses + between bidirectional and single-source BFS so you get better execution plans + for shortest-path queries when both source and destination use an index. + [#3751](https://github.com/memgraph/memgraph/pull/3751) +- Fixed a bug where `schema.assert` could not be used as a property name or + value, which blocked backup and restore in some `DUMP DATABASE` scenarios. + [#3772](https://github.com/memgraph/memgraph/pull/3772) +- When TLS/SSL certificate or key loading fails, Memgraph now logs an error + message so you can see why startup or connection failed and fix path or + permission issues. [#3795](https://github.com/memgraph/memgraph/pull/3795) +- With `--data-recovery-on-startup` enabled, the system now recovers both users + and databases as system metadata on startup, so all system metadata is + restored consistently; user data is still not recovered. + [#3807](https://github.com/memgraph/memgraph/pull/3807) +- Plan caching is now enabled again for queries that use BFS, fixing a + performance regression where a new plan was created for every such query. + [#3821](https://github.com/memgraph/memgraph/pull/3821) +- User and role details that apply only to enterprise licenses are now + persisted when switching between enterprise and community, so they are not + lost and apply again after upgrading back to an enterprise license. + [#3867](https://github.com/memgraph/memgraph/pull/3867) +- `SHOW SCHEMA INFO` now correctly respects fine-grained label-based access + control (LBAC) READ permissions, so restricted users see only the schema + elements they are allowed to access. + [#3870](https://github.com/memgraph/memgraph/pull/3870) +- Edge removals in the text index are now fully tracked when using `DETACH + DELETE`, so `SHOW INDEX INFO` reports accurate edge counts instead of + outdated metadata (search results were already correct). + [#3878](https://github.com/memgraph/memgraph/pull/3878) +- Concurrent operations on vector indexes (nodes and edges) are now less + contention-prone: internal locking allows multiple writers in parallel and + reserves exclusive access only for resize and removal. Garbage collection of + deleted entries is targeted instead of scanning the full index, improving GC + performance and fixing a potential crash when vertices or edges were deleted + in analytical mode before the vector index was cleaned up. + [#3856](https://github.com/memgraph/memgraph/pull/3856) +- Fixed a bug where the parallel version of `DISTINCT` could invalidate other + cursors in the query and cause failures. PARALLEL EXECUTION with `DISTINCT` + now runs correctly and faster. + [#3876](https://github.com/memgraph/memgraph/pull/3876) +- Fixed inconsistent ordering in the parallel `DISTINCT` cursor, which could + make PARALLEL EXECUTION output unstable. `DISTINCT` used in PARALLEL + EXECUTION now produces correct, stable results. + [#3815](https://github.com/memgraph/memgraph/pull/3815) +- Fixed a bug in HA replication where concurrent edge deltas could be processed + twice during durability traversal, causing replica inconsistency. + [#3836](https://github.com/memgraph/memgraph/pull/3836) +- Fixed a crash that could occur when queries using Python user-defined + functions were cleaned up on internal worker threads. Memgraph now acquires + the Python GIL before releasing Python object references during query + teardown, preventing segfaults when using Python query modules with custom + functions. [#3837](https://github.com/memgraph/memgraph/pull/3837) +- Fixed a rare race in durability when a transaction that created concurrent + edges aborted while the delta chain was being processed, which could cause + duplicate or missed edge-delta encoding and durability integrity issues. + [#3839](https://github.com/memgraph/memgraph/pull/3839) +- Fixed a segfault that could occur when closing Python query modules during + Memgraph shutdown. A missing Python interpreter state check is now performed + before cleanup so shutdown completes cleanly when Python modules are loaded. + [#3842](https://github.com/memgraph/memgraph/pull/3842) +- Fixed a segfault during vector index garbage collection when vertices or + edges were being inserted concurrently, which could occur with bulk + create/delete/create workloads. + [#3846](https://github.com/memgraph/memgraph/pull/3846) +- Fixed a data race in the high-availability coordinator's cluster + configuration handling. Concurrent config loads and saves are now properly + synchronized, preventing inconsistent state and coordinator instability + during cluster membership changes. + [#3850](https://github.com/memgraph/memgraph/pull/3850) +- Shutdown on SIGTERM and SIGINT no longer runs inside a signal handler, where + logging and mutex use were unsafe. Signals are now consumed on the main + thread so the full shutdown sequence runs in normal context, making shutdown + more robust against rare crashes or hangs. + [#3851](https://github.com/memgraph/memgraph/pull/3851) +- Fixed Python binding asserted due to unhandled import errors. When the system + encounters errors (for example when over the memory limit), Python will fail + to load modules. This was not gracefully handled and would cause termination. + The Python bindings are now more robust. + [#3864](https://github.com/memgraph/memgraph/pull/3864) +- Fixed Periodic commit failure causes double abort (which in turn causes an + assert). Periodic commit's failure is a special case and is handled correctly + by the interpreter. Users can run periodic commit without worrying if the + query will always succeed. + [#3868](https://github.com/memgraph/memgraph/pull/3868) +- Fixed a bug where incorrect queries using CALL + UNION would not throw an + exception but would just produce wrong results. + [#3869](https://github.com/memgraph/memgraph/pull/3869) +- Fixed occasional "Operation not permitted" error when vertices with + concurrently imported edges are indexed. + [#3871](https://github.com/memgraph/memgraph/pull/3871) +- Communication data parsing is now more robust when values do not match the + expected type, so you get fewer unexpected communication errors between + clients and the database. + [#3873](https://github.com/memgraph/memgraph/pull/3873) +- The schema query module now works with ZonedDateTime properties, so you can + use schema assertions and schema-based workflows (such as backup and restore) + on graphs that store temporal data with time zones. + [#3874](https://github.com/memgraph/memgraph/pull/3874) + +### Lab v3.9.0 + +## Previous releases + ### Memgraph v3.8.1 - February 17th, 2026 {

🐞 Bug fixes

} -- Fixed using `LOAD CSV` via SSL, by adding missing `ca-certificates` package to - the Memgraph Docker image +- Fixed using `LOAD CSV` via SSL, by adding missing `ca-certificates` package + to the Memgraph Docker image [#3793](https://github.com/memgraph/memgraph/pull/3793) - -- A data instance could've get into a deadlock state when some specific timings - are triggered. This is now fixed so users should be able to always start a +- A data instance could get into a deadlock state when some specific timings + are triggered. This is now fixed so users should be able to always start a data instance without any deadlocks occurring. [#3787](https://github.com/memgraph/memgraph/pull/3787) - -- Fix so that concurrent edge imports will no longer delay garbage collection. - This resulted in temporary memory spikes after import, until the periodic +- Fix so that concurrent edge imports will no longer delay garbage collection. + This resulted in temporary memory spikes after import, until the periodic garbage collector could free up memory. [#3784](https://github.com/memgraph/memgraph/pull/3784) @@ -382,8 +566,6 @@ correctly for auditing and role-based logic. -## Previous releases - ### Memgraph v3.7.2 - December 23rd, 2025 {

🐞 Bug fixes

} diff --git a/skills/check-before-release/SKILL.md b/skills/check-before-release/SKILL.md new file mode 100644 index 000000000..8e94516aa --- /dev/null +++ b/skills/check-before-release/SKILL.md @@ -0,0 +1,60 @@ +--- +name: check_before_release +description: Run before every release to ensure all memgraph PRs have changelog entries and docs pages where required. Use when preparing a release branch, before merging into the main branch, or when asked to "check before release". +--- + +# Check before release + +Run this check before every release to find memgraph PRs that are missing from the changelog or that have no documentation page despite being labeled "Docs needed". + +## When to use + +- Before merging the main release documentation PR (e.g. `memgraph/documentation#1530` for 3.9) +- When preparing or validating a release branch +- When asked to verify release readiness for docs/changelog + +## Assumptions + +- Memgraph PRs that need docs are labeled **"Docs needed"** or **"Docs - changelog only"**. +- The release documentation PR (in `memgraph/documentation`) contains two lists: + - **Memgraph PRs Docs Needed** – memgraph PR numbers with their corresponding doc PRs (e.g. `memgraph#3801 → #1555`). + - **Release Notes Required** – memgraph PR numbers that must appear in the changelog. + +## Steps + +1. **Identify versions** + - Current release (e.g. `3.9`) and previous one (e.g. `3.8.0`). + - Memgraph milestone for the release (e.g. `mg-v3.9.0`, milestone 43). + - The open documentation release PR (e.g. `memgraph/documentation#1530`). + +2. **Get the authoritative lists from the docs PR** + - From the PR description, extract: + - Every **Memgraph PRs Docs Needed** line: memgraph PR # and linked doc PR #. + - Every **Release Notes Required** line: memgraph PR # (and title if present). + +3. **Changelog check** + - Open `pages/release-notes.mdx` and locate the section for the new release (e.g. `### Memgraph v3.9.0`). + - For each PR in **Release Notes Required**, confirm it appears in that section (e.g. as `[#NNNN](https://github.com/memgraph/memgraph/pull/NNNN)` or equivalent). + - List any **missing from changelog**: PRs in Release Notes Required with no link in the release section. + +4. **Docs page check** + - For each memgraph PR listed under **Memgraph PRs Docs Needed**, confirm the description links to a documentation PR (e.g. `→ #1555`). + - Optionally fetch the milestone or PR list from `github.com/memgraph/memgraph/milestone/` and find any PR with label "Docs needed" that is **not** mentioned in the docs PR’s "Memgraph PRs Docs Needed" list. + - List any **docs page missing**: PRs labeled "Docs needed" (or that clearly need a dedicated docs page) with no corresponding doc PR in the release docs PR. + +5. **Report** + - **Not in changelog:** list of memgraph PR numbers (and titles if helpful) that are in Release Notes Required but not in `release-notes.mdx` for this release. + - **Docs page missing:** list of memgraph PR/issue numbers with "Docs needed" and no doc PR linked in the release docs PR; briefly note what’s missing (e.g. “TLS .pem-only behavior”). + - If both lists are empty, state that the release is clear for changelog and docs. + +## References + +- Memgraph commits (since last release): `https://github.com/memgraph/memgraph/commits/master/` +- Memgraph milestone (e.g. 3.9): `https://github.com/memgraph/memgraph/milestone/43?closed=1` +- Release notes file: `pages/release-notes.mdx` +- Documentation release PR: linked from the milestone or repo (e.g. `memgraph/documentation` open PR for the release). + +## Notes + +- PRs labeled **"Docs unnecessary"** (e.g. CI, tests, internal tooling) are excluded; no changelog or docs page required. +- If a memgraph PR was closed and its work continued in another PR (e.g. #3788 → #3795), treat the one that actually merged; if the closed PR had "Docs needed" and the follow-up is "Docs unnecessary", still report the gap if the behavior is not documented.