-
Notifications
You must be signed in to change notification settings - Fork 6
docs: add query exporter monitoring documentation #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Miggets7
wants to merge
8
commits into
main
Choose a base branch
from
query-exporter
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
ed9ba28
docs: add query exporter monitoring documentation
Miggets7 0bc4c52
Merge branch 'main' into query-exporter
richturner 68ba56f
Improved documentation based on feedback
Miggets7 178205d
Merge branch 'main' into query-exporter
wborn 6b03cfe
Merge branch 'main' into query-exporter
Miggets7 5c29e07
Apply suggestions from code review
Miggets7 a55b1b2
Update query exporter documentation with Docker networking details
Miggets7 ed2fe11
Merge branch 'main' into query-exporter
Miggets7 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,174 @@ | ||
| --- | ||
| sidebar_position: 18 | ||
| --- | ||
|
|
||
| # Query Exporter | ||
|
|
||
| The query-exporter service monitors the OpenRemote PostgreSQL database and exposes metrics on port 9560 for Prometheus scraping. It uses [query-exporter](https://github.com/albertodonato/query-exporter) to collect database health metrics. | ||
|
|
||
| ## Available Metrics | ||
|
|
||
| ### Table and Index Bloat | ||
| - `pg_table_bloat_count` - Number of tables/indexes with bloat exceeding thresholds | ||
| - `pg_table_bloat_ratio` - Bloat ratio per table/index (1.0 = no bloat, 2.0 = 100% bloat) | ||
| - `pg_table_bloat_bytes` - Estimated bloat size in bytes per table/index | ||
| - `pg_table_bloat_wasted_mb` - Estimated wasted space in megabytes per table/index | ||
|
|
||
| ### Autovacuum Workers | ||
| - `pg_autovacuum_workers_active` - Number of currently active autovacuum workers | ||
| - `pg_autovacuum_workers_max` - Maximum number of autovacuum workers configured | ||
| - `pg_autovacuum_running` - Running autovacuum processes (labels: database, table_schema, table_name, phase) | ||
|
|
||
| ### Datapoint Query Performance | ||
| - `pg_datapoint_query_duration_seconds` - Histogram of execution times for the attribute that is automatically identified as having the highest datapoint count | ||
| - `pg_datapoint_count` - Total number of datapoints for the attribute that is automatically identified as having the highest datapoint count | ||
|
|
||
| ### Database Health | ||
| - `pg_database_size_megabytes` - Total database size in megabytes | ||
| - `pg_connections_active` - Number of active connections | ||
| - `pg_connections_idle` - Number of idle connections | ||
| - `pg_locks_count` - Number of locks by type | ||
|
|
||
| ## Configuration | ||
|
|
||
| ### Environment Variables | ||
| The service uses the following environment variables (automatically configured in `profile/deploy.yml`): | ||
|
|
||
| **Database Connection:** | ||
| - `POSTGRES_HOST` - Database host (default: `postgresql`) | ||
| - `POSTGRES_PORT` - Database port (default: `5432`) | ||
| - `POSTGRES_DB` - Database name (default: `openremote`) | ||
| - `POSTGRES_USER` - Database user (default: `postgres`) | ||
| - `POSTGRES_PASSWORD` - Database password (default: `postgres`) | ||
|
|
||
| **Bloat Thresholds:** | ||
| - `TABLE_BLOAT_THRESHOLD` - Table bloat ratio threshold (default: `1.2` = 20% bloat) | ||
| - `INDEX_BLOAT_THRESHOLD` - Index bloat ratio threshold (default: `1.5` = 50% bloat) | ||
|
|
||
| :::note | ||
|
|
||
| Indexes typically bloat faster than tables, so the default index threshold is higher. | ||
|
|
||
| ::: | ||
|
|
||
| ### Customize Thresholds | ||
| Set environment variables before starting services: | ||
| ```bash | ||
| export TABLE_BLOAT_THRESHOLD=1.3 # 30% table bloat | ||
| export INDEX_BLOAT_THRESHOLD=2.0 # 100% index bloat | ||
| ``` | ||
|
|
||
| ### Query Intervals | ||
| - Table bloat queries: Every 5 minutes | ||
| - Autovacuum queries: Every 30 seconds | ||
| - Datapoint performance: Every 60 seconds (samples 100 most recent datapoints) | ||
| - Database size: Every 5 minutes | ||
| - Connection/lock stats: Every 30 seconds | ||
|
|
||
| ## Accessing Metrics | ||
|
|
||
| ### View Metrics Endpoint | ||
| ```bash | ||
| curl http://localhost:9560/metrics | ||
| ``` | ||
|
|
||
| ### Expose on Private Network | ||
| To expose on a private network in production, uncomment this line in `profile/deploy.yml`: | ||
| ```yaml | ||
| - "${PRIVATE_IP:-127.0.0.1}:9560:9560" | ||
| ``` | ||
|
|
||
| ## Prometheus Integration | ||
|
|
||
| Add this scrape configuration to your Prometheus config: | ||
|
|
||
| ```yaml | ||
| scrape_configs: | ||
| - job_name: 'openremote-postgres' | ||
| static_configs: | ||
| - targets: ['query-exporter:9560'] # Use service name in Docker network | ||
| # - targets: ['localhost:9560'] # Use localhost if Prometheus runs on host | ||
| scrape_interval: 30s | ||
| ``` | ||
|
|
||
| :::note | ||
|
|
||
| When Prometheus runs in the same Docker network as OpenRemote, use the service name `query-exporter:9560`. Only use `localhost:9560` if Prometheus is running directly on the host machine. | ||
|
|
||
| ::: | ||
|
|
||
| ## Customizing Queries | ||
|
|
||
| To modify queries or add new metrics: | ||
|
|
||
| 1. Edit the `config.yaml` file in the `query-exporter` configuration directory. | ||
| - Default host path: `/deployment/query-exporter/config.yaml` | ||
| - Container path (Docker volume mount): `/config/config.yaml` | ||
| - For the exact host path in your environment, see the `query-exporter` volume mapping in `profile/deploy.yml`. | ||
| 2. Restart the service: | ||
| ```bash | ||
| docker-compose -f profile/deploy.yml restart query-exporter | ||
| ``` | ||
|
|
||
| ## Troubleshooting | ||
|
|
||
| ### Check Service Logs | ||
| ```bash | ||
| docker-compose -f profile/deploy.yml logs -f query-exporter | ||
| ``` | ||
|
|
||
| ### Test Database Connectivity | ||
| ```bash | ||
| docker-compose -f profile/deploy.yml exec query-exporter sh | ||
| apk add postgresql-client | ||
| psql -h $POSTGRES_HOST -U $POSTGRES_USER -d $POSTGRES_DB | ||
| ``` | ||
|
|
||
| ### Verify Metrics Endpoint | ||
| ```bash | ||
| curl http://localhost:9560/metrics | ||
| ``` | ||
|
|
||
| ## Performance Tuning | ||
|
|
||
| If bloat detection queries impact database performance: | ||
|
|
||
| - **Increase query interval** - Change from 300s to 600s or higher in `config.yaml` | ||
| - **Limit to specific schemas** - Modify queries to target specific schemas only | ||
| - **Schedule off-peak runs** - Use `schedule` option instead of `interval` | ||
| - **Reduce sample size** - Lower the datapoint query sample size (default: 100) | ||
|
|
||
| ### Query Complexity | ||
| - Bloat detection scans `pg_stats` and `pg_class` catalogs (limited to top 50 results) | ||
| - Datapoint performance uses a sample size of 100 recent datapoints from the largest attribute (configurable) | ||
| - All queries exclude PostgreSQL system schemas (`pg_%` and `information_schema`) | ||
|
|
||
| ## Understanding Bloat | ||
|
|
||
| ### Bloat Ratio Values | ||
| - `1.0` - No bloat (optimal size) | ||
| - `1.2` - 20% bloat (default table threshold) | ||
| - `1.5` - 50% bloat (default index threshold) | ||
| - `2.0` - 100% bloat (object is twice the optimal size) | ||
|
|
||
| ### Maintenance Actions | ||
| - **Tables > 1.2** - Run `VACUUM FULL` during maintenance window | ||
| - **Indexes > 1.5** - Run `REINDEX` on affected indexes | ||
| - **Critical bloat (> 2.0)** - Immediate maintenance recommended | ||
|
|
||
| ### PostgreSQL Constants | ||
| The bloat detection queries use these PostgreSQL internal constants: | ||
| - `1048576` - Bytes per megabyte (1024 × 1024) | ||
| - `8` - Bits per byte (for null bitmap calculation) | ||
| - `20` - Page header size in bytes | ||
| - `12` - Index header overhead in bytes | ||
| - `4` - Item pointer size in bytes | ||
| - `23` - Typical tuple header size for PostgreSQL 14+ on Linux (this value may differ for other PostgreSQL versions or operating systems; verify for your deployment) | ||
| - `4` - Memory alignment for Linux containers | ||
|
|
||
| ## References | ||
|
|
||
| - [Query Exporter Documentation](https://github.com/albertodonato/query-exporter) | ||
| - [Configuration Format](https://github.com/albertodonato/query-exporter/blob/main/docs/configuration.rst) | ||
| - [PostgreSQL Statistics Views](https://www.postgresql.org/docs/current/monitoring-stats.html) | ||
| - [PostgreSQL Bloat Detection](https://wiki.postgresql.org/wiki/Show_database_bloat) | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sample size value mentioned here (100) should be documented in the "Query Intervals" section above for consistency. Currently, the query intervals section (lines 61-66) documents time intervals but doesn't mention sample sizes. Consider adding a note about the sample size to that section or creating a "Query Parameters" subsection.