Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

steampipe postgres size management #4260

Closed
deivK opened this issue Apr 29, 2024 · 7 comments
Closed

steampipe postgres size management #4260

deivK opened this issue Apr 29, 2024 · 7 comments
Labels
stale No recent activity has been detected on this issue/PR and it will be closed

Comments

@deivK
Copy link

deivK commented Apr 29, 2024

Hi,
we are using the aws plugin to collect some data from AWS. Every day we run some batch queries for each account and store data in another postgres db we use for analysis.
The issue is that the steampipe postgres, the one under .steampipe/db/14.2.0, is increasing out of control: the /.steampipe/db/14.2.0/data folder has reached 94 GB.

How can we reduce the size of this db and how can we keep it under control?

Many thanks
Davide

@pskrbasu
Copy link
Contributor

pskrbasu commented Apr 30, 2024

Hi @deivK this seems like something abnormal, 94GB is huge 🤯

A few questions:

  1. What steampipe version are you on?
  2. You mentioned having AWS accounts setup, how many accounts do you have/are you querying?
  3. Are you using any other steampipe plugins?
  4. What are the sizes of data and postgres directories inside .steampipe/db/14.2.0?
  5. "and store data in another postgres db we use for analysis" - I want to understand how you do this. Do you use a script?

@deivK
Copy link
Author

deivK commented Apr 30, 2024

Hi,
thank you for your reply.

  1. We are using Stemapipe v0.21.6
  2. We are querying 35 accounts
  3. we just use the aws plugin
  4. data is 94G, postgres is 127M
  5. we have a bunch of python scripts that, for each account, run batch queries and store the results in another database. Most of those queries are plain select from an aws table (like aws_account, aws_ec2_ami, aws_ec2_instance, ...). This happens every day, once a day.

Thanks again

@e-gineer
Copy link
Contributor

Have you tried analyzing the steampipe postgres database directory to see the main driver of the space usage?

Perhaps follow some steps in this StackOverflow article?

https://dba.stackexchange.com/questions/314079/postgres-how-to-find-where-database-size-growth-is-coming-from

Will be interesting to learn what you see?

@deivK
Copy link
Author

deivK commented May 6, 2024

Hi all,
I started steampipe as a service so that I could connect to the postgres database. I manually ran some VACUUM and the database size dropped to 700M.
Maybe, because the service was not running, the autovacuum were not executed?
I'm leaving the service running, even if we are running queries as batch queries, and monitoring the db size,

@e-gineer
Copy link
Contributor

e-gineer commented May 6, 2024

Interesting that autovacuum fixed it. How were you running Steampipe that it grew to that size? Keen to learn from your setup / experience...

Copy link

github-actions bot commented Jul 6, 2024

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the stale No recent activity has been detected on this issue/PR and it will be closed label Jul 6, 2024
Copy link

github-actions bot commented Aug 6, 2024

This issue was closed because it has been stalled for 90 days with no activity.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale No recent activity has been detected on this issue/PR and it will be closed
Projects
None yet
Development

No branches or pull requests

3 participants