Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add postgresql schemata options #2028

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 60 additions & 11 deletions visidata/loaders/postgres.py
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,11 @@ def openurl_postgres(vd, url, filetype=None):
port=url.port,
password=url.password)

return PgTablesSheet(dbname+"_tables", sql=SQL(conn))
schema = options.postgres_schema
if schema in ("*", "**"):
return PgSchemataSheet(dbname, sql=SQL(conn))
else:
return PgTablesSheet(dbname+"."+schema, sql=SQL(conn))


VisiData.openurl_postgresql=VisiData.openurl_postgres
Expand Down Expand Up @@ -84,18 +88,61 @@ def postgresGetColumns(vd, cur):
yield ColumnItem(coldesc.name, i, type=codeToType(coldesc.type_code, coldesc.name))


class PgSchemataSheet(Sheet):
rowtype = 'schemas'

def reload(self):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we're standardizing on defining iterload(), which is almost entirely like reload, but you don't have to clear the rows, and instead of calling addRow() directly, just yield the row.

if options.postgres_schema == "*":
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use self.options (here at least, also other places it's used) so that the option can be overridden on a per-sheet basis.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And let's make it so that an empty string means all-but-internal schema, instead of *. Maybe * can mean "including internal schemas" like you have for ** below. Do we think it's important to support internal schemas though?

qstr = """
SELECT schema_name
FROM information_schema.schemata
WHERE
(schema_name <> 'information_schema')
AND (schema_name NOT LIKE 'pg_%');
"""
elif options.postgres_schema == "**":
qstr = """SELECT schema_name FROM information_schema.schemata;"""

with self.sql.cur(qstr) as cur:
self.nrowsPerTable = {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?


self.rows = []
# try to get first row to make cur.description available
r = cur.fetchone()
if r:
self.addRow(r)
self.columns = []
for c in vd.postgresGetColumns(cur):
self.addColumn(c)
self.setKeys(self.columns[0:1]) # schema_name is the key

for r in cur:
self.addRow(r)

def openRow(self, row):
return PgTablesSheet(self.name+"."+row[0], source=row[0], sql=self.sql)


# rowdef: (table_name, ncols)
class PgTablesSheet(Sheet):
rowtype = 'tables'

def reload(self):
schema = options.postgres_schema
schema = self.source or options.postgres_schema
qstr = f'''
SELECT relname table_name, column_count.ncols, reltuples::bigint est_nrows
FROM pg_class, pg_namespace, (
SELECT table_name, COUNT(column_name) AS ncols FROM information_schema.COLUMNS WHERE table_schema = '{schema}' GROUP BY table_name
SELECT relname table_name,
column_count.ncols,
reltuples::bigint est_nrows
FROM pg_class,
pg_namespace,
( SELECT table_name, COUNT(column_name) AS ncols
FROM information_schema.COLUMNS
WHERE table_schema = '{schema}'
GROUP BY table_name
) AS column_count
WHERE pg_class.relnamespace = pg_namespace.oid AND pg_namespace.nspname = '{schema}' AND column_count.table_name = relname;
WHERE (pg_class.relnamespace = pg_namespace.oid)
AND (pg_namespace.nspname = '{schema}')
AND (column_count.table_name = relname);
'''

with self.sql.cur(qstr) as cur:
Expand All @@ -115,17 +162,19 @@ def reload(self):
self.addRow(r)

def openRow(self, row):
return PgTable(self.name+"."+row[0], source=row[0], sql=self.sql)
return PgTable(
self.name+"."+row[0],
source=(self.source or options.postgres_schema)+"."+row[0],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the given name and the source should be the same here. Let's use source, since I think it's reasonable to require source to be the schema for this class, and only use options.postgres_schema in the outer openurl_postgres function.

sql=self.sql
)


# rowdef: tuple of values as returned by fetchone()
class PgTable(Sheet):
@asyncthread
def reload(self):
if self.options.postgres_schema:
source = f"{self.options.postgres_schema}.{self.source}"
else:
source = self.source
source = self.source
Comment on lines -125 to +176
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will conflict with #2129, where the table name and schema were quoted to support both being something other than lowercase.

i.e.:

"public"."TableName" rather than public.TableName or "public.TableName"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking! As part of this PR, should we then move the #2129 fix into openurl_postgres?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will have to check that the loading of lowercase table names is preserved before merging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we then move the #2129 fix into openurl_postgres?

I don't see how that could be done. I think my tweak belongs in these places, where PgTable.source is set:

certainly here:
https://github.com/jrjsmrtn/visidata/blob/621dd88a75c6a133ce1a79cbd795c51fccd1d743/visidata/loaders/postgres.py#L167

maybe here?:
https://github.com/jrjsmrtn/visidata/blob/621dd88a75c6a133ce1a79cbd795c51fccd1d743/visidata/loaders/postgres.py#L123

I don't understand what some of the variables in this code refers to, such as row[0], columns[0:1], etc, so I'm on shaky ground here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

row[0] is the table_name as per the rowdef comment at the top of the class. columns[0:1] creates a list containing the first column (and an empty list if there are no columns).

I don't think the quoting should be done in the schema.table string given to the sheet; the SQL should do the quoting. So this means that the schema and the table name should be passed separately to the postgres sheet.


with self.sql.cur(f"SELECT * FROM {source}") as cur:
self.rows = []
r = cur.fetchone()
Expand Down