Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query Search is broken #6381

Closed
denisov-vlad opened this issue Aug 21, 2023 · 8 comments · Fixed by #6491
Closed

Query Search is broken #6381

denisov-vlad opened this issue Aug 21, 2023 · 8 comments · Fixed by #6491

Comments

@denisov-vlad
Copy link
Member

Issue Summary

After upgrading to Redash with Python 3.8, search cannot be used.

Steps to Reproduce

  1. Open /queries page
  2. Try to search something
  3. See "Internal Server Error"

Technical details:

  • Redash Version: 4a5c9c2
  • Browser/OS: Safari / Mac Os
  • How did you install Redash: k8s

Here are the logs from webserver pod:

[2023-08-21 09:11:29,965][PID:1784][ERROR][redash.app] Exception on /api/queries [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedFunction: function tsq_parse(unknown, unknown) does not exist
LINE 6: ...up_id IN (2, 1, 8)) AND (queries.search_vector @@ tsq_parse(...
                                                             ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1484, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1469, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 489, in wrapper
    resp = resource(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/flask_login/utils.py", line 277, in decorated_view
    return current_app.ensure_sync(func)(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/flask/views.py", line 109, in view
    return current_app.ensure_sync(self.dispatch_request)(**kwargs)
  File "/app/redash/handlers/base.py", line 31, in dispatch_request
    return super(BaseResource, self).dispatch_request(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/flask_restful/__init__.py", line 604, in dispatch_request
    resp = meth(*args, **kwargs)
  File "/app/redash/permissions.py", line 71, in decorated
    return fn(*args, **kwargs)
  File "/app/redash/handlers/queries.py", line 148, in get
    response = paginate(
  File "/app/redash/handlers/base.py", line 81, in paginate
    count = query_set.count()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3803, in count
    return self.from_self(col).scalar()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3523, in scalar
    ret = self.one()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3490, in one
    ret = self.one_or_none()
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3459, in one_or_none
    ret = list(self)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
    ret = self._execute_context(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
    util.raise_(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedFunction) function tsq_parse(unknown, unknown) does not exist
LINE 6: ...up_id IN (2, 1, 8)) AND (queries.search_vector @@ tsq_parse(...
                                                             ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

[SQL: SELECT count(*) AS count_1
FROM (SELECT queries.query AS queries_query, queries.updated_at AS queries_updated_at, queries.created_at AS queries_created_at, queries.id AS queries_id, queries.version AS queries_version, queries.org_id AS queries_org_id, queries.data_source_id AS queries_data_source_id, queries.latest_query_data_id AS queries_latest_query_data_id, queries.name AS queries_name, queries.description AS queries_description, queries.query_hash AS queries_query_hash, queries.api_key AS queries_api_key, queries.user_id AS queries_user_id, queries.last_modified_by_id AS queries_last_modified_by_id, queries.is_archived AS queries_is_archived, queries.is_draft AS queries_is_draft, queries.schedule AS queries_schedule, queries.schedule_failures AS queries_schedule_failures, queries.options AS queries_options, queries.search_vector AS queries_search_vector, queries.tags AS queries_tags
FROM queries LEFT OUTER JOIN users ON users.id = queries.user_id LEFT OUTER JOIN query_results ON query_results.id = queries.latest_query_data_id
WHERE queries.id IN (SELECT DISTINCT queries.id AS anon_2
FROM queries JOIN data_source_groups ON queries.data_source_id = data_source_groups.data_source_id
WHERE queries.is_archived IS false AND data_source_groups.group_id IN (%(group_id_1)s, %(group_id_2)s, %(group_id_3)s)) AND (queries.search_vector @@ tsq_parse(%(tsq_parse_1)s, %(tsq_parse_2)s)) ORDER BY ts_rank_cd(queries.search_vector, tsq_parse(%(tsq_parse_3)s)) DESC) AS anon_1]
[parameters: {'group_id_1': 2, 'group_id_2': 1, 'group_id_3': 8, 'tsq_parse_1': 'pg_catalog.simple', 'tsq_parse_2': 'qweq', 'tsq_parse_3': 'qweq'}]
(Background on this error at: http://sqlalche.me/e/13/f405)
[2023-08-21 09:11:29,969][PID:1784][INFO][metrics] method=GET path=/api/queries endpoint=queries status=500 content_type=application/json content_length=36 duration=16.16 query_count=3 query_duration=4.49
@justinclift
Copy link
Member

Thanks, that's pretty clearly a bug. 😄

psycopg2.errors.UndefinedFunction: function tsq_parse(unknown, unknown) does not exist

I'm not seeing any mentions of tsq_parse() in our code base.

@gaecoli @wlach Is this something either of you would be interested in investigating? 😄

@gaecoli
Copy link
Member

gaecoli commented Aug 21, 2023

Thanks, that's pretty clearly a bug. 😄

psycopg2.errors.UndefinedFunction: function tsq_parse(unknown, unknown) does not exist

I'm not seeing any mentions of tsq_parse() in our code base.

@gaecoli @wlach Is this something either of you would be interested in investigating? 😄

OK, i will check it!

@justinclift
Copy link
Member

Thanks @gaecoli. 😄

@gaecoli
Copy link
Member

gaecoli commented Aug 22, 2023

Thanks @gaecoli. 😄

I will try to fix it!

@Krikooo
Copy link

Krikooo commented Sep 11, 2023

Hello there o/
Any news about a potential fix here ? :)

@azundo
Copy link
Contributor

azundo commented Sep 28, 2023

I believe the underlying cause is upgrading sqlalchemy-searchable without adding a migration to add the new required custom sql functions. I've just run this manually and things seem to be fixed. I'm not sure the exact right way to handle this for new installs vs upgrades however:

DROP TYPE IF EXISTS tsq_state CASCADE;

CREATE TYPE tsq_state AS (
    search_query text,
    parentheses_stack int,
    skip_for int,
    current_token text,
    current_index int,
    current_char text,
    previous_char text,
    tokens text[]
);

CREATE OR REPLACE FUNCTION tsq_append_current_token(state tsq_state)
RETURNS tsq_state AS $$
BEGIN
    IF state.current_token != '' THEN
        state.tokens := array_append(state.tokens, state.current_token);
        state.current_token := '';
    END IF;
    RETURN state;
END;
$$ LANGUAGE plpgsql IMMUTABLE;


CREATE OR REPLACE FUNCTION tsq_tokenize_character(state tsq_state)
RETURNS tsq_state AS $$
BEGIN
    IF state.current_char = '(' THEN
        state.tokens := array_append(state.tokens, '(');
        state.parentheses_stack := state.parentheses_stack + 1;
        state := tsq_append_current_token(state);
    ELSIF state.current_char = ')' THEN
        IF (state.parentheses_stack > 0 AND state.current_token != '') THEN
            state := tsq_append_current_token(state);
            state.tokens := array_append(state.tokens, ')');
            state.parentheses_stack := state.parentheses_stack - 1;
        END IF;
    ELSIF state.current_char = '"' THEN
        state.skip_for := position('"' IN substring(
            state.search_query FROM state.current_index + 1
        ));

        IF state.skip_for > 1 THEN
            state.tokens = array_append(
                state.tokens,
                substring(
                    state.search_query
                    FROM state.current_index FOR state.skip_for + 1
                )
            );
        ELSIF state.skip_for = 0 THEN
            state.current_token := state.current_token || state.current_char;
        END IF;
    ELSIF (
        state.current_char = '-' AND
        (state.current_index = 1 OR state.previous_char = ' ')
    ) THEN
        state.tokens := array_append(state.tokens, '-');
    ELSIF state.current_char = ' ' THEN
        state := tsq_append_current_token(state);
    ELSE
        state.current_token = state.current_token || state.current_char;
    END IF;
    RETURN state;
END;
$$ LANGUAGE plpgsql IMMUTABLE;


CREATE OR REPLACE FUNCTION tsq_tokenize(search_query text) RETURNS text[] AS $$
DECLARE
    state tsq_state;
BEGIN
    SELECT
        search_query::text AS search_query,
        0::int AS parentheses_stack,
        0 AS skip_for,
        ''::text AS current_token,
        0 AS current_index,
        ''::text AS current_char,
        ''::text AS previous_char,
        '{}'::text[] AS tokens
    INTO state;

    state.search_query := lower(trim(
        regexp_replace(search_query, '""+', '""', 'g')
    ));

    FOR state.current_index IN (
        SELECT generate_series(1, length(state.search_query))
    ) LOOP
        state.current_char := substring(
            search_query FROM state.current_index FOR 1
        );

        IF state.skip_for > 0 THEN
            state.skip_for := state.skip_for - 1;
            CONTINUE;
        END IF;

        state := tsq_tokenize_character(state);
        state.previous_char := state.current_char;
    END LOOP;
    state := tsq_append_current_token(state);

    state.tokens := array_nremove(state.tokens, '(', -state.parentheses_stack);

    RETURN state.tokens;
END;
$$ LANGUAGE plpgsql IMMUTABLE;


-- Processes an array of text search tokens and returns a tsquery
CREATE OR REPLACE FUNCTION tsq_process_tokens(config regconfig, tokens text[])
RETURNS tsquery AS $$
DECLARE
    result_query text;
    previous_value text;
    value text;
BEGIN
    result_query := '';

    FOREACH value IN ARRAY tokens LOOP
        IF value = '"' THEN
            CONTINUE;
        END IF;

        IF value = 'or' THEN
            value := ' | ';
        END IF;

        IF left(value, 1) = '"' AND right(value, 1) = '"' THEN
            value := phraseto_tsquery(config, value);
        ELSIF value NOT IN ('(', ' | ', ')', '-') THEN
            value := quote_literal(value) || ':*';
        END IF;

        IF previous_value = '-' THEN
            IF value = '(' THEN
                value := '!' || value;
            ELSIF value = ' | ' THEN
                CONTINUE;
            ELSE
                value := '!(' || value || ')';
            END IF;
        END IF;

        SELECT
            CASE
                WHEN result_query = '' THEN value
                WHEN previous_value = ' | ' AND value = ' | ' THEN result_query
                WHEN previous_value = ' | ' THEN result_query || ' | ' || value
                WHEN previous_value IN ('!(', '(') OR value = ')' THEN result_query || value
                WHEN value != ' | ' THEN result_query || ' & ' || value
                ELSE result_query
            END
        INTO result_query;

        IF result_query = ' | ' THEN
            result_query := '';
        END IF;

        previous_value := value;
    END LOOP;

    RETURN to_tsquery(config, result_query);
END;
$$ LANGUAGE plpgsql IMMUTABLE;


CREATE OR REPLACE FUNCTION tsq_process_tokens(tokens text[])
RETURNS tsquery AS $$
    SELECT tsq_process_tokens(get_current_ts_config(), tokens);
$$ LANGUAGE SQL IMMUTABLE;


CREATE OR REPLACE FUNCTION tsq_parse(config regconfig, search_query text)
RETURNS tsquery AS $$
    SELECT tsq_process_tokens(config, tsq_tokenize(search_query));
$$ LANGUAGE SQL IMMUTABLE;


CREATE OR REPLACE FUNCTION tsq_parse(config text, search_query text)
RETURNS tsquery AS $$
    SELECT tsq_parse(config::regconfig, search_query);
$$ LANGUAGE SQL IMMUTABLE;


CREATE OR REPLACE FUNCTION tsq_parse(search_query text) RETURNS tsquery AS $$
    SELECT tsq_parse(get_current_ts_config(), search_query);
$$ LANGUAGE SQL IMMUTABLE;


-- remove first N elements equal to the given value from the array (array
-- must be one-dimensional)
--
-- If negative value is given as the third argument the removal of elements
-- starts from the last array element.
CREATE OR REPLACE FUNCTION array_nremove(anyarray, anyelement, int)
RETURNS ANYARRAY AS $$
    WITH replaced_positions AS (
        SELECT UNNEST(
            CASE
            WHEN $2 IS NULL THEN
                '{}'::int[]
            WHEN $3 > 0 THEN
                (array_positions($1, $2))[1:$3]
            WHEN $3 < 0 THEN
                (array_positions($1, $2))[
                    (cardinality(array_positions($1, $2)) + $3 + 1):
                ]
            ELSE
                '{}'::int[]
            END
        ) AS position
    )
    SELECT COALESCE((
        SELECT array_agg(value)
        FROM unnest($1) WITH ORDINALITY AS t(value, index)
        WHERE index NOT IN (SELECT position FROM replaced_positions)
    ), $1[1:0]);
$$ LANGUAGE SQL IMMUTABLE;

Figured this out based on this thread: falcony-io/sqlalchemy-searchable#67 and then looking at the expressions.sql file for the pinned version of sqlalchemy-searchable: https://github.com/falcony-io/sqlalchemy-searchable/blob/1.2.0/sqlalchemy_searchable/expressions.sql

@justinclift
Copy link
Member

Thanks @azundo that's AWESOME!

We had to upgrade SQLAlchemy-searchable (no real choice) in order to fix some important security issues (part of a chain of dependencies), and to get that working in short order we just "made it work" in a temporary way with a plan to look at it again properly later.

... which we've not gotten to yet. 😇

We should try turning that code into a proper migration (via PR), and test it out. Sounds like a fairly good starting point anyway.

Actually, are you any good with creating PR's and similar? If you want to throw that into a suitable PR that'd be welcome too. If not, we can do it. 😄

@azundo
Copy link
Contributor

azundo commented Sep 30, 2023

I haven't been able to test this - struggling to get a local dev environment set up fully on my Mac - but I think this branch should do it. If you can test it and make sure I've generated the migration right and it runs then I can submit a PR or feel free to just steal it and merge it.

https://github.com/azundo/redash/tree/fix-sqlalchemy-searchable-upgrade

PR is here: #6491

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants