Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pgconn cancel query through the PGBouncer #1765

Merged
merged 3 commits into from
Oct 14, 2023

Conversation

levakin
Copy link
Contributor

@levakin levakin commented Oct 10, 2023

Closes #1764

@levakin levakin changed the title Fix pgconn cancel query Fix pgconn cancel query through the PGBouncer Oct 10, 2023
@jackc
Copy link
Owner

jackc commented Oct 13, 2023

As far as I can tell, the current behavior is still within acceptable behavior for PostgreSQL. The change this reverts fails on Windows. I'm not sure exactly why. @drakkan worked on this change. Perhaps they could elaborate.

But it is surprising that not reading could be a problem -- there are no bytes to read. The only reason the code originally read was to wait for the server to close the connection.

@levakin
Copy link
Contributor Author

levakin commented Oct 13, 2023

pgconn_test.go:2253: 
        	Error Trace:	D:\a\pgx\pgx\pgconn\pgconn_test.go:2253
        	Error:      	Received unexpected error:
        	            	read tcp [::1]:56732->[::1]:5432: wsarecv: An existing connection was forcibly closed by the remote host.
        	Test:       	TestConnCancelRequest

@drakkan network seems to work differently on windows. Any ideas why?

@drakkan
Copy link
Contributor

drakkan commented Oct 13, 2023

Hi,

This change was made to fix test cases on Windows. I thought the error was related to the raw system calls used at the time, but I see this happening even now that we no longer have platform-specific code

@levakin
Copy link
Contributor Author

levakin commented Oct 13, 2023

hi, @jackc I've added handling of of this specific error. What do you think of it? Can you please run workflow again?

@levakin
Copy link
Contributor Author

levakin commented Oct 13, 2023

@jackc

The only reason the code originally read was to wait for the server to close the connection.

That's exactly the reason I want to revert this commit.
My current understanding is that due to the usage of PGbouncer, the connection is not fast enough to deliver the instruction and defer cancelConn.Close() closes the connection earlier than the instruction is delivered. And the query keeps running as a result.

Here's the pgbouncer log:

kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:51.238 UTC [1] LOG S-0x5631ef4891d0: shard1/[email protected]:5432 new connection to server (from 172.22.0.6:33088)
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.005 UTC [1] LOG C-0x5631ef481e80: shard1/[email protected]:57294 closing because: client close request (age=345s)
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.005 UTC [1] LOG S-0x5631ef4891d0: shard1/[email protected]:5432 closing because: client disconnect while server was not ready (age=2s)
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.005 UTC [1] LOG C-0x5631ef481e80: (nodb)/(nouser)@172.22.0.1:38600 closing because: failed cancel request (age=0s)
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.506 UTC [1] LOG C-0x5631ef481e80: shard1/[email protected]:38606 login attempt: db=shard1 user=postgres tls=no
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.508 UTC [1] LOG S-0x5631ef4891d0: shard1/[email protected]:5432 new connection to server (from 172.22.0.6:33094)

@drakkan
Copy link
Contributor

drakkan commented Oct 13, 2023

@jackc

The only reason the code originally read was to wait for the server to close the connection.

That's exactly the reason I want to revert this commit. My current understanding is that due to the usage of PGbouncer, the connection is not fast enough to deliver the instruction and defer cancelConn.Close() closes the connection earlier than the instruction is delivered. And the query keeps running as a result.

Here's the pgbouncer log:

kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:51.238 UTC [1] LOG S-0x5631ef4891d0: shard1/[email protected]:5432 new connection to server (from 172.22.0.6:33088)
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.005 UTC [1] LOG C-0x5631ef481e80: shard1/[email protected]:57294 closing because: client close request (age=345s)
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.005 UTC [1] LOG S-0x5631ef4891d0: shard1/[email protected]:5432 closing because: client disconnect while server was not ready (age=2s)
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.005 UTC [1] LOG C-0x5631ef481e80: (nodb)/(nouser)@172.22.0.1:38600 closing because: failed cancel request (age=0s)
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.506 UTC [1] LOG C-0x5631ef481e80: shard1/[email protected]:38606 login attempt: db=shard1 user=postgres tls=no
kpis-pgbouncer-shard1-1  | 2023-10-09 14:42:54.508 UTC [1] LOG S-0x5631ef4891d0: shard1/[email protected]:5432 new connection to server (from 172.22.0.6:33094)

in this case I think we can just read and ignore the error but let's see what @jackc thinks

I mean something like this

cancelConn.Read(buf)

and always return a nil error.
A comment explaining the reason for this would prevent others from doing my same mistake 😄

@za-arthur
Copy link

FYI here is how Postgres libpq sends cancel request:
https://github.com/postgres/postgres/blob/REL_16_0/src/interfaces/libpq/fe-connect.c#L4946-L4960

Looks like it waits for a response from a server and ignores all errors. AFAIK EINTR can be ignored by pgx since go runtime handles it.

@levakin levakin force-pushed the fix-pgconn-cancel-query branch from 4fc9e7d to 9eebd44 Compare October 13, 2023 17:31
@jackc
Copy link
Owner

jackc commented Oct 14, 2023

LGTM.

@jackc jackc merged commit 304697d into jackc:master Oct 14, 2023
11 of 16 checks passed
@levakin levakin deleted the fix-pgconn-cancel-query branch November 19, 2023 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pgx doesn't cancel query correctly through pgbouncer
4 participants