Expected behavior
When executing a long-running query (e.g., ALTER TABLE ... EXECUTE optimize), the Python client should wait for Trino to finish with low resource consumption — ideally using long-polling (maxWait parameter on next_uri) or a configurable backoff interval between polls. The JDBC driver already supports this.
Actual behavior
The Python client enters a tight polling loop with zero delay between successive fetch() calls, saturating a full CPU core for the entire duration of the query. This happens even though the client has no real work to do — it is simply waiting for Trino to finish.
Steps To Reproduce
Steps To Reproduce
- Start a Trino server with an Iceberg table that has many partitions.
- Run a long-running DDL via the Python client:
from trino.dbapi import connect
conn = connect(host="...", port=8080, user="...", catalog="iceberg", schema="...")
cur = conn.cursor()
cur.execute("ALTER TABLE \"iceberg\".\"db\".\"large_table\" EXECUTE optimize(file_size_threshold => '256MB')")
cur.fetchall()
- Observe the Python process CPU usage during execution — it stays at ~100% of a single core until the query completes.
Root Cause
There are two tight polling loops in trino/client.py that call fetch() with no delay between iterations:
TrinoQuery.execute() — blocks until at least one row arrives or the query finishes
while not self.finished and not self.cancelled and len(self._result.rows) == 0:
self._result.rows += self.fetch() # no sleep, no backoff
TrinoResult.__iter__() — iterates until the query finishes and all rows are consumed
while not self._query.finished or self._rows is not None:
next_rows = self._query.fetch() if not self._query.finished else None
...
self._rows = next_rows # no sleep between fetches
Each fetch() issues an HTTP GET to next_uri. For a long-running query, Trino responds almost instantly with a new next_uri pointing to the same in-progress state. The Python client immediately issues another request — no backoff, no minimum interval — burning CPU in a sub-millisecond HTTP request loop for the entire duration of the query.
DDL statements like ALTER TABLE EXECUTE optimize return zero data rows until completion. The execute() loop condition len(self._result.rows) == 0 stays true for minutes, so the tight loop runs uninterrupted.
The underlying issue: the Trino protocol's next_uri serves dual purpose — (a) ACK to advance query processing, and (b) status polling. The client treats both identically with no delay. The protocol itself supports maxWait on next_uri to enable long-polling, but the Python client does not use it.
Log output
No response
Operating System
PRETTY_NAME="Ubuntu 24.04 LTS"
Trino Python client version
0.337.0
Trino Server version
479
Python version
3.13
Are you willing to submit PR?
Expected behavior
When executing a long-running query (e.g.,
ALTER TABLE ... EXECUTE optimize), the Python client should wait for Trino to finish with low resource consumption — ideally using long-polling (maxWaitparameter onnext_uri) or a configurable backoff interval between polls. The JDBC driver already supports this.Actual behavior
The Python client enters a tight polling loop with zero delay between successive
fetch()calls, saturating a full CPU core for the entire duration of the query. This happens even though the client has no real work to do — it is simply waiting for Trino to finish.Steps To Reproduce
Steps To Reproduce
Root Cause
There are two tight polling loops in
trino/client.pythat callfetch()with no delay between iterations:TrinoQuery.execute()— blocks until at least one row arrives or the query finishesTrinoResult.__iter__()— iterates until the query finishes and all rows are consumedEach
fetch()issues an HTTP GET tonext_uri. For a long-running query, Trino responds almost instantly with a newnext_uripointing to the same in-progress state. The Python client immediately issues another request — no backoff, no minimum interval — burning CPU in a sub-millisecond HTTP request loop for the entire duration of the query.DDL statements like
ALTER TABLE EXECUTE optimizereturn zero data rows until completion. Theexecute()loop conditionlen(self._result.rows) == 0stays true for minutes, so the tight loop runs uninterrupted.The underlying issue: the Trino protocol's
next_uriserves dual purpose — (a) ACK to advance query processing, and (b) status polling. The client treats both identically with no delay. The protocol itself supportsmaxWaitonnext_urito enable long-polling, but the Python client does not use it.Log output
No response
Operating System
PRETTY_NAME="Ubuntu 24.04 LTS"
Trino Python client version
0.337.0
Trino Server version
479
Python version
3.13
Are you willing to submit PR?