-
-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add profile events support #455
base: master
Are you sure you want to change the base?
Conversation
@@ -142,3 +143,14 @@ def store_progress(self, progress): | |||
|
|||
def store_elapsed(self, elapsed): | |||
self.elapsed = elapsed | |||
|
|||
def store_profile_events(self, packet): | |||
data = QueryResult([packet]).get_result() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we have 'static' attributes that stores statistics: https://clickhouse-driver.readthedocs.io/en/latest/features.html#query-execution-statistics: client.last_query.progress.total_rows
, client.last_query.progress.total_bytes
, etc.
I'd prefer to store statistics in the same way if it's possible: client.last_query.stats.select_query
, client.last_query.stats.selected_rows
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I use content of this data for analyzing queries, and they may be very different. Metrics here vary on query type and even queried table engine. I guess server version may have effect too (I use only v.23 at this moment). So, finally, that is not stable list of metrics. And number of options too big (may be >100 finally).
I found that ~20 of them most common for queries and most intersting. I use pydantic model to get them:
class ClickhouseStats(pydantic.BaseModel):
elapsed: int = pydantic.Field(alias="elapsed")
is_insert: int | None = pydantic.Field(alias="InsertQuery", default=None)
read_bytes: int | None = pydantic.Field(alias="ReadCompressedBytes", default=None)
write_bytes: int | None = pydantic.Field(alias="WriteBufferFromFileDescriptorWriteBytes", default=None)
network_recv_bytes: int | None = pydantic.Field(alias="NetworkReceiveBytes", default=None)
network_recv_time: int | None = pydantic.Field(alias="NetworkReceiveElapsedMicroseconds", default=None)
network_send_bytes: int | None = pydantic.Field(alias="NetworkSendBytes", default=None)
network_send_time: int | None = pydantic.Field(alias="NetworkSendElapsedMicroseconds", default=None)
memory_usage: int | None = pydantic.Field(alias="MemoryTrackerUsage", default=None)
memory_peak: int | None = pydantic.Field(alias="MemoryTrackerPeakUsage", default=None)
file_open: int | None = pydantic.Field(alias="FileOpen", default=None)
function_execute: int | None = pydantic.Field(alias="FunctionExecute", default=None)
write_time: int | None = pydantic.Field(alias="DiskWriteElapsedMicroseconds", default=None)
insert_rows: int | None = pydantic.Field(alias="InsertedRows", default=None)
insert_bytes: int | None = pydantic.Field(alias="InsertedBytes", default=None)
select_rows: int | None = pydantic.Field(alias="SelectedRows", default=None)
select_bytes: int | None = pydantic.Field(alias="SelectedBytes", default=None)
insert_parts: int | None = pydantic.Field(alias="InsertedCompactParts", default=None)
real_time: int | None = pydantic.Field(alias="RealTimeMicroseconds", default=None)
system_time: int | None = pydantic.Field(alias="SystemTimeMicroseconds", default=None)
def __init__(self, result: CursorResult | None = None, query_info: QueryInfo | None = None):
if query_info is None:
query_info: QueryInfo = result.context.query_info
super().__init__(elapsed=int(query_info.elapsed * 1000), **(query_info.stats or {})) # TODO: 1000?
I can add them here(without pydantic), but wouldn't that be too much?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. It's too much. Dict will be fine.
Adds support of profile events on native protocol. There are many different parameters inside like network timings, locks, memory usage, that may be very helpful for debug and monitoring queries.
Not sure is it needed to update docs.
Checklist:
flake8
and fix issues.pytest
no tests failed. See https://clickhouse-driver.readthedocs.io/en/latest/development.html.