[FEATURE] Add mean to metrics API #10961

billdirks · 2025-02-21T22:38:16Z

Description of PR changes above includes a link to an existing GitHub issue
PR title is prefixed with one of: [BUGFIX], [FEATURE], [DOCS], [MAINTENANCE], [CONTRIB]
Code is linted - run invoke lint (uses ruff format + ruff check)
Appropriate tests and docs have been updated

For more information about contributing, visit our community resources.

After you submit your PR, keep the page open and monitor the statuses of the various checks made by our continuous integration process at the bottom of the page. Please fix any issues that come up and reach out on Slack if you need help. Thanks for contributing!

netlify · 2025-02-21T22:38:32Z

✅ Deploy Preview for niobium-lead-7998 canceled.

Name	Link
🔨 Latest commit	`3be1518`
🔍 Latest deploy log	https://app.netlify.com/sites/niobium-lead-7998/deploys/67bd0730df21f9000809169e

for more information, see https://pre-commit.ci

codecov · 2025-02-21T22:40:58Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.45%. Comparing base (d4dc22d) to head (355b027).

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop   #10961      +/-   ##
===========================================
- Coverage    80.84%   78.45%   -2.39%     
===========================================
  Files          471      472       +1     
  Lines        40790    40797       +7     
===========================================
- Hits         32976    32008     -968     
- Misses        7814     8789     +975

Flag	Coverage Δ
3.10	`70.25% <100.00%> (+0.02%)`	⬆️
3.10 athena or openpyxl or pyarrow or project or sqlite or aws_creds	`56.56% <100.00%> (?)`
3.10 aws_deps	`46.52% <100.00%> (?)`
3.10 big	`54.97% <100.00%> (?)`
3.10 clickhouse	`43.42% <100.00%> (?)`
3.10 filesystem	`63.01% <100.00%> (?)`
3.10 mssql	`51.52% <100.00%> (?)`
3.10 mysql	`51.90% <100.00%> (?)`
3.10 spark_connect	`46.84% <100.00%> (?)`
3.11	`70.25% <100.00%> (+0.02%)`	⬆️
3.11 athena or openpyxl or pyarrow or project or sqlite or aws_creds	`56.56% <100.00%> (?)`
3.11 aws_deps	`46.52% <100.00%> (?)`
3.11 big	`54.97% <100.00%> (?)`
3.11 clickhouse	`43.42% <100.00%> (?)`
3.11 filesystem	`63.01% <100.00%> (?)`
3.11 mssql	`51.52% <100.00%> (?)`
3.11 mysql	`51.90% <100.00%> (?)`
3.11 postgresql	`54.64% <100.00%> (?)`
3.11 spark_connect	`46.84% <100.00%> (?)`
3.12	`70.25% <100.00%> (+0.01%)`	⬆️
3.12 athena or openpyxl or pyarrow or project or sqlite or aws_creds	`56.56% <100.00%> (+<0.01%)`	⬆️
3.12 aws_deps	`46.52% <100.00%> (+<0.01%)`	⬆️
3.12 big	`54.96% <100.00%> (+<0.01%)`	⬆️
3.12 bigquery	`?`
3.12 databricks	`?`
3.12 filesystem	`63.01% <100.00%> (+<0.01%)`	⬆️
3.12 mssql	`51.52% <100.00%> (+<0.01%)`	⬆️
3.12 mysql	`51.90% <100.00%> (+<0.01%)`	⬆️
3.12 postgresql	`54.65% <100.00%> (+0.03%)`	⬆️
3.12 snowflake	`?`
3.12 spark	`?`
3.12 spark_connect	`?`
3.12 trino	`?`
3.9	`70.28% <100.00%> (+0.02%)`	⬆️
3.9 athena or openpyxl or pyarrow or project or sqlite or aws_creds	`56.56% <100.00%> (+<0.01%)`	⬆️
3.9 aws_deps	`46.54% <100.00%> (+<0.01%)`	⬆️
3.9 big	`54.98% <100.00%> (+<0.01%)`	⬆️
3.9 bigquery	`?`
3.9 clickhouse	`43.44% <100.00%> (+<0.01%)`	⬆️
3.9 databricks	`?`
3.9 filesystem	`63.02% <100.00%> (+<0.01%)`	⬆️
3.9 mssql	`51.51% <100.00%> (+<0.01%)`	⬆️
3.9 mysql	`51.88% <100.00%> (+<0.01%)`	⬆️
3.9 postgresql	`54.63% <100.00%> (+0.03%)`	⬆️
3.9 snowflake	`?`
3.9 spark	`?`
3.9 spark_connect	`46.85% <100.00%> (+<0.01%)`	⬆️
3.9 trino	`?`
cloud	`?`
docs-basic	`?`
docs-creds-needed	`?`
docs-spark	`52.47% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

billdirks · 2025-02-21T22:59:19Z

tests/metrics/column_aggregate/test_column_aggregate.py

+    batch_setup = request.getfixturevalue(setup_datasource)
+    breakpoint()
+    with batch_setup.batch_test_context() as batch:
+        metric = ColumnValuesMean(batch_id=batch.id, column="number")


I will file a ticket to remove the batch_id argument when instantiating a metric.

https://greatexpectations.atlassian.net/browse/GX-441

NathanFarmer · 2025-02-24T15:59:52Z

tests/metrics/column_aggregate/test_column_aggregate.py

+@pytest.fixture
+def setup_spark(dataframe: pandas.DataFrame, tmp_path: Path) -> SparkFilesystemCsvBatchTestSetup:
+    return SparkFilesystemCsvBatchTestSetup(
+        config=SparkFilesystemCsvDatasourceTestConfig(),
+        data=dataframe,
+        base_dir=tmp_path,
+    )


You may need to pass column_types to SparkFilesystemCsvDatasourceTestConfig to get the spark error test passing. Maybe it is coercing somehow, I'm not sure why it doesn't fail.

It looks to be a bug and we use a different metric name when the mean fails on spark. See inline comment on failure test below.

for more information, see https://pre-commit.ci

billdirks · 2025-02-25T00:01:29Z

tests/metrics/column_aggregate/test_column_aggregate.py

+
+
+@pytest.fixture
+def setup_snowflake(dataframe: pandas.DataFrame) -> SnowflakeBatchTestSetup:


I've added snowflake tests since it is already supported in expectai.

Add mean metric

1fd7f73

[pre-commit.ci] auto fixes from pre-commit.com hooks

67e18b0

for more information, see https://pre-commit.ci

billdirks commented Feb 21, 2025

View reviewed changes

Remove stray breakpoint

88da9df

NathanFarmer requested changes Feb 24, 2025

View reviewed changes

billdirks and others added 3 commits February 24, 2025 15:46

Ignore spark failure test.

355b027

Add snowflake tests.

48e3573

[pre-commit.ci] auto fixes from pre-commit.com hooks

3be1518

for more information, see https://pre-commit.ci

billdirks commented Feb 25, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add mean to metrics API #10961

[FEATURE] Add mean to metrics API #10961

billdirks commented Feb 21, 2025

netlify bot commented Feb 21, 2025 •

edited

Loading

codecov bot commented Feb 21, 2025 •

edited

Loading

billdirks Feb 21, 2025

billdirks Feb 21, 2025

NathanFarmer Feb 24, 2025

billdirks Feb 25, 2025

billdirks Feb 25, 2025



		@pytest.fixture
		def setup_snowflake(dataframe: pandas.DataFrame) -> SnowflakeBatchTestSetup:

[FEATURE] Add mean to metrics API #10961

Are you sure you want to change the base?

[FEATURE] Add mean to metrics API #10961

Conversation

billdirks commented Feb 21, 2025

netlify bot commented Feb 21, 2025 • edited Loading

✅ Deploy Preview for niobium-lead-7998 canceled.

codecov bot commented Feb 21, 2025 • edited Loading

Codecov Report

billdirks Feb 21, 2025

Choose a reason for hiding this comment

billdirks Feb 21, 2025

Choose a reason for hiding this comment

NathanFarmer Feb 24, 2025

Choose a reason for hiding this comment

billdirks Feb 25, 2025

Choose a reason for hiding this comment

billdirks Feb 25, 2025

Choose a reason for hiding this comment

netlify bot commented Feb 21, 2025 •

edited

Loading

codecov bot commented Feb 21, 2025 •

edited

Loading