Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Add mean to metrics API #10961

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

billdirks
Copy link
Contributor

  • Description of PR changes above includes a link to an existing GitHub issue
  • PR title is prefixed with one of: [BUGFIX], [FEATURE], [DOCS], [MAINTENANCE], [CONTRIB]
  • Code is linted - run invoke lint (uses ruff format + ruff check)
  • Appropriate tests and docs have been updated

For more information about contributing, visit our community resources.

After you submit your PR, keep the page open and monitor the statuses of the various checks made by our continuous integration process at the bottom of the page. Please fix any issues that come up and reach out on Slack if you need help. Thanks for contributing!

Copy link

netlify bot commented Feb 21, 2025

Deploy Preview for niobium-lead-7998 canceled.

Name Link
🔨 Latest commit 3be1518
🔍 Latest deploy log https://app.netlify.com/sites/niobium-lead-7998/deploys/67bd0730df21f9000809169e

Copy link

codecov bot commented Feb 21, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 78.45%. Comparing base (d4dc22d) to head (355b027).

✅ All tests successful. No failed tests found.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop   #10961      +/-   ##
===========================================
- Coverage    80.84%   78.45%   -2.39%     
===========================================
  Files          471      472       +1     
  Lines        40790    40797       +7     
===========================================
- Hits         32976    32008     -968     
- Misses        7814     8789     +975     
Flag Coverage Δ
3.10 70.25% <100.00%> (+0.02%) ⬆️
3.10 athena or openpyxl or pyarrow or project or sqlite or aws_creds 56.56% <100.00%> (?)
3.10 aws_deps 46.52% <100.00%> (?)
3.10 big 54.97% <100.00%> (?)
3.10 clickhouse 43.42% <100.00%> (?)
3.10 filesystem 63.01% <100.00%> (?)
3.10 mssql 51.52% <100.00%> (?)
3.10 mysql 51.90% <100.00%> (?)
3.10 spark_connect 46.84% <100.00%> (?)
3.11 70.25% <100.00%> (+0.02%) ⬆️
3.11 athena or openpyxl or pyarrow or project or sqlite or aws_creds 56.56% <100.00%> (?)
3.11 aws_deps 46.52% <100.00%> (?)
3.11 big 54.97% <100.00%> (?)
3.11 clickhouse 43.42% <100.00%> (?)
3.11 filesystem 63.01% <100.00%> (?)
3.11 mssql 51.52% <100.00%> (?)
3.11 mysql 51.90% <100.00%> (?)
3.11 postgresql 54.64% <100.00%> (?)
3.11 spark_connect 46.84% <100.00%> (?)
3.12 70.25% <100.00%> (+0.01%) ⬆️
3.12 athena or openpyxl or pyarrow or project or sqlite or aws_creds 56.56% <100.00%> (+<0.01%) ⬆️
3.12 aws_deps 46.52% <100.00%> (+<0.01%) ⬆️
3.12 big 54.96% <100.00%> (+<0.01%) ⬆️
3.12 bigquery ?
3.12 databricks ?
3.12 filesystem 63.01% <100.00%> (+<0.01%) ⬆️
3.12 mssql 51.52% <100.00%> (+<0.01%) ⬆️
3.12 mysql 51.90% <100.00%> (+<0.01%) ⬆️
3.12 postgresql 54.65% <100.00%> (+0.03%) ⬆️
3.12 snowflake ?
3.12 spark ?
3.12 spark_connect ?
3.12 trino ?
3.9 70.28% <100.00%> (+0.02%) ⬆️
3.9 athena or openpyxl or pyarrow or project or sqlite or aws_creds 56.56% <100.00%> (+<0.01%) ⬆️
3.9 aws_deps 46.54% <100.00%> (+<0.01%) ⬆️
3.9 big 54.98% <100.00%> (+<0.01%) ⬆️
3.9 bigquery ?
3.9 clickhouse 43.44% <100.00%> (+<0.01%) ⬆️
3.9 databricks ?
3.9 filesystem 63.02% <100.00%> (+<0.01%) ⬆️
3.9 mssql 51.51% <100.00%> (+<0.01%) ⬆️
3.9 mysql 51.88% <100.00%> (+<0.01%) ⬆️
3.9 postgresql 54.63% <100.00%> (+0.03%) ⬆️
3.9 snowflake ?
3.9 spark ?
3.9 spark_connect 46.85% <100.00%> (+<0.01%) ⬆️
3.9 trino ?
cloud ?
docs-basic ?
docs-creds-needed ?
docs-spark 52.47% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

batch_setup = request.getfixturevalue(setup_datasource)
breakpoint()
with batch_setup.batch_test_context() as batch:
metric = ColumnValuesMean(batch_id=batch.id, column="number")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will file a ticket to remove the batch_id argument when instantiating a metric.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +45 to +51
@pytest.fixture
def setup_spark(dataframe: pandas.DataFrame, tmp_path: Path) -> SparkFilesystemCsvBatchTestSetup:
return SparkFilesystemCsvBatchTestSetup(
config=SparkFilesystemCsvDatasourceTestConfig(),
data=dataframe,
base_dir=tmp_path,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may need to pass column_types to SparkFilesystemCsvDatasourceTestConfig to get the spark error test passing. Maybe it is coercing somehow, I'm not sure why it doesn't fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to be a bug and we use a different metric name when the mean fails on spark. See inline comment on failure test below.



@pytest.fixture
def setup_snowflake(dataframe: pandas.DataFrame) -> SnowflakeBatchTestSetup:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added snowflake tests since it is already supported in expectai.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants