Releases: dagster-io/dagster
Releases · dagster-io/dagster
1.7.7 (core) / 0.23.7 (libraries)
New
- [ui] Command clicking on nodes in the asset lineage tab will now open them in a separate tab. Same with external asset links in the asset graph.
- Added support for setting a custom job namespace in user code deployments. (thanks @tmatthews0020!)
- Removed warnings due to use of
datetime.utcfromtimestamp
(thanks @dbrtly!) - Custom smtp user can now be used for e-mail alerts (thanks @edsoncezar16!)
- [dagster-dbt] Added support for
dbt-core==1.8.*
. - [dagster-embedded-elt] Failed dlt pipelines are now accurately reflected on the asset materialization (thanks @edsoncezar16!)
Bugfixes
- Fixed spurious errors in logs due to module shadowing.
- Fixed an issue in the Backfill Daemon where if the assets to be materialized had different
BackfillPolicy
s, each asset would get materialized in its own run, rather than grouping assets together into single run. - Fixed an issue that could cause the Asset Daemon to lose information in its cursor about an asset if that asset’s code location was temporarily unavailable.
- [dagster-dbt] Mitigated issues with cli length limits by only listing specific dbt tests as needed when the tests aren’t included via indirect selection, rather than listing all tests.
Documentation
- Markdoc tags can now be used in place of MDX components (thanks @nikomancy)
1.7.6 (core) / 0.23.6 (libraries)
New
- The backfill daemon now has additional logging to document the progression through each tick and why assets are and are not materialized during each evaluation of a backfill.
- Made performance improvements in both calculating and storing data version for assets, especially for assets with a large fan-in.
- Standardized table row count metadata output by various integrations to
dagster/row_count
. - [dagster-aws][community-contribution] Additional parameters can now be passed to the following resources:
CloudwatchLogsHandler
,ECRPublicClient
,SecretsManagerResource
,SSMResource
thanks@jacob-white-simplisafe
!
Bugfixes
- Fixed issue that could cause runs to fail if they targeted any assets which had a metadata value of type
TableMetadataValue
,TableSchemaMetadataValue
, orTableColumnLineageMetadataValue
defined. - Fixed an issue which could cause evaluations produced via the Auto-materialize system to not render the “skip”-type rules.
- Backfills of asset jobs now correctly use the
BackfillPolicy
of the underlying assets in the job. - [dagster-databricks][community-contribution]
databricks-sdk
version bumped to0.17.0
, thanks@lamalex
! - [helm][community-contribution] resolved incorrect comments about
dagster code-server start
, thanks@SanjaySiddharth
!
Documentation
- Added section headings to Pipes API references, along with explanatory copy and links to relevant pages
- Added a guide for subletting asset checks
- Add more detailed steps to transition from serverless to hybrid
- [community-contribution] asset selection syntax corrected, thanks
@JonathanLai2004
!
Dagster Plus
- Fixed an issue where Dagster Cloud agents would wait longer than necessary when multiple code locations were timing out during a deployment.
1.7.5 (core) / 0.23.5 (libraries)
New
- The Asset > Checks tab now allows you to view plots of numeric metadata emitted by your checks.
- The Asset > Events tab now supports infinite-scrolling, making it possible to view all historical materialization and observation events.
- When constructing a
MaterializeResult
,ObserveResult
, orOutput
, you can now include tags that will be attached to the correspondingAssetMaterialization
orAssetObservation
event. These tags will be rendered on these events in the UI.
Bugfixes
- Fixed an issue where backfills would sometimes fail if a partition definition was changed in the middle of the backfill.
- Fixed an issue where if the code server became unavailable during the first tick of a backfill, the backfill would stall and be unable to submit runs once the code server became available.
- Fixed an issue where the status of an external asset would not get updated correctly.
- Fixed an issue where run status sensors would sometimes fall behind in deployments with large numbers of runs.
- The descriptions and metadata on the experimental
build_last_update_freshness_checks
andbuild_time_partition_freshness_checks
APIs have been updated to be clearer. - The headers of tables no longer become misaligned when a scrollbar is present in some scenarios.
- The sensor type, instigation type, and backfill status filters on their respective pages are now saved to the URL, so sharing the view or reloading the page preserve your filters.
- Typing a
%
into the asset graph’s query selector no longer crashes the UI. - “Materializing” states on the asset graph animate properly in both light and dark themes.
- Thanks to @lautaro79 for fixing a helm chart issue.
Breaking Changes
- Subclasses of
MetadataValue
have been changed fromNamedTuple
s to Pydantic models.NamedTuple
functionality on these classes was not part of Dagster’s stable public API, but usages relying on their tuple-ness may break. For example: callingjson.dumps
on collections that include them.
Deprecations
- [dagster-dbt] Support for
dbt-core==1.5.*
has been removed, as it has reached end of life in April 2024.
Dagster Plus
- Fixed an issue in the
dagster-cloud
CLI where the--deployment
argument was ignored when theDAGSTER_CLOUD_URL
environment variable was set. - Fixed an issue where
dagster-cloud-cli
package wouldn’t work unless thedagster-cloud
package was installed as well. - A new “budget alerts” feature has launched for users on self-serve plans. This feature will alert you when you hit your credit limit.
- The experimental asset health overview now allows you to group assets by compute kind, tag, and tag value.
- The concurrency and locations pages in settings correctly show Dagster Plus-specific options when experimental navigation is enabled.
1.7.4 (core) / 0.23.4 (libraries)
New
TimeWindowPartitionMapping
now supports thestart_offset
andend_offset
parameters even when the upstreamPartitionsDefinition
is different than the downstreamPartitionsDefinition
. The offset is expressed in units of downstream partitions, soTimeWindowPartitionMapping(start_offset=-1)
between an hourly upstream and a daily downstream would map each downstream partition to 48 upstream partitions – those for the same and preceding day.
Bugfixes
- Fixed an issue where certain exceptions in the Dagster daemon would immediately retry instead of waiting for a fixed interval before retrying.
- Fixed a bug with asset checks in complex asset graphs that include cycles in the underlying nodes.
- Fixed an issue that would cause unnecessary failures on FIPS-enabled systems due to the use of md5 hashes in non-security-related contexts (thanks @jlloyd-widen!)
- Removed
path
metadata fromUPathIOManager
inputs. This eliminates the creation ofASSET_OBSERVATION
events for every input on every step for the default I/O manager. - Added support for defining
owners
on@graph_asset
. - Fixed an issue where having multiple partitions definitions in a location with the same start date but differing end dates could lead to “
DagsterInvalidSubsetError
when trying to launch runs.
Documentation
- Fixed a few issues with broken pages as a result of the Dagster+ rename.
- Renamed a few instances of Dagster Cloud to Dagster+.
- Added a note about external asset + alert incompatibility to the Dagster+ alerting docs.
- Fixed references to outdated apis in freshness checks docs.
Dagster Plus
- When creating a Branch Deployment via GraphQL or the
dagster-cloud branch-deployment
CLI, you can now specify the base deployment. The base deployment will be used for comparing assets for Change Tracking. For example, to set the base deployment to a deployment namedstaging
:dagster-cloud branch-deployment create-or-update --base-deployment-name staging ...
. Note that once a Branch Deployment is created, the base deployment cannot be changed. - Fixed an issue where agents serving many branch deployments simultaneously would sometimes raise a
413: Request Entity Too Large
error when uploading a heartbeat to the Dagster Plus servers.
1.7.3 (core) / 0.23.3 (libraries)
New
@graph_asset
now accepts atags
argument- [ui] For users whose light/dark mode theme setting is set to match their system setting, the theme will update automatically when the system changes modes (e.g. based on time of day), with no page reload required.
- [ui] We have introduced the typefaces Geist and Geist Mono as our new default fonts throughout the Dagster app, with the goal of improving legibility, consistency, and maintainability.
- [ui] [experimental] We have begun experimenting with a new navigation structure for the Dagster UI. The change can be enabled via User Settings.
- [ui] [experimental] Made performance improvements to the Concurrency settings page.
- [dagster-azure] [community-contribution] ADLS2 IOManager supports custom timeout. Thanks @tomas-gajarsky!
- [dagster-fivetran] [community-contribution] It’s now possible to specify destination ids in
load_asset_defs_from_fivetran_instance
. Thanks @lamalex!
Bugfixes
- Fixed an issue where pressing the “Reset sensor status” button in the UI would also reset the sensor’s cursor.
- Fixed a bug that caused input loading time not to be included in the reported step duration.
- Pydantic warnings are no longer raised when importing Dagster with Pydantic 2.0+.
- Fixed an issue which would cause incorrect behavior when auto-materializing partitioned assets based on updates to a parent asset in a different code location.
- Fixed an issue which would cause every tick of the auto-materialize sensor to produce an evaluation for each asset, even if nothing had changed from the previous tick.
- [dagster-dbt] Fixed a bug that could raise
Duplicate check specs
errors with singular tests ingested as asset checks. - [embedded-elt] resolved an issue where subset of resources were not recognized when using
source.with_resources(...)
- [ui] Fixed an issue where a sensor that targeted an invalid set of asset keys could cause the asset catalog to fail to load.
- [ui] Fixed an issue in which runs in the Timeline that should have been considered overlapping were not correctly grouped together, leading to visual bugs.
- [ui] On the asset overview page, job tags no longer render poorly when an asset appears in several jobs.
- [ui] On the asset overview page, hovering over the timestamp tags in the metadata table explains where each entry originated.
- [ui] Right clicking the background of the asset graph now consistently shows a context menu, and the lineage view supports vertical as well as horizontal layout.
Documentation
- Sidebar navigation now appropriately handles command-click and middle-click to open links in a new tab.
- Added a section for asset checks to the Testing guide.
- Added a guide about Column-level lineage for assets.
- Lots of updates to examples to reflect the new opt-in approach to I/O managers.
Dagster+
- [ui] [experimental] A new Overview > Asset Health page provides visibility into failed and missing materializations, check warnings and check errors.
- [ui] You can now share feedback with the Dagster team directly from the app. Open the Help menu in the top nav, then “Share feedback”. Bugs and feature requests are submitted directly to the Dagster team.
- [ui] When editing a team, the list of team members is now virtualized, allowing for the UI to scale better for very large team sizes.
- [ui] Fixed dark mode for billing components.
1.7.2 (core) / 0.23.2 (libraries)
New
- Performance improvements when loading large asset graphs in the Dagster UI.
@asset_check
functions can now be invoked directly for unit testing.dagster-embedded-elt
dlt resourceDagsterDltResource
can now be used from@op
definitions in addition to assets.UPathIOManager.load_partitions
has been added to assist with helpingUpathIOManager
subclasses deal with serialization formats which support partitioning. Thanks@danielgafni
!- [dagster-polars] now supports other data types rather than only string for the partitioning columns. Also
PolarsDeltaIOManager
now supportsMultiPartitionsDefinition
withDeltaLake
native partitioning. Metadata value"partition_by": {"dim_1": "col_1", "dim_2": "col_2"}
should be specified to enable this feature. Thanks@danielgafni
!
Bugfixes
- [dagster-airbyte] Auto materialization policies passed to
load_assets_from_airbyte_instance
andload_assets_from_airbyte_project
will now be properly propagated to the created assets. - Fixed an issue where deleting a run that was intended to materialize a partitioned asset would sometimes leave the status of that asset as “Materializing” in the Dagster UI.
- Fixed an issue with
build_time_partition_freshness_checks
where it would incorrectly intuit that an asset was not fresh in certain cases. - [dagster-k8s] Fix an error on transient ‘none’ responses for pod waiting reasons. Thanks @piotrmarczydlo!
- [dagster-dbt] Failing to build column schema metadata will now result in a warning rather than an error.
- Fixed an issue where incorrect asset keys would cause a backfill to fail loudly.
- Fixed an issue where syncing unmaterialized assets could include source assets.
Breaking Changes
- [dagster-polars]
PolarsDeltaIOManager
no longer supports loading natively partitioned DeltaLake tables as dictionaries. They should be loaded as a singlepl.DataFrame
/pl.LazyFrame
instead.
Documentation
- Renamed
Dagster Cloud
toDagster+
all over the docs. - Added a page about Change Tracking in Dagster+ branch deployments.
- Added a section about user-defined metrics to the Dagster+ Insights docs.
- Added a section about Asset owners to the asset metadata docs.
Dagster Cloud
- Branch deployments now have Change Tracking. Assets in each branch deployment will be compared to the main deployment. New assets and changes to code version, dependencies, partitions definitions, tags, and metadata will be marked in the UI of the branch deployment.
- Pagerduty alerting is now supported with Pro plans. See the documentation for more info.
- Asset metadata is now included in the insights metrics for jobs materializing those assets.
- Per-run Insights are now available on individual assets.
- Previously, the
before_storage_id
/after_storage_id
values in theAssetRecordsFilter
class were ignored. This has been fixed. - Updated the output of
dagster-cloud deployment alert-policies list
to match the format ofsync
. - Fixed an issue where Dagster Cloud agents with many code locations would sometimes leave code servers running after the agent shut down.
1.7.1 (core) / 0.23.1 (libraries)
New
- [dagster-dbt][experimental] A new cli command
dagster-dbt project prepare-for-deployment
has been added in conjunction withDbtProject
for managing the behavior of rebuilding the manifest during development and preparing a pre-built one for production.
Bugfixes
- Fixed an issue with duplicate asset check keys when loading checks from a package.
- A bug with the new
build_last_update_freshness_checks
andbuild_time_partition_freshness_checks
has been fixed where multi_asset checks passed in would not be executable. - [dagster-dbt] Fixed some issues with building column lineage for incremental models, models with implicit column aliases, and models with columns that have multiple dependencies on the same upstream column.
Breaking Changes
- [dagster-dbt] The experimental
DbtArtifacts
class has been replaced byDbtProject
.
Documentation
- Added a dedicated concept page for all things metadata and tags
- Moved asset metadata content to a dedicated concept page: Asset metadata
- Added section headings to the Software-defined Assets API reference, which groups APIs by asset type or use
- Added a guide about user settings in the Dagster UI
- Added
AssetObservation
to the Software-defined Assets API reference - Renamed Dagster Cloud GitHub workflow files to the new, consolidated
dagster-cloud-deploy.yml
- Miscellaneous formatting and copy updates
- [community-contribution] [dagster-embedded-elt] Fixed
get_asset_key
API documentation (thanks @aksestok!) - [community-contribution] Updated Python version in contributing documentation (thanks @piotrmarczydlo!)
- [community-contribution] Typo fix in README (thanks @MiConnell!)
Dagster Cloud
- Fixed a bug where an incorrect value was being emitted for BigQuery bytes billed in Insights.
1.7.0 (core) / 0.23.0 (libraries)
Major Changes since 1.6.0 (core) / 0.22.0 (libraries)
- Asset definitions can now have tags, via the
tags
argument on@asset
,AssetSpec
, andAssetOut
. Tags are meant to be used for organizing, filtering, and searching for assets. - The Asset Details page has been revamped to include an “Overview” tab that centralizes the most important information about the asset – such as current status, description, and columns – in a single place.
- Assets can now be assigned owners.
- Asset checks are now considered generally available and will no longer raise experimental warnings when used.
- Asset checks can now be marked
blocking
, which causes downstream assets in the same run to be skipped if the check fails with ERROR-level severity. - The new
@multi_asset_check
decorator enables defining a single op that executes multiple asset checks. - The new
build_last_updated_freshness_checks
andbuild_time_partition_freshness_checks
APIs allow defining asset checks that error or warn when an asset is overdue for an update. Refer to the Freshness checks guide for more info. - The new
build_column_schema_change_checks
API allows defining asset checks that warn when an asset’s columns have changed since its latest materialization. - In the asset graph UI, the “Upstream data”, “Code version changed”, and “Upstream code version” statuses have been collapsed into a single “Unsynced” status. Clicking on “Unsynced” displays more detailed information.
- I/O managers are now optional. This enhances flexibility for scenarios where they are not necessary. For guidance, see When to use I/O managers.
- Assets with
None
orMaterializeResult
return type annotations won't use I/O managers; dependencies for these assets can be set using thedeps
parameter in the@asset
decorator.
- Assets with
- [dagster-dbt] Dagster’s dbt integration can now be configured to automatically collect metadata about column schema and column lineage.
- [dagster-dbt] dbt tests are now pulled in as Dagster asset checks by default.
- [dagster-dbt] dbt resource tags are now automatically pulled in as Dagster asset tags.
- [dagster-dbt] dbt owners from dbt groups are now automatically pulled in as Dagster owners.
- [dagster-snowflake] [dagster-gcp] The dagster-snowflake and dagster-gcp packages now both expose a
fetch_last_updated_timestamps
API, which makes it straightforward to collect data freshness information in source asset observation functions.
Changes since 1.6.14 (core) / 0.22.14 (libraries)
New
- Metadata attached during asset or op execution can now be accessed in the I/O manager using
OutputContext.output_metadata
. - [experimental] Single-run backfills now support batched inserts of asset materialization events. This is a major performance improvement for large single-run backfills that have database writes as a bottleneck. The feature is off by default and can be enabled by setting the
DAGSTER_EVENT_BATCH_SIZE
environment variable in a code server to an integer (25 recommended, 50 max). It is only currently supported in Dagster Cloud and OSS deployments with a postgres backend. - [ui] The new Asset Details page is now enabled for new users by default. To turn this feature off, you can toggle the feature in the User Settings.
- [ui] Queued runs now display a link to view all the potential reasons why a run might remain queued.
- [ui] Starting a run status sensor with a stale cursor will now warn you in the UI that it will resume from the point that it was paused.
- [asset-checks] Asset checks now support asset names that include
.
, which can occur when checks are ingested from dbt tests. - [dagster-dbt] The env var
DBT_INDIRECT_SELECTION
will no longer be set toempty
when executing dbt tests as asset checks, unless specific asset checks are excluded.dagster-dbt
will no longer explicitly select all dbt tests with the dbt cli, which had caused argument length issues. - [dagster-dbt] Singular tests with a single dependency are now ingested as asset checks.
- [dagster-dbt] Singular tests with multiple dependencies must have the primary dependency must be specified using dbt meta.
{{
config(
meta={
'dagster': {
'ref': {
'name': <ref_name>,
'package': ... # Optional, if included in the ref.
'version': ... # Optional, if included in the ref.
},
}
}
)
}}
...
- [dagster-dbt] Column lineage metadata can now be emitted when invoking dbt. See the documentation for details.
- [experimental][dagster-embedded-elt] Add the data load tool (dlt) integration for easily building and integration dlt ingestion pipelines with Dagster.
- [dagster-dbt][community-contribution] You can now specify a custom schedule name for schedules created with
build_schedule_from_dbt_selection
. Thanks @dragos-pop! - [helm][community-contribution] You can now specify a custom job namespace for your user code deployments. Thanks @tmatthews0020!
- [dagster-polars][community-contribution] Column schema metadata is now integrated using the dagster-specific metadata key in
dagster_polars
. Thanks @danielgafni! - [dagster-datadog][community-contribution] Added
datadog.api
module to theDatadogClient
resource, enabling direct access to API methods. Thanks @shivgupta!
Bugfixes
- Fixed a bug where run status sensors configured to monitor a specific job would trigger for jobs with the same name in other code locations.
- Fixed a bug where multi-line asset check result descriptions were collapsed into a single line.
- Fixed a bug that caused a value to show up under “Target materialization” in the asset check UI even when an asset had had observations but never been materialized.
- Changed typehint of
metadata
argument onmulti_asset
andAssetSpec
toMapping[str, Any]
. - [dagster-snowflake-pandas] Fixed a bug introduced in 0.22.4 where column names were not using quote identifiers correctly. Column names will now be quoted.
- [dagster-aws] Fixed an issue where a race condition where simultaneously materializing the same asset more than once would sometimes raise an Exception when using the
s3_io_manager
. - [ui] Fixed a bug where resizable panels could inadvertently be hidden and never recovered, for instance the right panel on the global asset graph.
- [ui] Fixed a bug where opening a run with an op selection in the Launchpad could lose the op selection setting for the subsequently launched run. The op selection is now correctly preserved.
- [community-contribution] Fixed
dagster-polars
tests by excludingDecimal
types. Thanks @ion-elgreco! - [community-contribution] Fixed a bug where auto-materialize rule evaluation would error on FIPS-compliant machines. Thanks @jlloyd-widen!
- [community-contribution] Fixed an issue where an excessive DeprecationWarning was being issued for a
ScheduleDefinition
passed into theDefinitions
object. Thanks @2Ryan09!
Breaking Changes
- Creating a run with a custom non-UUID
run_id
was previously private and only used for testing. It will now raise an exception. - [community-contribution] Previously, calling
get_partition_keys_in_range
on aMultiPartitionsDefinition
would erroneously return partition keys that were within the one-dimensional range of alphabetically-sorted partition keys for the definition. Now, this method returns the cartesian product of partition keys within each dimension’s range. Thanks, @mst! - Added
AssetCheckExecutionContext
to replaceAssetExecutionContext
as the type of thecontext
param passed in to@asset_check
functions.@asset_check
was an experimental decorator. - [experimental]
@classmethod
decorators have been removed from dagster-embedded-slt.slingDagsterSlingTranslator
- [dagster-dbt]
@classmethod
decorators have been removed fromDagsterDbtTranslator
. - [dagster-k8s] The default merge behavior when raw kubernetes config is supplied at multiple scopes (for example, at the instance level and for a particluar job) has been changed to be more consistent. Previously, configuration was merged shallowly by default, with fields replacing other fields instead of appending or merging. Now, it is merged deeply by default, with lists appended to each other and dictionaries merged, in order to be more consistent with how kubernetes configuration is combined in all other places. See the docs for more information, including how to restore the previous default merge behavior.
Deprecations
AssetSelection.keys()
has been deprecated. Instead, you can now supply asset key arguments toAssetSelection.assets()
.- Run tag keys with long lengths and certain characters are now deprecated. For consistency with asset tags, run tags keys are expected to only contain alpha-numeric characters, dashes, underscores, and periods. Run tag keys can also contain a prefix section, separated with a slash. The main section and prefix section of a run tag are limited to 63 characters.
- `AssetExecuti...
1.6.14 (core) / 0.22.14 (libraries)
Bugfixes
- [dagster-dbt] Fixed some issues with building column lineage metadata.
1.6.13 (core) / 0.22.13 (libraries)
Bugfixes
- Fixed a bug where an asset with a dependency on a subset of the keys of a parent multi-asset could sometimes crash asset job construction.
- Fixed a bug where a Definitions object containing assets having integrated asset checks and multiple partitions definitions could not be loaded.