Proper way to model an ML testing pipeline #27589

mielkec-gene · 2025-02-05T17:34:19Z

mielkec-gene
Feb 5, 2025

Our use case : we daily train new models that are archived on a different platform, and we wish to download and test each model on dozens of downstream tasks. Some of these tasks are simple, but many others require horizontal scaling across a dozen or so machines.

I'm relatively new to dagster, so the right way of modeling this has been somewhat unclear. I did eventually figure out that a Dynamic partition was likely the best way to represent new models as they arrive, instead of defining them as new assets and checking in a new code version....

I needed to have a job that could programmatically create a new partition, and create the model asset associated with it. After dabbling for a day, I finally got the following pattern working :

model_partition = dg.DynamicPartitionsDefinition(name="trained_models")

@asset(partitions_def=model_partition)
def model_asset(context: AssetExecutionContext) -> MaterializeResult:
  model_id = context.partition_key
  # model gets downloaded here to the system dagster is running on
  return dg.MaterializeResult(metadata=metadata)     # various metadata gets saved to UI


class testModelConfig(Config):
   model_id: str = Field('<model id>', description="ID of trained model hosted on another platform")

@op
def materialize_model(context: OpExecutionContext, config: testModelConfig):
   context.instance.add_dynamic_partitions(model_partition.name, [config.model_id])
   result: ExecuteInProcessResult = materialize([model_asset], instance=context.instance, partition_key=config.model_id)
   return config.model_id

@graph
def test_model():
   model = materialize_model()
   benchmarkA(model)
   benchmarkB(model)

@job
def test_model_job():
   test_model()

There must be a simpler way then this, right? I went about asset factories for a time, but it seems that those are meant to be executed only when the code loads, and not during pipeline execution. Im using a @graph here because it appears to allow me to use the launchpad to launch a new validation job from the UI with the model_id passed as the entire pipeline input, even though its really only needed for the model asset itself.

I'm assuming Ive created some antipatterns here, but I just can't find a simpler way to accomplish dynamic asset creation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proper way to model an ML testing pipeline #27589

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Proper way to model an ML testing pipeline #27589

mielkec-gene Feb 5, 2025

Replies: 0 comments

mielkec-gene
Feb 5, 2025