Skip to content

feat(workflow-operator): add Python UDF UI parameter injection model#5141

Open
carloea2 wants to merge 7 commits into
apache:mainfrom
carloea2:ui-parameter-backend-scala
Open

feat(workflow-operator): add Python UDF UI parameter injection model#5141
carloea2 wants to merge 7 commits into
apache:mainfrom
carloea2:ui-parameter-backend-scala

Conversation

@carloea2
Copy link
Copy Markdown
Contributor

What changes were proposed in this PR?

This PR adds the Scala backend foundation for Python UDF UI parameters.

It introduces:

Area Change
UI parameter model Adds UiUDFParameter, containing backend-compatible attribute metadata and an editable value.
Python UDF injector Adds PythonUdfUiParameterInjector, which validates UI parameters and injects a reserved hook method into supported Python UDF classes.
Safe string encoding Marks Attribute.getName() as encodable so UI parameter names are safely rendered through the Python template builder.
Test coverage Adds Scala tests for hook injection, validation, unsupported types, reserved method conflicts, and unchanged behavior when no UI parameters exist.

This PR is stacked after the merged frontend foundation PR #5043. It does not yet wire the injector into operator execution; that wiring is handled by later PRs in the stack.

Existing Python UDF workflow execution remains unchanged in this PR because PythonUDFOpDescV2, PythonUDFSourceOpDescV2, and DualInputPortsPythonUDFOpDescV2 are not modified here.

Any related issues, documentation, discussions?

Part of the Python UDF UI parameter feature split from feat/ui-parameter.

Related tracking issue / stack: #5044

Stack order:

  1. Frontend UI parameter building blocks: feat(frontend): add Python UDF UI parameter form support #5043
  2. Scala backend injection model: this PR
  3. Python runtime support
  4. End-to-end integration

How was this PR tested?

Commands run:

sbt "WorkflowOperator / Test / testOnly org.apache.texera.amber.operator.udf.python.PythonUdfUiParameterInjectorSpec"
sbt scalafmtAll
sbt scalafmtCheckAll "scalafixAll --check"

Results:

  • PythonUdfUiParameterInjectorSpec: 10 tests passed.
  • scalafmtAll: no file changes.
  • scalafmtCheckAll and scalafixAll --check: passed.

Was this PR authored or co-authored using generative AI tooling?

No

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 21, 2026

Codecov Report

❌ Patch coverage is 90.36145% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 46.66%. Comparing base (d8c254c) to head (75fe048).

Files with missing lines Patch % Lines
...ator/udf/python/PythonUdfUiParameterInjector.scala 90.00% 0 Missing and 8 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5141      +/-   ##
============================================
- Coverage     48.95%   46.66%   -2.30%     
+ Complexity     2377     2376       -1     
============================================
  Files          1048     1048              
  Lines         40270    40098     -172     
  Branches       4272     4267       -5     
============================================
- Hits          19714    18711    -1003     
- Misses        19402    20256     +854     
+ Partials       1154     1131      -23     
Flag Coverage Δ *Carryforward flag
access-control-service 39.53% <ø> (ø)
agent-service 33.74% <ø> (-0.03%) ⬇️ Carriedforward from 0a42fcd
amber 51.74% <90.36%> (+0.17%) ⬆️
computing-unit-managing-service 0.00% <ø> (ø)
config-service 0.00% <ø> (ø)
file-service 37.99% <ø> (ø)
frontend 35.21% <ø> (-5.44%) ⬇️ Carriedforward from 0a42fcd
python 90.50% <ø> (-0.30%) ⬇️ Carriedforward from 0a42fcd
workflow-compiling-service 56.81% <ø> (ø)

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@carloea2
Copy link
Copy Markdown
Contributor Author

@mengw15 can you review this PR? Thanks

@mengw15 mengw15 requested a review from Xiao-zhen-Liu May 26, 2026 03:36
Copy link
Copy Markdown
Contributor

@Xiao-zhen-Liu Xiao-zhen-Liu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few correctness notes from me — I've left them as inline comments on the relevant lines below. They're mostly about the generated Python surviving the handoff to the runtime PR; the happy path itself looks right.


private val InjectedUiParametersHookMethodName = "_texera_injected_ui_parameters"
private val InjectedUiParametersHookMethodHeader =
s"def $InjectedUiParametersHookMethodName(self) -> Dict[str, Any]:"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dict and Any aren't actually exported by from pytexera import * — its __all__ only re-exports Iterator, Optional, and Union from typing. So in the user's module these names are undefined, and since we run Python 3.10–3.13 (where return annotations are evaluated at definition time), the class raises NameError: name 'Dict' is not defined the moment it loads. The test that asserts the output shouldn't contain import typing quietly locks this assumption in, so it'll stay green while generating code that can't run. We could add Dict/Any to pytexera's __all__, inject a from typing import Dict, Any, or just drop the return annotation — whichever we pick, it'd be good to settle it before the runtime PR starts depending on it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is covered in next PR actually but I did not include it here since this is for scala changes. Next PR will be about Python side.

private def buildInjectedHookMethod(uiParameters: List[UiUDFParameter]): String = {
val injectedParametersMap = buildInjectedParametersMap(uiParameters)

(pyb"""|@overrides
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no _texera_injected_ui_parameters in any base class yet, so @overrides will fail at class-definition time — it checks that the method really overrides a parent, and its default signature check calls get_type_hints, which also runs into the Dict/Any issue above. That's a perfectly reasonable thing to leave for the runtime PR; I'd just call it out in the description (or in a comment right here) so it's clear the generated code isn't runnable on its own yet.

if (lineEnd < 0) text.length else lineEnd
}

private def detectClassBlockEnd(code: String, classHeaderStart: Int, classIndent: String): Int = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This decides the class body ends at the first non-blank line indented at or below the class header. That's right for ordinary code, but it doesn't account for string literals, so a triple-quoted block inside a method whose contents start at column 0 looks like the end of the class — and we'd splice the hook into the middle of the string:

class ProcessTupleOperator(UDFOperatorV2):
    def process_tuple(self, tuple_, port):
        sql = """
SELECT * FROM t
"""
        yield tuple_

pybuilder already has PythonLexerUtils/BoundaryValidator for exactly this kind of lexical case, so it might be worth reusing them instead of scanning by indentation. If that's more than you want to take on here, a test that documents the limitation would at least make it a known boundary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can take a look about this one, thanks

private def injectHookIntoUserClass(encodedUserCode: String, hookMethod: String): String = {
val classHeaderMatch =
SupportedPythonUdfClassHeaderRegex.findFirstMatchIn(encodedUserCode).getOrElse {
return encodedUserCode
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the code doesn't contain one of the four supported class names, we hand it back unchanged even if there are parameters to inject. So if a user renames their class, their configured parameters just disappear at runtime with no signal. Since validate already raises for bad input, it might be friendlier to raise here too when uiParameters is non-empty but nothing matched — otherwise it's a confusing "my parameters did nothing" situation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I will update it

val parameters = Option(uiParameters).getOrElse(List.empty)
validate(parameters)

val encodedUserCode = pyb"$code".encode
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: running the whole program through pyb"$code".encode also sends it through stripMargin (the builder's render calls it), so any user line that's whitespace followed by | would get rewritten — even on the no-parameter passthrough path. It's an unlikely shape in a real UDF, but since this is the first place arbitrary user source flows through pyb, it's worth being aware of.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take care of it.

@carloea2
Copy link
Copy Markdown
Contributor Author

Thanks @Xiao-zhen-Liu, I will ping you when it is ready for the next review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants