Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GOBBLIN-2145] fix bug in getting flow status #4040

Merged
merged 2 commits into from
Aug 29, 2024

Conversation

arjun4084346
Copy link
Contributor

@arjun4084346 arjun4084346 commented Aug 29, 2024

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):
    method getAllFlowStatusesForFlowExecutionsOrdered expects flowGroup, flowName in this order, while it was being passed the parameters in opposite order.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    updated tests

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring to these in JobStatusRetriever:

  public abstract Iterator<JobStatus> getJobStatusesForFlowExecution(String flowName, String flowGroup,
      long flowExecutionId);
  public abstract Iterator<JobStatus> getJobStatusesForFlowExecution(String flowName, String flowGroup,
      long flowExecutionId, String jobName, String jobGroup);

@@ -77,12 +77,12 @@ public void skipFlowConcurrentCheckSameFlowExecutionId() {
JobStatus jobStatus = JobStatus.builder().flowGroup(flowGroup).flowName(flowName).flowExecutionId(flowExecutionId)
.jobName(JobStatusRetriever.NA_KEY).jobGroup(JobStatusRetriever.NA_KEY).eventName(ExecutionStatus.COMPILED.name()).build();
Iterator<JobStatus> jobStatusIterator = Lists.newArrayList(jobStatus).iterator();
FlowStatus flowStatus = new FlowStatus(flowName,flowGroup,flowExecutionId,jobStatusIterator,ExecutionStatus.COMPILED);
when(jobStatusRetriever.getAllFlowStatusesForFlowExecutionsOrdered(flowName, flowGroup)).thenReturn(
FlowStatus flowStatus = new FlowStatus(flowName, flowGroup, flowExecutionId, jobStatusIterator,ExecutionStatus.COMPILED);
Copy link
Contributor

@phet phet Aug 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's confusing that org.apache.gobblin.service.monitoring.FlowStatus uses the (name, group) ordering. personally, I SO MUCH prefer (group, name), but the fact that some of these other long-standing classes don't follow it, makes me question whether we ought to just retreat and accept the (name, group) ordering as a fact of life.

I'm NOT against the change, just recognizing it would take more work and wanting to ensure we have a consistent end-state. so if we want to double down on (group, name) then let's change it EVERYWHERE, even for FlowStatus (and the other methods of JobStatusRetriever

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree many old methods have flowName first, but a lot of methods also have flowGroup first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, i am not comfortable with this state either. lets change it in some other PR, this one is quite urgent

@@ -97,31 +97,31 @@ public void testGetFlowStatusFromJobStatuses() throws Exception {

addJobStatusToStateStore(flowExecutionId, JobStatusRetriever.NA_KEY, ExecutionStatus.COMPILED.name());
Assert.assertEquals(ExecutionStatus.COMPILED,
jobStatusRetriever.getFlowStatusFromJobStatuses(jobStatusRetriever.getJobStatusesForFlowExecution(FLOW_NAME, FLOW_GROUP, flowExecutionId)));
JobStatusRetriever.getFlowStatusFromJobStatuses(jobStatusRetriever.getJobStatusesForFlowExecution(FLOW_NAME, FLOW_GROUP, flowExecutionId)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, let's reorder only if we go ALL-IN w/ this getJobStatusesForFlowExecution too

@arjun4084346 arjun4084346 changed the title fix bug in getting flow status [GOBBLIN-2145] fix bug in getting flow status Aug 29, 2024
@arjun4084346 arjun4084346 merged commit f6ba473 into apache:master Aug 29, 2024
6 checks passed
Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you did to fix the mis-ordering introduced here - https://github.com/apache/gobblin/pull/3989/files#diff-237d0e6a4a38c77a7a26a49075a7d8dfa5b95002adff25a18586f7495dca9820R157

but really that new method JobStatusRetriever::getAllFlowStatusesForFlowExecutionsOrdered is the only one of its class w/ the (flowGroup, flowName) param ordering.

I understand it turned into too much effort to reorder all the rest to (flowGroup, flowName). the converse of reordering only that one new method to use (flowName, flowGroup), as the other legacy ones do, might arguably leave the codebase more consistent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants