Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GOBBLIN-2136] Byebye dag manager #4031

Merged
merged 12 commits into from
Sep 7, 2024

Conversation

arjun4084346
Copy link
Contributor

@arjun4084346 arjun4084346 commented Aug 16, 2024

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):

Now when we have developed the new GaaS architecture based on DagProcessingEngine, we do not need DagManager and related classes. This also enabled GaaS to work in a multi leader mode, so there is no need for the code that was there for standby hosts.

a)
Removed the code path for GOBBLIN_SERVICE_FLOW_CATALOG_LOCAL_COMMIT = false
Removed the code path for FORCE_LEADER = true
Removed the code path for MULTI_ACTIVE_EXECUTION_ENABLED = false
Removed the code path for MULTI_ACTIVE_SCHEDULER_ENABLED = false
Removed the code path for GOBBLIN_SERVICE_WARM_STANDBY_ENABLED_KEY = false
Removed the code path for DAG_PROC_ENGINE_ENABLED = false

b)
Removed DagManager

c)
Removed Optional from DagManagement, DagTaskStream, DagProcFactory, DagProcessingEngine, DagManagementStateStore, FlowLaunchHandler, DagActionReminderScheduler

d)
Codestyle changes suggested by IntelliJ

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    no code logic change, only updated the existing tests

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

@arjun4084346 arjun4084346 force-pushed the byebyeDagManager branch 9 times, most recently from 1e9b843 to 4f71190 Compare August 17, 2024 06:14
@codecov-commenter
Copy link

codecov-commenter commented Aug 18, 2024

Codecov Report

Attention: Patch coverage is 68.04124% with 93 lines in your changes missing coverage. Please review.

Project coverage is 48.25%. Comparing base (444f266) to head (349f2b9).
Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
.../modules/scheduler/GobblinServiceJobScheduler.java 55.81% 19 Missing ⚠️
...les/orchestration/DagManagementTaskStreamImpl.java 10.52% 17 Missing ⚠️
...e/modules/orchestration/MysqlUserQuotaManager.java 0.00% 14 Missing ⚠️
...rvice/modules/orchestration/DagManagerMetrics.java 44.44% 5 Missing ⚠️
...in/service/modules/core/GobblinServiceManager.java 71.42% 4 Missing ⚠️
...odules/orchestration/InMemoryUserQuotaManager.java 78.94% 2 Missing and 2 partials ⚠️
...in/service/modules/orchestration/Orchestrator.java 69.23% 2 Missing and 2 partials ⚠️
...rvice/monitoring/KafkaJobStatusMonitorFactory.java 0.00% 4 Missing ⚠️
...ache/gobblin/runtime/spec_catalog/FlowCatalog.java 62.50% 2 Missing and 1 partial ⚠️
...obblin/service/modules/orchestration/DagUtils.java 62.50% 3 Missing ⚠️
... and 11 more
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #4031      +/-   ##
============================================
+ Coverage     38.79%   48.25%   +9.45%     
- Complexity     1599     9111    +7512     
============================================
  Files           388     1778    +1390     
  Lines         15998    68211   +52213     
  Branches       1585     7315    +5730     
============================================
+ Hits           6207    32913   +26706     
- Misses         9293    32513   +23220     
- Partials        498     2785    +2287     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@arjun4084346 arjun4084346 force-pushed the byebyeDagManager branch 6 times, most recently from f1c070f to 30d9638 Compare August 20, 2024 00:18
@arjun4084346 arjun4084346 changed the title [draft] Byebye dag manager [GOBBLIN-2136] Byebye dag manager Aug 20, 2024
@arjun4084346 arjun4084346 force-pushed the byebyeDagManager branch 6 times, most recently from 4462811 to bc27263 Compare August 26, 2024 23:05
Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR is a monster! (hard to believe I'm not even 3/4 done yet)

maybe github will give you a badge for the largest PR. I sure hope I get one for reviewing it ;p

that aside, this looks like great work, which I'm eager to incorporate into HEAD. I'll return to finish my first pass ASAP.

my biggest concern is you've clubbed together removing the DagMgr w/ reworking the REST APIs. if we somehow needed to revert one or the other change, they suddenly become inseparable.

changing public APIs like the REST one often requires care, so I'd definitely isolate that one. it will help maintainers as well as end-users to grasp when that change landed, which is harder to do if the commit message is this other one about removing the DM

Comment on lines 169 to 189
public static final String JOB_START_SLA_TIME = GOBBLIN_SERVICE_DAG_PROCESSING_ENGINE_PREFIX + ConfigurationKeys.GOBBLIN_JOB_START_SLA_TIME;
public static final String JOB_START_SLA_UNITS = GOBBLIN_SERVICE_DAG_PROCESSING_ENGINE_PREFIX + ConfigurationKeys.GOBBLIN_JOB_START_SLA_TIME_UNIT;
public static final long DEFAULT_FLOW_SLA_MILLIS = TimeUnit.HOURS.toMillis(24);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

three thoughts here:

  1. previously the property included the .dagManager segment:
  // Default job start SLA time if configured, measured in minutes. Default is 10 minutes
  // todo - rename "sla" -> "deadline", and move them to DagProcUtils
  public static final String JOB_START_SLA_TIME = DAG_MANAGER_PREFIX + ConfigurationKeys.GOBBLIN_JOB_START_SLA_TIME;

should we continue to accept that legacy prop name so existing flow defs remain b/w compat?

  1. as we introduce a new prop name, let's take the opportunity to align naming and semantics by calling it start.deadline, rather than SLA

  2. given that .dagProcessingEngine is as much impl. detail as .dagManager, it may not belong in the customer-facing name.

@@ -74,7 +74,7 @@ public abstract class BaseFlowToJobSpecCompiler implements SpecCompiler {
// to these data structures.
@Getter
@Setter
protected final Map<URI, TopologySpec> topologySpecMap;
protected Map<URI, TopologySpec> topologySpecMap;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice why this wouldn't still be final. is a derived class assigning to it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea it should be.... i thought if there is a @Setter it should not be final. removed the @Setter instead

Comment on lines 41 to 43
public static final String PER_USER_QUOTA = ServiceConfigKeys.GOBBLIN_SERVICE_DAG_PROCESSING_ENGINE_PREFIX + "perUserQuota";
public static final String PER_FLOWGROUP_QUOTA = ServiceConfigKeys.GOBBLIN_SERVICE_DAG_PROCESSING_ENGINE_PREFIX + "perFlowGroupQuota";
public static final String USER_JOB_QUOTA_KEY = ServiceConfigKeys.GOBBLIN_SERVICE_DAG_PROCESSING_ENGINE_PREFIX + "defaultJobQuota";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, should we honor the old prop names for b/w compat?

also, when we devise a new name to deprecate the old one, let's avoid embedding impl-specific identifiers in what should otherwise be a user-level config option w/ clear and steady semantics

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least in linkedin no one is using them, so i would prefer to remove them. and if someone outside them is using them, we would not know we can make the switch

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this the right class? isn't this more of a "JobSpecTest"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to DagUtilsTest

@arjun4084346 arjun4084346 force-pushed the byebyeDagManager branch 5 times, most recently from 5194b07 to a8c4203 Compare August 31, 2024 03:49
Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work so far! this looks very close

Copy link
Contributor

@phet phet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incredible work here on this enormous cleanup, arjun!

@arjun4084346 arjun4084346 merged commit 45ad13e into apache:master Sep 7, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants