-
Notifications
You must be signed in to change notification settings - Fork 9.1k
YARN-11836. Fixed AM log fetching with YARN CLI #7813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
Change-Id: I04678a04503e01e67f1075f3fc543887cf80ac47
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @p-szucs for identify the problem and try to fix it.
I think the basic idea is good, i just had some NIT code style comment
int haIndex = 0; | ||
int activeRMIndex = 0; | ||
int rmCount = 1; | ||
|
||
if (HAUtil.isHAEnabled(conf)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we try some early return pattern?
if (!HAUtil.isHAEnabled(conf)) {
// If i understand right here we have just 1 RM so we can just try with that
String rmAddress = getRMWebAppURLWithScheme(conf, 0);
return func.apply(rmAddress, arg);
}
// try RMHAUtils.findActiveRMHAId(conf)
// if null do the iteration
// finally apply and return
Maybe will be a bit easyer to read cause now we have the
int activeRMIndex = 0;
int rmCount = 1;
code what related to non HA, than HA code, than some mix.
if (HAUtil.isHAEnabled(conf)) { | ||
ArrayList<String> rmIds = (ArrayList<String>) HAUtil.getRMHAIds(conf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we cast to simple List interface here? Maybe will be a bit more robust.
String rmAddress = getRMWebAppURLWithScheme(conf, i); | ||
return func.apply(rmAddress, arg); | ||
} catch (Exception e) { | ||
// Ignore and try next RM if there are any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some trace level LOG can be use full
Change-Id: I04678a04503e01e67f1075f3fc543887cf80ac47
Description of PR
YARN-10767 introduced a bug, where YARN Logs CLI is unable to fetch the AM logs using "-am" option if the user is not in the Admin ACLs.
This commit changed the logic for requesting the AM logs and it fetches the "id" of the active RM from the HA service, and requesting the logs from there.
Reproduction:
The issue can be reproduced by calling "yarn logs -applicationId ‹appId› -am 1" command with a user who has not got admin access.
In the RM logs of the test cluster I can see the following error, which states that the user doesn't have permission to call 'getServiceState':
Currently in WebAppUtils's execOnActiveRM method we throw an exception when RMHAUtils.findActiveRMHAId returns null here, stating that "No active RM is available". However that method will return null if the permissions are missing to check the service states. I think at this point we could fall back to the original code here, and try to find the active RM by iterating through them.
The issue only happens in HA mode, and only if we use "-am" option, without this option the AM logs can be retrieved together with the aggregated logs.
How was this patch tested?
Tested manually on a cluster by fetching application logs
The aggregated logs and AM logs could be fetched in every cases.
When I stopped the first RM, and the user could not check the active one, we found it when the connection timed out after 30 retries. With admin user it could find the active one for the first time
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?