-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Spark 2.3/2.4 in Dr.Elephant #683
Comments
@ShubhamGupta29 Just FYI: I have question: Is it because in SHS 2.3 it's not available or there is some work needed to see these metrics. Anyway glad to see progress for 2.3+ spark. |
@mareksimunek I will surely go through your changes.
For your Heuristics related issues, I need to check how are you retrieving and transforming data |
I think, I tried rebase to current master and with higher versions there were more failing tests so I stick with 2.1 and skipped less tests :).
|
@mareksimunek the issue you mentioned |
@ShubhamGupta29 every spark job, I suspect it's beacuse of this. That Spark history verison : 2.3.0.2.6.5.0-292 (its 2.3 with some HDP patches) |
@mareksimunek Hello, buddy, I met the same case as yours: mem/executor/storage info can not be fetched from SHS once the job end. Have your problem solved? Thanks. |
@mareksimunek @xglv1985 need one help from you guysin debugging the issue, can you confirm if the value of |
@ShubhamGupta29 yep its zero.
Correction, its even reporting zero From MR its getting memory stats from this setting. Am I right? |
@ShubhamGupta29 yes, I also proved it. The mem field is 0 in response json |
@mareksimunek @xglv1985 thanks for the prompt response, I am able to support Spark2.3 and will make the changes public soon. I am debugging this |
sure: "memoryMetrics" : { |
By the way, @ShubhamGupta29 Spark Configuration
|
@xglv1985 no this is not normal. Can you tell me which branch or source code are you using? |
@ShubhamGupta29 |
Can you provide the link as |
@ShubhamGupta29 I forked my own dr-elephant from linkedin/dr-elephant master. I only put "SparkFetcher" in my conf xml file, with <use_rest_for_eventlogs>true</use_rest_for_eventlogs> |
@xglv1985 if you are using current master, you can't see any metrics from spark 2.3+ That's why there is ongoing work from @ShubhamGupta29 to support this version.
Thanks for update @ShubhamGupta29 They are already included in my post |
@mareksimunek Thanks very much, I saw the same problem with mine, in link you gave. |
@mareksimunek @xglv1985, I have made the changes for Spark2.3 (these are the foundation changes, will fix the tests and other cleanups in some time). If possible can you guys try this personal branch, it has changes for Spark2.3. |
@ShubhamGupta29 nice, the |
@mareksimunek working on the same, after going through Spark's code got some idea of why this metric is not getting populated. For now, testing the changes and soon add those to the branch and also trying to support Spark 2.4 too. |
ok I saw it yesterday and I will fill the survey today.
| |
hikari
|
|
邮箱:[email protected]
|
Signature is customized by Netease Mail Master
On 04/30/2020 10:08, Shubham Gupta wrote:
@mareksimunek working on the same, after going through Spark's code got some idea of why this metric is not getting populated. For now, testing the changes and soon add those to the branch and also trying to support Spark 2.4 too.
@mareksimunek and @xglv1985, can you guys fill the survey in #685, it would be helpful for us to make Dr.Elephant more OS community-friendly.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@xglv1985 did you get a chance to use the changes done for Spark2.3? Feedback for the changes will make it easy to start the effort for merging the changes to the master branch for users' ease. |
sure, I will try your personal branch, and will feedback to you during May 1st to 5th.
At 2020-04-30 14:37:06, "Shubham Gupta" <[email protected]> wrote:
@xglv1985 did you get a chance to use the changes done for Spark2.3? Feedback for the changes will make it easy to start the effort for merging the changes to the master branch for users' ease.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
@mareksimunek and @xglv1985 I have made some more changes for Spark 2.3 support, kindly try this branch whenever you guys have time. Also for memory heuristics, there is a change needed in Spark Conf, add |
@ShubhamGupta29
|
@mareksimunek thanks for the reply and testing out the provided version.
Also, let me know any other issue you are facing or any suggestion you have for Dr.Elephant. Hope Dr.Elephant is proving useful for you and your team. |
So far it seems it's working like a charm. I am trying to push it through in our team (now its running on small testing cluster) and with working spark metrics it will be much easier to get approval to work on that, thanks for the progress. Question: Are you using 1 dr elephant installation per cluster or do you have 1 dr elephant analyzing more clusters. |
The current Dr.Elephant allow the analysis of jobs only from single RM(single cluster). |
@ShubhamGupta29 First sorry for the late response. Thanking for your branch feature_spark2.3, I now have run it up. This is my screen capture: |
I have found the reason why the details disappeared. The "feature_2.3" branch use "org.avaje.ebeanorm.avaje-ebeanorm-3.2.4.jar" and "org.avaje.ebeanorm.avaje-ebeanorm-agent-3.2.2.jar", which will lead to details missing. I replaced these two jars with |
@xglv1985 did you replace this in the ClassPath? |
Yes, I did. |
@ShubhamGupta29 Hi, did you have time to look into PeakExecutionMemory ? |
@mareksimunek didn't get a chance to look into it but surely will do it over the weekend. |
@ShubhamGupta29 I know you are probably busy, but I hope you will find a way to look at it :). |
Hi @mareksimunek, sorry was caught up in some other tasks. I am working on an approximate value for PeakExecutionMemory as after a lot of finding I got to know that there is no way of getting this value with making changes to Spark source code. Possibly in the coming week, I will push the changes. Also, let me know if Dr.Elephant's support for Spark2.3 is working fine. |
@ShubhamGupta29 Support for spark 2.3 works fine :)). |
@ShubhamGupta29
I have a few questions :
Thanks a lot in advance for your time ! |
Hi @RaphaelDucay ,
|
@ShubhamGupta29 Thanks a lot for the feeedback !
I will keep you updated ! |
@ShubhamGupta29 Ok so we made it for spark 1.6.3 [warn] /dr-elephant-sources/app/org/apache/spark/deploy/history/SparkDataCollection.scala:124: abstract type pattern T is unchecked since it is eliminated by erasure Do you have an idea on how to fix this ? Thanks in advance ! |
hi @ShubhamGupta29 , @xglv1985 |
hi @ShubhamGupta29 , @xglv1985 , |
Hi @shagneet330 the fix is provided in the above comment. But I would suggest using the latest Ember UI, which would be available if your compilation went well. You can access the new UI by adding new# after Dr.E endpoint. e.g. http://hostname:8080/new#, if you can see UI like below then there should be no issue. |
@ShubhamGupta29 Tried with http://hostname:8080/new# but this doesn't seem to load. Are these changes available in "feature_2.3" branch? |
It is available in ` "npm installation found, we'll compile with the new user interface" |
@ShubhamGupta29 would you help add spark application name in the new UI? |
@ShubhamGupta29, Below is the error. elephant_spark23/dr-elephant/app/org/apache/spark/status/CustomAppStatusListener.scala:628: value getPartitions is not a member of org.apache.spark.status.LiveRDD |
Currently, Dr.Elephant at max supports Spark 2.2.3. We need to support the latest versions of Spark(at least 2.3 and 2.4). This needs several changes, will update the issue as proceeds further.
The text was updated successfully, but these errors were encountered: