How and where to change the spark history server port number in dr-elephant #686

selvamraman · 2020-04-29T11:38:38Z

Spark 2.4.4 version
emr release 5.29

Due to some constraint, i can not use port number 18080 for spark history server. Spark history server points to other than 18080(example: 18480).

I am getting below error message in dr-elephant log.

04-29-2020 05:31:54 ERROR [ForkJoinPool-1-worker-29] com.linkedin.drelephant.spark.fetchers.SparkRestClient : error reading applicationInfo http://master-dns-name:18080/api/v1/applications/application_1588136450759_0001. Exception Message = java.net.SocketTimeoutException: connect timed out

I have followed this guide to install dr.elephant in EMR- https://aws.amazon.com/blogs/big-data/tune-hadoop-and-spark-performance-with-dr-elephant-and-sparklens-on-amazon-emr/

ShubhamGupta29 · 2020-04-29T11:47:13Z

Hi @selvamraman,
You can make these changes to the config named spark.yarn.historyServer.address that is generally present in the spark-defaults.conf file of the conf folder of Spark.

@selvamraman if possible can you provide some insight regarding your usage of Dr.Elephant, like

what Fetcher are you using for Spark applications?
what version of Spark and Hadoop you are using?

This information would be helpful for us in providing a resolution to your issues.

selvamraman · 2020-05-01T18:19:51Z

@ShubhamGupta29

AWS EMR Cluster details:
Release label:emr-5.29.0
Hadoop distribution:Amazon 2.8.5
Applications:Spark 2.4.4, Hive 2.3.6

FetcherConf.xml:::

<fetcher>
<applicationtype>spark</applicationtype>
<classname>com.linkedin.drelephant.spark.fetchers.SparkFetcher</classname>
    <params>
      <use_rest_for_eventlogs>true</use_rest_for_eventlogs>
      <should_process_logs_locally>true</should_process_logs_locally>
    </params>
</fetcher>

ShubhamGupta29 · 2020-05-05T06:00:23Z

@selvamraman hope the reply resolved your issue, if so then can I close this issue?

selvamraman · 2020-05-05T19:51:31Z

@ShubhamGupta29
Thanks for the help. i have changed the port number in the conf and now dr.elephant try to fetch the information from 18480. But, still i am facing the problem

05-05-2020 19:30:53 ERROR [ForkJoinPool-1-worker-5] com.linkedin.drelephant.spark.fetchers.SparkRestClient : error reading applicationInfo http://dns_name:18480/api/v1/applications/application_1588700272512_0001. Exception Message = java.net.SocketException: Unexpected end of file from server

The spark history server address is not listening http and it is https -
https://dns_name:18480/api/v1/applications/application_1588700272512_0001.

how to include https for history server in the config?
is it something i can change in the spark-defaults.conf
spark.yarn.historyServer.address=https://dns-name:18480

Also, what are the environment variable do i need to setup before start the EMR?

spark conf
spark home
etc

ShubhamGupta29 · 2020-05-06T05:39:01Z

Hi @selvamramam,
For using HTTPS, if your spark.yarn.historyServer.address has HTTP then change that to HTTPS and you also need to make some changes to the Dr.Elephant's SparkRestClient.scala class. There is an else condition in the calculation for an attribute named historyServerUri.

Environment Variable needed before setup:

SPARK_HOME
HADOOP_HOME
HADOOP_CONF_DIR

selvamraman · 2020-05-13T08:37:25Z

I will let you know the final result. Thanks you so much @ShubhamGupta29 your prompt response.

ProbShin · 2020-06-30T21:46:10Z

Normally, the Hadoop https address and its port number are predefined in Hadoop config. So, for your Dr.Elephant, Just modify Fetchers source code to grab the correct config name, and let it use https head (if SSL is enabled) would be good.

I am having a similar issue. My cluster only uses an SSL connection for some reason. And my Hadoop's config set the jobhistory.http.policy to be HTTPS_ONLY. So, I had to manually modify the source code. For example, for my Tez fetchers TezFetcher.java, I replace TIMELINE_SERVER_URL to uses "yarn.timeline-service.webapp.https.address" instead of "yarn.timeline-service.webapp.address", and I modified URLFactory(String hserverAddr) to using the "https" head.

P.S.

However, in rare cases, if you got some errors like this when starting.

javax.net.ssl.SSLHandshakeException: java.security.cert.CertificateException: No name matching xxx.xxx.xxx.xxx found.

highly likely, It is caused by the incorrect authentication setting on the cluster. Fixing the incorrect setting is recommended, well,..., for simple test purpose you can also try add

static {
    //WORKAROUND. TO BE REMOVED.
    javax.net.ssl.HttpsURLConnection.setDefaultHostnameVerifier(
    new javax.net.ssl.HostnameVerifier(){
        public boolean verify(String |hostname|,
                javax.net.ssl.SSLSession sslSession) {
            if (hostname.equals("mytargethostname")) {
                return true;
            }
            return false;
        }
    });
}

To temporarily workaround the issue. check here for details

ShubhamGupta29 self-assigned this Apr 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How and where to change the spark history server port number in dr-elephant #686

How and where to change the spark history server port number in dr-elephant #686

selvamraman commented Apr 29, 2020

ShubhamGupta29 commented Apr 29, 2020

selvamraman commented May 1, 2020 •

edited

Loading

ShubhamGupta29 commented May 5, 2020

selvamraman commented May 5, 2020

ShubhamGupta29 commented May 6, 2020

selvamraman commented May 13, 2020

ProbShin commented Jun 30, 2020 •

edited

Loading

How and where to change the spark history server port number in dr-elephant #686

How and where to change the spark history server port number in dr-elephant #686

Comments

selvamraman commented Apr 29, 2020

ShubhamGupta29 commented Apr 29, 2020

selvamraman commented May 1, 2020 • edited Loading

ShubhamGupta29 commented May 5, 2020

selvamraman commented May 5, 2020

ShubhamGupta29 commented May 6, 2020

selvamraman commented May 13, 2020

ProbShin commented Jun 30, 2020 • edited Loading

selvamraman commented May 1, 2020 •

edited

Loading

ProbShin commented Jun 30, 2020 •

edited

Loading