You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[BUG] Profiling logs against eventlogs on Databricks CPU cluster FAILED on --- NullPointerException: com.nvidia.spark.rapids.tool.profiling.CollectInformation....
#552
Closed
NvTimLiu opened this issue
Sep 9, 2023
· 1 comment
[BUG] Profiling logs against NDS eventlogs on Databricks CPU cluster FAILED as below
NOTE: Only FAILED on the CPU FULL logs, dbfs:/cicd/azure2-cpu/eventlog,dbfs:/cicd/aws-cpu/eventlog,
collected either on aws or azure Databricks CPU clusters
1, Profile PASS with part of CPU envetnlogs, e.g. dbfs:/cicd/cpu/eventlog-2023-09-07--11-00.gz
2, Profile PASS with GPU eventlogs, e.g. dbfs:/cicd/eventlog
3, event log available on azure DBFS:/ against the host: 763784504165494.14
export DATABRICKS_HOST=YOUR_HOST
export DATABRICKS_THOKEN=YOUR_TOKE
SPARK_HOME=export SPARK_HOME=${PWD}/spark-3.2.0-bin-hadoop3.2
RAPIDS_TOOLS_JAR=$PWD/rapids-4-spark-tools_2.12-23.08.1-SNAPSHOT.jar
CLASS=com.nvidia.spark.rapids.tool.profiling.ProfileMain
OUTPUT_DIR=$PWD/output/cpu
EVENT_LOGS=dbfs:/cicd/azure2-cpu/eventlog
java -Xmx20g -cp ${RAPIDS_TOOLS_JAR}:${SPARK_HOME}/jars/* ${CLASS} \
--csv \
--output-directory file://${OUTPUT_DIR} \
${EVENT_LOGS}
+ java -Xmx20g -cp 'rapids-4-spark-tools_2.12-23.08.1-SNAPSHOT.jar:/databricks/jars/*' com.nvidia.spark.rapids.tool.profiling.ProfileMain --csv --output-directory file:///tmp/cpu /dbfs/cicd/cpu/aws/eventlog
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
23/09/09 12:47:36 INFO Profiler: Threadpool size is 1
23/09/09 12:47:36 INFO ApplicationInfo: Parsing Event Log: file:/dbfs/cicd/cpu/aws/eventlog
23/09/09 12:47:37 WARN ToolUtils: ClassNotFoundException while parsing an event: DBCEventLoggingListenerMetadata
Profile Tool Progress 0% [> ] (0 succeeded + 0 failed + 0 N/A) / 1
23/09/09 12:47:38 WARN Utils: Your hostname,xxxxx resolves to a loopback address: 127.0.1.1; using xxx.xxx.xxx.xxx instead (on interface eth0)
23/09/09 12:47:38 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
23/09/09 12:48:27 INFO ApplicationInfo: Total number of events parsed: 208205 for file:/dbfs/cicd/cpu/aws/eventlog
23/09/09 12:48:30 INFO EventLogPathProcessor: ============== (index=1) ==============
23/09/09 12:48:30 INFO Profiler: Took 54072ms to process file:/dbfs/cicd/cpu/aws/eventlog
23/09/09 12:48:30 WARN Profiler: Exception occurred processing file: eventlog
java.lang.NullPointerException
at com.nvidia.spark.rapids.tool.profiling.CollectInformation.$anonfun$getAppInfo$1(CollectInformation.scala:39)
at scala.collection.immutable.List.map(List.scala:293)
at com.nvidia.spark.rapids.tool.profiling.CollectInformation.getAppInfo(CollectInformation.scala:37)
at com.nvidia.spark.rapids.tool.profiling.Profiler.com$nvidia$spark$rapids$tool$profiling$Profiler$$processApps(Profiler.scala:315)
at com.nvidia.spark.rapids.tool.profiling.Profiler$ProfileProcessThread$1.run(Profiler.scala:249)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
23/09/09 12:48:30 INFO ToolTextFileWriter: Profile summary output location: file:/tmp/cpu/rapids_4_spark_profile/profile.log
Profile Tool Progress 100% [======================================================] (0 succeeded + 1 failed + 0 N/A) / 1
Profile Tool execution time: 54122ms
process.success.count = 0
process.failure.count = 1
process.NA.count = 0
execution.total.count = 1
The text was updated successfully, but these errors were encountered:
Describe the bug
[BUG] Profiling logs against NDS eventlogs on
Databricks CPU cluster
FAILED as belowNOTE: Only FAILED on the CPU FULL logs, dbfs:/cicd/azure2-cpu/eventlog,dbfs:/cicd/aws-cpu/eventlog,
collected either on aws or azure Databricks CPU clusters
1, Profile PASS with part of CPU envetnlogs, e.g. dbfs:/cicd/cpu/eventlog-2023-09-07--11-00.gz
2, Profile PASS with GPU eventlogs, e.g. dbfs:/cicd/eventlog
3, event log available on azure DBFS:/ against the host:
763784504165494.14
The text was updated successfully, but these errors were encountered: