Skip to content

Commit

Permalink
Merge dev into main
Browse files Browse the repository at this point in the history
Signed-off-by: spark-rapids automation <[email protected]>
  • Loading branch information
nvauto committed Nov 4, 2024
2 parents 1e2e7c1 + ca6cac6 commit 9742601
Show file tree
Hide file tree
Showing 150 changed files with 4,892 additions and 1,352 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/configuration.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
},
{
"title": "### Core",
"labels": ["tools"]
"labels": ["core_tools"]
},
{
"title": "### Miscellaneous",
Expand Down
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ repos:
rev: v4.0.1
hooks:
- id: check-added-large-files
name: Check for file over 2.0MiB
args: ['--maxkb=2000', '--enforce-all']
name: Check for file over 4.0MiB
args: ['--maxkb=4000', '--enforce-all']
- id: trailing-whitespace
name: trim trailing white spaces preserving md files
args: ['--markdown-linebreak-ext=md']
Expand Down
2 changes: 2 additions & 0 deletions .pylintrc
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,8 @@ disable=
# R0913: Too many arguments
too-many-arguments,
# R0914: Too many local variables
too-many-positional-arguments,
# R0917: Too many positional arguments
too-many-locals,
# R0801: Similar lines in 2 files
duplicate-code,
Expand Down
11 changes: 11 additions & 0 deletions core/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,17 @@ mvn -Dbuildver=351 clean package

Run `mvn help:all-profiles` to list supported Spark versions.

### Running tests

The unit tests are run by default when building unless they are explicitly skipped by specifying `-DskipTests`.

To run an individual test the `-Dsuites` option can be specified:

```bash
mvn test -Dsuites=com.nvidia.spark.rapids.tool.qualification.QualificationSuite
```


### Setting up an Integrated Development Environment

Before proceeding with importing spark-rapids-tools into IDEA or switching to a different Spark release
Expand Down
2 changes: 1 addition & 1 deletion core/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
<artifactId>rapids-4-spark-tools_2.12</artifactId>
<name>RAPIDS Accelerator for Apache Spark tools</name>
<description>RAPIDS Accelerator for Apache Spark tools</description>
<version>24.08.2</version>
<version>24.08.3-SNAPSHOT</version>
<packaging>jar</packaging>
<url>http://github.com/NVIDIA/spark-rapids-tools</url>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -304,3 +304,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
Original file line number Diff line number Diff line change
Expand Up @@ -304,3 +304,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
Original file line number Diff line number Diff line change
Expand Up @@ -292,3 +292,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
1 change: 1 addition & 0 deletions core/src/main/resources/operatorsScore-dataproc-gke-l4.csv
Original file line number Diff line number Diff line change
Expand Up @@ -286,3 +286,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
1 change: 1 addition & 0 deletions core/src/main/resources/operatorsScore-dataproc-gke-t4.csv
Original file line number Diff line number Diff line change
Expand Up @@ -286,3 +286,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
1 change: 1 addition & 0 deletions core/src/main/resources/operatorsScore-dataproc-l4.csv
Original file line number Diff line number Diff line change
Expand Up @@ -292,3 +292,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
Original file line number Diff line number Diff line change
Expand Up @@ -286,3 +286,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
1 change: 1 addition & 0 deletions core/src/main/resources/operatorsScore-dataproc-t4.csv
Original file line number Diff line number Diff line change
Expand Up @@ -292,3 +292,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
1 change: 1 addition & 0 deletions core/src/main/resources/operatorsScore-emr-a10.csv
Original file line number Diff line number Diff line change
Expand Up @@ -292,3 +292,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
1 change: 1 addition & 0 deletions core/src/main/resources/operatorsScore-emr-a10G.csv
Original file line number Diff line number Diff line change
Expand Up @@ -292,3 +292,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
1 change: 1 addition & 0 deletions core/src/main/resources/operatorsScore-emr-t4.csv
Original file line number Diff line number Diff line change
Expand Up @@ -292,3 +292,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
1 change: 1 addition & 0 deletions core/src/main/resources/operatorsScore-onprem-a100.csv
Original file line number Diff line number Diff line change
Expand Up @@ -304,3 +304,4 @@ MapFromArrays,1.5
DecimalSum,1.5
MaxBy,1.5
MinBy,1.5
ArrayJoin,1.5
106 changes: 106 additions & 0 deletions core/src/main/resources/photonOperatorMappings/databricks-13_3.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
{
"//comments": [
"This file contains the mapping between Photon operators and Spark operators generated from Databricks Runtime 13.3",
"Some entries have one-to-many mappings. For example, 'PhotonAgg' can map to either 'HashAggregate', 'SortAggregate', or 'ObjectHashAggregate'.",
"Currently, only the first mapping in the list is used.",
"This limitation exists because we cannot differentiate between these operators in the SparkPlan.",
"TODO: Create separate mapping file for different Photon/Databricks versions"
],
"PhotonAdapter": [
"Scan"
],
"PhotonScan": [
"Scan"
],
"PhotonSubqueryBroadcast": [
"SubqueryBroadcast"
],
"PhotonResultStage": [
"WholeStageCodegen"
],
"PhotonUnionShuffleExchangeSink": [
"Union"
],
"PhotonUnionShuffleMapStage": [
"WholeStageCodegen"
],
"PhotonTopK": [
"TakeOrderedAndProject"
],
"PhotonAgg": [
"HashAggregate",
"SortAggregate",
"ObjectHashAggregate"
],
"PhotonBroadcastExchange": [
"BroadcastExchange"
],
"PhotonBroadcastHashJoin": [
"BroadcastHashJoin"
],
"PhotonBroadcastNestedLoopJoin": [
"BroadcastNestedLoopJoin"
],
"PhotonExpand": [
"Expand"
],
"PhotonFilter": [
"Filter"
],
"PhotonGenerate": [
"Generate"
],
"PhotonGlobalLimit": [
"GlobalLimit"
],
"PhotonGroupingAgg": [
"HashAggregate",
"SortAggregate",
"ObjectHashAggregate"
],
"PhotonGroupingAggWithRollup": [
"HashAggregate"
],
"PhotonHashJoin": [
"BroadcastHashJoin"
],
"PhotonLocalLimit": [
"LocalLimit"
],
"PhotonProject": [
"Project"
],
"PhotonRowToColumnar": [
"RowToColumnar"
],
"PhotonShuffledHashJoin": [
"SortMergeJoin"
],
"PhotonShuffleExchangeSink": [
"Exchange",
"StageBoundary",
"BroadcastExchange"
],
"PhotonShuffleExchangeSource": [
"Exchange",
"StageBoundary",
"AQEShuffleRead",
"ShuffleQueryStage"
],
"PhotonShuffleHashJoin": [
"ShuffledHashJoin"
],
"PhotonShuffleMapStage": [
"WholeStageCodegen"
],
"PhotonSort": [
"Sort"
],
"PhotonUnion": [
"Union"
],
"PhotonWindow": [
"Window",
"RunningWindowFunction"
]
}
4 changes: 4 additions & 0 deletions core/src/main/resources/supportedExprs.csv
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,10 @@ ArrayFilter,S,`filter`,None,project,result,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,N
ArrayIntersect,S,`array_intersect`,None,project,array1,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA
ArrayIntersect,S,`array_intersect`,None,project,array2,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA
ArrayIntersect,S,`array_intersect`,None,project,result,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA
ArrayJoin,S,`array_join`,None,project,array,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,S,NA,NA,NA,NA,NA
ArrayJoin,S,`array_join`,None,project,delimiter,NA,NA,NA,NA,NA,NA,NA,NA,NA,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
ArrayJoin,S,`array_join`,None,project,nullReplacement,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
ArrayJoin,S,`array_join`,None,project,result,NA,NA,NA,NA,NA,NA,NA,NA,NA,S,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
ArrayMax,S,`array_max`,None,project,input,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA
ArrayMax,S,`array_max`,None,project,result,S,S,S,S,S,S,S,S,PS,S,S,S,NS,NS,NS,NA,NS,NS,NA,NA
ArrayMin,S,`array_min`,None,project,input,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,PS,NA,NA,NA,NA,NA
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
package com.nvidia.spark.rapids

import org.apache.spark.scheduler.SparkListenerEvent
import org.apache.spark.sql.rapids.tool.annotation.ToolsReflection


/**
Expand All @@ -30,4 +31,9 @@ case class SparkRapidsBuildInfoEvent(
sparkRapidsJniBuildInfo: Map[String, String],
cudfBuildInfo: Map[String, String],
sparkRapidsPrivateBuildInfo: Map[String, String]
) extends SparkListenerEvent
) extends SparkListenerEvent {
@ToolsReflection("BD-3.2.1", "Ignore")
override val eventTime: Long = 0
@ToolsReflection("BD-3.2.1", "Ignore")
override val eventType: String = ""
}
Loading

0 comments on commit 9742601

Please sign in to comment.