[BUG] databricks-aws/databricks-azure COMMAND does not work with latest Databricks v0.2.x #609

NvTimLiu · 2023-10-06T09:38:03Z

Describe the bug
Follow below README file (https://github.com/NVIDIA/spark-rapids-tools/blob/main/user_tools/docs/user-tools-databricks-aws.md) to run rapids tools command:

spark_rapids_user_tools databricks-aws profiling --eventlogs /tmp/eventlogs --gpu_cluster 'test-aws-12.2' --tools_jar ./rapids-4-spark-tools_2.12-23.08.2-SNAPSHOT.jar

Got below error:

ERROR root: Profiling. Raised an error in phase [Process-Arguments]
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/rapids/rapids_tool.py", line 108, in wrapper
    func_cb(self, *args, **kwargs)  # pylint: disable=not-callable
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/rapids/rapids_tool.py", line 152, in _process_arguments
    self._process_custom_args()
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/rapids/profiling.py", line 62, in _process_custom_args
    self._process_offline_cluster_args()
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/rapids/profiling.py", line 69, in r_args
    if self._process_gpu_cluster_args(offline_cluster_opts):
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/rapids/profiling.py", line 81, in gs
    gpu_cluster_obj = self._create_migration_cluster('GPU', gpu_cluster_arg)
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/rapids/rapids_tool.py", line 629, in er
    cluster_obj = self.ctxt.platform.connect_cluster_by_name(cluster_arg)
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/cloud_api/sp_types.py", line 794, in 
    cluster_props = self.cli.pull_cluster_props_by_args(args={'cluster': cluster})
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/cloud_api/databricks_aws.py", line 119, in rgs
    cluster_described = self.run_sys_cmd(get_cluster_cmd)
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/cloud_api/sp_types.py", line 464, in run_sys_cmd
    return sys_cmd.exec()
  File "/usr/local/lib/python3.8/dist-packages/spark_rapids_pytools/common/utilities.py", line 363, in exec
    raise RuntimeError(f'{cmd_err_msg}')
RuntimeError: Error invoking CMD <databricks clusters get --profile DEFAULT --cluster-name tim-user-tools-profiling-aws-
	| Error: unknown flag: --cluster-name
	| 
	| Usage:
	|   databricks clusters get CLUSTER_ID [flags]
	| 
	| Flags:
	|   -h, --help               help for get
	|       --no-wait            do not wait to reach RUNNING state
	|       --timeout duration   maximum amount of time to reach RUNNING state (default 20m0s)
	| 
	| Global Flags:
	|       --log-file file            file to write logs to (default stderr)
	|       --log-format type          log output format (text or json) (default text)
	|       --log-level format         log level (default disabled)
	|   -o, --output type              output type: text or json (default text)
	|   -p, --profile string           ~/.databrickscfg profile
	|       --progress-format format   format for progress logs (append, inplace, json) (default default)
	|   -t, --target string            bundle target to use (if applicable)
	| 

Processing Completed!
script returned exit code 1

As the latest Databricks CLI version(v0.2.x, e.g. databricks clusters get CLUSTER_ID) is not compatible with the old ones (v0.1.x, e.g. databricks clusters get --cluster-id/--cluster-name CLUSTER_ID/CLUSTER_NAME)

---------------------------------------- Databricks CLI v0.1.x --------------------------------------------------
root@b21879072023:/# databricks --version
Version 0.17.6

root@b21879072023:/# databricks clusters get -h
Usage: databricks clusters get [OPTIONS]

  Retrieves metadata about a cluster.

Options:
  --cluster-id CLUSTER_ID    Can be found in the URL at https://*.cloud.databr
                             icks.com/#/setting/clusters/$CLUSTER_ID/configura
                             tion.
  --cluster-name CLUSTER_ID  Can be found in the URL at https://*.cloud.databr
                             icks.com/#/setting/clusters/$CLUSTER_ID/configura
                             tion.
  --debug                    Debug Mode. Shows full stack trace on error.
  --profile TEXT             CLI connection profile to use. The default
                             profile is "DEFAULT".
  -h, --help                 Show this message and exit.
root@b21879072023:/# 


------------------------------------------------- Databricks CLI v0.2.x -----------------------------------------
~$ databricks --version
Databricks CLI v0.207.0
~$ databricks clusters get -h
Get cluster info.
  
  Retrieves the information for a cluster given its identifier. Clusters can be
  described while they are running, or up to 60 days after they are terminated.

Usage:
  databricks clusters get CLUSTER_ID [flags]

Flags:
  -h, --help               help for get
      --no-wait            do not wait to reach RUNNING state
      --timeout duration   maximum amount of time to reach RUNNING state (default 20m0s)

Global Flags:
      --log-file file            file to write logs to (default stderr)
      --log-format type          log output format (text or json) (default text)
      --log-level format         log level (default disabled)
  -o, --output type              output type: text or json (default text)
  -p, --profile string           ~/.databrickscfg profile
      --progress-format format   format for progress logs (append, inplace, json) (default default)
  -t, --target string            bundle target to use (if applicable)

https://github.com/NVIDIA/spark-rapids-tools/blob/dev/user_tools/src/spark_rapids_pytools/cloud_api/databricks_aws.py#L116-L120

Steps/Code to reproduce bug
Follow below README file (https://github.com/NVIDIA/spark-rapids-tools/blob/main/user_tools/docs/user-tools-databricks-aws.md) to run rapids tools command:

spark_rapids_user_tools databricks-aws profiling --eventlogs /tmp/eventlogs --gpu_cluster 'test-aws-12.2' --tools_jar ./rapids-4-spark-tools_2.12-23.08.2-SNAPSHOT.jar

Expected behavior
spark_rapids_user_tools databricks-aws should work with latest Databricks CLI

The text was updated successfully, but these errors were encountered:

NvTimLiu · 2023-10-10T01:05:28Z

@mattahrens Can you help to take a look? Thanks!

mattahrens · 2023-10-10T13:48:19Z

@cindyyuanjiang have you take a look at this issue related to the Databricks CLI?

NvTimLiu added bug Something isn't working ? - Needs Triage labels Oct 6, 2023

cindyyuanjiang self-assigned this Oct 10, 2023

cindyyuanjiang mentioned this issue Oct 10, 2023

[BUG] Update user tools to use latest Databricks CLI version 0.200+ #614

Merged

mattahrens removed the ? - Needs Triage label Oct 11, 2023

cindyyuanjiang closed this as completed in #614 Oct 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] databricks-aws/databricks-azure COMMAND does not work with latest Databricks v0.2.x #609

[BUG] databricks-aws/databricks-azure COMMAND does not work with latest Databricks v0.2.x #609

NvTimLiu commented Oct 6, 2023 •

edited

Loading

NvTimLiu commented Oct 10, 2023

mattahrens commented Oct 10, 2023

[BUG] databricks-aws/databricks-azure COMMAND does not work with latest Databricks v0.2.x #609

[BUG] databricks-aws/databricks-azure COMMAND does not work with latest Databricks v0.2.x #609

Comments

NvTimLiu commented Oct 6, 2023 • edited Loading

NvTimLiu commented Oct 10, 2023

mattahrens commented Oct 10, 2023

NvTimLiu commented Oct 6, 2023 •

edited

Loading