Skip to content

Conversation

@dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented Nov 3, 2025

What changes were proposed in this pull request?

This PR aims to a few pip install command to use proper quotation marks like the following.

-RUN python3.13 -m pip install --ignore-installed blinker>=1.6.2 # mlflow needs this
+RUN python3.13 -m pip install --ignore-installed 'blinker>=1.6.2' # mlflow needs this
-RUN python3.13 -m pip install numpy>=2.1 pyarrow>=18.0.0 six==1.16.0 pandas==2.3.3 scipy coverage matplotlib openpyxl grpcio==1.67.0 grpcio-status==1.67.0 lxml jinja2 && \
+RUN python3.13 -m pip install 'numpy>=2.1' 'pyarrow>=18.0.0' 'six==1.16.0' 'pandas==2.3.3' scipy coverage matplotlib openpyxl 'grpcio==1.67.0' 'grpcio-status==1.67.0' lxml jinja2 && \

Why are the changes needed?

SHELL handles >= before pip install command receives it.

Does this PR introduce any user-facing change?

No behavior change because this is only changing infra.

How was this patch tested?

Pass the CIs.

Was this patch authored or co-authored using generative AI tooling?

No.

@bjornjorgensen
Copy link
Contributor

ok, but I also think we should have inside the ARG's to like

ARG BASIC_PIP_PKGS="numpy 'pyarrow>=22.0.0' 'six==1.16.0' 'pandas==2.3.3' scipy 'plotly<6.0.0' coverage matplotlib openpyxl 'memory-profiler>=0.61.0' 'scikit-learn>=1.3.2'"
# Python deps for Spark Connect
ARG CONNECT_PIP_PKGS="'grpcio==1.75.1' 'grpcio-status==1.71.2' 'protobuf==5.29.5' 'googleapis-common-protos==1.65.0' 'graphviz==0.20.3'" 

@dongjoon-hyun
Copy link
Member Author

Thank you, @HyukjinKwon and @bjornjorgensen . Let me resolve the conflict and merge the AS-IS first. For ARG, we can revisit.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-54159 branch November 6, 2025 02:24
dongjoon-hyun added a commit that referenced this pull request Nov 6, 2025
…on marks

### What changes were proposed in this pull request?

This PR aims to a few `pip install` command to use proper quotation marks like the following.

```
-RUN python3.13 -m pip install --ignore-installed blinker>=1.6.2 # mlflow needs this
+RUN python3.13 -m pip install --ignore-installed 'blinker>=1.6.2' # mlflow needs this
```

```
-RUN python3.13 -m pip install numpy>=2.1 pyarrow>=18.0.0 six==1.16.0 pandas==2.3.3 scipy coverage matplotlib openpyxl grpcio==1.67.0 grpcio-status==1.67.0 lxml jinja2 && \
+RUN python3.13 -m pip install 'numpy>=2.1' 'pyarrow>=18.0.0' 'six==1.16.0' 'pandas==2.3.3' scipy coverage matplotlib openpyxl 'grpcio==1.67.0' 'grpcio-status==1.67.0' lxml jinja2 && \
```

### Why are the changes needed?

SHELL handles `>=` before `pip install` command receives it.

### Does this PR introduce _any_ user-facing change?

No behavior change because this is only changing infra.

### How was this patch tested?

Pass the CIs.

### Was this patch authored or co-authored using generative AI tooling?

No.

This patch had conflicts when merged, resolved by
Committer: Dongjoon Hyun <[email protected]>

Closes #52857 from dongjoon-hyun/SPARK-54159.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit 37689bf)
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants