Skip to content

Conversation

thomasrebele
Copy link
Contributor

What changes were proposed in this pull request?

Make the Python scripts compatible with Python 3 by changing print .... to print(...), with some minor fixes where necessary.

Why are the changes needed?

Some q files use python scripts. However, the python scripts use syntax that is not compatible with Python 3. Python 2 is out-of-date (last release in 2020), and some operating systems do not provide packages anymore (e.g., Ubuntu LTS 24.04). The goal of this ticket is to make the scripts compatible with Python 3.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Running mvn test -pl itests/qtest -Pitests -Dtest=TestMiniLlapCliDriver -Dqfile=newline.q locally.

@Aggarwal-Raghav
Copy link
Contributor

Added 1 comment, rest of the changes LGTM.

@Aggarwal-Raghav
Copy link
Contributor

Aggarwal-Raghav commented Sep 25, 2025

my python version is: Python 3.12.11
Running python testutils/gen-report.py

  File "/Users/raghav/Desktop/apache/hive/master/testutils/gen-report.py", line 168
    except urllib2.HTTPError, e:
           ^^^^^^^^^^^^^^^^^^^^
SyntaxError: multiple exception types must be parenthesized

Parenthesis are required and indentation is not correct which is causing failure

@Aggarwal-Raghav
Copy link
Contributor

Aggarwal-Raghav commented Sep 25, 2025

Outside scope of the PR, hcatalog/bin/hcatcfg.py the import statment for reduce method is missing i believe

@thomasrebele
Copy link
Contributor Author

@Aggarwal-Raghav, thank you for the review! I've fixed the reduce for hcatalog/bin/hcatcfg.py as well. The script testutils/gen-report.py needs a big refactoring to make it python 3 compatible, so I consider it out-of-scope for this ticket (which aims to fix the scripts used by the itests). If you have some time, could you have another look, please?

Copy link

@Aggarwal-Raghav
Copy link
Contributor

Changes LGTM +1 (non-binding).

I understand that you are using try/except to make in compatible with python2.

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to do a proper review and ensure that scripts work I will probably have to run and test each file that is modified. However, before running them I would like to know if we need/use all these python scripts? If we have useless stuff that can be dropped then we should do this before trying to make them Python 3 compatible.

@thomasrebele
Copy link
Contributor Author

@zabetak, most of them are used:

$ rg --glob '*.q' 'data/scripts/.*py' | sed 's#data/#\ndata/#g; s/\.py/\.py\n/g;' | grep '^data' | sort -u            
data/scripts/cat.py
data/scripts/doubleescapedtab.py
data/scripts/dumpdata_script.py
data/scripts/escapedcarriagereturn.py
data/scripts/escapednewline.py
data/scripts/escapedtab.py
data/scripts/input20_script.py
data/scripts/newline.py
$ rg --glob '*.xml' '<include>.*py</include>'
packaging/src/main/assembly/bin.xml
224:        <include>**/*.py</include>
347:        <include>hcatcfg.py</include>
348:        <include>hcat.py</include>
391:        <include>hcatcfg.py</include>
392:        <include>hcat_server.py</include>

The main goal of the PR was to fix errors when executing qfile tests that use these python files. If you think that certain changes are too risky, let me know, I'm happy to reduce the scope of the PR.

Copy link
Member

@zabetak zabetak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went over the changes and overall LGTM.

I run manually the modified scripts under data/scripts using python3 and they seem to run fine. These scripts seem to be in use by various .q files so I assume that since tests are green the changes are OK. It still makes a bit wonder though since the tests were also running fine before the changes so it means somehow somewhere we are using and relying on an old Python version.

The changes in the scripts under hcatalog are completely based on my eyes. Not sure how and if these scripts are used nowadays.

I tried running fb303_simple_mgmt.py but there are still many compatibility issues in dependent modules. Not sure if anyone is using these scripts these days.

I tried running gen-report.py but it fails due to various other compatibility issues. This is also a script that I am not sure if it is relevant anymore.

@thomasrebele
Copy link
Contributor Author

Thank you for the review, @zabetak! I've postponed fixing fb303_simple_mgmt.py and gen-report.py completely to a later ticket. Making these two python3 compatible would be much effort and my goal was to just get the qfile tests working on my laptop.

I can't see the merge button here, maybe because I don't have write permissions to the Hive repository. Could you merge it, please?

@zabetak zabetak merged commit 481d274 into apache:master Oct 2, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants