Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running TFX transform fails to pip install #6955

Open
kolaente opened this issue Nov 14, 2024 · 4 comments
Open

Running TFX transform fails to pip install #6955

kolaente opened this issue Nov 14, 2024 · 4 comments

Comments

@kolaente
Copy link

System information

  • Have I specified the code to reproduce the issue (Yes, No): Yes
  • Environment in which the code is executed: Linux, venv Jupyter Notebook
  • TensorFlow version: 2.15.1
  • TFX Version: 1.15.1
  • Python version: 3.10
  • Python dependencies (from pip freeze output):
click to expand
absl-py==1.4.0
annotated-types==0.7.0
anyio==4.6.2.post1
apache-beam==2.60.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
astunparse==1.6.3
async-lru==2.0.4
async-timeout==5.0.0
attrs==23.2.0
babel==2.16.0
backcall==0.2.0
backports.tarfile==1.2.0
beautifulsoup4==4.12.3
bleach==6.2.0
cachetools==5.5.0
certifi==2024.8.30
cffi==1.17.1
charset-normalizer==3.4.0
click==8.1.7
cloudpickle==2.2.1
colorama==0.4.6
comm==0.2.2
crcmod==1.7
cryptography==43.0.3
debugpy==1.8.7
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.14
dill==0.3.1.1
dnspython==2.7.0
docker==4.4.4
docopt==0.6.2
docstring_parser==0.16
exceptiongroup==1.2.2
fastavro==1.9.7
fasteners==0.19
fastjsonschema==2.20.0
flatbuffers==24.3.25
fqdn==1.5.1
gast==0.6.0
google-api-core==2.22.0
google-api-python-client==1.12.11
google-apitools==0.5.31
google-auth==2.35.0
google-auth-httplib2==0.2.0
google-auth-oauthlib==1.2.1
google-cloud-aiplatform==1.71.1
google-cloud-bigquery==3.26.0
google-cloud-bigquery-storage==2.27.0
google-cloud-bigtable==2.26.0
google-cloud-core==2.4.1
google-cloud-datastore==2.20.1
google-cloud-dlp==3.25.0
google-cloud-language==2.15.0
google-cloud-pubsub==2.26.1
google-cloud-pubsublite==1.11.1
google-cloud-recommendations-ai==0.10.13
google-cloud-resource-manager==1.13.0
google-cloud-spanner==3.49.1
google-cloud-storage==2.18.2
google-cloud-videointelligence==2.14.0
google-cloud-vision==3.8.0
google-crc32c==1.6.0
google-pasta==0.2.0
google-resumable-media==2.7.2
googleapis-common-protos==1.65.0
grpc-google-iam-v1==0.13.1
grpc-interceptor==0.15.4
grpcio==1.65.5
grpcio-status==1.48.2
h11==0.14.0
h5py==3.12.1
hdfs==2.7.3
httpcore==1.0.6
httplib2==0.22.0
httpx==0.27.2
idna==3.10
importlib_metadata==8.4.0
ipykernel==6.29.5
ipython==7.34.0
ipython-genutils==0.2.0
ipywidgets==7.8.5
isoduration==20.11.0
jaraco.classes==3.4.0
jaraco.context==6.0.1
jaraco.functools==4.1.0
jedi==0.19.1
jeepney==0.8.0
Jinja2==3.1.4
joblib==1.4.2
json5==0.9.25
jsonpickle==3.3.0
jsonpointer==3.0.0
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
jupyter-events==0.10.0
jupyter-lsp==2.2.5
jupyter_client==8.6.3
jupyter_core==5.7.2
jupyter_server==2.14.2
jupyter_server_terminals==0.5.3
jupyterlab==4.2.5
jupyterlab_pygments==0.3.0
jupyterlab_server==2.27.3
jupyterlab_widgets==1.1.11
keras==2.15.0
keras-tuner==1.4.7
keyring==25.5.0
keyrings.google-artifactregistry-auth==1.1.2
kt-legacy==1.0.5
kubernetes==12.0.1
libclang==18.1.1
lxml==5.3.0
Markdown==3.7
MarkupSafe==3.0.2
matplotlib-inline==0.1.7
mistune==3.0.2
ml-dtypes==0.3.2
ml-metadata==1.15.0
ml-pipelines-sdk==1.15.1
more-itertools==10.5.0
nbclient==0.10.0
nbconvert==7.16.4
nbformat==5.10.4
nest-asyncio==1.6.0
nltk==3.9.1
notebook==7.2.2
notebook_shim==0.2.4
numpy==1.26.4
nvidia-cublas-cu12==12.2.5.6
nvidia-cuda-cupti-cu12==12.2.142
nvidia-cuda-nvcc-cu12==12.2.140
nvidia-cuda-nvrtc-cu12==12.2.140
nvidia-cuda-runtime-cu12==12.2.140
nvidia-cudnn-cu12==8.9.4.25
nvidia-cufft-cu12==11.0.8.103
nvidia-curand-cu12==10.3.3.141
nvidia-cusolver-cu12==11.5.2.141
nvidia-cusparse-cu12==12.1.2.141
nvidia-nccl-cu12==2.16.5
nvidia-nvjitlink-cu12==12.2.140
oauth2client==4.1.3
oauthlib==3.2.2
objsize==0.7.0
opentelemetry-api==1.27.0
opentelemetry-sdk==1.27.0
opentelemetry-semantic-conventions==0.48b0
opt_einsum==3.4.0
orjson==3.10.11
overrides==7.7.0
packaging==24.1
pandas==1.5.3
pandocfilters==1.5.1
parso==0.8.4
pexpect==4.9.0
pickleshare==0.7.5
pillow==11.0.0
platformdirs==4.3.6
pluggy==1.5.0
portalocker==2.10.1
portpicker==1.6.0
prometheus_client==0.21.0
prompt_toolkit==3.0.48
proto-plus==1.25.0
protobuf==3.20.3
psutil==6.1.0
ptyprocess==0.7.0
pyarrow==10.0.1
pyarrow-hotfix==0.6
pyasn1==0.6.1
pyasn1_modules==0.4.1
pycparser==2.22
pydantic==2.9.2
pydantic_core==2.23.4
pydot==1.4.2
pyfarmhash==0.3.2
Pygments==2.18.0
pymongo==4.10.1
pyparsing==3.2.0
python-dateutil==2.9.0.post0
python-json-logger==2.0.7
pytz==2024.2
PyYAML==6.0.2
pyzmq==26.2.0
redis==5.2.0
referencing==0.35.1
regex==2024.9.11
requests==2.32.3
requests-oauthlib==2.0.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rouge_score==0.1.2
rpds-py==0.20.1
rsa==4.9
sacrebleu==2.4.3
scipy==1.12.0
SecretStorage==3.3.3
Send2Trash==1.8.3
shapely==2.0.6
six==1.16.0
sniffio==1.3.1
soupsieve==2.6
sqlparse==0.5.1
tabulate==0.9.0
tensorboard==2.15.2
tensorboard-data-server==0.7.2
tensorflow==2.15.1
tensorflow-data-validation==1.15.1
tensorflow-estimator==2.15.0
tensorflow-hub==0.15.0
tensorflow-io-gcs-filesystem==0.37.1
tensorflow-metadata==1.15.0
tensorflow-serving-api==2.15.1
tensorflow-transform==1.15.0
tensorflow_model_analysis==0.46.0
termcolor==2.5.0
terminado==0.18.1
tfx==1.15.1
tfx-bsl==1.15.1
tinycss2==1.4.0
tomli==2.0.2
tornado==6.4.1
tqdm==4.66.6
traitlets==5.14.3
types-python-dateutil==2.9.0.20241003
typing_extensions==4.12.2
uri-template==1.3.0
uritemplate==3.0.1
urllib3==2.2.3
wcwidth==0.2.13
webcolors==24.8.0
webencodings==0.5.1
websocket-client==1.8.0
Werkzeug==3.1.1
widgetsnbextension==3.6.10
wrapt==1.14.1
zipp==3.20.2
zstandard==0.23.0

Describe the current behavior

Following this TFX guide, running this code in a jupyter notebook cell:

from tfx.components import Transform

transform = Transform(
    examples=prepare_data_component.outputs['examples'],
    schema=schema_gen.outputs['schema'],
    module_file=os.path.abspath('../components/module.py'))

context.run(transform)

(examples and schema have been generated previously)

results in this error:

WARNING: There was an error checking the latest version of pip.
ERROR: Exception:
Traceback (most recent call last):
  File "PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 105, in _run_wrapper
    status = _inner_run()
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 96, in _inner_run
    return self.run(options, args)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 67, in wrapper
    return func(self, options, args)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 325, in run
    session = self.get_default_session(options)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/index_command.py", line 76, in get_default_session
    self._session = self.enter_context(self._build_session(options))
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/cli/index_command.py", line 99, in _build_session
    session = PipSession(
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/network/session.py", line 344, in __init__
    self.headers["User-Agent"] = user_agent()
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_internal/network/session.py", line 142, in user_agent
    linux_distribution = distro.name(), distro.version(), distro.codename()
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_vendor/distro/distro.py", line 371, in version
    return _distro.version(pretty, best)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_vendor/distro/distro.py", line 900, in version
    self.uname_attr("release"),
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_vendor/distro/distro.py", line 1088, in uname_attr
    return self._uname_info.get(attribute, "")
  File "/nix/store/si7kfwma2v8ypxnl9iyl0x2sw47fq4pc-python3-3.10.14-env/lib/python3.10/functools.py", line 981, in __get__
    val = self.func(instance)
  File "/PROJECT_DIR/.devenv/state/venv/lib/python3.10/site-packages/pip/_vendor/distro/distro.py", line 1202, in _uname_info
    stdout = subprocess.check_output(cmd, stderr=subprocess.DEVNULL)
  File "/nix/store/si7kfwma2v8ypxnl9iyl0x2sw47fq4pc-python3-3.10.14-env/lib/python3.10/subprocess.py", line 421, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/nix/store/si7kfwma2v8ypxnl9iyl0x2sw47fq4pc-python3-3.10.14-env/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('uname', '-rs')' died with <Signals.SIGSEGV: 11>.

CalledProcessError: Command '['PROJECT_DIR/.devenv/state/venv/bin/python', '-m', 'pip', 'install', '--target', '/tmp/tmphlk7ovq8', '/tmp/tfx-interactive-2024-11-14T12_22_15.829911-sndfipr4/_wheels/tfx_user_code_Transform-0.0+6a115ee6c2805a1c5f73bc0f06dee31e25bab22f2807f05ce91a6ad75f2068aa-py3-none-any.whl']' returned non-zero exit status 2.

It looks like running pip here failed - why does it even call pip in the first place? I'm able to run pip without issues in the venv.

Describe the expected behavior

Does not crash.

Standalone code to reproduce the issue

see above

@janasangeetha janasangeetha self-assigned this Nov 18, 2024
@janasangeetha
Copy link
Contributor

Hi @kolaente
Thank you for reporting. I'll investigate and provide an update here.

@janasangeetha
Copy link
Contributor

Hi @kolaente
I was unable to reproduce the error. Please provide more steps to reproduce the issue. Also, I am able to run the tutorial which has transform component gist. Please feel free to explore the tutorial.

@kolaente
Copy link
Author

Seems like an issue with my environment. It works when running it in a jupyter notebook server.

Still, why does running the component try to install something via pip?

@janasangeetha
Copy link
Contributor

Hi @kolaente
As per my understanding when we run in local environment wheel file will be created for the component and then the package will be installed.
@lego0901 Could you please share your thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants
@nikelite @kolaente @lego0901 @janasangeetha and others