Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vdsql: Don't know how to connect to .ddb (DuckDB) #2259

Open
fleimgruber opened this issue Jan 17, 2024 · 14 comments
Open

vdsql: Don't know how to connect to .ddb (DuckDB) #2259

fleimgruber opened this issue Jan 17, 2024 · 14 comments

Comments

@fleimgruber
Copy link

fleimgruber commented Jan 17, 2024

Small description
vdsql is not able to open a .ddb file

Expected result
vdsql is able to open a .ddb file.

Actual result with screenshot
image

File "C:\Users\LeimgruberF\lg\.venv\lib\site-packages\visidata\threads.py", line 220, in _toplevelTryFunc
t.status = func(*args, **kwargs)
File "C:\Users\LeimgruberF\lg\.venv\lib\site-packages\visidata\sheets.py", line 260, in reload
self.loader()
File "C:\Users\LeimgruberF\lg\.venv\lib\site-packages\visidata\sheets.py", line 285, in loader
for r in self.iterload():
File "C:\Users\LeimgruberF\lg\.venv\lib\site-packages\visidata\apps\vdsql\_ibis.py", line 123, in iterload
with self.con as con:
File "C:\Users\LeimgruberF\.pyenv\pyenv-win\versions\3.10.11\lib\contextlib.py", line 135, in __enter__
return next(self.gen)
File "C:\Users\LeimgruberF\lg\.venv\lib\site-packages\visidata\apps\vdsql\_ibis.py", line 93, in get_conn
r = ibis.connect(str(self.source))
File "C:\Users\LeimgruberF\lg\.venv\lib\site-packages\ibis\backends\base\__init__.py", line 1353, in connect
raise ValueError(f"Don't know how to connect to {resource!r}")
ValueError: Don't know how to connect to 'test.ddb'

Steps to reproduce with sample data and a .vd
Sample data:

import duckdb

con = duckdb.connect("test.ddb")
con.sql("CREATE TABLE test (i INTEGER)")
con.sql("INSERT INTO test VALUES (42)")
con.table("test").show()
con.close()

command:

visidata -f vdsql test.ddb

commandlog:

#!vd -p
{"sheet": "global", "col": null, "row": "filetype", "longname": "set-option", "input": "vdsql", "keystrokes": "", "comment": null}
{"longname": "open-file", "input": "test.ddb", "keystrokes": "o"}

Additional context
Python 3.10.11
VisiData 96d2702

@midichef
Copy link
Contributor

It appears that the ibis library expects a DuckDB file to have the extension .duckdb. I do not think there's any way to get it to read a file ending in .ddb, without submitting a change to ibis-project.

After renaming the .ddb file, vd -f vdsql test.duckdb worked for me.

@saulpw
Copy link
Owner

saulpw commented Jan 18, 2024

I thought this commit enabled this: a94cdff. I wonder what happened.

@midichef
Copy link
Contributor

midichef commented Jan 18, 2024

I'm not sure what changed. But ibis will take a duckdb:// URL that takes a relative or absolute path. So with a loader something like this:

@VisiData.api
def openurl_duckdb(vd, p, filetype=None):
    return vd.open_vdsql(p, filetype)

the .ddb file can be loaded with vd -f vdsql duckdb://test.ddb. Would that be enough?
(EDIT: vd duckdb://test.ddb works and is shorter.)

@fleimgruber
Copy link
Author

vd -f vdsql test.duckdb

Works for me as well, thanks.

vd -f vdsql duckdb://test.ddb

I am not using explicit loaders myself, but that could be nice to have. I think with a94cdff working it would be a better user experience in any case.

@reagle
Copy link
Contributor

reagle commented Aug 29, 2024

Neither work for me.

❯ vd -f vdsql duckdb://test.ddb
saul.pw/VisiData v3.0.2
no loader for url scheme: duckdb

@midichef
Copy link
Contributor

@reagle
Opening a duckdb:// URL can only work if you add the openurl_duckdb() code above to your .visidatarc. It's not clear from context whether you had that code in your .visidatarc when you ran it. Does the failure happen with that code added?

The other method is to open a file ending in .duckdb, by adding -f vdsql:
vd -f vdsql test.duckdb
However, this method won't work for files ending in .ddb, only .duckdb. If your file had the right extension, and still couldn't be loaded, please let me know what output visidata gave when you ran it with -f vdsql.

@reagle
Copy link
Contributor

reagle commented Sep 12, 2024

@midichef
Copy link
Contributor

midichef commented Sep 13, 2024

I see you got the message unknown "vdsql" filetype, and it doesn't work. I get that too when I install vdsql with pip install vdsql.

But the .duckdb file will open if I follow the alternate vdsql install instructions from the vdsql README.md:

git clone [email protected]:saulpw/visidata.git
cd visidata/visidata/apps/vdsql
pip3 install .

Does it work for you if you do that, and then run vd -f vdsql test.duckdb?

@reagle
Copy link
Contributor

reagle commented Feb 8, 2025

That never worked; there seems to be dependency issues, python version issues, etc.

@anjakefala
Copy link
Collaborator

@reagle Can you paste the errors you have?

@reagle
Copy link
Contributor

reagle commented Feb 10, 2025

Hi @anjakefala . First, I'm not sure why I'd use vdsql if the functionality has moved to vd now? But opening a duckdb fails with vd v3.1.1 (from homebrew) fails: "unknown 'duckdb' filetype" and it opens as text.

Second, I'm not able to install a working vdsql from git as @midichef suggested above. I'm using uv to try to keep things cleaner, and am probably making a mistaking. Using python 3.10 yields:

❯ cd tmp/visidata/visidata/apps/vdsql
❯ uv venv --python 3.10 .venv
source .venv/bin/activate
❯ uv pip install -e .
Resolved 26 packages in 3ms
Installed 26 packages in 256ms
 + atpublic==4.1.0
 + bidict==0.23.1
 + ibis-framework==9.0.0
 + ibis-substrait==4.0.1
 + markdown-it-py==3.0.0
 + mdurl==0.1.2
 + numpy==1.26.4
 + packaging==24.2
 + pandas==1.5.3
 + parsy==2.1
 + protobuf==5.29.3
 + pyarrow==16.1.0
 + pyarrow-hotfix==0.6
 + pygments==2.19.1
 + python-dateutil==2.9.0.post0
 + pytz==2025.1
 + pyyaml==6.0.2
 + regex==2024.11.6
 + rich==13.9.4
 + six==1.17.0
 + sqlglot==23.12.2
 + sqlparse==0.5.3
 + substrait==0.23.0
 + toolz==0.12.1
 + typing-extensions==4.12.2
 + vdsql==0.3.dev0 (from file:///Users/reagle/Downloads/visidata/visidata/apps/vdsql)
❯ vdsql
zsh: /Users/reagle/Downloads/visidata/visidata/apps/vdsql/.venv/bin/vdsql: bad interpreter: /Users/reagle/.cache/uv/builds-v0/.tmp4ZXt7g/bin/python: no such file or directory
Traceback (most recent call last):
  File "/Users/reagle/Downloads/visidata/visidata/apps/vdsql/vdsql", line 4, in <module>
    from visidata.apps.vdsql.__main__ import main
ModuleNotFoundError: No module named 'visidata'
 ~/tmp/visidata/visidata/apps/vdsql  develop

and using a recent python chokes when trying to build pyarrow:

❯ git clone [email protected]:saulpw/visidata.git
Cloning into 'visidata'...
remote: Enumerating objects: 42243, done.
remote: Counting objects: 100% (801/801), done.
remote: Compressing objects: 100% (260/260), done.
remote: Total 42243 (delta 655), reused 541 (delta 541), pack-reused 41442 (from 4)
Receiving objects: 100% (42243/42243), 53.83 MiB | 19.50 MiB/s, done.
Resolving deltas: 100% (30052/30052), done.
❯ cd visidata/visidata/apps/vdsql
❯ uv venv .venv
Using CPython 3.13.1 interpreter at: /Users/reagle/.pyenv/versions/3.13.1/bin/python
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate
❯ source .venv/bin/activate
❯ uv pip install -e .
Resolved 26 packages in 432ms
      Built vdsql @ file:///Users/reagle/Downloads/visidata/visidata/apps/vdsql
  × Failed to build `pyarrow==16.1.0`
  ├─▶ The build backend returned an error
  ╰─▶ Call to `setuptools.build_meta:__legacy__.build_wheel` failed (exit status: 1)

      [stdout]
      running bdist_wheel
      running build
      running build_py
      copying pyarrow/orc.py -> build/lib.macosx-15.2-arm64-cpython-313/pyarrow
      copying pyarrow/conftest.py -> build/lib.macosx-15.2-arm64-cpython-313/pyarrow
      copying pyarrow/_generated_version.py -> build/lib.macosx-15.2-arm64-cpython-313/pyarrow
      copying pyarrow/benchmark.py -> build/lib.macosx-15.2-arm64-cpython-313/pyarrow
      copying pyarrow/_compute_docstrings.py -> build/lib.macosx-15.2-arm64-cpython-313/pyarrow
      ...
      build/lib.macosx-15.2-arm64-cpython-313/pyarrow/src/arrow/python
      copying pyarrow/src/arrow/python/visibility.h ->
      build/lib.macosx-15.2-arm64-cpython-313/pyarrow/src/arrow/python
      running build_ext
      -- Running cmake for PyArrow
      cmake
      -DCMAKE_INSTALL_PREFIX=/Users/reagle/.cache/uv/sdists-v7/pypi/pyarrow/16.1.0/vJlc26cdB809K9_3Nw1e8/src/build/lib.macosx-15.2-arm64-cpython-313/pyarrow
      -DPYTHON_EXECUTABLE=/Users/reagle/.cache/uv/builds-v0/.tmpP8S1JR/bin/python
      -DPython3_EXECUTABLE=/Users/reagle/.cache/uv/builds-v0/.tmpP8S1JR/bin/python
      -DPYARROW_CXXFLAGS= -DPYARROW_BUILD_CUDA=off -DPYARROW_BUILD_SUBSTRAIT=off
      -DPYARROW_BUILD_FLIGHT=off -DPYARROW_BUILD_GANDIVA=off -DPYARROW_BUILD_ACERO=off
      -DPYARROW_BUILD_DATASET=off -DPYARROW_BUILD_ORC=off -DPYARROW_BUILD_PARQUET=off
      -DPYARROW_BUILD_PARQUET_ENCRYPTION=off -DPYARROW_BUILD_AZURE=off -DPYARROW_BUILD_GCS=off
      -DPYARROW_BUILD_S3=off -DPYARROW_BUILD_HDFS=off -DPYARROW_BUNDLE_ARROW_CPP=off
      -DPYARROW_BUNDLE_CYTHON_CPP=off -DPYARROW_GENERATE_COVERAGE=off -DCMAKE_BUILD_TYPE=release
      /Users/reagle/.cache/uv/sdists-v7/pypi/pyarrow/16.1.0/vJlc26cdB809K9_3Nw1e8/src
      -- System processor: arm64
      -- Arrow build warning level: PRODUCTION
      -- Build Type: RELEASE
      -- CMAKE_C_FLAGS:  -Wall -Wno-unknown-warning-option -Wno-pass-failed -march=armv8-a
      -Qunused-arguments -fcolor-diagnostics  -fno-omit-frame-pointer -Wno-unused-variable
      ...
      /opt/homebrew/include/arrow/visit_type_inline.h:54:34: note: in
      instantiation of function template specialization 'arrow::py::(anonymous
      namespace)::ObjectWriterVisitor::Visit<arrow::Decimal64Type>' requested here
         54 |     ARROW_GENERATE_FOR_ALL_TYPES(TYPE_VISIT_INLINE);
            |                                  ^
      /Users/reagle/.cache/uv/sdists-v7/pypi/pyarrow/16.1.0/vJlc26cdB809K9_3Nw1e8/src/pyarrow/src/arrow/python/arrow_to_pandas.cc:1410:12:
      note: in instantiation of function template specialization
      'arrow::VisitTypeInline<arrow::py::(anonymous namespace)::ObjectWriterVisitor>' requested
      here
       1410 |     return VisitTypeInline(*data->type(), &visitor);
            |            ^
      2 errors generated.
      make[2]: *** [CMakeFiles/arrow_python.dir/pyarrow/src/arrow/python/arrow_to_pandas.cc.o]
      Error 1
      make[1]: *** [CMakeFiles/arrow_python.dir/all] Error 2
      make: *** [all] Error 2
      error: command '/opt/homebrew/bin/cmake' failed with exit code 2

      hint: This usually indicates a problem with the package or the build environment.
  help: `pyarrow` (v16.1.0) was included because `vdsql` (v0.3.dev0) depends on `ibis-framework`
        (v9.0.0) which depends on `pyarrow`
 ~/tmp/visidata/visidata/apps/vdsql  develop

@anjakefala
Copy link
Collaborator

Couple of issues here.

ModuleNotFoundError: No module named 'visidata'

It's assumed the user already has VisiData installed, which maybe shouldn't be assumed.

Failed to build pyarrow==16.1.0

This is an ibis dependency. PyArrow 16.1.0 doesn't have a wheel built for Python 3.13. I recommend using a lower version of Python while we explore bumping the ibis version!

@reagle
Copy link
Contributor

reagle commented Feb 10, 2025

This worked:

git clone [email protected]:saulpw/visidata.git
cd visidata
uv venv --python 3.10 .venv
source .venv/bin/activate
uv pip install -e .
cd visidata/apps/vdsql
uv pip install -e .
uv pip install 'ibis-framework[duckdb]'
vdsql ~/tmp/duckdb-demo.duckdb

Should the vdsql page be changed so that it's clear vdsql does not work in the vd mono repo and that python3.10 is needed?

@anjakefala
Copy link
Collaborator

Yeah, that sounds reasonable. I'll give vdsql some attention soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants