Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get: fails to clone with ssh #8681

Closed
dekromp opened this issue Dec 12, 2022 · 7 comments
Closed

get: fails to clone with ssh #8681

dekromp opened this issue Dec 12, 2022 · 7 comments
Assignees
Labels
A: data-sync Related to dvc get/fetch/import/pull/push research upstream Issues which need to be resolved in an upstream dependency

Comments

@dekromp
Copy link

dekromp commented Dec 12, 2022

Bug Report

Issue name

get: "Failed to clone repo" when attempting to clone via ssh.

Description

dvc get fails to clone git repository via ssh in general. git clone works.

Reproduce

Reproduction requires a git repository and a configured ssh key for git. In the example, the .dvc files are located in a data branch.

% dvc get --rev v0.0.1 -o data/my-dvc-proj [email protected]:XXXXX/my-dvc-proj.git data
ERROR: failed to get 'data' from '[email protected]:XXXXX/my-dvc-proj.git' - Failed to clone repo '[email protected]:XXXXX/my-dvc-proj.git' to '/var/folders/r7/6vs0hrvd2gl7z35_lm1vr5g00000gn/T/tmpkvxn3v7vdvc-clone'

git clone works as expected

% git clone -b data [email protected]:XXXXX/my-dvc-proj.git
Cloning into 'my-dvc-proj'...
remote: Enumerating objects: 138, done.
remote: Counting objects: 100% (138/138), done.
remote: Compressing objects: 100% (86/86), done.
remote: Total 138 (delta 60), reused 112 (delta 36), pack-reused 0
Receiving objects: 100% (138/138), 585.58 KiB | 5.86 MiB/s, done.
Resolving deltas: 100% (60/60), done.

Expected

Cloning should work.

Environment information

Output of dvc doctor:

DVC version: 2.37.0 (pip)
---------------------------------
Platform: Python 3.8.12 on macOS-12.6.1-x86_64-i386-64bit
Subprojects:
	dvc_data = 0.28.4
	dvc_objects = 0.14.0
	dvc_render = 0.0.15
	dvc_task = 0.1.6
	dvclive = 1.0.1
	scmrepo = 0.1.4
Supports:
	http (aiohttp = 3.8.3, aiohttp-retry = 2.8.3),
	https (aiohttp = 3.8.3, aiohttp-retry = 2.8.3)

Additional Information (if any):

Opened separate issue as proposed in #7670.

@dtrifiro
Copy link
Contributor

dtrifiro commented Dec 12, 2022

Thanks for creating a new issue. Could you provide the output of the same operation with the -v flag? (dvc get -v)

Edit: If running macOS and using encrypted credentials files, this might be a known issue, for which a workaround is available: iterative/dvc-ssh#20

@dekromp
Copy link
Author

dekromp commented Dec 12, 2022

Hi @dtrifiro,

% dvc get -v  --rev v0.0.1 -o data/my-dvc-proj XXXprivate-gitlab.com:XXXXX/my-dvc-proj.git data
2022-12-12 19:11:24,959 DEBUG: Creating external repo XXXprivate-gitlab.com:XXXXX/[email protected]
2022-12-12 19:11:24,960 DEBUG: erepo: git clone 'XXXprivate-gitlab.com:XXXXX/my-dvc-proj.git' to a temporary dir
2022-12-12 19:11:25,536 DEBUG: Removing '/var/folders/r7/6vs0hrvd2gl7z35_lm1vr5g00000gn/T/tmp7fk69hrtdvc-clone/.git'
2022-12-12 19:11:25,554 DEBUG: Removing '/XXX/data/.RahpvZ4xCo9RY3mq6aDMed'
2022-12-12 19:11:25,555 ERROR: failed to get 'data' from 'XXXprivate-gitlab.com:XXXXX/my-dvc-proj.git' - Failed to clone repo 'XXXprivate-gitlab.com:XXXXX/my-dvc-proj.git' to '/var/folders/r7/6vs0hrvd2gl7z35_lm1vr5g00000gn/T/tmp7fk69hrtdvc-clone'
------------------------------------------------------------
Traceback (most recent call last):
  File "/XXX/.venv/lib/python3.8/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 200, in clone
    repo = clone_from()
  File "/XXX/.venv/lib/python3.8/site-packages/dulwich/porcelain.py", line 551, in clone
    return client.clone(
  File "/XXX/.venv/lib/python3.8/site-packages/dulwich/client.py", line 760, in clone
    result = self.fetch(path, target, progress=progress, depth=depth)
  File "/XXX/.venv/lib/python3.8/site-packages/dulwich/client.py", line 837, in fetch
    result = self.fetch_pack(
  File "/XXX/.venv/lib/python3.8/site-packages/dulwich/client.py", line 1146, in fetch_pack
    proto, can_read, stderr = self._connect(b"upload-pack", path)
  File "/XXX/.venv/lib/python3.8/site-packages/dulwich/client.py", line 1792, in _connect
    con = self.ssh_vendor.run_command(
  File "/XXX/.venv/lib/python3.8/site-packages/fsspec/asyn.py", line 113, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/XXX/.venv/lib/python3.8/site-packages/fsspec/asyn.py", line 98, in sync
    raise return_result
  File "/XXX/.venv/lib/python3.8/site-packages/fsspec/asyn.py", line 53, in _runner
    result[0] = await coro
  File "/XXX/.venv/lib/python3.8/site-packages/scmrepo/git/backend/dulwich/asyncssh_vendor.py", line 163, in _run_command
    conn = await asyncssh.connect(
  File "/XXX/.venv/lib/python3.8/site-packages/asyncssh/connection.py", line 7834, in connect
    return await asyncio.wait_for(
  File "/XXX/.pyenv/versions/3.8.12/lib/python3.8/asyncio/tasks.py", line 455, in wait_for
    return await fut
  File "/XXX/.venv/lib/python3.8/site-packages/asyncssh/connection.py", line 433, in _connect
    conn = await _open_proxy(loop, proxy_command, conn_factory)
  File "/XXX/.venv/lib/python3.8/site-packages/asyncssh/connection.py", line 352, in _open_proxy
    _, tunnel = await loop.subprocess_exec(_ProxyCommandTunnel, *command)
  File "/XXX/.pyenv/versions/3.8.12/lib/python3.8/asyncio/base_events.py", line 1630, in subprocess_exec
    transport = await self._make_subprocess_transport(
  File "/XXX/.pyenv/versions/3.8.12/lib/python3.8/asyncio/unix_events.py", line 197, in _make_subprocess_transport
    transp = _UnixSubprocessTransport(self, protocol, args, shell,
  File "/XXX/.pyenv/versions/3.8.12/lib/python3.8/asyncio/base_subprocess.py", line 36, in __init__
    self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
  File "/XXX/.pyenv/versions/3.8.12/lib/python3.8/asyncio/unix_events.py", line 789, in _start
    self._proc = subprocess.Popen(
  File "/XXX/.pyenv/versions/3.8.12/lib/python3.8/subprocess.py", line 858, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/XXX/.pyenv/versions/3.8.12/lib/python3.8/subprocess.py", line 1704, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'none'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/XXX/.venv/lib/python3.8/site-packages/dvc/scm.py", line 145, in clone
    git = Git.clone(url, to_path, progress=pbar.update_git, **kwargs)
  File "/XXX/.venv/lib/python3.8/site-packages/scmrepo/git/__init__.py", line 143, in clone
    backend.clone(url, to_path, **kwargs)
  File "/XXX/.venv/lib/python3.8/site-packages/scmrepo/git/backend/dulwich/__init__.py", line 203, in clone
    raise CloneError(url, to_path) from exc
scmrepo.exceptions.CloneError: Failed to clone repo 'XXXprivate-gitlab.com:XXXXX/my-dvc-proj.git' to '/var/folders/r7/6vs0hrvd2gl7z35_lm1vr5g00000gn/T/tmp7fk69hrtdvc-clone'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/XXX/.venv/lib/python3.8/site-packages/dvc/commands/get.py", line 39, in _get_file_from_repo
    Repo.get(
  File "/XXX/.venv/lib/python3.8/site-packages/dvc/repo/get.py", line 50, in get
    with external_repo(
  File "/XXX/.pyenv/versions/3.8.12/lib/python3.8/contextlib.py", line 113, in __enter__
    return next(self.gen)
  File "/XXX/.venv/lib/python3.8/site-packages/dvc/external_repo.py", line 39, in external_repo
    path = _cached_clone(url, rev, for_write=for_write)
  File "/XXX/.venv/lib/python3.8/site-packages/dvc/external_repo.py", line 169, in _cached_clone
    clone_path, shallow = _clone_default_branch(url, rev, for_write=for_write)
  File "/XXX/.venv/lib/python3.8/site-packages/funcy/decorators.py", line 45, in wrapper
    return deco(call, *dargs, **dkwargs)
  File "/XXX/.venv/lib/python3.8/site-packages/funcy/flow.py", line 274, in wrap_with
    return call()
  File "/XXX/.venv/lib/python3.8/site-packages/funcy/decorators.py", line 66, in __call__
    return self._func(*self._args, **self._kwargs)
  File "/XXX/.venv/lib/python3.8/site-packages/dvc/external_repo.py", line 239, in _clone_default_branch
    git = clone(url, clone_path)
  File "/XXX/.venv/lib/python3.8/site-packages/dvc/scm.py", line 150, in clone
    raise CloneError(str(exc))
dvc.scm.CloneError: Failed to clone repo 'XXXprivate-gitlab.com:XXXXX/my-dvc-proj.git' to '/var/folders/r7/6vs0hrvd2gl7z35_lm1vr5g00000gn/T/tmp7fk69hrtdvc-clone'
------------------------------------------------------------
2022-12-12 19:11:25,588 DEBUG: Analytics is enabled.
2022-12-12 19:11:25,663 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/var/folders/r7/6vs0hrvd2gl7z35_lm1vr5g00000gn/T/tmpnffxtfbh']'
2022-12-12 19:11:25,665 DEBUG: Spawned '['daemon', '-q', 'analytics', '/var/folders/r7/6vs0hrvd2gl7z35_lm1vr5g00000gn/T/tmpnffxtfbh']'

@dtrifiro dtrifiro added the A: data-sync Related to dvc get/fetch/import/pull/push label Dec 13, 2022
@dtrifiro
Copy link
Contributor

dtrifiro commented Dec 13, 2022

Thanks! Do you have a ProxyCommand section (or some other custom configuration) in your .ssh/config? It might help reproduce the issue

@dtrifiro dtrifiro self-assigned this Dec 13, 2022
@efiop efiop added the awaiting response we are waiting for your reply, please respond! :) label Dec 13, 2022
@dekromp
Copy link
Author

dekromp commented Dec 13, 2022

Yes I have. But since the XXXprivate-gitlab.com is within the private network it is set to none:

Host *XXXprivate-gitlab.com
   ProxyCommand none

For all other ips I am using corkscrew for tunneling through the proxy.

@shcheklein shcheklein removed the awaiting response we are waiting for your reply, please respond! :) label Dec 13, 2022
@dtrifiro
Copy link
Contributor

dtrifiro commented Dec 14, 2022

It seems our ssh backend (asyncssh) tries to execute none as a proxy command instead of interpreting a directive to disable Proxycommand, marking this as an upstream issue (ronf/asyncssh#528).

@dtrifiro dtrifiro added the upstream Issues which need to be resolved in an upstream dependency label Dec 14, 2022
@dtrifiro
Copy link
Contributor

This has been solved upstream. Waiting for an asyncssh release (see ronf/asyncssh#528)

@dtrifiro
Copy link
Contributor

Released in asyncssh 2.13.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: data-sync Related to dvc get/fetch/import/pull/push research upstream Issues which need to be resolved in an upstream dependency
Projects
No open projects
Archived in project
Development

No branches or pull requests

4 participants