Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataLoaders(..., parallel=true) hanging #132

Open
ablaom opened this issue Dec 7, 2022 · 1 comment
Open

DataLoaders(..., parallel=true) hanging #132

ablaom opened this issue Dec 7, 2022 · 1 comment

Comments

@ablaom
Copy link

ablaom commented Dec 7, 2022

In the following MWE I successively create an out-of-memory data source of 20 MNIST images using FileDataset. I can the wrap the source as MLUtils.DataLoader with the default parallel=false option and collect the result. However, if I specify parallel=true then the collect hangs.

Pkg.activate("data", shared=true)
import MLDatasets: MNIST
using MLDatasets
using ScientificTypes
using MLUtils
using FileIO

ENV["DATADEPS_ALWAYS_ACCEPT"] = true
images, labels = MNIST.(split=:train)[:];

N = 20
images = coerce(images, GrayImage)[1:N];

# save some MNIST images as tiff files:
const dir = tempname()
for i  in eachindex(images)
    filename = joinpath(dir, "$i.tiff")
    FileIO.save(filename, images[i])
end

# create out-of-memory image source:
X = MLDatasets.FileDataset(dir)

sequential = DataLoader(X, batchsize=2, collate=true)
collect(sequential) # executes as expected

parallel = DataLoader(X, batchsize=2, collate=true, parallel=true);
collect(parallel); # hangs

Here's my setup:

(@data) pkg> status
Status `~/.julia/environments/data/Project.toml`
  [5789e2e9] FileIO v1.16.0
  [82e4d734] ImageIO v0.6.6
  [eb30cadb] MLDatasets v0.7.6
  [f1d291b0] MLUtils v0.3.1
  [321657f4] ScientificTypes v3.0.2

julia> versioninfo()
Julia Version 1.8.3
Commit 0434deb161e (2022-11-14 20:14 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin21.4.0)
  CPU: 12 × Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, skylake)
  Threads: 5 on 12 virtual cores
Environment:
  JULIA_LTS_PATH = /Applications/Julia-1.6.app/Contents/Resources/julia/bin/julia
  JULIA_PATH = /Applications/Julia-1.8.app/Contents/Resources/julia/bin/julia
  JULIA_EGLOT_PATH = /Applications/Julia-1.6.app/Contents/Resources/julia/bin/julia
  JULIA_NUM_THREADS = 5
  JULIA_NIGHTLY_PATH = /Applications/Julia-1.8.app/Contents/Resources/julia/bin/julia
@RomeoV
Copy link
Contributor

RomeoV commented Mar 9, 2023

Try setting

import FastAI.Flux.MLUtils._default_executor
_default_executor() = ThreadedEx()

See also #142 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants