Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproductivity problem with multi-threading #32

Open
terasakisatoshi opened this issue Mar 18, 2022 · 4 comments
Open

Reproductivity problem with multi-threading #32

terasakisatoshi opened this issue Mar 18, 2022 · 4 comments

Comments

@terasakisatoshi
Copy link

When I used this DataLoaders.jl in my project especially deep learning, I encountered a reproductivity problem with multi-threading is enabled. Below is a MWE that describes our issue. Here, MyDataset returns idx from which comes the 2nd argument of getobs method.

# example.jl
module My

import DataLoaders.LearnBase: getobs, nobs
using Random

struct MyDataset
    ndata::Int
end

Base.getindex(dset::MyDataset, idx) = idx
getobs(dset::MyDataset, idx) = dset[idx]
nobs(dset::MyDataset) = dset.ndata

end # My

using DataLoaders
using Random

using .My

MyDataset = My.MyDataset

ntrial = 3

for t in 1:ntrial
    dset = MyDataset(10000) # create an instance of MyDataset
    loader = DataLoader(dset, 100) # setup loader
    for batch in loader
        @show batch # <------
        println()
        break
    end
end

From my understanding, for each t in 1:ntrial, @show batch should display array from 1 to 100 namely:

batch = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]

On the other hand, the actual behavior of the example.jl script above will output something like:

$ julia --threads=12 example.jl # num thread = 12
batch = [301, 302, 303, 304, 305, 306, 307, 308, 309, 310, 311, 312, 313, 314, 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325, 326, 327, 328, 329, 330, 331, 332, 333, 334, 335, 336, 337, 338, 339, 340, 341, 342, 343, 344, 345, 346, 347, 348, 349, 350, 351, 352, 353, 354, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378, 379, 380, 381, 382, 383, 384, 385, 386, 387, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 398, 399, 400]

batch = [401, 402, 403, 404, 405, 406, 407, 408, 409, 410, 411, 412, 413, 414, 415, 416, 417, 418, 419, 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440, 441, 442, 443, 444, 445, 446, 447, 448, 449, 450, 451, 452, 453, 454, 455, 456, 457, 458, 459, 460, 461, 462, 463, 464, 465, 466, 467, 468, 469, 470, 471, 472, 473, 474, 475, 476, 477, 478, 479, 480, 481, 482, 483, 484, 485, 486, 487, 488, 489, 490, 491, 492, 493, 494, 495, 496, 497, 498, 499, 500]

batch = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100]

This phenomena happens when we specify the number of threads more than 1.

@terasakisatoshi
Copy link
Author

terasakisatoshi commented Mar 18, 2022

Below is my output of versioninfo()

                _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.2 (2022-02-06)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

julia>  versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin19.5.0)
  CPU: Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)

(EDIT): I've tested DataLoaders with 0.1.3

               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.2 (2022-02-06)
 _/ |\__'_|_|_|\__'_|  |  Official https://julialang.org/ release
|__/                   |

(@v1.7) pkg> st DataLoaders
      Status `~/.julia/environments/v1.7/Project.toml`
  [2e981812] DataLoaders v0.1.3

@lorenzoh
Copy link
Owner

lorenzoh commented Mar 18, 2022

DataLoader with multiple threads uses eachobsparallel, which does not guarantee a deterministic ordering.

DataLoaders.jl functionality is currently being added to MLUtils.jl (see JuliaML/MLUtils.jl#33) and I am thinking to add an optional wrapper that reorders the batches, at the cost of some performance likely.

I won't add this here, though, since MLUtils.jl will supersede DataLoaders.jl. I'll leave this open and update once the functionality exists there 👍

@terasakisatoshi
Copy link
Author

Thank you for your quick reply!

DataLoader with multiple threads uses eachobsparallel, which does not guarantee a deterministic ordering.

O.K. As for me, reproducibility of experiments is important when it comes to evaluate some performances in term of precision or accuracy etc...

I will also check out MLUtils.jl.

I am thinking to add an optional wrapper that reorders the batches, at the cost of some performance likely.

Great! Let me know when you are done.

@lorenzoh
Copy link
Owner

I made an issue that you can subscribe to :) JuliaML/MLUtils.jl#68

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants