Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to create the BPE dataset for custom dataset #22

Open
guanqun-yang opened this issue Aug 18, 2021 · 6 comments
Open

Failure to create the BPE dataset for custom dataset #22

guanqun-yang opened this issue Aug 18, 2021 · 6 comments

Comments

@guanqun-yang
Copy link

guanqun-yang commented Aug 18, 2021

Hi

I am trying to train a style transfer model for a style (i.e., profane vs. civil) that is not supported in the paper. However, when I tried to run the first step as is instructed in the repository

python datasets/dataset2bpe.py --dataset datasets/golbeck

where datasets/golbeck is a dataset on toxicity comments with required directory structure.

golbeck/
├── dev.label
├── dev.txt
├── test.label
├── test.txt
├── train.label
└── train.txt

a series of errors on some dependencies are reported.

[...]lib/python3.6/site-packages/torch/include/ATen/core/TensorMethods.h:1417:15: note:   no known conversion for argument 1 from ‘<brace-enclosed initializer list>’ to ‘at::TensorList {aka c10::ArrayRef<at::Tensor>}’
error: command 'gcc' failed with exit status 1

It seems to me that this should be related to the installation. Here are the full installation commands I used.

conda create --name strap python==3.6.3
conda activate strap

conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch

pip install -r requirements.txt
pip install -e .

cd fairseq
pip install -e .

# this line is required because otherwise running dataset2bpe.py will report missing dependencies
conda install -c conda-forge hydra-core omegaconf

I am wondering how I could resolve this issue. In order to reproduce this error, the data is provided here.

@martiansideofthemoon
Copy link
Owner

Hi @guanqun-yang, thanks for reporting this issue. Could you provide a full stack trace, pointing to the python line the error arises? Also, with your installation setup are you able to run the scripts for tasks like shakespeare transfer?

@guanqun-yang
Copy link
Author

@martiansideofthemoon Thank you for your prompt reply!

I reconfigured the whole environment using the virtualenv (rather than conda in the question). I think the style transfer on Shakesphere is runnable (it is still running though), and this means the installation should be correct. But I kept getting similar errors as I got yesterday.

Here are the full error traces

Using cache found in /home/yang/.cache/torch/hub/pytorch_fairseq_master
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
running build_ext
cythoning fairseq/data/data_utils_fast.pyx to fairseq/data/data_utils_fast.cpp
cythoning fairseq/data/token_block_utils_fast.pyx to fairseq/data/token_block_utils_fast.cpp
building 'fairseq.libbleu' extension
creating /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8
creating /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq
creating /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib
creating /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbleu
Emitting ninja build file /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/2] c++ -MMD -MF /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbleu/module.o.d -pthread -B /home/yang/Essential/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/yang/Essential/anaconda3/include/python3.8 -c -c /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/clib/libbleu/module.cpp -o /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbleu/module.o -std=c++11 -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=libbleu -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
[2/2] c++ -MMD -MF /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbleu/libbleu.o.d -pthread -B /home/yang/Essential/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/yang/Essential/anaconda3/include/python3.8 -c -c /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/clib/libbleu/libbleu.cpp -o /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbleu/libbleu.o -std=c++11 -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=libbleu -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
creating build/lib.linux-x86_64-3.8
creating build/lib.linux-x86_64-3.8/fairseq
g++ -pthread -shared -B /home/yang/Essential/anaconda3/compiler_compat -L/home/yang/Essential/anaconda3/lib -Wl,-rpath=/home/yang/Essential/anaconda3/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbleu/libbleu.o /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbleu/module.o -o build/lib.linux-x86_64-3.8/fairseq/libbleu.cpython-38-x86_64-linux-gnu.so
building 'fairseq.data.data_utils_fast' extension
creating /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/data
Emitting ninja build file /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/data/data_utils_fast.o.d -pthread -B /home/yang/Essential/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include -I/home/yang/Essential/anaconda3/include/python3.8 -c -c /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/data/data_utils_fast.cpp -o /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/data/data_utils_fast.o -std=c++11 -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=data_utils_fast -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:0,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                 from /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/data/data_utils_fast.cpp:624:
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
 #warning "Using deprecated NumPy API, disable it with " \
  ^~~~~~~
creating build/lib.linux-x86_64-3.8/fairseq/data
g++ -pthread -shared -B /home/yang/Essential/anaconda3/compiler_compat -L/home/yang/Essential/anaconda3/lib -Wl,-rpath=/home/yang/Essential/anaconda3/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/data/data_utils_fast.o -o build/lib.linux-x86_64-3.8/fairseq/data/data_utils_fast.cpython-38-x86_64-linux-gnu.so
building 'fairseq.data.token_block_utils_fast' extension
Emitting ninja build file /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/data/token_block_utils_fast.o.d -pthread -B /home/yang/Essential/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include -I/home/yang/Essential/anaconda3/include/python3.8 -c -c /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/data/token_block_utils_fast.cpp -o /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/data/token_block_utils_fast.o -std=c++11 -O3 -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=token_block_utils_fast -D_GLIBCXX_USE_CXX11_ABI=0
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:0,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/ndarrayobject.h:12,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/arrayobject.h:4,
                 from /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/data/token_block_utils_fast.cpp:625:
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
 #warning "Using deprecated NumPy API, disable it with " \
  ^~~~~~~
/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/data/token_block_utils_fast.cpp: In function ‘PyArrayObject* __pyx_f_7fairseq_4data_22token_block_utils_fast__get_slice_indices_fast(PyArrayObject*, PyObject*, int, int, int)’:
/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/data/token_block_utils_fast.cpp:3290:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       __pyx_t_4 = ((__pyx_v_sz_idx < __pyx_t_10) != 0);
                     ~~~~~~~~~~~~~~~^~~~~~~~~~~~
/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/data/token_block_utils_fast.cpp:3485:36: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
       __pyx_t_3 = ((__pyx_v_sz_idx < __pyx_t_10) != 0);
                     ~~~~~~~~~~~~~~~^~~~~~~~~~~~
g++ -pthread -shared -B /home/yang/Essential/anaconda3/compiler_compat -L/home/yang/Essential/anaconda3/lib -Wl,-rpath=/home/yang/Essential/anaconda3/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/data/token_block_utils_fast.o -o build/lib.linux-x86_64-3.8/fairseq/data/token_block_utils_fast.cpython-38-x86_64-linux-gnu.so
building 'fairseq.libbase' extension
creating /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbase
Emitting ninja build file /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbase/balanced_assignment.o.d -pthread -B /home/yang/Essential/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/TH -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/THC -I/home/yang/Essential/anaconda3/include/python3.8 -c -c /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/clib/libbase/balanced_assignment.cpp -o /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbase/balanced_assignment.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=libbase -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/ATen/Parallel.h:149:0,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/utils.h:3,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:5,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn.h:3,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/all.h:12,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/extension.h:4,
                 from /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/clib/libbase/balanced_assignment.cpp:16:
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/ATen/ParallelOpenMP.h:84:0: warning: ignoring #pragma omp parallel [-Wunknown-pragmas]
 #pragma omp parallel for if ((end - begin) >= grain_size)
 
g++ -pthread -shared -B /home/yang/Essential/anaconda3/compiler_compat -L/home/yang/Essential/anaconda3/lib -Wl,-rpath=/home/yang/Essential/anaconda3/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libbase/balanced_assignment.o -L/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -o build/lib.linux-x86_64-3.8/fairseq/libbase.cpython-38-x86_64-linux-gnu.so
building 'fairseq.libnat' extension
creating /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libnat
Emitting ninja build file /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] c++ -MMD -MF /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libnat/edit_dist.o.d -pthread -B /home/yang/Essential/anaconda3/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/TH -I/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/THC -I/home/yang/Essential/anaconda3/include/python3.8 -c -c /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/clib/libnat/edit_dist.cpp -o /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libnat/edit_dist.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=libnat -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++14
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++
In file included from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/ATen/Parallel.h:149:0,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/utils.h:3,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn/cloneable.h:5,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/nn.h:3,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/all.h:12,
                 from /home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/torch/csrc/api/include/torch/torch.h:3,
                 from /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/clib/libnat/edit_dist.cpp:11:
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/ATen/ParallelOpenMP.h:84:0: warning: ignoring #pragma omp parallel [-Wunknown-pragmas]
 #pragma omp parallel for if ((end - begin) >= grain_size)
 
In file included from /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/clib/libnat/edit_dist.cpp:9:0:
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/pybind11/detail/common.h: In function ‘void pybind11::pybind11_fail(const string&)’:
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/pybind11/detail/common.h:680:83: warning: inline declaration of ‘void pybind11::pybind11_fail(const string&)’ follows declaration with attribute noinline [-Wattributes]
 [[noreturn]] PYBIND11_NOINLINE inline void pybind11_fail(const std::string &reason) { throw std::runtime_error(reason); }
                                                                                   ^
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/pybind11/detail/common.h:679:44: note: previous definition of ‘void pybind11::pybind11_fail(const char*)’ was here
 [[noreturn]] PYBIND11_NOINLINE inline void pybind11_fail(const char *reason) { throw std::runtime_error(reason); }
                                            ^~~~~~~~~~~~~
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/pybind11/detail/common.h:680:83: warning: inline declaration of ‘void pybind11::pybind11_fail(const string&)’ follows declaration with attribute noinline [-Wattributes]
 [[noreturn]] PYBIND11_NOINLINE inline void pybind11_fail(const std::string &reason) { throw std::runtime_error(reason); }
                                                                                   ^
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/include/pybind11/detail/common.h:679:44: note: previous definition of ‘void pybind11::pybind11_fail(const char*)’ was here
 [[noreturn]] PYBIND11_NOINLINE inline void pybind11_fail(const char *reason) { throw std::runtime_error(reason); }
                                            ^~~~~~~~~~~~~
g++ -pthread -shared -B /home/yang/Essential/anaconda3/compiler_compat -L/home/yang/Essential/anaconda3/lib -Wl,-rpath=/home/yang/Essential/anaconda3/lib -Wl,--no-as-needed -Wl,--sysroot=/ /home/yang/.cache/torch/hub/pytorch_fairseq_master/build/temp.linux-x86_64-3.8/fairseq/clib/libnat/edit_dist.o -L/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/lib -lc10 -ltorch -ltorch_cpu -ltorch_python -o build/lib.linux-x86_64-3.8/fairseq/libnat.cpython-38-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.8/fairseq/libbleu.cpython-38-x86_64-linux-gnu.so -> fairseq
copying build/lib.linux-x86_64-3.8/fairseq/data/data_utils_fast.cpython-38-x86_64-linux-gnu.so -> fairseq/data
copying build/lib.linux-x86_64-3.8/fairseq/data/token_block_utils_fast.cpython-38-x86_64-linux-gnu.so -> fairseq/data
copying build/lib.linux-x86_64-3.8/fairseq/libbase.cpython-38-x86_64-linux-gnu.so -> fairseq
copying build/lib.linux-x86_64-3.8/fairseq/libnat.cpython-38-x86_64-linux-gnu.so -> fairseq
/home/yang/Essential/anaconda3/lib/python3.8/site-packages/hydra/experimental/initialize.py:35: UserWarning: hydra.experimental.initialize() is no longer experimental. Use hydra.initialize()
  warnings.warn(
Error when composing. Overrides: ['common.no_progress_bar=False', 'common.log_interval=25', "common.log_format='json'", 'common.log_file=null', 'common.tensorboard_logdir=null', 'common.wandb_project=null', 'common.azureml_logging=False', 'common.seed=1', 'common.cpu=False', 'common.tpu=False', 'common.bf16=False', 'common.memory_efficient_bf16=False', 'common.fp16=True', 'common.memory_efficient_fp16=True', 'common.fp16_no_flatten_grads=False', 'common.fp16_init_scale=4', 'common.fp16_scale_window=128', 'common.fp16_scale_tolerance=0.0', 'common.on_cpu_convert_precision=False', 'common.min_loss_scale=0.0001', 'common.threshold_loss_scale=1.0', 'common.amp=False', 'common.amp_batch_retries=2', 'common.amp_init_scale=128', 'common.amp_scale_window=null', 'common.user_dir=null', 'common.empty_cache_freq=0', 'common.all_gather_list_size=16384', 'common.model_parallel_size=1', 'common.quantization_config_path=null', 'common.profile=False', 'common.reset_logging=False', 'common.suppress_crashes=False', 'common.use_plasma_view=False', "common.plasma_path='/tmp/plasma'", 'common_eval.path=null', 'common_eval.post_process=null', 'common_eval.quiet=False', "common_eval.model_overrides='{}'", 'common_eval.results_path=null', 'distributed_training.distributed_world_size=512', 'distributed_training.distributed_num_procs=1', 'distributed_training.distributed_rank=0', "distributed_training.distributed_backend='nccl'", 'distributed_training.distributed_init_method=null', 'distributed_training.distributed_port=19812', 'distributed_training.device_id=0', 'distributed_training.distributed_no_spawn=False', "distributed_training.ddp_backend='c10d'", "distributed_training.ddp_comm_hook='none'", 'distributed_training.bucket_cap_mb=200', 'distributed_training.fix_batches_to_gpus=False', 'distributed_training.find_unused_parameters=True', 'distributed_training.fast_stat_sync=False', 'distributed_training.heartbeat_timeout=-1', 'distributed_training.broadcast_buffers=False', 'distributed_training.slowmo_momentum=null', "distributed_training.slowmo_algorithm='LocalSGD'", 'distributed_training.localsgd_frequency=3', 'distributed_training.nprocs_per_node=1', 'distributed_training.pipeline_model_parallel=False', 'distributed_training.pipeline_balance=null', 'distributed_training.pipeline_devices=null', 'distributed_training.pipeline_chunks=0', 'distributed_training.pipeline_encoder_balance=null', 'distributed_training.pipeline_encoder_devices=null', 'distributed_training.pipeline_decoder_balance=null', 'distributed_training.pipeline_decoder_devices=null', "distributed_training.pipeline_checkpoint='never'", "distributed_training.zero_sharding='none'", 'distributed_training.fp16=True', 'distributed_training.memory_efficient_fp16=True', 'distributed_training.tpu=True', 'distributed_training.no_reshard_after_forward=False', 'distributed_training.fp32_reduce_scatter=False', 'distributed_training.cpu_offload=False', 'distributed_training.use_sharded_state=False', 'dataset.num_workers=2', 'dataset.skip_invalid_size_inputs_valid_test=True', 'dataset.max_tokens=999999', 'dataset.batch_size=null', 'dataset.required_batch_size_multiple=1', 'dataset.required_seq_len_multiple=1', "dataset.dataset_impl='mmap'", 'dataset.data_buffer_size=10', "dataset.train_subset='train'", "dataset.valid_subset='valid'", 'dataset.combine_valid_subsets=null', 'dataset.ignore_unused_valid_subsets=False', 'dataset.validate_interval=1', 'dataset.validate_interval_updates=0', 'dataset.validate_after_updates=0', 'dataset.fixed_validation_seed=null', 'dataset.disable_validation=False', "dataset.max_tokens_valid='${dataset.max_tokens}'", "dataset.batch_size_valid='${dataset.batch_size}'", 'dataset.max_valid_steps=null', 'dataset.curriculum=0', "dataset.gen_subset='test'", 'dataset.num_shards=1', 'dataset.shard_id=0', 'optimization.max_epoch=0', 'optimization.max_update=500000', 'optimization.stop_time_hours=0.0', 'optimization.clip_norm=0.0', 'optimization.sentence_avg=False', 'optimization.update_freq=[1]', 'optimization.lr=[0.0006]', 'optimization.stop_min_lr=-1.0', 'optimization.use_bmuf=False', "checkpoint.save_dir='checkpoints'", "checkpoint.restore_file='checkpoint_last.pt'", 'checkpoint.finetune_from_model=null', 'checkpoint.reset_dataloader=True', 'checkpoint.reset_lr_scheduler=False', 'checkpoint.reset_meters=False', 'checkpoint.reset_optimizer=False', "checkpoint.optimizer_overrides='{}'", 'checkpoint.save_interval=1', 'checkpoint.save_interval_updates=2000', 'checkpoint.keep_interval_updates=-1', 'checkpoint.keep_interval_updates_pattern=-1', 'checkpoint.keep_last_epochs=-1', 'checkpoint.keep_best_checkpoints=-1', 'checkpoint.no_save=False', 'checkpoint.no_epoch_checkpoints=True', 'checkpoint.no_last_checkpoints=False', 'checkpoint.no_save_optimizer_state=False', "checkpoint.best_checkpoint_metric='loss'", 'checkpoint.maximize_best_checkpoint_metric=False', 'checkpoint.patience=-1', "checkpoint.checkpoint_suffix=''", 'checkpoint.checkpoint_shard_count=1', 'checkpoint.load_checkpoint_on_all_dp_ranks=False', 'checkpoint.write_checkpoints_asynchronously=False', "checkpoint.model_parallel_size='${common.model_parallel_size}'", 'bmuf.block_lr=1.0', 'bmuf.block_momentum=0.875', 'bmuf.global_sync_iter=10', 'bmuf.warmup_iterations=500', 'bmuf.use_nbm=False', 'bmuf.average_sync=False', 'bmuf.distributed_world_size=512', 'generation.beam=5', 'generation.nbest=1', 'generation.max_len_a=0.0', 'generation.max_len_b=200', 'generation.min_len=1', 'generation.match_source_len=False', 'generation.unnormalized=False', 'generation.no_early_stop=False', 'generation.no_beamable_mm=False', 'generation.lenpen=1.0', 'generation.unkpen=0.0', 'generation.replace_unk=null', 'generation.sacrebleu=False', 'generation.score_reference=False', 'generation.prefix_size=0', 'generation.no_repeat_ngram_size=0', 'generation.sampling=False', 'generation.sampling_topk=-1', 'generation.sampling_topp=-1.0', 'generation.constraints=null', 'generation.temperature=1.0', 'generation.diverse_beam_groups=-1', 'generation.diverse_beam_strength=0.5', 'generation.diversity_rate=-1.0', 'generation.print_alignment=null', 'generation.print_step=False', 'generation.lm_path=null', 'generation.lm_weight=0.0', 'generation.iter_decode_eos_penalty=0.0', 'generation.iter_decode_max_iter=10', 'generation.iter_decode_force_max_iter=False', 'generation.iter_decode_with_beam=1', 'generation.iter_decode_with_external_reranker=False', 'generation.retain_iter_history=False', 'generation.retain_dropout=False', 'generation.retain_dropout_modules=null', 'generation.decoding_format=null', 'generation.no_seed_provided=False', 'eval_lm.output_word_probs=False', 'eval_lm.output_word_stats=False', 'eval_lm.context_window=0', 'eval_lm.softmax_batch=9223372036854775807', 'interactive.buffer_size=0', "interactive.input='-'", 'task=masked_lm', 'task._name=masked_lm', "task.data='/home/yang/.cache/torch/pytorch_fairseq/37d2bc14cf6332d61ed5abeb579948e6054e46cc724c7d23426382d11a31b2d6.ae5852b4abc6bf762e0b6b30f19e741aa05562471e9eb8f4a6ae261f04f9b350'", "task.sample_break_mode='complete'", 'task.tokens_per_sample=512', 'task.mask_prob=0.15', 'task.leave_unmasked_prob=0.1', 'task.random_token_prob=0.1', 'task.freq_weighted_replacement=False', 'task.mask_whole_words=False', 'task.mask_multiple_length=1', 'task.mask_stdev=0.0', "task.shorten_method='none'", "task.shorten_data_split_list=''", 'task.seed=1', 'criterion=masked_lm', 'criterion._name=masked_lm', 'criterion.tpu=True', 'bpe=gpt2', 'bpe._name=gpt2', "bpe.gpt2_encoder_json='https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/encoder.json'", "bpe.gpt2_vocab_bpe='https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/vocab.bpe'", 'optimizer=adam', 'optimizer._name=adam', "optimizer.adam_betas='(0.9, 0.98)'", 'optimizer.adam_eps=1e-06', 'optimizer.weight_decay=0.01', 'optimizer.use_old_adam=False', 'optimizer.fp16_adam_stats=False', 'optimizer.tpu=True', 'optimizer.lr=[0.0006]', 'lr_scheduler=polynomial_decay', 'lr_scheduler._name=polynomial_decay', 'lr_scheduler.warmup_updates=24000', 'lr_scheduler.force_anneal=null', 'lr_scheduler.end_learning_rate=0.0', 'lr_scheduler.power=1.0', 'lr_scheduler.total_num_update=500000.0', 'lr_scheduler.lr=[0.0006]']
Traceback (most recent call last):
  File "dataset2bpe.py", line 10, in <module>
    roberta = torch.hub.load('pytorch/fairseq', 'roberta.base')
  File "/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/hub.py", line 370, in load
    model = _load_local(repo_or_dir, model, *args, **kwargs)
  File "/home/yang/Essential/anaconda3/lib/python3.8/site-packages/torch/hub.py", line 399, in _load_local
    model = entry(*args, **kwargs)
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/models/roberta/model.py", line 277, in from_pretrained
    x = hub_utils.from_pretrained(
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/hub_utils.py", line 73, in from_pretrained
    models, args, task = checkpoint_utils.load_model_ensemble_and_task(
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py", line 421, in load_model_ensemble_and_task
    state = load_checkpoint_to_cpu(filename, arg_overrides)
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py", line 339, in load_checkpoint_to_cpu
    state = _upgrade_state_dict(state)
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py", line 643, in _upgrade_state_dict
    state["cfg"] = convert_namespace_to_omegaconf(state["args"])
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/dataclass/utils.py", line 389, in convert_namespace_to_omegaconf
    composed_cfg = compose("config", overrides=overrides, strict=False)
TypeError: compose() got an unexpected keyword argument 'strict'

@martiansideofthemoon
Copy link
Owner

martiansideofthemoon commented Aug 19, 2021

Hi @guanqun-yang, this is almost certainly a fairseq issue. I think the fairseq you are using is not the local implementation provided in the repository (I can see paths like /home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/dataclass/utils.py in the stacktrace rather than local paths). Could you try to uninstall fairseq and install it again using the local fairseq folder?

@guanqun-yang
Copy link
Author

@martiansideofthemoon Thanks for your reply!

I removed all fairseq installed globally, started afresh with a newly cloned repo, and configured the environment as below. But it seems a fairseq will be downloaded to /home/yang/.cache anyway whether there is a global installation or not

virtualenv -p python3 style-venv
source style-venv/bin/activate

pip install torch==1.5.0+cu101 torchvision==0.6.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html

pip install -r requirements.txt
pip install --editable ./

cd fairseq
pip install --editable ./

The following are the full stack traces after executing datasets/dataset2bpe.py. It seems that the problems come from this line.

Using cache found in /home/yang/.cache/torch/hub/pytorch_fairseq_master
/home/yang/style-transfer-paraphrase/style-venv/lib/python3.6/site-packages/hydra/experimental/initialize.py:37: UserWarning: hydra.experimental.initialize() is no longer experimental. Use hydra.initialize()
  message="hydra.experimental.initialize() is no longer experimental."
Error when composing. Overrides: ['common.no_progress_bar=False', 'common.log_interval=25', "common.log_format='json'", 'common.log_file=null', 'common.tensorboard_logdir=null', 'common.wandb_project=null', 'common.azureml_logging=False', 'common.seed=1', 'common.cpu=False', 'common.tpu=False', 'common.bf16=False', 'common.memory_efficient_bf16=False', 'common.fp16=True', 'common.memory_efficient_fp16=True', 'common.fp16_no_flatten_grads=False', 'common.fp16_init_scale=4', 'common.fp16_scale_window=128', 'common.fp16_scale_tolerance=0.0', 'common.on_cpu_convert_precision=False', 'common.min_loss_scale=0.0001', 'common.threshold_loss_scale=1.0', 'common.amp=False', 'common.amp_batch_retries=2', 'common.amp_init_scale=128', 'common.amp_scale_window=null', 'common.user_dir=null', 'common.empty_cache_freq=0', 'common.all_gather_list_size=16384', 'common.model_parallel_size=1', 'common.quantization_config_path=null', 'common.profile=False', 'common.reset_logging=False', 'common.suppress_crashes=False', 'common.use_plasma_view=False', "common.plasma_path='/tmp/plasma'", 'common_eval.path=null', 'common_eval.post_process=null', 'common_eval.quiet=False', "common_eval.model_overrides='{}'", 'common_eval.results_path=null', 'distributed_training.distributed_world_size=512', 'distributed_training.distributed_num_procs=1', 'distributed_training.distributed_rank=0', "distributed_training.distributed_backend='nccl'", 'distributed_training.distributed_init_method=null', 'distributed_training.distributed_port=19812', 'distributed_training.device_id=0', 'distributed_training.distributed_no_spawn=False', "distributed_training.ddp_backend='c10d'", "distributed_training.ddp_comm_hook='none'", 'distributed_training.bucket_cap_mb=200', 'distributed_training.fix_batches_to_gpus=False', 'distributed_training.find_unused_parameters=True', 'distributed_training.fast_stat_sync=False', 'distributed_training.heartbeat_timeout=-1', 'distributed_training.broadcast_buffers=False', 'distributed_training.slowmo_momentum=null', "distributed_training.slowmo_algorithm='LocalSGD'", 'distributed_training.localsgd_frequency=3', 'distributed_training.nprocs_per_node=1', 'distributed_training.pipeline_model_parallel=False', 'distributed_training.pipeline_balance=null', 'distributed_training.pipeline_devices=null', 'distributed_training.pipeline_chunks=0', 'distributed_training.pipeline_encoder_balance=null', 'distributed_training.pipeline_encoder_devices=null', 'distributed_training.pipeline_decoder_balance=null', 'distributed_training.pipeline_decoder_devices=null', "distributed_training.pipeline_checkpoint='never'", "distributed_training.zero_sharding='none'", 'distributed_training.fp16=True', 'distributed_training.memory_efficient_fp16=True', 'distributed_training.tpu=True', 'distributed_training.no_reshard_after_forward=False', 'distributed_training.fp32_reduce_scatter=False', 'distributed_training.cpu_offload=False', 'distributed_training.use_sharded_state=False', 'dataset.num_workers=2', 'dataset.skip_invalid_size_inputs_valid_test=True', 'dataset.max_tokens=999999', 'dataset.batch_size=null', 'dataset.required_batch_size_multiple=1', 'dataset.required_seq_len_multiple=1', "dataset.dataset_impl='mmap'", 'dataset.data_buffer_size=10', "dataset.train_subset='train'", "dataset.valid_subset='valid'", 'dataset.combine_valid_subsets=null', 'dataset.ignore_unused_valid_subsets=False', 'dataset.validate_interval=1', 'dataset.validate_interval_updates=0', 'dataset.validate_after_updates=0', 'dataset.fixed_validation_seed=null', 'dataset.disable_validation=False', "dataset.max_tokens_valid='${dataset.max_tokens}'", "dataset.batch_size_valid='${dataset.batch_size}'", 'dataset.max_valid_steps=null', 'dataset.curriculum=0', "dataset.gen_subset='test'", 'dataset.num_shards=1', 'dataset.shard_id=0', 'optimization.max_epoch=0', 'optimization.max_update=500000', 'optimization.stop_time_hours=0.0', 'optimization.clip_norm=0.0', 'optimization.sentence_avg=False', 'optimization.update_freq=[1]', 'optimization.lr=[0.0006]', 'optimization.stop_min_lr=-1.0', 'optimization.use_bmuf=False', "checkpoint.save_dir='checkpoints'", "checkpoint.restore_file='checkpoint_last.pt'", 'checkpoint.finetune_from_model=null', 'checkpoint.reset_dataloader=True', 'checkpoint.reset_lr_scheduler=False', 'checkpoint.reset_meters=False', 'checkpoint.reset_optimizer=False', "checkpoint.optimizer_overrides='{}'", 'checkpoint.save_interval=1', 'checkpoint.save_interval_updates=2000', 'checkpoint.keep_interval_updates=-1', 'checkpoint.keep_interval_updates_pattern=-1', 'checkpoint.keep_last_epochs=-1', 'checkpoint.keep_best_checkpoints=-1', 'checkpoint.no_save=False', 'checkpoint.no_epoch_checkpoints=True', 'checkpoint.no_last_checkpoints=False', 'checkpoint.no_save_optimizer_state=False', "checkpoint.best_checkpoint_metric='loss'", 'checkpoint.maximize_best_checkpoint_metric=False', 'checkpoint.patience=-1', "checkpoint.checkpoint_suffix=''", 'checkpoint.checkpoint_shard_count=1', 'checkpoint.load_checkpoint_on_all_dp_ranks=False', 'checkpoint.write_checkpoints_asynchronously=False', "checkpoint.model_parallel_size='${common.model_parallel_size}'", 'bmuf.block_lr=1.0', 'bmuf.block_momentum=0.875', 'bmuf.global_sync_iter=10', 'bmuf.warmup_iterations=500', 'bmuf.use_nbm=False', 'bmuf.average_sync=False', 'bmuf.distributed_world_size=512', 'generation.beam=5', 'generation.nbest=1', 'generation.max_len_a=0.0', 'generation.max_len_b=200', 'generation.min_len=1', 'generation.match_source_len=False', 'generation.unnormalized=False', 'generation.no_early_stop=False', 'generation.no_beamable_mm=False', 'generation.lenpen=1.0', 'generation.unkpen=0.0', 'generation.replace_unk=null', 'generation.sacrebleu=False', 'generation.score_reference=False', 'generation.prefix_size=0', 'generation.no_repeat_ngram_size=0', 'generation.sampling=False', 'generation.sampling_topk=-1', 'generation.sampling_topp=-1.0', 'generation.constraints=null', 'generation.temperature=1.0', 'generation.diverse_beam_groups=-1', 'generation.diverse_beam_strength=0.5', 'generation.diversity_rate=-1.0', 'generation.print_alignment=null', 'generation.print_step=False', 'generation.lm_path=null', 'generation.lm_weight=0.0', 'generation.iter_decode_eos_penalty=0.0', 'generation.iter_decode_max_iter=10', 'generation.iter_decode_force_max_iter=False', 'generation.iter_decode_with_beam=1', 'generation.iter_decode_with_external_reranker=False', 'generation.retain_iter_history=False', 'generation.retain_dropout=False', 'generation.retain_dropout_modules=null', 'generation.decoding_format=null', 'generation.no_seed_provided=False', 'eval_lm.output_word_probs=False', 'eval_lm.output_word_stats=False', 'eval_lm.context_window=0', 'eval_lm.softmax_batch=9223372036854775807', 'interactive.buffer_size=0', "interactive.input='-'", 'task=masked_lm', 'task._name=masked_lm', "task.data='/home/yang/.cache/torch/hub/pytorch_fairseq/37d2bc14cf6332d61ed5abeb579948e6054e46cc724c7d23426382d11a31b2d6.ae5852b4abc6bf762e0b6b30f19e741aa05562471e9eb8f4a6ae261f04f9b350'", "task.sample_break_mode='complete'", 'task.tokens_per_sample=512', 'task.mask_prob=0.15', 'task.leave_unmasked_prob=0.1', 'task.random_token_prob=0.1', 'task.freq_weighted_replacement=False', 'task.mask_whole_words=False', 'task.mask_multiple_length=1', 'task.mask_stdev=0.0', "task.shorten_method='none'", "task.shorten_data_split_list=''", 'task.seed=1', 'criterion=masked_lm', 'criterion._name=masked_lm', 'criterion.tpu=True', 'bpe=gpt2', 'bpe._name=gpt2', "bpe.gpt2_encoder_json='https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/encoder.json'", "bpe.gpt2_vocab_bpe='https://dl.fbaipublicfiles.com/fairseq/gpt2_bpe/vocab.bpe'", 'optimizer=adam', 'optimizer._name=adam', "optimizer.adam_betas='(0.9, 0.98)'", 'optimizer.adam_eps=1e-06', 'optimizer.weight_decay=0.01', 'optimizer.use_old_adam=False', 'optimizer.fp16_adam_stats=False', 'optimizer.tpu=True', 'optimizer.lr=[0.0006]', 'lr_scheduler=polynomial_decay', 'lr_scheduler._name=polynomial_decay', 'lr_scheduler.warmup_updates=24000', 'lr_scheduler.force_anneal=null', 'lr_scheduler.end_learning_rate=0.0', 'lr_scheduler.power=1.0', 'lr_scheduler.total_num_update=500000.0', 'lr_scheduler.lr=[0.0006]']
Traceback (most recent call last):
  File "dataset2bpe.py", line 10, in <module>
    roberta = torch.hub.load('pytorch/fairseq', 'roberta.base')
  File "/home/yang/style-transfer-paraphrase/style-venv/lib/python3.6/site-packages/torch/hub.py", line 369, in load
    model = entry(*args, **kwargs)
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/models/roberta/model.py", line 284, in from_pretrained
    **kwargs,
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/hub_utils.py", line 75, in from_pretrained
    arg_overrides=kwargs,
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py", line 421, in load_model_ensemble_and_task
    state = load_checkpoint_to_cpu(filename, arg_overrides)
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py", line 339, in load_checkpoint_to_cpu
    state = _upgrade_state_dict(state)
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/checkpoint_utils.py", line 643, in _upgrade_state_dict
    state["cfg"] = convert_namespace_to_omegaconf(state["args"])
  File "/home/yang/.cache/torch/hub/pytorch_fairseq_master/fairseq/dataclass/utils.py", line 389, in convert_namespace_to_omegaconf
    composed_cfg = compose("config", overrides=overrides, strict=False)
TypeError: compose() got an unexpected keyword argument 'strict'

@guanqun-yang
Copy link
Author

@martiansideofthemoon I managed to find a workaround after some attempts. I will post my solution after my experiments.

@martiansideofthemoon
Copy link
Owner

Great good to know! Do post your solution here whenever you get a chance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants