-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A python tensorflow error when testing inference inside docker with a CPU #14
Comments
I've just tried running the Docker example in https://github.com/KhalilMrini/LAL-Parser#inference on my side. Everything works fine. Did you run the same commands or did you change anything? Also, why do you say "I am using benepar[cpu] and tensorflow==2.0.0b1 since tensorflow==2.0.0 could not be found."? Did you get some error messages other than FYI:
|
@Franck-Dernoncourt thanks for your quick reply! Some more info: I am testing on a machine without a decent GPU so I have to use the CPU only. That's why I've installed |
Can you try:
? |
Yeah, without any changes to the requirements.sh script, I get the GPU versions of benepar and tensorflow which doesn't work here. I need the CPU ones.
Note that I've successfully used other BERT based models in this way in the past. |
Same here, when I use Docker, everything runs on CPU. Can try to run the following single command:
and see whether it works? Works for me:
|
@Franck-Dernoncourt It again died with Now, I will try with updating docker first to the latest version since the one that I have is slightly old at |
After some reading, it could also be CPU-specific. This is the output that I get from inside the docker instance: |
A slightly different but basically the same error with docker v19.03.11:
|
You can save the Docker image before running |
The machine has more than 20 GB of RAM. It doesn't even hit the swap when I run the parse.sh script. |
Very interesting. So long with Docker being reproducible :-/ At that point looks like you'll have to look at which line in Otherwise you could try to run the code in a virtual Python environment or an old-fashioned virtual machine. Sorry about that! Please keep us posted how it goes, I'm curious. Thanks for your patience! |
Sure! Thanks for your trying to help! Much appreciated. |
I am at a loss. Compiled TF from source (it took ages) and still getting:
Could it be a broken download of the big model? What's your md5 checksum?
Additionally, @Franck-Dernoncourt could you upload somewhere the contents of |
Same hash here:
Looks like you might have to narrow down which tensorflow command is causing the issue and report it to the tensorflow tech support. Or you could try in a virtual machine in case it masks whatever tensorflow doesn't enjoy in your hardware configuration. |
@KhalilMrini thanks for your great work!
Could the following error be due to a recently changed dependency:
Note: I am using benepar[cpu] and tensorflow==2.0.0b1 since tensorflow==2.0.0 could not be found.
$ source parse_quick.sh
/home/user/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/user/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/user/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/user/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/user/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
_np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/user/.local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
np_resource = np.dtype([("resource", np.ubyte, 1)])
Not using CUDA!
Loading model from best_parser.pt...
/home/user/.local/lib/python3.6/site-packages/torch/nn/reduction.py:43: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
warnings.warn(warning.format(ret))
Parsing sentences...
Parsing sentences: 0%| | 0/1 [00:00<?, ?it/s]/pytorch/aten/src/ATen/native/LegacyDefinitions.cpp:29: UserWarning: masked_select received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/pytorch/aten/src/ATen/native/TensorAdvancedIndexing.cpp:543: UserWarning: masked_fill received a mask with dtype torch.uint8, this behavior is now deprecated,please use a mask with dtype torch.bool instead.
/pytorch/aten/src/ATen/native/IndexingUtils.h:20: UserWarning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead.
Parsing sentences: 0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File "src_joint/main.py", line 788, in
main()
File "src_joint/main.py", line 784, in main
args.callback(args)
File "src_joint/main.py", line 711, in run_parse
syntree, _ = parser.parse_batch(tagged_sentences)
File "/home/user/LAL-Parser/src_joint/KM_parser.py", line 1771, in parse_batch
annotations, self.current_attns = self.encoder(emb_idxs, pre_words_idxs, batch_idxs, extra_content_annotations=extra_content_annotations)
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/user/LAL-Parser/src_joint/KM_parser.py", line 1170, in forward
res, current_attns = attn(res, batch_idxs)
File "/home/user/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in call
result = self.forward(*input, **kwargs)
File "/home/user/LAL-Parser/src_joint/KM_parser.py", line 404, in forward
return self.layer_norm(outputs + residual), attns_padded
RuntimeError: The size of tensor a (39) must match the size of tensor b (35) at non-singleton dimension 0
The text was updated successfully, but these errors were encountered: