Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when installing maskrcnn_benchmark #209

Open
draym28 opened this issue Apr 3, 2024 · 18 comments
Open

Error when installing maskrcnn_benchmark #209

draym28 opened this issue Apr 3, 2024 · 18 comments

Comments

@draym28
Copy link

draym28 commented Apr 3, 2024

Some common problems & solutions when installing maskrcnn_benchmark.

1. THC.h: No such file or directory/THCeilDiv Undefined/
see this

2. identifier "THCudaCheck" is undefined
see this

3. torch.utils.cpp_extension.load stuck
see this

@Maelic
Copy link

Maelic commented Apr 3, 2024

Hi,

The version of the code in this repo is very outdated and is indeed not up-to-date with current CUDA standards. I fixed all of those issues in my implementation, you can probably copy the csrc folder into your local path and be able to compile without any issues (I tested it with CUDA version 11+):
https://github.com/Maelic/SGG-Benchmark/tree/main/sgg_benchmark/csrc

Best

@draym28
Copy link
Author

draym28 commented Apr 4, 2024

Hi,

The version of the code in this repo is very outdated and is indeed not up-to-date with current CUDA standards. I fixed all of those issues in my implementation, you can probably copy the csrc folder into your local path and be able to compile without any issues (I tested it with CUDA version 11+): https://github.com/Maelic/SGG-Benchmark/tree/main/sgg_benchmark/csrc

Best

Thanks for your help!
But after using your csrc, when I conduct SGDet on Custom Images following the instruction in README.md, other errors still comes up:

D:\App\Anaconda3\envs\sgg\lib\site-packages\torch\utils\cpp_extension.py:358: UserWarning: Error checking compiler version for cl: 'cp1' codec can't decode bytes in position 0--1: No mapping for the Unicode character exists in the target code page.
  warnings.warn(f'Error checking compiler version for {compiler}: {error}')
D:\App\Anaconda3\envs\sgg\lib\site-packages\apex\__init__.py:68: DeprecatedFeatureWarning: apex.amp is deprecated and will be removed by the end of February 2023. Use [PyTorch AMP](https://pytorch.org/docs/stable/amp.html)
  warnings.warn(msg, DeprecatedFeatureWarning)
Traceback (most recent call last):
  File "tools/relation_test_net.py", line 11, in <module>
    from maskrcnn_benchmark.data import make_data_loader
  File "d:\code\new_proj\v2t\sgg\scenegraphbenchmark\maskrcnn_benchmark\data\__init__.py", line 2, in <module>
    from .build import make_data_loader, get_dataset_statistics
  File "d:\code\new_proj\v2t\sgg\scenegraphbenchmark\maskrcnn_benchmark\data\build.py", line 14, in <module>
    from . import datasets as D
  File "d:\code\new_proj\v2t\sgg\scenegraphbenchmark\maskrcnn_benchmark\data\datasets\__init__.py", line 2, in <module>
    from .coco import COCODataset
  File "d:\code\new_proj\v2t\sgg\scenegraphbenchmark\maskrcnn_benchmark\data\datasets\coco.py", line 39, in <module>
    class COCODataset(torchvision.datasets.coco.CocoDetection):
AttributeError: module 'torchvision' has no attribute 'datasets'

I still stuck on this step. It makes me crazy.

@Maelic
Copy link

Maelic commented Apr 4, 2024

Which version of torchvision are you using?

@Maelic
Copy link

Maelic commented Apr 4, 2024

It works for me with torchvision 0.17 for cuda 12.1

image

@draym28
Copy link
Author

draym28 commented Apr 4, 2024

I am using pytorch=1.13 and torchvision=0.14.
I can import torchvision.datasets as you did, but when I run the scripts to conduct sgdet on custom images, the error came up.
it is confused.

@Maelic
Copy link

Maelic commented Apr 4, 2024

Then you may be running your code in another conda env or something like that. You can also try to clean and re-build the package with something like rm -rf ./build/ && python setup.py build develop

@draym28
Copy link
Author

draym28 commented Apr 4, 2024

I clean and create a new env many times.
But the error still come up.
And I also did python setup.py build develop every time.
Many people also have this problem, see this.

@Maelic
Copy link

Maelic commented Apr 4, 2024

Can you post the outputs of pip freeze | grep torchvision and conda list | grep torchvision ? You may have different versions of torchvision installed at the same time.

@draym28
Copy link
Author

draym28 commented Apr 4, 2024

outputs of pip freeze | grep torchvision:
torchvision==0.14.1
outputs of conda list | grep torchvision:
torchvision 0.14.1 py38_cu117 pytorch

@Maelic
Copy link

Maelic commented Apr 4, 2024

Hum I don't know, from your outputs I assume that you installed torchvision with conda, try removing it and install with pip maybe. On my machine, I installed it with the following command (for cuda 12.1):
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121

@draym28
Copy link
Author

draym28 commented Apr 4, 2024

Still don't work.
This time I create a new env and use pip install torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 --index-url https://download.pytorch.org/whl/cu117.
But the error still come up.

@Maelic
Copy link

Maelic commented Apr 4, 2024

I'm afraid I can't help you more here, sorry. I don't recall having this error ever, even when I was working with previous versions of pytorch for this codebase.

@draym28
Copy link
Author

draym28 commented Apr 5, 2024

It is OK, thanks for your help. I will keep finding the solution.

@Ali-Hatami
Copy link

Hi @Maelic, thank you for sharing your implementation. I'm encountering an issue with installing Apex due to CUDA compatibility. I was wondering if you could provide guidance on how to resolve this. Thanks!

@Maelic
Copy link

Maelic commented Apr 15, 2024

Hi @Maelic, thank you for sharing your implementation. I'm encountering an issue with installing Apex due to CUDA compatibility. I was wondering if you could provide guidance on how to resolve this. Thanks!

You don't need to use APEX anymore as it is depreciated and built-in for new versions of torch. Please consider removing all reference to apex and this line

with amp.scale_loss(losses, optimizer) as scaled_losses:

And add this a little above:

with torch.autocast(device_type='cuda', dtype=torch.float16, enabled=use_amp):
            loss_dict = model(images, targets)
            
            losses = sum(loss for loss in loss_dict.values())

And it should work, see:

https://github.com/Maelic/SGG-Benchmark/blob/cecf1bbe46f3d862704d9cf0ffccf2282fb00cfe/tools/relation_train_net.py#L51

@Ali-Hatami
Copy link

Thank you for the prompt response. In the step-by-step installation (https://github.com/Maelic/SGG-Benchmark/blob/main/INSTALL.md) I have an error. My CUDA version is 11.5 but 11.5 is not available in the nvidia channels. How can I solve this issue?

RuntimeError:
The detected CUDA version (11.5) mismatches the version that was used to compile
PyTorch (12.1). Please make sure to use the same CUDA versions.

@Maelic
Copy link

Maelic commented Apr 16, 2024

Thank you for the prompt response. In the step-by-step installation (https://github.com/Maelic/SGG-Benchmark/blob/main/INSTALL.md) I have an error. My CUDA version is 11.5 but 11.5 is not available in the nvidia channels. How can I solve this issue?

RuntimeError: The detected CUDA version (11.5) mismatches the version that was used to compile PyTorch (12.1). Please make sure to use the same CUDA versions.

Try upgrading your CUDA version or build torch from source.
By the way, this is not an issue directly related to this work, you will probably have more success if you ask on the dedicated PyTorch forum.

@jzzzzh
Copy link

jzzzzh commented Nov 9, 2024

Some common problems & solutions when installing maskrcnn_benchmark.

1. THC.h: No such file or directory/THCeilDiv Undefined/ see this

2. identifier "THCudaCheck" is undefined see this

3. torch.utils.cpp_extension.load stuck see this

mark

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants