Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading and using open_vision model #747

Open
1 task done
FightingKai01 opened this issue Dec 17, 2024 · 14 comments
Open
1 task done

Loading and using open_vision model #747

FightingKai01 opened this issue Dec 17, 2024 · 14 comments
Labels
Clarified Tag for issues that are clearly agreed upon question Further information is requested

Comments

@FightingKai01
Copy link

Search before asking

  • I have searched the X-AnyLabeling Docs and issues and found no similar questions.

Question

I have correctly loaded the open_vision model, but found a problem during the actual reasoning process.

The model cannot infer other similar objects under visual prompt, which is very different from the video demonstration you gave.
04

By reading the relevant documents, I may know the problem, but due to my limited professional ability. Observing the open_vision.yaml , it can be seen that the problem may be that the CountGD model is not loaded.
01
02

Could you provide a detailed reference document to help me or others complete the Text-Visual Prompting Grounding project?
03

Looking forward to your guidance.

Additional

No response

@FightingKai01 FightingKai01 added the question Further information is requested label Dec 17, 2024
@CVHub520
Copy link
Owner

Please paste the output log of the terminal here.

@FightingKai01
Copy link
Author

FightingKai01 commented Dec 17, 2024

the output log of the terminal

(x_anylabeling) PS F:\Datas\Desktop\A_Code\Python\X-AnyLabeling-2.5.0> python anylabeling\app.py
2024-12-17 16:35:49,630 | INFO    | app:main:159 - 🚀 X-AnyLabeling v2.4.4 launched!
2024-12-17 16:35:49,630 | INFO    | app:main:162 - ⭐ If you like it, give us a star: https://github.com/CVHub520/X-AnyLabeling
2024-12-17 16:35:49,651 | INFO    | config:get_config:83 - 🔧️ Initializing config from local file: C:\Users\28172\.xanylabelingrc
2024-12-17 16:36:16.4927167 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 onnxruntime::TryGetProviderInfo_CUDA] D:\a\_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1193 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

2024-12-17 16:36:16.5020399 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:743 onnxruntime::python::CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Please reference https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements to ensure all dependencies are met. 

2024-12-17 16:36:18,070 | INFO    | model_manager:_load_model:1071 - ✅ Model loaded successfully: open_vision
2024-12-17 16:37:00,011 | WARNING | open_vision:predict_shapes:393 - Could not inference model
2024-12-17 16:37:00,014 | WARNING | open_vision:predict_shapes:394 - name '_C' is not defined
Traceback (most recent call last):
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\anylabeling\services\auto_labeling\open_vision.py", line 362, in predict_shapes
    boxes = self.get_boxes(cv_image, text_prompt)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\anylabeling\services\auto_labeling\open_vision.py", line 309, in get_boxes
    model_output = self.net(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\anylabeling\services\auto_labeling\visualgd\model\groundingdino.py", line 598, in forward
    hs, reference, hs_enc, ref_enc, init_box_proposal = self.transformer(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\anylabeling\services\auto_labeling\visualgd\model\transformer.py", line 276, in forward
    memory, memory_text = self.encoder(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\anylabeling\services\auto_labeling\visualgd\model\transformer.py", line 616, in forward
    output = layer(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\anylabeling\services\auto_labeling\visualgd\model\transformer.py", line 824, in forward
    src2 = self.self_attn(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1736, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\nn\modules\module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\anylabeling\services\auto_labeling\visualgd\model\ms_deform_attn.py", line 339, in forward
    output = MultiScaleDeformableAttnFunction.apply(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\torch\autograd\function.py", line 575, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\anylabeling\services\auto_labeling\visualgd\model\ms_deform_attn.py", line 56, in forward
    output = _C.ms_deform_attn_forward(
NameError: name '_C' is not defined

Is this the problem? I have been trying to solve it for a long time but failed. Maybe it is because my professional ability is not solid.

05

Before this, I looked at other open_vision related issues, which helped me a lot. Could it be that the operating environment is still not configured correctly? Please give me more advice

@CVHub520
Copy link
Owner

Q1: 2024-12-17 16:35:49,630 | INFO | app:main:159 - ?? X-AnyLabeling v2.4.4 launched!

A1: Please update the source code to v2.5.0 or higher.

Q2: 2024-12-17 16:36:16.4927167 [E:onnxruntime:Default, provider_bridge_ort.cc:1480 onnxruntime::TryGetProviderInfo_CUDA] D:\a_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1193 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : LoadLibrary failed with error 126 "" when trying to load "F:\MySoftware\Anaconda\Anaconda_Software\envs\x_anylabeling\lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll"

A2: Please follow the instructions to install the correct onnxruntime version, ensuring it is compatible with your local CUDA version.

Q3: name '_C' is not defined

A3: Please refer to the original repository to install the necessary packages.

Remember to compile the operator:

cd models/GroundingDINO/ops
python setup.py build install

@FightingKai01
Copy link
Author

FightingKai01 commented Dec 17, 2024

To Q1:
I did download the source code in release 2.5.0, but when I run it, it shows 2.4.4, and I don't know why.
06
07

To Q2:
I am trying.
My cuda version is 12.3, and the onnxruntime version is not installed incorrectly. Is it because I don't have cuda installed locally?
I installed cudatoolkit in the environment. Is this not enough to support it?

Please forgive me! This is indeed an unprofessional reply from a beginner! You are very patient!

To Q3:
The original library does mention the relevant software packages, I installed them as required, and it also mentioned

cd models/GroundingDINO/ops

python setup.py build install

I tried it and it works
Do you mean to execute this command in this project as well? Not in CountGD.

In this project, I have previously executed

pip install .

Will this have a different effect than python setup.py build install?

@CVHub520
Copy link
Owner

All right. You shouldn't execute pip install . in the X-AnyLabeling repository. Don't worry, let's forget about it and reinstall the environment step by step by following these instructions:

  1. Create and activate the environment:

    conda create -n countgd python=3.9.19 -y
    conda activate countgd
  2. Downgrade the NVIDIA driver to version 12.1 and reinstall it if necessary:

    pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121

    Alternatively, you can follow this method to manage multiple CUDA environments.

  3. Clone the CountGD repository and navigate to it:

    git clone https://github.com/niki-amini-naieni/CountGD.git
    cd CountGD
  4. Install dependencies and set up GroundingDINO:

    pip install -r requirements.txt
    export CC=/usr/bin/gcc-11  # Ensure GCC 11 is used for compilation
    cd models/GroundingDINO/ops
    python setup.py build install
    python test.py
  5. Clone the X-AnyLabeling repository and navigate to the project directory:

    cd /path/to/x-anylabeling/project
    git clone https://github.com/CVHub520/X-AnyLabeling
    cd X-AnyLabeling
  6. Follow the instructions to install the required packages, ensuring compatibility with your local CUDA version.

  7. Run the application:

    python anylabeling/app.py --logger-level debug

Happy labeling! 🚀

@FightingKai01
Copy link
Author

FightingKai01 commented Dec 17, 2024

Thanks, I will continue to try.

I strongly recommend that you create a more complete help document for the deployment of open_vision to help more people(or beginners).

Thank you very much and wish you a happy life!

@FightingKai01
Copy link
Author

Sorry to bother you.
I'm having trouble reconfiguring my environment.
08

Has been changed to cuda:12.1
cuda:12.1
10

**My approach:**I tried updating the version of visual studio, currently it is 2022, but there was no good result.

Terminal Output:

(x-anylabeling) PS F:\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops> python setup.py build install
running build
running build_py
running build_ext
F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\utils\cpp_extension.py:381: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
  warnings.warn(f'Error checking compiler version for {compiler}: {error}')
building 'MultiScaleDeformableAttention' extension
Emitting ninja build file F:\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops\build\temp.win-amd64-cpython-39\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] F:\MySoftware\CUDA\NVIDIA_GPU_ComputingToolkit_CUDA12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output F:\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops\build\temp.win-amd64-cpython-39\Release\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops\src\cuda\ms_deform_attn_cuda.obj.d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -Xcompiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -DWITH_CUDA -IF:\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops\src -IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\include -IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\include\torch\csrc\api\include -IF:\M
ySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\include\TH -IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\sit
e-packages\torch\include\THC -IF:\MySoftware\CUDA\NVIDIA_GPU_ComputingToolkit_CUDA12.1\include -IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\inc
lude -IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\Include -IF:\MySoftware\VisualStudio\VisualStudio_Software\VC\Tools\MSVC\14.42.34433\include 
-IF:\MySoftware\VisualStudio\VisualStudio_Software\VC\Tools\MSVC\14.42.34433\ATLMFC\include -IF:\MySoftware\VisualStudio\VisualStudio_Software\VC\Auxiliary\VS\i
nclude "-IF:\Windows Kits\10\include\10.0.22621.0\ucrt" "-IF:\Windows Kits\10\\include\10.0.22621.0\\um" "-IF:\Windows Kits\10\\include\10.0.22621.0\\shared" "-
IF:\Windows Kits\10\\include\10.0.22621.0\\winrt" "-IF:\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" -c F:\Datas\Desktop\A_Code\Python\CountGD\models\Ground
ingDINO\ops\src\cuda\ms_deform_attn_cuda.cu -o F:\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops\build\temp.win-amd64-cpython-39\Release\Datas\Des
ktop\A_Code\Python\CountGD\models\GroundingDINO\ops\src\cuda\ms_deform_attn_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFL
OAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUD
A_NO_HALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
FAILED: F:/Datas/Desktop/A_Code/Python/CountGD/models/GroundingDINO/ops/build/temp.win-amd64-cpython-39/Release/Datas/Desktop/A_Code/Python/CountGD/models/GroundingDINO/ops/src/cuda/ms_deform_attn_cuda.obj
F:\MySoftware\CUDA\NVIDIA_GPU_ComputingToolkit_CUDA12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output F:\Datas\Desktop\A_Code\Python\CountGD
\models\GroundingDINO\ops\build\temp.win-amd64-cpython-39\Release\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops\src\cuda\ms_deform_attn_cuda.obj.
d -std=c++17 --use-local-env -Xcompiler /MD -Xcompiler /wd4819 -Xcompiler /wd4251 -Xcompiler /wd4244 -Xcompiler /wd4267 -Xcompiler /wd4275 -Xcompiler /wd4018 -X
compiler /wd4190 -Xcompiler /wd4624 -Xcompiler /wd4067 -Xcompiler /wd4068 -Xcompiler /EHsc -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcud
afe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=dll_interface_conflict_dll
export_assumed -DWITH_CUDA -IF:\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops\src -IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\l
ib\site-packages\torch\include -IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\include\torch\csrc\api\include -IF:\MySoftw
are\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\include\TH -IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-pack
ages\torch\include\THC -IF:\MySoftware\CUDA\NVIDIA_GPU_ComputingToolkit_CUDA12.1\include -IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\include -
IF:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\Include -IF:\MySoftware\VisualStudio\VisualStudio_Software\VC\Tools\MSVC\14.42.34433\include -IF:\M
ySoftware\VisualStudio\VisualStudio_Software\VC\Tools\MSVC\14.42.34433\ATLMFC\include -IF:\MySoftware\VisualStudio\VisualStudio_Software\VC\Auxiliary\VS\include
 "-IF:\Windows Kits\10\include\10.0.22621.0\ucrt" "-IF:\Windows Kits\10\\include\10.0.22621.0\\um" "-IF:\Windows Kits\10\\include\10.0.22621.0\\shared" "-IF:\Wi
ndows Kits\10\\include\10.0.22621.0\\winrt" "-IF:\Windows Kits\10\\include\10.0.22621.0\\cppwinrt" -c F:\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDIN
O\ops\src\cuda\ms_deform_attn_cuda.cu -o F:\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops\build\temp.win-amd64-cpython-39\Release\Datas\Desktop\A
_Code\Python\CountGD\models\GroundingDINO\ops\src\cuda\ms_deform_attn_cuda.obj -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_
CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -DCUDA_HAS_FP16=1 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_H
ALF2_OPERATORS__ -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=MultiScaleDeformableAttention -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86
F:\MySoftware\CUDA\NVIDIA_GPU_ComputingToolkit_CUDA12.1\include\crt/host_config.h(153): fatal error C1189: #error:  -- unsupported Microsoft Visual Studio versi
on! Only the versions between 2017 and 2022 (inclusive) are supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
ms_deform_attn_cuda.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\utils\cpp_extension.py", line 2096, in _run_ninja_build
    subprocess.run(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\subprocess.py", line 528, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "F:\Datas\Desktop\A_Code\Python\CountGD\models\GroundingDINO\ops\setup.py", line 64, in <module>
    setup(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\__init__.py", line 117, in setup
    return distutils.core.setup(**attrs)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\core.py", line 183, in setup
    return run_commands(dist)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\core.py", line 199, in run_commands
    dist.run_commands()
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\dist.py", line 954, in run_commands
    self.run_command(cmd)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\dist.py", line 950, in run_command
    super().run_command(command)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\dist.py", line 973, in run_command
    cmd_obj.run()
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\command\build.py", line 135, in run
    self.run_command(cmd_name)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\cmd.py", line 316, in run_command
    self.distribution.run_command(command)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\dist.py", line 950, in run_command
    super().run_command(command)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\dist.py", line 973, in run_command
    cmd_obj.run()
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\command\build_ext.py", line 98, in run
    _build_ext.run(self)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 359, in run
    self.build_extensions()
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\utils\cpp_extension.py", line 871, in build_extensions
    build_ext.build_extensions(self)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 476, in build_extensions
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 502, in _build_extensions_serial
    self.build_extension(ext)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\command\build_ext.py", line 263, in build_extension
    _build_ext.build_extension(self, ext)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\Cython\Distutils\build_ext.py", line 135, in build_extension
    super(build_ext, self).build_extension(ext)
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\setuptools\_distutils\command\build_ext.py", line 557, in build_extension 
    objects = self.compiler.compile(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\utils\cpp_extension.py", line 843, in win_wrap_ninja_compile        
    _write_ninja_file_and_compile_objects(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\utils\cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
    _run_ninja_build(
  File "F:\MySoftware\Anaconda\Anaconda_Software\envs\x-anylabeling\lib\site-packages\torch\utils\cpp_extension.py", line 2112, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

@CVHub520
Copy link
Owner

Hey @FightingKai01,

You'll need to handle the environment setup on your own. I suggest trying WSL2; it's generally easier for beginners. You can also research how to compile on Windows, and this might give you some guidance.

@FightingKai01
Copy link
Author

Glad I didn't give up and I got it working in WSL.
12
13
14
Because of your guidance, I have been exposed to a lot of new knowledge and have grown a lot.

Thanks Thanks Thanks🚀🚀🚀

🚀My suggestion
Do you know anything about T-Rex2? ---> https://trexlabel.com/?source=dds Automatic annotation tools based on visual or language prompts.
1.**More flexible interactive functions.**Users can give positive or negative prompts, and perform multiple rounds of optimization with multiple prompts.You can switch categories and start a new round of automatic labeling to gradually complete the labeling of the entire scene.
2.More importantly, can it be achieved:
During the user's use, the model is constantly updated based on the new data provided by the user, and constantly adapts to new scenarios. ------> It may be online learning (my theoretical knowledge is not sufficient yet)

Hahahaha, communicating with you is really rewarding.

@CVHub520
Copy link
Owner

Woo... 😃 I'm so glad to hear about your success and growth! It's wonderful to see how you've been able to learn and develop new skills. Your enthusiasm is truly infectious! 🚀

Regarding T-Rex2 - yes! I'm know about it. While it's indeed an impressive model with powerful capabilities for automatic annotation using visual and language prompts. Unfortuanatelly 😞 , it's currently only available through commercial API services rather than being open-source. I understand your interest in its flexible interactive features and the potential for online learning adaptation.

That said, I share your hope for the open-source community to develop even more powerful foundation models. The collaborative nature of open-source development has already given us many remarkable tools, and I'm confident we'll continue to see innovative contributions in this space.

Keep exploring and learning - your curiosity and engagement with these technologies is truly inspiring! Looking forward to hearing more about your discoveries and experiences. 🙏

@CVHub520 CVHub520 pinned this issue Dec 19, 2024
@CVHub520 CVHub520 added the Clarified Tag for issues that are clearly agreed upon label Dec 19, 2024
@FightingKai01
Copy link
Author

FightingKai01 commented Dec 20, 2024

I read the relevant documents again, but I am not yet capable enough to study the underlying code of open_vision.

I have some questions.
I think it can help me plan my future learning and research routes.

Of course, you can answer selectively. I am a beginner and I am not sure whether I should ask you questions.

  • 1. Can you give me a guide to fine-tune the open_vision model (or train from scratch)? I want to try whether it can enhance its zero-shot detection and general detection capabilities.

15

  • 2. What is the relationship between the open_vision model and the three models of CountGD, GroundingDINO, and SAM?

  • 3. Under the above three prompting methods, do the three required models (CountGD, GroundingDINO, SAM) complete the corresponding prompting tasks separately? Or, each task is completed by the three models calling each other and collaborating

  • 4. Does open_vision have an open source repository? (It seems that your team's research results are based on the existing models (CountGD, GroundingDINO, SAM, etc.), please forgive my ignorance)

@CVHub520
Copy link
Owner

Hey there! @FightingKai01: Thank you for waiting! Let me answer your questions one by one. 😊

Q1: Can you provide a guide to fine-tune the open_vision model (or train it from scratch)? I’d like to see if it can improve its zero-shot detection and general detection capabilities.

A1: You can refer to this guide for fine-tuning CountGD. Once completed, simply update the corresponding weights in the open_vision configuration file. 🎯

Q2: What is the relationship between the open_vision model and the three models: CountGD, GroundingDINO, and SAM?

A2: You can think of open_vision as an integrated workflow combining multiple models. 🔗

Q3: In the three prompting methods, do the required models (CountGD, GroundingDINO, SAM) complete their respective tasks separately? Or do they work together by calling and collaborating with each other?

A3: Essentially, CountGD extends the text-based prompting capabilities of GroundingDINO with visual prompting features. For more specifics, I recommend checking the related technical reports. 📄

Q4: Does open_vision have an open-source repository? (It seems your team's work builds upon existing models like CountGD, GroundingDINO, and SAM—please excuse my lack of knowledge.)

A4: No, open_vision is not an open-source repository. As mentioned in A2, it’s more of a workflow integrating multiple models. 🚀

Hope it helps, if you need more clarification feel free to ask! 🚀

@FightingKai01
Copy link
Author

About A3:
Can you give me more clues, such as the name of the technology, keywords, etc., to facilitate my search?

I'm in the process of moving from application to research code and hope to continue communicating.

@CVHub520
Copy link
Owner

About A3: Can you give me more clues, such as the name of the technology, keywords, etc., to facilitate my search?

I'm in the process of moving from application to research code and hope to continue communicating.

https://arxiv.org/abs/2407.04619

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clarified Tag for issues that are clearly agreed upon question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants