Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

下载Youku-AliceMind的文件名与caption文件里的名字不同要怎么匹配? #27

Open
Science2AI-TaoXu opened this issue May 16, 2024 · 4 comments

Comments

@Science2AI-TaoXu
Copy link

1715869599182 1715869616587
@dai-yutong
Copy link

I'm facing the same issue. How are the filenames mapped?

@dai-yutong
Copy link

I'm facing the same issue. How are the filenames mapped?

I have solved my problem.

from datasets.utils.file_utils import hash_url_to_filename
url = f"public-unzip-dataset/modelscope/Youku-AliceMind/master/{filename_in_csv}"
new_filename = hash_url_to_filename(url)

@Sun-light-W
Copy link

I'm facing the same issue. How are the filenames mapped?

I have solved my problem.

from datasets.utils.file_utils import hash_url_to_filename url = f"public-unzip-dataset/modelscope/Youku-AliceMind/master/{filename_in_csv}" new_filename = hash_url_to_filename(url)

I used this method but the resulting new_filename was not found in my data_files directory

@dai-yutong
Copy link

I'm facing the same issue. How are the filenames mapped?

I have solved my problem.
from datasets.utils.file_utils import hash_url_to_filename url = f"public-unzip-dataset/modelscope/Youku-AliceMind/master/{filename_in_csv}" new_filename = hash_url_to_filename(url)

I used this method but the resulting new_filename was not found in my data_files directory

Maybe you can enter the directory of the modelscope package, find the file "msdatasets/utils/oss_utils.py", find the code filename = hash_url_to_filename(file_oss_key, etag=None), print file_oss_key and filename when you run MsDataset.load.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants