You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First off, this dataset is amazing! Thanks for putting it together.
I was trying to use this dataset to train a GAN network to create anime faces but I ran into a problem with my dataloader. I pulled the data from your google drive link to 'data.tgz' and then extracted the files. As you mention in the readme, there are some bad images. Specifically, my dataloader failed once it encountered some corrupted JPG files with a filesize of 0 bytes.
I used this code to delete the bad files and remove them from the dataset. Could you remove the bad files from the 'data.tgz' to make it easier for other people to use?
import os
#There are some 0 byte JPG files that cause issues
#I extracted 'data.tgz' into a folder called anime_faces
#but you can put whatever folder your data is in instead.
#This will go through all subfolders recursively.
for root, dirs, files in os.walk("anime_faces"):
for file in files:
path = os.path.join(root, file)
if os.stat(path).st_size == 0:
print("Remove 0B size file:", path)
os.remove(path)
Here was the output, indicating which files are corrupted and should be removed:
First off, this dataset is amazing! Thanks for putting it together.
I was trying to use this dataset to train a GAN network to create anime faces but I ran into a problem with my dataloader. I pulled the data from your google drive link to 'data.tgz' and then extracted the files. As you mention in the readme, there are some bad images. Specifically, my dataloader failed once it encountered some corrupted JPG files with a filesize of 0 bytes.
I used this code to delete the bad files and remove them from the dataset. Could you remove the bad files from the 'data.tgz' to make it easier for other people to use?
Here was the output, indicating which files are corrupted and should be removed:
The text was updated successfully, but these errors were encountered: