Dataset contains invalid JPG files with a size of 0 bytes #3

lodeous · 2020-10-30T18:55:17Z

First off, this dataset is amazing! Thanks for putting it together.

I was trying to use this dataset to train a GAN network to create anime faces but I ran into a problem with my dataloader. I pulled the data from your google drive link to 'data.tgz' and then extracted the files. As you mention in the readme, there are some bad images. Specifically, my dataloader failed once it encountered some corrupted JPG files with a filesize of 0 bytes.

I used this code to delete the bad files and remove them from the dataset. Could you remove the bad files from the 'data.tgz' to make it easier for other people to use?


import os
#There are some 0 byte JPG files that cause issues
#I extracted 'data.tgz' into a folder called anime_faces
#but you can put whatever folder your data is in instead.
#This will go through all subfolders recursively.
for root, dirs, files in os.walk("anime_faces"):
  for file in files:
    path = os.path.join(root, file)
    if os.stat(path).st_size == 0:
      print("Remove 0B size file:", path)
      os.remove(path)

Here was the output, indicating which files are corrupted and should be removed:


Remove 0B size file: anime_faces/cropped/35611_2011.jpg
Remove 0B size file: anime_faces/cropped/23163_2008.jpg
Remove 0B size file: anime_faces/cropped/62546_2019.jpg
Remove 0B size file: anime_faces/cropped/8331_2004.jpg
Remove 0B size file: anime_faces/cropped/32128_2010.jpg
Remove 0B size file: anime_faces/cropped/38922_2012.jpg
Remove 0B size file: anime_faces/cropped/24884_2009.jpg
Remove 0B size file: anime_faces/cropped/23762_2008.jpg
Remove 0B size file: anime_faces/cropped/13131_2005.jpg
Remove 0B size file: anime_faces/cropped/54405_2016.jpg
Remove 0B size file: anime_faces/cropped/3781_2002.jpg
Remove 0B size file: anime_faces/cropped/61050_2018.jpg
Remove 0B size file: anime_faces/cropped/6955_2003.jpg
Remove 0B size file: anime_faces/cropped/1147_2001.jpg
Remove 0B size file: anime_faces/cropped/32445_2011.jpg
Remove 0B size file: anime_faces/cropped/55266_2016.jpg
Remove 0B size file: anime_faces/cropped/46998_2014.jpg
Remove 0B size file: anime_faces/cropped/27828_2009.jpg
Remove 0B size file: anime_faces/cropped/10877_2005.jpg
Remove 0B size file: anime_faces/cropped/23057_2008.jpg
Remove 0B size file: anime_faces/cropped/46240_2014.jpg
Remove 0B size file: anime_faces/cropped/28321_2010.jpg
Remove 0B size file: anime_faces/cropped/54885_2016.jpg
Remove 0B size file: anime_faces/cropped/8565_2004.jpg
Remove 0B size file: anime_faces/cropped/58070_2017.jpg
Remove 0B size file: anime_faces/cropped/40505_2012.jpg
Remove 0B size file: anime_faces/cropped/43570_2013.jpg
Remove 0B size file: anime_faces/cropped/6339_2003.jpg
Remove 0B size file: anime_faces/cropped/36800_2012.jpg
Remove 0B size file: anime_faces/cropped/5964_2003.jpg
Remove 0B size file: anime_faces/cropped/35412_2011.jpg
Remove 0B size file: anime_faces/cropped/45303_2014.jpg
Remove 0B size file: anime_faces/cropped/21135_2008.jpg
Remove 0B size file: anime_faces/cropped/44478_2013.jpg
Remove 0B size file: anime_faces/cropped/35751_2011.jpg
Remove 0B size file: anime_faces/cropped/62823_2019.jpg
Remove 0B size file: anime_faces/cropped/6188_2003.jpg
Remove 0B size file: anime_faces/cropped/24453_2009.jpg
Remove 0B size file: anime_faces/cropped/6268_2003.jpg
Remove 0B size file: anime_faces/cropped/4221_2002.jpg
Remove 0B size file: anime_faces/cropped/55695_2016.jpg
Remove 0B size file: anime_faces/cropped/26439_2009.jpg
Remove 0B size file: anime_faces/cropped/26648_2009.jpg
Remove 0B size file: anime_faces/cropped/55501_2016.jpg
Remove 0B size file: anime_faces/cropped/25213_2009.jpg
Remove 0B size file: anime_faces/cropped/32898_2011.jpg
Remove 0B size file: anime_faces/cropped/3651_2002.jpg
Remove 0B size file: anime_faces/cropped/3901_2002.jpg
Remove 0B size file: anime_faces/cropped/2125_2001.jpg
Remove 0B size file: anime_faces/cropped/46234_2014.jpg
Remove 0B size file: anime_faces/cropped/7058_2003.jpg
Remove 0B size file: anime_faces/cropped/20399_2007.jpg
Remove 0B size file: anime_faces/cropped/55382_2016.jpg
Remove 0B size file: anime_faces/cropped/154_2000.jpg
Remove 0B size file: anime_faces/cropped/46529_2014.jpg
Remove 0B size file: anime_faces/cropped/20330_2007.jpg
Remove 0B size file: anime_faces/cropped/28555_2010.jpg
Remove 0B size file: anime_faces/cropped/55062_2016.jpg
Remove 0B size file: anime_faces/cropped/23647_2008.jpg
Remove 0B size file: anime_faces/cropped/26406_2009.jpg
Remove 0B size file: anime_faces/cropped/48378_2014.jpg
Remove 0B size file: anime_faces/cropped/20561_2008.jpg
Remove 0B size file: anime_faces/cropped/4118_2002.jpg

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset contains invalid JPG files with a size of 0 bytes #3

Dataset contains invalid JPG files with a size of 0 bytes #3

lodeous commented Oct 30, 2020 •

edited

Loading

Dataset contains invalid JPG files with a size of 0 bytes #3

Dataset contains invalid JPG files with a size of 0 bytes #3

Comments

lodeous commented Oct 30, 2020 • edited Loading

lodeous commented Oct 30, 2020 •

edited

Loading