Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

site_dump does not handle root level files correctly #56

Open
jimdinunzio opened this issue Nov 15, 2019 · 1 comment
Open

site_dump does not handle root level files correctly #56

jimdinunzio opened this issue Nov 15, 2019 · 1 comment

Comments

@jimdinunzio
Copy link

Hi,
I noticed on two different site dumps that files at the root level all are all copies of a single file. The files in folders below root appear to be fine.

Name Size Date Modified
Intro to AVR programming/ 11/14/19, 6:38:58 PM
PID presentation/ 11/14/19, 6:38:58 PM
robot builder issues/ 11/14/19, 6:38:58 PM
sample line follow videos/ 11/14/19, 6:38:58 PM
file1 37.1 kB 11/14/19, 6:38:58 PM
file2 37.1 kB 11/14/19, 6:38:58 PM
file3 37.1 kB 11/14/19, 6:38:58 PM
file4 37.1 kB 11/14/19, 6:38:58 PM
file5 37.1 kB 11/14/19, 6:38:58 PM
file6 37.1 kB 11/14/19, 6:38:58 PM
file7 37.1 kB 11/14/19, 6:38:58 PM

Jim

@dogfeathers
Copy link

Same thing is happening with the files in every folder, not just at root level. The issue is in yield_walk_files() in scraper.py. In the "for data in data_files" loop, the "el" variable is not being set in the loop, it's inherited from above. Get rid of that loop and move its lines up into the "if data['fileType'] == 'f':" block. No need for the data_files list. You can similarly get rid of the data_dirs list, but that one isn't hurting anything (but not helping anything either).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants