Skip to content

Update batch info, cache, and collected urls & tags #79

Update batch info, cache, and collected urls & tags

Update batch info, cache, and collected urls & tags #79

GitHub Actions / flake8 succeeded Aug 14, 2024 in 1s

reviewdog [flake8] report

reported by reviewdog 🐶

Findings (0)
Filtered Findings (266)

Tests/init.py|1 col 1| Missing docstring in public package
Tests/test_common_crawler_integration.py|1 col 1| Missing docstring in public module
Tests/test_common_crawler_integration.py|15 col 1| Missing docstring in public class
Tests/test_common_crawler_integration.py|17 col 1| Missing docstring in init
Tests/test_common_crawler_integration.py|27 col 1| Section has no content
Tests/test_common_crawler_integration.py|87 col 1| Missing docstring in public function
Tests/test_common_crawler_integration.py|87 col 36| Unused argument 'repo_file_path'
Tests/test_common_crawler_integration.py|101 col 1| Missing docstring in public function
Tests/test_common_crawler_integration.py|101 col 47| Unused argument 'repo_file_path'
Tests/test_common_crawler_integration.py|109 col 1| too many blank lines (3)
Tests/test_common_crawler_integration.py|150 col 37| Unused argument '_'
Tests/test_common_crawler_integration.py|156 col 5| too many blank lines (2)
Tests/test_common_crawler_integration.py|203 col 5| too many blank lines (2)
Tests/test_common_crawler_integration.py|235 col 1| Missing docstring in public function
Tests/test_common_crawler_unit.py|1 col 1| Missing docstring in public module
Tests/test_common_crawler_unit.py|4 col 1| 'urllib.parse.quote_plus' imported but unused
Tests/test_html_tag_collector_integration.py|1 col 1| Missing docstring in public module
Tests/test_html_tag_collector_integration.py|2 col 1| 'pytest' imported but unused
Tests/test_identifier_unit.py|1 col 1| Missing docstring in public module
Tests/test_identifier_unit.py|7 col 1| 'from agency_identifier.identifier import *' used; unable to detect undefined names
Tests/test_identifier_unit.py|11 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|15 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|15 col 32| Unused argument 'mock_env'
Tests/test_identifier_unit.py|18 col 16| 'get_page_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|22 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|22 col 32| Unused argument 'mock_env'
Tests/test_identifier_unit.py|26 col 13| 'get_page_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|35 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|36 col 12| 'parse_hostname' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|43 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|45 col 9| 'parse_hostname' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|55 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|56 col 12| 'remove_http' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|60 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|67 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|69 col 13| 'match_agencies' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|74 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|76 col 13| 'match_agencies' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|81 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|90 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|93 col 13| 'match_agencies' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|98 col 23| 'match_agencies' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|103 col 20| 'match_agencies' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|108 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|108 col 48| Unused argument 'mock_env'
Tests/test_identifier_unit.py|115 col 10| 'get_agencies_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|123 col 24| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|130 col 18| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|136 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|142 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|143 col 20| 'process_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|153 col 32| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|160 col 28| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|169 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|175 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|176 col 18| 'match_urls_to_agencies_and_clean_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|189 col 30| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|190 col 30| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|193 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|201 col 14| 'read_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|207 col 9| 'os' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|209 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|212 col 9| 'read_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|215 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|217 col 10| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|225 col 9| 'write_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|234 col 9| 'os' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|237 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|240 col 19| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|241 col 32| Unused argument 'file_path'
Tests/test_identifier_unit.py|243 col 18| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|244 col 13| 'write_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|250 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|252 col 35| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|253 col 20| 'polars' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|257 col 5| 'process_and_write_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|271 col 1| Missing docstring in public function
Tests/test_identifier_unit.py|271 col 57| Unused argument 'mock_process_data'
Tests/test_identifier_unit.py|271 col 76| Unused argument 'mock_write_data'
Tests/test_identifier_unit.py|275 col 9| 'process_and_write_data' may be undefined, or defined from star imports: agency_identifier.identifier
Tests/test_identifier_unit.py|275 col 68| no newline at end of file
Tests/test_label_studio_interface_integration.py|1 col 1| Missing docstring in public module
Tests/test_label_studio_interface_integration.py|9 col 1| Missing docstring in public function
Tests/test_label_studio_interface_integration.py|14 col 1| Missing docstring in public function
Tests/test_label_studio_interface_integration.py|19 col 1| Missing docstring in public function
Tests/test_label_studio_interface_integration.py|25 col 1| Missing docstring in public function
Tests/test_label_studio_interface_integration.py|31 col 1| Missing docstring in public function
Tests/test_label_studio_interface_integration.py|36 col 1| Missing docstring in public function
Tests/test_label_studio_interface_integration.py|62 col 5| too many blank lines (2)
Tests/test_label_studio_interface_integration.py|64 col 29| no newline at end of file
Tests/test_util_unit.py|1 col 1| Missing docstring in public module
Tests/test_util_unit.py|6 col 1| Missing docstring in public function
Tests/test_util_unit.py|15 col 1| Missing docstring in public function
Tests/test_util_unit.py|20 col 1| Missing docstring in public function
Tests/test_util_unit.py|22 col 44| no newline at end of file
agency_identifier/init.py|1 col 1| Missing docstring in public package
agency_identifier/identifier.py|1 col 1| Missing docstring in public module
agency_identifier/identifier.py|49 col 1| Missing docstring in public function
agency_identifier/identifier.py|150 col 1| Missing docstring in public function
agency_identifier/identifier.py|191 col 1| Missing docstring in public function
agency_identifier/identifier.py|199 col 1| Missing docstring in public function
agency_identifier/identifier.py|208 col 1| Missing docstring in public function
agency_identifier/identifier.py|224 col 1| Missing docstring in public function
annotation_pipeline/populate_labelstudio.py|84 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|114 col 1| No whitespaces allowed surrounding docstring text
annotation_pipeline/populate_labelstudio.py|114 col 1| First word of the first line should be properly capitalized
annotation_pipeline/populate_labelstudio.py|120 col 1| No whitespaces allowed surrounding docstring text
annotation_pipeline/populate_labelstudio.py|145 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|150 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|154 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|158 col 7| missing whitespace after keyword
annotation_pipeline/populate_labelstudio.py|175 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|178 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|182 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|193 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|204 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|207 col 5| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|238 col 9| block comment should start with '# '
annotation_pipeline/populate_labelstudio.py|252 col 1| expected 2 blank lines after class or function definition, found 1
annotation_pipeline/populate_labelstudio.py|255 col 1| blank line at end of file
common_crawler/init.py|1 col 1| Missing docstring in public package
common_crawler/argparser.py|1 col 1| Missing docstring in public module
common_crawler/cache.py|1 col 1| Missing docstring in public module
common_crawler/cache.py|12 col 1| 1 blank line required after class docstring
common_crawler/cache.py|45 col 5| too many blank lines (2)
common_crawler/cache.py|62 col 5| too many blank lines (2)
common_crawler/cache.py|75 col 5| too many blank lines (2)
common_crawler/cache.py|86 col 5| too many blank lines (2)
common_crawler/crawler.py|1 col 1| Missing docstring in public module
common_crawler/crawler.py|10 col 1| 'collections.namedtuple' imported but unused
common_crawler/crawler.py|20 col 1| too many blank lines (3)
common_crawler/crawler.py|21 col 1| Missing docstring in public class
common_crawler/crawler.py|34 col 1| Missing docstring in init
common_crawler/crawler.py|40 col 1| Missing docstring in public method
common_crawler/crawler.py|129 col 1| Missing docstring in public method
common_crawler/csv_manager.py|1 col 1| Missing docstring in public module
common_crawler/csv_manager.py|13 col 1| Missing docstring in init
common_crawler/csv_manager.py|32 col 14| indentation is not a multiple of 4
common_crawler/csv_manager.py|32 col 14| over-indented
common_crawler/main.py|1 col 1| Missing docstring in public module
common_crawler/main.py|30 col 1| Missing docstring in public class
common_crawler/main.py|39 col 1| 1 blank line required after class docstring
common_crawler/main.py|42 col 1| expected 2 blank lines after class or function definition, found 1
common_crawler/main.py|44 col 1| Missing docstring in public function
common_crawler/main.py|48 col 1| Missing docstring in public function
common_crawler/main.py|68 col 1| Missing docstring in public function
common_crawler/main.py|287 col 1| Missing docstring in public function
common_crawler/utils.py|11 col 1| Missing docstring in init
common_crawler/utils.py|14 col 1| Missing docstring in public method
common_crawler/utils.py|21 col 1| Missing docstring in magic method
html_tag_collector/DataClassTags.py|1 col 1| Missing docstring in public module
html_tag_collector/DataClassTags.py|5 col 1| Missing docstring in public class
html_tag_collector/RootURLCache.py|1 col 1| Missing docstring in public module
html_tag_collector/RootURLCache.py|13 col 1| Missing docstring in public class
html_tag_collector/RootURLCache.py|14 col 1| Missing docstring in init
html_tag_collector/RootURLCache.py|18 col 1| Missing docstring in public method
html_tag_collector/RootURLCache.py|30 col 1| Missing docstring in public method
html_tag_collector/RootURLCache.py|34 col 1| Missing docstring in public method
html_tag_collector/RootURLCache.py|74 col 1| Missing docstring in public method
html_tag_collector/init.py|1 col 1| Missing docstring in public package
html_tag_collector/collector.py|1 col 1| Docstring is over-indented
html_tag_collector/collector.py|1 col 1| No whitespaces allowed surrounding docstring text
html_tag_collector/collector.py|18 col 1| 'urllib.request' imported but unused
html_tag_collector/collector.py|19 col 1| 'concurrent.futures.as_completed' imported but unused
html_tag_collector/collector.py|19 col 1| 'concurrent.futures.ThreadPoolExecutor' imported but unused
html_tag_collector/collector.py|29 col 1| redefinition of unused 'tqdm' from line 28
html_tag_collector/collector.py|95 col 1| Missing docstring in public function
html_tag_collector/collector.py|95 col 23| Unused argument 'loop'
html_tag_collector/collector.py|202 col 9| line break before binary operator
html_tag_collector/collector.py|203 col 9| line break before binary operator
html_tag_collector/collector.py|204 col 9| line break before binary operator
html_tag_collector/collector.py|208 col 9| line break before binary operator
html_tag_collector/collector.py|209 col 9| line break before binary operator
html_tag_collector/collector.py|475 col 1| Missing docstring in public function
html_tag_collector/common.py|1 col 1| Missing docstring in public module
html_tag_collector/common.py|9 col 125| no newline at end of file
hugging_face/example/huggingface_test.py|1 col 1| Missing docstring in public module
hugging_face/example/huggingface_test.py|18 col 1| Missing docstring in public function
hugging_face/example/huggingface_test.py|23 col 1| expected 2 blank lines after class or function definition, found 1
hugging_face/example/huggingface_test.py|29 col 61| at least two spaces before inline comment
hugging_face/example/huggingface_test.py|29 col 61| inline comment should start with '# '
hugging_face/example/huggingface_test.py|30 col 59| at least two spaces before inline comment
hugging_face/example/huggingface_test.py|30 col 59| inline comment should start with '# '
hugging_face/example/huggingface_test.py|34 col 115| at least two spaces before inline comment
hugging_face/example/huggingface_test.py|34 col 115| inline comment should start with '# '
hugging_face/example/huggingface_test.py|37 col 1| Missing docstring in public function
hugging_face/example/huggingface_test.py|42 col 1| expected 2 blank lines after class or function definition, found 1
hugging_face/example/huggingface_test.py|53 col 1| block comment should start with '# '
hugging_face/example/huggingface_test.py|54 col 1| block comment should start with '# '
hugging_face/example/split_data.py|1 col 1| Missing docstring in public module
hugging_face/testing/hf_trainer.py|1 col 1| Missing docstring in public module
hugging_face/testing/hf_trainer.py|12 col 1| block comment should start with '# '
hugging_face/testing/hf_trainer.py|21 col 1| Missing docstring in public function
hugging_face/testing/hf_trainer.py|29 col 1| expected 2 blank lines after class or function definition, found 1
hugging_face/testing/hf_trainer.py|29 col 27| undefined name 'filtered_dataset'
hugging_face/testing/hf_trainer.py|30 col 26| undefined name 'filtered_dataset'
hugging_face/testing/hf_trainer.py|48 col 22| at least two spaces before inline comment
hugging_face/testing/hf_trainer.py|66 col 1| Missing docstring in public function
hugging_face/testing/hf_trainer.py|70 col 5| block comment should start with '# '
hugging_face/testing/hf_trainer.py|76 col 1| expected 2 blank lines after class or function definition, found 1
hugging_face/testing/hf_trainer.py|82 col 15| unexpected spaces around keyword / parameter equals
hugging_face/testing/hf_trainer.py|82 col 17| unexpected spaces around keyword / parameter equals
hugging_face/url_relevance/clean_data.py|1 col 1| Missing docstring in public module
hugging_face/url_relevance/huggingface_relevance.py|1 col 1| Missing docstring in public module
hugging_face/url_relevance/huggingface_relevance.py|23 col 1| Missing docstring in public function
hugging_face/url_relevance/huggingface_relevance.py|88 col 1| Missing docstring in public function
identification_pipeline.py|1 col 1| Missing docstring in public module
identification_pipeline.py|7 col 1| 'datetime.datetime as dt' imported but unused
identification_pipeline.py|12 col 1| Missing docstring in public function
identification_pipeline.py|22 col 94| missing whitespace after ','
identification_pipeline.py|32 col 43| no newline at end of file
label_studio_interface/LabelStudioAPIManager.py|1 col 1| Missing docstring in public module
label_studio_interface/LabelStudioAPIManager.py|9 col 1| 'dotenv.load_dotenv' imported but unused
label_studio_interface/LabelStudioAPIManager.py|25 col 1| 1 blank line required after class docstring
label_studio_interface/LabelStudioAPIManager.py|36 col 1| Missing docstring in public function
label_studio_interface/LabelStudioAPIManager.py|41 col 1| Missing docstring in public class
label_studio_interface/LabelStudioAPIManager.py|42 col 1| Missing docstring in init
label_studio_interface/LabelStudioAPIManager.py|48 col 1| Missing docstring in public method
label_studio_interface/LabelStudioAPIManager.py|52 col 1| Missing docstring in public method
label_studio_interface/LabelStudioAPIManager.py|56 col 1| Missing docstring in public method
label_studio_interface/LabelStudioAPIManager.py|72 col 1| Missing docstring in init
label_studio_interface/LabelStudioAPIManager.py|150 col 1| Missing docstring in public class
label_studio_interface/LabelStudioConfig.py|1 col 1| Missing docstring in public module
label_studio_interface/LabelStudioConfig.py|6 col 1| 1 blank line required after class docstring
label_studio_interface/LabelStudioConfig.py|22 col 1| Missing docstring in public method
label_studio_interface/LabelStudioConfig.py|26 col 1| Missing docstring in public method
label_studio_interface/LabelStudioConfig.py|30 col 1| Missing docstring in public method
label_studio_interface/PreAnnotationCreator.py|7 col 1| 1 blank line required after class docstring
label_studio_interface/PreAnnotationCreator.py|24 col 1| Missing docstring in public class
label_studio_interface/PreAnnotationCreator.py|26 col 1| Missing docstring in init
label_studio_interface/PreAnnotationCreator.py|49 col 1| Missing docstring in public class
label_studio_interface/PreAnnotationCreator.py|49 col 1| too many blank lines (4)
label_studio_interface/PreAnnotationCreator.py|51 col 1| Missing docstring in init
label_studio_interface/PreAnnotationCreator.py|54 col 33| Unused argument 'raw_taxonomy_data'
label_studio_interface/PreAnnotationCreator.py|81 col 9| local variable 'taxonomy_results' is assigned to but never used
label_studio_interface/PreAnnotationCreator.py|86 col 9| too many blank lines (4)
label_studio_interface/init.py|1 col 1| Missing docstring in public package
openai-playground/openai-test.py|1 col 1| Missing docstring in public module
openai-playground/openai-test.py|2 col 1| 'os' imported but unused
openai-playground/openai-test.py|7 col 3| continuation line under-indented for hanging indent
openai-playground/openai-test.py|9 col 5| continuation line under-indented for hanging indent
openai-playground/openai-test.py|14 col 37| no newline at end of file
tests/test_root_url_cache_unit.py|1 col 1| Missing docstring in public module
tests/test_root_url_cache_unit.py|10 col 1| Missing docstring in public function
tests/test_root_url_cache_unit.py|20 col 1| Missing docstring in public function
tests/test_root_url_cache_unit.py|77 col 13| continuation line over-indented for hanging indent
util/init.py|1 col 1| Missing docstring in public package
util/db_manager.py|1 col 1| Missing docstring in public module
util/db_manager.py|7 col 1| Missing docstring in public class
util/db_manager.py|9 col 1| Missing docstring in init
util/db_manager.py|19 col 1| Missing docstring in magic method
util/db_manager.py|22 col 1| Missing docstring in public method
util/db_manager.py|27 col 1| Missing docstring in public method
util/db_manager.py|30 col 1| Missing docstring in public method
util/db_manager.py|33 col 1| Missing docstring in public method
... (Too many findings. Dropped some findings)