Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

school_mapping.py #17

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

school_mapping.py #17

wants to merge 1 commit into from

Commits on Jun 4, 2024

  1. school_mapping.py

     Description
    This pull request adds a script (`school_matching.py`) to match schools from Source A to Source B using fuzzy matching of transliterated school names and district IDs.
    
    Changes Made
    - Added `school_matching.py` which:
      - Loads school data from `school_list_A.tsv` and `school_list_B.tsv`.
      - Transliterates Devanagari text to Romanized text using the Velthuis method.
      - Matches schools based on transliterated names and district IDs using the RapidFuzz library.
      - Saves the matching results to `school_mapping_results.csv`.
    
    Assumptions
    - District mapping data is provided in `jilla.tsv` with Devanagari district names.
    - Fuzzy matching threshold is set to 70.
    
    
    This approach ensures accurate matching based on transliterated names within the same district. Open to feedback and suggestions for improvements.
    
    Contributor
    This contribution was made by Bimal Bhandari.
    Cybertechnnp authored Jun 4, 2024
    Configuration menu
    Copy the full SHA
    c5f4afb View commit details
    Browse the repository at this point in the history