This repo contain preprocessor , Stopwords and Other functionality that we need when we want to do work on Urdu NLP
-
urdu.py contains
URDU_DIACRITICS
,URDU_DIGIT
URDU_PUNCTUATIONS
URDU_EXTRA_CHARACTER
URDU_ALPHABET
URDU_STOPWORDS
-
The notebook
preprocessor.ipynb
contains some exaple of preprocesing -
capture_phone_or_email_from_text.py
two function that accept string told that phone or email availabe in the text and return boolian vaule. The value
0 -> Not found
1 -> Found