Hi! I'm Rajat, a Data Scientist with 6 years of experience, based in Delhi. I have a background in Industrial and Systems engineering from IIT Kharagpur. I have extensive experience in solving problems through Data Science in organizations from a range of industries, including retail, media, and consulting. My key interests lie in Natural Language Processing and Recommendation Systems. My approach centers on developing streamlined, high-impact solutions. Throughout my career, I've driven success through taking initiative, diligence, and an unwavering commitment to continuous improvement. I have 3 ginger cats, and I enjoy hiking in the Himalayas in my spare time.
Currently I am working as a Lead Data Scientist at The Home Depot.
IIT Kharagpur - Industrial and Systems Engineering - Dual Degree (B.Tech & M.Tech) | Class of 2018
- Next Best Action: Developed a configurable Python package for Next Best Action Sequence Models, enabling data science teams to train performant recommenders, and enhancing customer experiences across The Home Depot.
- Causal Inference Framework for Customer Insights: Devised a reusable causal inference framework, enabling cross-functional teams to quantify the impact of adverse events on customer behavior, including:
- Email Opt-out Revenue Impact Analysis: Quantified potential revenue loss per customer due to email opt-outs, informing targeted retention strategies.
- Validated Framework Efficacy: Successfully validated the framework's results against a Mobile App team's A/B test, achieving >99% alignment in measured lift, and demonstrating the framework's reliability for future use cases.
- Lookalike-Modeling: Framework for semi-supervised learning on Retail Data. Helped create customer persona based on spend history, demographics, browsing patterns for effective targeting
- Propensity Models Improvement: Studied exising propensity models, and devised new features to improve performance of Class Propensity Models upto 15%
-
Economic Times Newsletter: Created news recommendation pipeline for 1.5 million users to increase click-through rate from 4.8% to 15.3% using Apache Airflow for scheduling, monitoring, and logging
-
News Summarisation API: Google’s Pegasus-Large Model customized for ET markets data and finetuned for English Summarization. Built Flask API as an endpoint which is currently in production
-
Summarization for Indic Languages: Tested on various statistical summarization methods for Bangla, Tamil, Telugu, Malyalam, Marathi and Gujarati News
-
Keyword Recommendation for Articles (BERT + Solr): Designed a system for recommending Keywords for articles using Solr. Indexed Wikipedia data in solr search engine. Used BM25 for coarse search and BERT embeddings for ranking. Tested for 10 searches per second over 6M articles
-
Hack and Hustle 3.0 Hackathon Finalist 2022: Lead a team of 3 senior developers to build a Twitter Social listening API to provide Sentiment, Subjectivity and Engagement Insights on user specified Stocks
- Rare Disease Identification using EHR: Devised PU-Classifier using GANs. Automated Rare Disease Identification pipeline from Electronic Health Records , leading to lead time reduction from weeks to 10 work hours
- Compliance Prediction and Monitoring: Resource level Non-Compliance prediction for fortune 100 Hi-Tech client, leading to 23% reduction in non-compliance and $ 150K savings over 6 months
- Asymptomatic Liver Disease Progression Modeling: Predicted high risk NASH patients with 86% AUROC in EHR, identified 4 additional non invasive markers for fast-progressing NASH. Model was used for identifying patients for their focused treatment and higher enrollment in clinical trials
- Text Mining and Classification: Extracted unstructured clinical trial data for given therapy area in order to analyze current research scenario and trends
Data Science Intern Jan. 2017 - June. 2017
- Text Multi Class Sentiment Analysis: NLP pipeline for unstructured social listening data using bidirectional LSTM, TF-IDF Random Forest and Word2Vec Averaging MaxEnt. 72% Accuracy and 0.45 Cohen’s Kappa Score
- Smart Scheduling Tool for Recruiters: Created a tool in Excel VBA for Recruiters to plan and monitor daily interviews with flexibility of sending automated mailers
Python, Pandas, Numpy, sklearn, Pytorch, Tensorflow, Keras, Transformers, SQL, VBA
- Sequential Recommender in Production [Recbole] [https://throw-away-qq.github.io/recbole_to_production/]
- Motion Following camera using Arduino Uno and IR sensors
- Smart Scheduling Tools for Recruiters in VBA