Welcome to my GitHub profile!
I am a dedicated and passionate 💻 Computer Science student with a strong background in Software Development, Database Systems, Machine Learning, and Natural Language Processing. I am currently pursuing a Master's degree at 🏛 Carnegie Mellon University, focusing on Computer Systems and Machine Learning Systems.
Programming Languages:
Tools & Technologies:
- Designed and developed data management platform for over 500 hours driving scenes data, providing means to enhance the training efficiency of BITS, a state-of-the-art traffic simulation model for autonomous vehicle development
- Designed robust schema for immutable data and columnar organization to enable selective data usage, integrating with Apache Parquet data format and DuckDB for efficient data storage and query, utilizing Flask framework to launch the Data as a Service (DaaS)
- Trained BITS with meticulously processed internal MADS data, achieving a 53.9% lower off-road rate in the evaluation on NuScenes-test dataset
- Conducted research on cutting-edge multi-task joint models, like XLNet, RoBERTa, for resume parsing in recruitment scenarios
- Designed a benchmark based on GLUE for tasks specialized in recruitment scenarios to evaluate BERT-based LLMs
- Improved F1-Score by over 8% in CWS and NER tasks, by constructing specialized dictionary with 100k+ entries extracted from resume dataset and deploying the ML pipeline on AWS, utilizing Spark for big data processing and analytics
- Hosted weekly recitation sessions and Office Hours for 100+ students, covering topics including OOP, software system design patterns, concurrency, and Java/TypeScript programming, providing one-on-one mentoring and code reviews for students
- Carnegie Mellon University - M.S. in Computer Systems, 08/2023 - 12/2024
- Tsinghua University - B.E. in Electronics and Computer Engineering, 08/2019 - 06/2023
- LinkedIn: https://www.linkedin.com/in/zhewei-tong-074244282/
- Email: [email protected]