Hello, my name is Juan M. Banda and I am currently a Sr. Data Scientist at Stanford Health Care, bringing artificial intelligence and machine learning to the healthcare system. Previously I was an assistant professor of computer science at Georgia State University. In my research lab, Panacea Lab, we build machine learning, computer vision, and NLP methods that help to generate insights from multi-modal large-scale data sources. With applications to precision medicine, medical informatics, astroinformatics and other domains, our work addresses domain-specific problems with data science methods and practices. As an engineer at heart and practice for the last 20 years, I have used Python, Bash, ontologies, and NLP tools to build pipelines to annotate over 68 million clinical notes. I have built custom ETLs to map over 8 million patient electronic health records, from 4 institutions, to common data models (OMOP) for large scale analytics and machine learning purposes. I have designed pipelines, databases, and processes to build research infrastructure for my current and previous labs. I have used R, SQL, Matlab, Perl, Java, Javascript, and other languages to acquire, clean and operationalize data from multiple sources. I have mined over 9 billion Tweets for NLP tasks to gain insights from them. In my earlier days, I built content-based image retrieval systems for NASAβs SDO mission, with capacity to process and index over 40,000 images daily, and provide computer vision-aided similarity search for images. I started my engineering days designing and developing point-of-sale systems written in Visual Basic. Apart from my technical skills, I have strong communication and writing skills (over 50 refereed publications) and management skills (I have managed over 40 employees and 20 students). With the desire of improving patient outcomes, medical care and building things that change peopleβs lives, I am committed to releasing all my work via open-source licenses following the FAIR data sharing principles.
Currently Learning:
Project π§ | Stars β | Forks π΄ | Issues β | Pull Requests πΏ |
---|---|---|---|---|
Covid-19 Twitter dataset | ||||
Social Media Mining Toolkit | ||||
APHRODITE |