I am a Software Engineer specializing in Generative AI and Data Science, currently working at CVS Health in New York City. With a strong background in AI, Machine Learning, and Software Engineering, I focus on building scalable AI solutions and large language model applications. I hold an MS in Applied Data Science from The University of Southern California and have extensive experience in developing and deploying AI systems that impact millions of users.
Python, SQL, C, C++, CUDA (Intermediate), OpenMP (Intermediate)
Analytics, Deep Learning, Machine Learning, A/B Testing, GCP Vertex AI, AWS, Docker, Git, Informatica ETL, OpenMP(Intermediate), Web Scraping/Automation, AI Research, Parallel Programming, Numpy, Pandas, Scikit learn, LangChain, Nvidia DGX Server, Nsight Profiling, Nvidia NeMo, LLMOps, RAG, VectorDBs, LLM Training and Inference techniques, HuggingFace, FastAPI, RESTful API, Responsible AI, Linux, MongoDB, Selenium, Hadoop HDFS, PySpark, PowerBI/Tableau, LLMs, Generative AI, MapReduce, Azure, Google Cloud Platform, BigQuery, HiveQL, Horovod, ArcGIS, MLOps, Jenkins CI/CD, NLP, Forecasting, Unsupervised ML, Generative Models
Working on a high-performance CUDA + OpenMP hybrid vector addition project that combines CPU and GPU parallelism. The implementation achieves an 8.8x performance improvement over single CUDA streams by using 4 OpenMP threads with parallel CUDA streams for overlapped memory transfers and optimal load balancing.
Building a GPT3 implementation from scratch to understand the transformer architecture and attention mechanisms. This project focuses on implementing the core components of the GPT model including multi-head attention, feed-forward networks, and the complete transformer architecture.
June 2024 - Present
February 2024 - May 2024
May 2023 - August 2023
January 2022 - July 2022
November 2020 - June 2021
August 2022 - May 2024
GPA: 3.7/4.0
Coursework: Machine Learning for Data Science and AI, Applications of Data Mining, Predictive Analytics, Fairness, Security and Privacy in AI
August 2018 - June 2022
GPA: 3.95/4.0
Coursework: Data Structures and Algorithms, Data Mining, Machine Learning, Distributed & High-Performance Computing, Cloud Computing
A comprehensive guide to implementing KV caches for efficient LLM inference, covering the fundamental concepts and practical code implementation.
Read ArticleAn in-depth exploration of LLM system architecture using a three-layer abstraction model for better understanding of large language model systems.
Read Article