AI Systems Engineer · Agentic AI · RAG · Production ML
MS Data Science @ ASU · GPA 4.0/4.0 · 4+ years SWE & ML · Tempe, AZ
Building AI systems that reason over messy, real-world knowledge, and the boring infrastructure that keeps them honest in production.
MS in Data Science, Analytics & Engineering at Arizona State University (GPA 4.0/4.0, Dec 2026), with 4+ years of SWE and ML experience shipping NLP pipelines, distributed systems, and analytics platforms across SWE, data, and research roles. I care about the boring parts of AI work (evaluation, latency, drift, failure modes) as much as the model itself. Right now I'm focused on agentic AI, retrieval-augmented generation, and robust deep learning.
Currently:
| Track | Focus | |
|---|---|---|
| 🛠 | Building | Agentic RAG patterns: tool-use, query rewriting, and re-ranking |
| 🔬 | Researching | Frequency-domain adversarial robustness in Vision Transformers |
| 📖 | Exploring | LLM evaluation harnesses, long-context retrieval, MCP, structured outputs |
| 🎯 | Open to | AI/ML Engineering internships for Summer 2026 · CPT eligible |
Where I've shipped real systems with real users.
Apr 2023 – Dec 2024 · India
- Engineered a multilingual NLP pipeline supporting 114 languages using seq2seq Transformers on AWS (EC2, S3, Lambda) with Google/Azure speech APIs, lifting translation accuracy by 76% and cutting manual review costs.
- Shipped a reviewer web app with automated task allocation that reduced project cycle time by 41%, improved translation quality by 20%, and automated 400+ Amazon Polly voice narrations matched to character profiles across markets.
Sep 2022 – Mar 2023 · India
- Built churn-prediction models (logistic regression, 84% AUC) on 50K+ customer records using Python and SQL; RFM clustering surfaced fee-driven attrition and informed a retention strategy that cut attrition 50% in fee-sensitive cohorts within one quarter.
- Boosted Net Promoter Score by +10 via an analytics-driven strategy; automated survey reporting with zero-shot NLP classification and Power BI dashboards for leadership.
Dec 2020 – Jul 2022 · India
- Architected Celery/Redis distributed task queues processing 2M+ daily transactions, reducing pipeline latency by 40% and holding 99.9% SLA across peak campaigns handling 10× normal traffic.
- Built log-analytics dashboards and multithreaded Python services that lifted backend throughput by 35%; automated anomaly-detection alerts to prevent overload incidents.
A handful of projects that show how I think about systems: research-to-production, generative-to-classical.
FreqShield-ViT · Repo →
Frequency-domain adversarial defenses for Vision Transformers.
Stack: PyTorch · DeiT-Small · torch-dct · PyWavelets · SLURM
Investigation of feature-level frequency-domain regularization for adversarially-trained ViTs across four band-weighting configs and three frequency transforms (DCT, DFT, Haar wavelet). Documents a Siamese collapse failure mode and a threat-model-asymmetric robustness finding. Reproducible pipeline with depth-resolved spectral diagnostics, ablations, and patch-attack evaluation. Paper in draft.
Generative diffusion framework for blood-glucose forecasting.
Stack: PyTorch · Conditional Diffusion · Time-series
Conditional diffusion model generating privacy-preserving synthetic CGM data conditioned on meals, insulin, and physical activity. Outperformed LSTM/CNN baselines by 18% RMSE on the OhioT1DM benchmark.
FinFusion · Repo →
Deep learning for S&P 500 return forecasting.
Stack: PyTorch Lightning · pytorch-forecasting · ARIMAX · LSTM
Benchmarked ARIMAX, LSTM, and Temporal Fusion Transformer across 450+ experiments spanning 11 phases. Discovered gradient collapse in financial TFT; weekly resampling achieves 59.1% directional accuracy across 9-fold rolling evaluation (2016–2024).
Biosignal-conditioned music generation on mobile.
Stack: CNN-LSTM · Emotion-conditioned Transformer · REMI · PPG
HRV features (SDNN, RMSSD, LF/HF ratio) extracted from smartphone-camera PPG, mood classified into Russell's valence-arousal space via CNN-LSTM, and personalized instrumental MIDI generated by an emotion-conditioned Transformer decoder.
Traitlytics · Repo →
Big-Five personality prediction from LinkedIn profile text.
Stack: BERT · RoBERTa · TF-IDF · FastAPI · Docker · AWS (EC2)
NLP pipeline predicting Big-Five personality traits from LinkedIn profile text using BERT and RoBERTa with TF-IDF features. Deployed batch and real-time REST endpoints on AWS.
BasketIQ · Repo →
Market basket analysis on 32.4M Instacart transactions.
Stack: Python · mlxtend (Apriori) · scikit-learn · Tableau
Mined Apriori association rules and segmented users into 5 RFM-based clusters via K-Means. Interactive dashboard to drive targeted marketing and retention strategies.
The tools and frameworks I actually reach for.
Whether you're shipping agentic systems, evaluating RAG honestly, or trying to make ML hold up
under real-world distribution shift, I'd love to talk. Internship, collaboration, or just a hard problem.
My inbox is open.
