Strong C++ systems skills, Python backend experience, and hands-on ML infrastructure expertise. Built high-performance storage engines, low-latency inference systems, and scalable ML training workflows.
Over the next six months, my goal is to build RedOpsAI, a SaaS platform that makes website security testing simple and effective. The platform will provide safe and transparent scans, showing live visual feedback of potential risks without causing any harm to the site.
Feb 2026 – Apr 2026 · C++17, STL, CMake, GoogleTest
C++17/20, Python, SQL, JavaScript
Linux, Concurrency, Multithreading, TCP/IP, SIMD/AVX2, OpenMP
Docker, Kafka, GitHub Actions, AWS (EC2), FastAPI
jQuery, Dataverse, OData, Power Automate, HTML/CSS
Supervised/Unsupervised Learning, Ensemble Methods, Feature Engineering, Model Evaluation
Transformers, CNNs, RNN/LSTM, Transfer Learning, Fine-Tuning, QLoRA
PyTorch, Transformers, QLoRA, vLLM, FAISS, LLM Inference
Nov 2025 – Mar 2026
Engineered a C++ GPT-2 inference engine with SIMD and parallelism. Architected a multi-engine process pool with KV-cache and LRU eviction, sustaining 14 concurrent sessions within 8GB/16GB RAM.
Nov 2025 – Feb 2026
CPU-optimized C++ inference engine for a 51M-parameter GPT model. Reduced per-token latency from 91.87ms → 35.57ms by eliminating heap allocations, redesigning tensor layout for cache locality, and parallelizing matrix multiplications with OpenMP.
Aug 2025 – Jan 2026
Engineered a robust Order Matching Engine in C++20 using std::map for strict price-time priority and std::unordered_map for O(1) lookups. Validated with a comprehensive GoogleTest framework.
Sep 2025 – Oct 2025
Architected a compact 10M-parameter GPT model from scratch in PyTorch. Implemented mixed-precision and gradient accumulation, reducing per-epoch data loading times by over 90%.
Built a full RAG pipeline (BM25 + FAISS) boosting grounded accuracy (42%→71%). Optimized inference using vLLM.
Designed an event-driven pipeline for e-commerce data using asynchronous producers/consumers.
Implemented a QLoRa (4-bit) pipeline, reducing memory footprint by ~70% on a single T4 GPU.
AI agent using LangChain that plans and generates multi-file applications, decreasing development cycle time by 30%.
Developed interpretable ML models that improved prediction accuracy by 20% for NBFC clients.
Trained a text classification model with TF-IDF on 100K+ comments, achieving 95% accuracy.
Designed a multi-agent framework that automates market research by 70% using specialized 'Researcher' and 'Writer' agents.
Forage
Debugged and refactored a 10k-line Python/SQL codebase. Added connection pooling and regression tests, reducing query failure rate by 40%.
Forage | 10/2025 – 11/2025
Designed a scalable Elastic Beanstalk architecture for a client experiencing growth. Translated technical architecture into cost-benefit analysis.