Hi, I'm Wallace.

Building resilient AI solutions, robust full-stack applications, and high-performance data infrastructure.

About Me

My journey and professional philosophy.

From undertaking my undergraduate degree in India to achieving my Master's at Carnegie Mellon University, my technical journey has always been driven by building systems that scale and matter. Currently at Walmart, I transitioned from Data Engineer III to a Senior role, focusing extensively on scalable architecture, LLM-powered AI agents like the "My Investigator" chatbot, and high-performance data pipelines. My expertise bridges the gap between massive datasets and responsive, user-facing applications.

Experience

My professional track record focusing on impact and scale.

Senior Data Engineer

Walmart Inc.

July 2022 - Present

Reston, VA

  • Launched first production LLM solution (Clarity RX) featuring LangChain RAG for summarization and chat, reducing new GenAI app build time by ~60%.
  • Built & optimized large-scale streaming pipelines using Kafka, Apache Hudi, and BigQuery, cutting runtime for Legal NextGen by 55% and Geocoding by 83%.
  • Developed Compliance Intelligence NL-to-SQL + vector search FastAPI platform, improving regulatory data retrieval speed by ≈30%.
  • Pioneered Spark Streaming as a Service (via Kubernetes + ArgoCD), replacing Dataproc to save ~$6.2K per pipeline monthly.
  • Created full-stack & platform components, including FastAPI microservices and React UIs for real-time similarity results.

Machine Learning Intern

Cadence Design Systems

June 2021 - Aug 2021

San Jose, CA

  • Developed an audio noise suppression DTLN model on the DNS-challenge dataset achieving a mean PESQ score of 2.98.
  • Optimized the model reducing its size by 81.05%, quantized to TFlite int8 format for performant use in microprocessors.

Hosting Product Specialist

Endurance International Group

July 2019 - Dec 2020

Mumbai, India

  • Deployed and maintained Linux and Windows servers leveraging automated infrastructure scripts.
  • Managed and supported diverse hosting ecosystems across VPS, Cloud, and Dedicated products.

Featured Projects

Showcasing architecture, ML models, and full-stack solutions.

AI & Full-Stack

Generative AI Chatbot Platform

A comprehensive LLM app framework (Chat2Data) utilizing LangChain and VectorDBs for natural language interaction with enterprise data. Features a React frontend and FastAPI backend.

ReactFastAPILangChainVectorDB
Data Architecture

Real-time Streaming Engine

Scale-driven streaming ETL pipelines consuming massive volumes of Kafka data, processing via Spark Streaming on Kubernetes, and loading seamlessly into BigQuery.

KafkaApache HudiBigQueryKubernetes
Machine Learning

Entity Resolution ML

Large-scale ML solution leveraging XGBoost and FAISS achieving over 90% accuracy on a dataset of 600M records, reducing false positives by 15%.

XGBoostFAISSPython

Tech Stack & Skills

The tools I use to build scalable systems.

Languages

Python
TypeScript
JavaScript
SQL
C++

Frameworks & UI

React
Vite
Tailwind CSS
FastAPI
Node.js

AI & Data

LangChain
PyTorch
Spark
Kafka
Hudi

Infrastructure

GCP (BigQuery)
Kubernetes
Docker
Linux

Get In Touch

Have a question or want to work together? Leave a message below.