AI Engineer

Darshil Kapadia

10 years at IBM India delivering computer vision, NLP, and agentic AI for global enterprises. I design, train, and ship end-to-end AI systems — from custom model fine-tuning to cloud-scale production deployment.

About

I'm an AI Engineer based in India with a decade of experience at IBM, building production AI systems for global enterprise clients — PepsiCo, Bacardi, Dow Chemicals, JSW, FAA, NedBank, Iffco-Tokio, and more. My work covers the full stack: data pipelines, model training and fine-tuning, agentic system design, and cloud deployment on AWS and Azure. I hold an M.Tech from IIT Kharagpur and a B.Tech in Electronics & Communication Engineering.

Generative AI

LangGraphLangChainOpenAI Agents SDKRAGGraphRAGRLHFPEFT / LoRA / QLoRAFine-tuningMulti-Agent SystemsPrompt EngineeringDSPyVector DBAutomated Prompt Generation

Decoder-only transformers trained on next-token prediction. State of the art for open-ended generation, reasoning, and instruction following.

OpenAI ModelsClaudeLLaMA 3.3LLaMA 3.1

Bidirectional transformers that produce rich contextual embeddings. Best suited for classification, NER, and semantic similarity.

BERTRoBERTaDistilBERT

Seq2seq architecture mapping input sequences to output sequences. Used for summarisation, translation, and question answering.

T5BARTmT5

Generative models that learn to reverse a noise process. State of the art for high-fidelity image and media synthesis.

Stable DiffusionDALL-EImagen

ML & Deep Learning

TensorFlowPyTorchTransfer LearningComputer VisionMLflow

Region-based and anchor-free detectors for localising and classifying objects in images. Applied in insurance claim assessment and industrial inspection.

YOLOR-CNNFaster R-CNNMask R-CNN

Hierarchical spatial feature extractors. Backbone of most computer vision pipelines for classification and representation learning.

CNNResNetVGGEfficientNet

Recurrent architectures that model temporal and sequential dependencies. Applied in OCR, time-series analysis, and sequence labelling.

RNNLSTMCRNNGRU

Attention-based architecture that unified NLP and vision. Trained via self-supervised objectives before task-specific fine-tuning.

Self-Supervised LearningMasked LMContrastive LearningMulti-Head AttentionViTCLIP

Adversarial generator–discriminator framework for producing realistic synthetic data and high-quality image generation.

GANDCGANStyleGANCycleGAN

Message-passing networks that learn on graph-structured data — knowledge graphs, recommendation systems, and molecular modelling.

GCNGraphSAGEGATGraph Transformer

Policy optimisation through environment interaction. Underpins LLM alignment (RLHF) and sequential decision-making agents.

PPODQNA3C

Compressing large teacher models into faster, smaller student models while preserving accuracy — critical for production deployment.

Knowledge DistillationDistilBERTTinyBERTQuantisation

Classical algorithms for structured and tabular data. Fast to train, interpretable, and often the right tool before reaching for deep learning.

Supervised

XGBoostLightGBMRandom ForestAdaBoostSVMDecision Treesk-NNLogistic Regression

Unsupervised

PCAK-MeansDBSCANt-SNEUMAPIsolation Forest

NLP

Hugging FaceText ClassificationNERStarCoderCodeLLaMAWatson Speech-to-TextWatson Visual Recognition

Languages

PythonC++SQLGremlinCypher

Python Ecosystem

FastAPIPandasscikit-learnOpenCVNetworkXBokeh

Cloud & Infrastructure

DockerKubernetesTerraformCI/CDOpenTelemetryWebSockets

Hands-on experience with core AWS services for data engineering, ML workloads, and application infrastructure.

EC2S3RDSVPCDynamoDBSageMakerBedrockLambdaEKSFargateIAM

Deep production experience across the Azure ecosystem — from ML pipelines and vector search to DevOps and real-time compute.

Azure MLAzure DatabricksAzure AI StudioAzure AI SearchCosmosDBAzure FunctionsAzure Form RecognizerAKSAzure DevOps

Databases

PostgreSQLMongoDBCosmosDBNeo4jRedisMySQLSQLite

Projects

No projects yet.

Experience

IBM India

Data Scientist / AI EngineerAug 2016 – Present
  • Built end-to-end computer vision, NLP, and agentic AI solutions for global enterprise clients including PepsiCo, Bacardi, Dow Chemicals, JSW, FAA, NedBank, and Iffco-Tokio.
  • Fine-tuned large language models using PEFT techniques (LoRA, QLoRA) and aligned models with RLHF; deployed RAG and GraphRAG pipelines in production.
  • Designed and deployed ML systems on AWS and Azure with Docker, Kubernetes, and full MLOps practices.

IIT Kharagpur

M.Tech — Telecommunication Systems Engineering2014 – 2016

Dharmsinh Desai University (DDIT)

B.Tech — Electronics & Communication Engineering2009 – 2013

Contact