emma
All projects

Case Study 02 · Demoed at AWS Startup Loft, 2025

PubMed AI Agent

Scalable data pipelines and structured data systems for AI-powered research workflows on top of the PubMed corpus.

Role
Data engineering & agent
Venue
AWS Startup Loft, SF
Stack
Python, RAG, vector store
Surface
Clinical-grade research
PubMed AI Agent demo at AWS Startup Loft, SF

Problem

Clinical researchers rely on PubMed but its retrieval surface rewards exact-term searches over question-shaped queries. The gap between “what a researcher asks” and “what the index returns” eats hours per week.

What I built

Scalable ingestion and annotation pipelines for the PubMed corpus, plus the structured data schemas a RAG agent could actually reason over. The agent answers natural-language clinical questions with inline citations back to the source papers.

Demo

Showed live at the AWS Startup Loft in SF: typed an open-ended clinical question, the agent walked through retrieval, ranking, and synthesis with sources visible at every step.