Flagship Agentic AI Builds

Agentic AI, up close.

Each agentic system solves a non-trivial coordination problem — multi-agent orchestration, adversarial reasoning, cross-source signal fusion, human-in-the-loop gating, persistent memory loops. Built solo. Production-deployed. Each a complete proof point.

// 01 — strategic brief

★ Flagship · Product Brief v0.6

Conductor PM

A pre-build feasibility simulator for AI agent products. A product manager describes a proposed agent in 8 questions. Twenty minutes later, they walk into the kickoff with a confidence score and the three risks most likely to kill the project — each with a specific mitigation move.

The pattern today is to start building an agent on a hunch and discover three weeks in that the eval signal is too noisy to ship. The wasted sprint costs $80K–$200K loaded, plus credibility. Gartner forecasts 40%+ of agentic AI projects will be cancelled by end of 2027. Conductor PM attacks the decision that precedes all three. A scan of fifteen agent-eval platforms — LangSmith, Braintrust, Arize, Maxim, Coval — confirms the wedge: every one scores a built agent. None scores a described one. Three probes run in parallel: eval-signal noise, tool-call composability, failure-mode tolerance. Output: 0–100 score + 3 ranked risks + prescriptive GO / PAUSE / KILL.

The differentiator: A PM-native buyer at a different moment than the eval incumbents. They sell to engineers after the build. Conductor PM sells to product leaders before the spend — and tells them what their team would otherwise learn in three weeks.

Stack

Next.js · Vercel · Supabase · Claude API

Models

Claude Sonnet 4.6 (the three probes) · Haiku 4.5 (synthetic case generation) — Sonnet does judgment-grade reasoning; Haiku is dirt-cheap synthesis. ~$0.40–$0.80 per simulation

What's different

PM-native buyer · before the build · prescriptive call (GO / PAUSE / KILL) instead of softened confidence bands

Next.js Vercel Supabase Claude Sonnet 4.6 Claude Haiku 4.5 Apache 2.0 OSS

// 02 — working build

14-day build · in flight

Conductor

Most "AI for marketing" tools generate content. Conductor manages the work. It plugs into Linear — the tool engineering teams already trust to track tasks — and turns it into a control panel for AI agents that do the actual job: drafting campaigns, refreshing copy, running competitive scans — gated by human approval at every step.

The breakthrough is the control panel. Engineering teams already use issue trackers like Linear to assign work, track progress, and approve completion. Conductor uses that same workflow — but the "worker" is an AI agent, and the "manager" is the human in charge. Each ticket becomes an autonomous task: research, draft, refine, hand back for review. Built for the work most companies aren't automating well yet — marketing ops, RevOps, lifecycle, customer experience. Open-source, runs on Claude, 14-day build.

The differentiator: Conductor is built on top of OpenAI's Symphony pattern — originally designed for engineering workflows — and extends that same approval-gated control panel into marketing, RevOps, and lifecycle teams.

Stack

Claude Opus 4.7 · Haiku 4.5 · Managed Agents · Linear API · Vercel · Braintrust

Models

Opus 4.7 for strategic reasoning · Haiku 4.5 for execution — the Anthropic Advisor pattern. Opus is overkill for fetching a Linear ticket; Haiku is underpowered for synthesis. Routing keeps cost low without quality loss.

How it works

Multi-agent workflow with human-in-the-loop approval gates

What's different

Built for marketing & RevOps teams, not engineering teams — a space no competitor occupies

TypeScript Next.js Anthropic SDK Managed Agents Linear MCP Braintrust Vercel

// 03

Live prototype

Throughline

Reconciliation for agentic payments. As autonomous agents transact across cards, stablecoins, and ACH, the reconciliation surface fragments. Throughline catches the exceptions a finance team can't afford to miss — amount mismatches, duplicate settlements, over-threshold authorizations — across every rail.

Built on direct payments expertise — $3B+ in embedded payments volume at Planet DDS, plus payment-orchestration architecture experience. The premise: when AI agents start moving money on their own, the gaps that humans used to catch in spreadsheet reconciliations become structural. Throughline sits post-settlement and detects three exception types — amount mismatch, over-threshold authorization, duplicate settlement — across card and stablecoin rails uniformly. Built as a portfolio piece to demonstrate domain expertise meeting agentic commerce.

The differentiator: Most "AI for payments" pitches generate marketing copy. Throughline solves the actual operations problem agentic commerce creates — reconciliation surface fragmentation as agents move money across rails — using the same finance-grade rigor incumbents apply to human-initiated transactions.

Stack

Next.js · Tailwind · Supabase · Claude API · light-theme editorial UI (Linear / Mercury register)

Models

Claude Sonnet 4.6 for exception reasoning across rails (the hard work is matching semantically equivalent transactions whose shapes differ by processor) · Haiku 4.5 for fast filtering of the matched majority before the expensive call

How it works

Post-settlement record-by-record review · KPI tiles for trust at a glance · drill-down evidence drawer for each exception

What's different

Built on actual payments domain expertise · finance-grade integrity discipline · agentic-commerce-native, not retrofit

Next.js Tailwind Supabase Claude Sonnet 4.6 Claude Haiku 4.5 Vercel

// 04

Running daily at 6 AM CT

ProdAgentCo

Autonomous multi-agent product pipeline that ideates, debates, plans, and builds a new product daily — without me.

Daily 6 AM autonomous runs via Mac LaunchAgent: Discovery → CEO Debate → Gate 1 → 7 Planning agents → Gate 2 → Build → Gate 3. Three HITL approvals through Telegram. CEO Debate Layer surfaces risk through opposed PM/CTO/CFO/Marketing/Legal agents before the synthesizer renders GO / NO-GO / NEEDS_REVIEW.

Models

Claude Sonnet 4.6 for the CEO Debate and reasoning roles (PM / CTO / CFO / Marketing / Legal agents) · Groq + Haiku 4.5 for the mechanical worker agents — reserves the expensive reasoning for what actually needs judgment. Major cost reduction without quality loss.

CrewAI Anthropic API LangSmith Telegram HITL Next.js / Vercel Mac LaunchAgent

GitHub →

// 05

Session 5 · 232/232 tests passing

Tribunal

Agentic stock-debate orchestrator. Bull / Bear / Contrarian agents reason adversarially over structured evidence bundles, daily.

Three Sonnet agents on Claude Code Agent Teams, restricted to messaging primitives — no tools, no web — debating over a SignalBundle assembled from Databento market state, SEC EDGAR filings, Alpha Vantage news, and Finnhub catalysts. Completeness gate: bundles below 90% don't go to debate. First live run validated the architecture — Bear caught Bull on incorrect base-rate arithmetic; Bull conceded. The disagreement is the product.

Models

Claude Sonnet 4.6 across all three agents (Bull, Bear, Contrarian). Adversarial reasoning needs sustained chain-of-thought — routing any of the three to Haiku would collapse the debate quality and the disagreement-as-signal vanishes.

Python 3.11 Pydantic v2 FastAPI Claude Agent Teams Supabase Next.js 14 Tailwind

// 06

Round 6 · 14 tests green · 5 ADRs

Convergence Scanner

Three-signal convergence detector — fuses derivatives flow, Polymarket odds, and news-silence windows to surface pre-announcement signals.

Bloomberg fuses this for $24K/year. Nothing under $5K does. Convergence Scanner watches for anomalous order flow in a macro-sensitive instrument, a concurrent Polymarket odds shift on a related event, and silence on the news wire — and fires a Telegram alert with a Claude-generated hypothesis when all three align. Not a front-running tool; a tailing tool on public order flow you can see.

Models

Claude Haiku 4.5 watches the three signals constantly — cheap to monitor 24/7. Only when all three converge does the system call Sonnet 4.6 for hypothesis generation — the expensive model fires once per real signal, not once per tick.

Python / CrewAI Claude Sonnet 4.6 Claude Haiku 4.5 Databento Polymarket CLOB Supabase Telegram HITL

// 07

Live · v2-session8

Job Search Agent v2

Full materials pipeline on top of Claude Managed Agents — and the same system I use to land the interviews this portfolio supports.

V1 used a Managed Agent to search jobs (slow, fragile, 9 production bugs). V2 collapses that layer to SerpAPI + Haiku scoring (5–8 sec end to end) and reserves Managed Agents for what actually needs reasoning — a 5-tab materials pipeline: Resume Tailor, LinkedIn Outreach, Cover Letter, Company Intel (live web search), Interview Prep. A Grader agent writes to Turso memory; future runs read those signals back into prompts. Adaptive without overcorrection.

Models

Claude Haiku 4.5 for the fast job-fit scoring pass (5–8 sec end-to-end) · Claude Managed Agents (Sonnet-class) for the materials pipeline where reasoning depth matters. Earlier versions used Sonnet on every job — the cost-per-scan dropped 80% with no fit-quality loss after the switch.

Next.js 16 TypeScript Managed Agents SerpAPI Turso shadcn/ui Vercel

Live demo → GitHub →

// 08

Working prototype

Autonomous Agent Wallet — AP2

Working prototype of Google's Agent Payments Protocol — a coordinated multi-agent commerce flow with the full mandate chain end to end.

User intent ($700, Palm Springs, first weekend of November) → prime agent negotiates simultaneously with airline + hotel sub-agents → dual cryptographically-linked Cart Mandates execute in one atomic transaction → flight + hotel receipts written together or rolled back together. A 4-agent topology (Gemini Pro prime + three Flash sub-agents) at 64% cost reduction vs all-Pro without quality loss. The coordinated_booking_id is the proof both receipts came from one Cart Mandate.

Models

Gemini 2.5 Pro orchestrates the booking flow (mandate generation, atomic commit logic) · Gemini 2.5 Flash handles each sub-agent (flight search, hotel search). 64% cost reduction vs all-Pro topology, no quality loss. Built on Google's models because AP2 is a Google-led protocol — demonstrates the pattern in its native habitat.

Python Gemini 2.5 Pro Gemini 2.5 Flash AP2 Protocol FastAPI SQLite Hash chain

// 09

Running daily at 7 AM ET

EDGAR 8-K Scanner

7 AM ET pre-market scanner that turns the SEC EDGAR firehose into a ranked, Claude-scored Telegram alert before the open.

Pulls the prior ~18 hours of 8-K filings, scores them by Item type (1.01 Material Agreement, 2.01 Acquisition, 2.02 Earnings, 5.02 Officer changes) and catalyst keyword overlap (FDA, DOD, hyperscaler counterparties, rare earths, design wins, scale terms like "billion"), then posts a ranked alert via Telegram with the ticker, the excerpt, and a directional bullish/bearish call. Runs as a Mac LaunchAgent alongside ProdAgentCo — same operational pattern, different signal.

Models

Claude Haiku 4.5 only. Structured 8-K filing scoring at ~$0.001 per scan — Sonnet would be overkill for matching Item types and catalyst keywords against a known taxonomy. Runs ~100 filings daily; full month of operation costs less than a coffee.

Python SEC EDGAR Atom Claude Haiku 4.5 Telegram Bot API Mac LaunchAgent

The Full Portfolio

Everything else I've shipped.

Every project here is deployed, tested, and reachable. Each one started as a question — "could a PM build this in a weekend?" — and ended with a live URL.

// 10

Consulting Diagnostic

AI Readiness Scorecard

Vertical-branched 7-dimension assessment for executive prospects. Generates personalized readiness reports with engagement tier recommendation, preliminary agent strategy, and phased project roadmap. Doubles as ProdAgentCo's proposal-generator.

Next.js · Supabase · Tailwind · Vercel
Claude Sonnet 4.6 — personalized narrative reports need real reasoning; Haiku was too flat for executive output

// 11

Consumer · Personal

The Morning Brief

Daily AI news dashboard built primarily for my 81-year-old father. RSS-fed for cost, Claude Haiku for synthesis, on-demand loading. Includes a dedicated Stillwater Ponies high-school sports tab as the highest-priority feed. Runs at ~$2/month.

React TS · Express · RSS · Vercel
Claude Haiku 4.5 — light news synthesis on a $2/month budget; reasoning depth would be wasted here

// 12

Healthtech

Clinical Notes Generator

Ambient clinical documentation app. Browser MediaRecorder streams to Deepgram nova-2-medical for real-time transcription; Claude generates structured SOAP notes from the transcript. First production AI build — and a clear domain-to-AI bridge for healthtech roles.

Next.js · Deepgram nova-2-medical · Vercel
Claude Sonnet 4.6 for SOAP note structuring — Haiku missed clinical nuance in early tests

// 13

Markets · Agentic

Whale Scanner

Real-time Polymarket whale-activity dashboard. WebSocket connection to the CLOB with 30-sec REST gap-filler, 3-layer dedup, NegRisk detection, and a 4-panel UI (live feed · top markets · volume spikes · whale wallets). Built and deployed in a single session.

Python FastAPI · Next.js 14 · SQLite · Telegram · Railway
No LLM in the hot path — pure WebSocket trade detection. Claude Haiku 4.5 only for alert summaries when fired

// 14

Markets · Sports Analytics

Edge Scanner

3-agent prediction chain for NBA, NHL, and NCAAM — model probability, edge gates, fractional Kelly sizing, Brier-score grading. v3.0 caught a 5-play NHL slate with +10.2% top edge. Originally a Replit build; later wired into a MiroFish multi-agent simulation seed pipeline.

Flask · SQLite · Replit
Claude Sonnet 4.6 for the 3-agent prediction chain — adversarial reasoning matters for edge calls

// 15

Consumer · Direct Bookings

Johnson Lake House

A book-direct vacation rental site at johnsonlake-home.com — full SEO meta stack, Open Graph cards, tabbed gallery with lightbox, embedded map, staging branch workflow on Vercel. Optimized to win hyper-local searches the big OTAs don't bother indexing for.

HTML · CSS · Vanilla JS · Vercel

// 16

Markets · Modeling

MLB Signal Engine

PRD v1.2 baseball picks model — MLB Stats API, pybaseball, The Odds API, and Open-Meteo for weather signals. Adaptive feedback loop logs outcomes and adjusts signal weights, with locked-signal rules to prevent overcorrection on noise.

Python · MLB Stats API · pybaseball · Open-Meteo
Claude Sonnet 4.6 for signal-weight reasoning · Haiku for fast feature extraction across game logs

Resume

The complete picture.

15+ years of B2B SaaS product leadership across fintech, healthtech, and dental practice management — owning $25M+ portfolios, shipping AI in production, and now building agentic systems end-to-end. Click through, or download the full Word doc.

Download Word doc ↓

Jeremy Serfling

Principal Product Manager · Director of Product

B2B SaaS · Health Tech · Fintech · AI Product Builder

Minneapolis–St. Paul Area, 55125 · 949-885-6498 · Serf439@gmail.com · linkedin.com/in/jeremyserfling

Professional Summary

Product leader with 15+ years scaling B2B SaaS platforms across fintech and health tech — owning $25M+ portfolios and leading high-performing PM teams from strategy through execution. Proven track record shipping AI-powered products in production, with hands-on fluency across LLM pipelines, multi-agent orchestration, and full-stack AI product delivery.

Select Accomplishments

AI Product Builder — Independently designed and shipped production-grade agentic AI applications end-to-end, including an autonomous multi-agent product pipeline (ProdAgentCo) that executes market discovery, strategic planning, and build execution without human intervention — demonstrating hands-on fluency across LLM pipelines, multi-agent orchestration, Claude API, and full-stack AI delivery.
Outcome-Driven Product Management — Built and led a 6-person product management team with clear portfolio operating priorities, decision frameworks anchored in value proposition, and execution aligned to customer impact, operational efficiency, and business outcomes.
Strategic Partnerships — Identified, evaluated, and onboarded strategic technology partners across payments and RCM, driving partner-led capability expansion that generated $5M+ in product revenue while reducing build costs and accelerating time-to-market.
Product Metrics — Defined and operationalized product OKRs and KPIs, including time to activation, adoption, revenue, and client efficiency metrics, that connected team execution to measurable business outcomes.
Product Innovation — Co-authored two U.S. technology patents: a contextual office management system leveraging NLP and semantic analysis, and a device repair application using augmented reality and image recognition.

Relevant Skills

AI & Development Tools

Claude Code, Claude Managed Agents, OpenClaw, MCP, CrewAI, Next.js, TypeScript, LLM Integration, Azure OpenAI, ChatGPT, Replit, Vercel, Figma

Strategy & Growth

Product Vision & Strategy, Product-Led Growth, Roadmap Prioritization, Revenue & Monetization, Business Case Development, B2B2C SaaS

Leadership & Execution

Cross-Functional Leadership, Team Building & Development, Stakeholder Management, Go-to-Market Execution, Strategic Partnerships

Product & Data

Product Discovery & Validation, OKRs & KPIs, Data Analytics & Insights, UX Optimization, AI-Assisted Product Development, Tableau, Power BI, SQL

Domain Expertise

Connected Health Platforms, EMR / Clinical Workflow, Revenue Cycle Management (RCM), Embedded Payments, SaaS Platform Management, API Strategy, Regulated Industries

Methodologies & Frameworks

Agile, SAFe, Lean Six Sigma, Design Thinking, Full SDLC Ownership

Product Management Platforms

Confluence, Jira, Aha!, Pendo

Professional Experience

Planet DDS

Orange County, CA

Sr. Product Manager, Revenue Cycle Automation & Embedded Payments

01/2023 – 10/2025

Leading dental practice management SaaS platform serving 13,000+ practices, with $100M+ ARR

Owned product vision, strategy, and end-to-end delivery for a large-scale, multi-tenant SaaS platform serving thousands of customers across regulated fintech and healthcare verticals — directly responsible for $10M ARR across revenue cycle products.
Conceived and launched AI-powered Insurance Verification by engineering a multi-source data ingestion pipeline paired with an LLM reasoning layer across 300+ payers — cutting processing time by 60%, eliminating thousands of hours of weekly manual work, and generating $2M in new ARR.
Introduced AI-powered product development workflows across discovery, prototyping, and artifact creation — compressing planning and design cycles by 40% and enabling the team to move from concept to build-ready in a fraction of the time, directly increasing throughput without adding headcount.
Defined and executed automation and growth initiatives across the full product lifecycle — from problem discovery through delivery — contributing to 20% YoY revenue growth and a 25% EBITDA improvement.
Built and managed outcome-driven roadmaps tied to activation, adoption, and monetization goals, backed by rigorous business cases, customer journey mapping, and prioritized requirements aligned to revenue and retention targets.
Led product for SaaS-embedded payments, driving $3B+ in payment volume and $5M ARR through improved onboarding, time-to-first-transaction, feature attach rates, and scalable payment infrastructure.
Streamlined onboarding and operational workflows through automation and UX optimization, delivering 20% efficiency gains across implementation and customer success teams.

Equifax

Atlanta, GA

Director Product Management, Fintech – Digital Lending Experience

08/2016 – 03/2022

Leading global data and analytics company delivering credit intelligence, risk insights, and digital lending solutions

Built and led a 6-person product management team, owning portfolio strategy, coaching, and career development across a $25M+ multi-product portfolio.
Partnered with executive leadership to shape portfolio strategy, investment priorities, and operating plans, translating business goals into measurable outcomes.
Defined product vision, strategy, and roadmaps across a multi-product portfolio, translating business priorities into executable plans that drove measurable activation, adoption, and growth.
Owned roadmap prioritization for a data analytics and insights portfolio, driving 25% YoY revenue growth and adoption by top-tier financial institutions.
Championed a first-of-its-kind enhanced mortgage credit risk product leveraging differentiated alternative data sources — expanding credit access for underserved borrowers while driving double-digit portfolio growth and adoption by top-tier financial institutions.

Closing Corp (acquired by CoreLogic)

San Diego, CA

Director Product Management, Fintech – Data Insights & Workflow

06/2015 – 08/2016

Leading provider of real estate closing cost data and analytics, powering fee transparency and workflow efficiency

Launched SmartFee 2.0 with supporting ROI models and stakeholder alignment, delivering 30% YoY revenue growth.
Modernized the core UX and streamlined onboarding and administration tools, improving platform efficiency and client experience.

First American Financial Corporation

Orange County, CA

Director Product Management, Fintech – Digital Experience

06/2014 – 06/2015

Leading Fortune 500 provider of real estate technology solutions powering the U.S. mortgage ecosystem

Defined and executed product strategy and roadmap in partnership with sales, business development, and marketing to identify and prioritize high-value opportunities.
Built market sizing analyses, competitive assessments, and ROI models to support investment decisions across the product portfolio.

Canon Research & Development

Orange County, CA

Sr. Product Manager, Information Intelligence

10/2007 – 04/2014

The advanced R&D Division of Canon Inc., pioneering intelligent office and document technologies for global enterprise markets

Defined product plans, priorities, requirements, and roadmaps; collaborated with development teams to translate business cases into technical designs.
Created and integrated a framework for identifying strategic partnerships with innovative technology companies based in Silicon Valley; connected with 100+ startups annually to gather feedback for leadership in Japan.

Additional Experience: Senior Program Manager, Digital Mortgage Platform — First American Financial (2006–2007) · Senior Consultant, Decision Systems & Automation — Fannie Mae (2000–2006)

Education, Certifications & Patents

Education

Master of Science in Information Systems

The George Washington University

Education

Bachelor of Science in Finance

University of Minnesota, Carlson School of Management

Certification

Lean Six Sigma Green Belt (LSS)

Certified 2015

Certification

Scrum Master Certified (SMC)

Certified 2013

U.S. Patent · 2018

Devices, Systems, and Methods for Context Management

Patent No. 9,922,092 · NLP & semantic analysis

U.S. Patent · 2016

Printing System, Image Forming Apparatus

Patent No. 9,235,819 · AR & image recognition

Product leadership,
at AI scale.

Agentic AI, up close.

Conductor PM

Conductor

Throughline

ProdAgentCo

Tribunal

Convergence Scanner

Job Search Agent v2

Autonomous Agent Wallet — AP2

EDGAR 8-K Scanner

Everything else I've shipped.

AI Readiness Scorecard

The Morning Brief

Clinical Notes Generator

Whale Scanner

Edge Scanner

Johnson Lake House

MLB Signal Engine

PM-as-Builder is the wedge.

What 90 days of building actually taught me.

Structured disagreement beats single-agent reasoning

Model routing is a cost-architecture decision

HITL gates need a risk framework, not vibes

Context handoff is where multi-agent chains break

Adaptive feedback loops are the real unlock

Framework choice has a half-life

The complete picture.

Jeremy Serfling

Let's connect.