Modern workspace with AI agent and data workflow screens

AI agent platforms · enterprise data infrastructure · product leadership

Building practical AI agents for real-world operations.

I am a product leader with 8+ years building AI agent platforms, support automation, and enterprise SaaS infrastructure across TikTok Shop and Microsoft Azure. My focus is turning complex workflows into reliable, measurable systems that people can trust in production.

Charlie Zhu is a product/platform operator focused on production GenAI systems. He holds a master’s degree in Computer Science from Stanford University and previously worked on enterprise cloud/data platform products.

See what I am building Connect on LinkedIn

3M+ DAUs served by AI support automation

9 languages across global support workflows

80% AI containment reached in production

$2.5B ARR enterprise data orchestration platform

Prior public technical writing: 17 Azure Data Factory articles · ~208.6K public views · Microsoft Tech Community

View previous technical writing

Point of view

Agents only matter when they survive contact with operations.

My background sits at the intersection of AI agents, support automation, and data orchestration. I care about the less glamorous parts that make AI useful: tool-use safety, launch gates, evals, policy grounding, human fallback, observability, workflow configuration, and business metrics.

I am building a public body of work around production-grade agent systems for enterprise workflows, with a particular interest in data operations, customer support automation, and research-to-product translation.

Selected work

Operating at AI and data-platform scale

TikTok Shop · AI Agent Platform

Customer-service automation across 15 countries

Led roadmap, PRDs, OKRs, launch criteria, and success metrics for an LLM-powered support automation platform spanning Chat/DM agents, tool-use orchestration, RAG grounding, model routing, configuration tooling, AI quality, agent handoff, and human fallback.

Raised AI containment from 30% to 80% Prevented roughly 12M tickets from reaching human support Reduced policy hallucinations from about 15% to about 3% Cut validation cycles from 5 days to 12 hours with golden evals

Microsoft Azure Data · Data Factory

Enterprise orchestration for cloud-native data pipelines

Owned product strategy and execution for Azure Data Factory’s backend orchestration layer, including low-code pipeline authoring, SDK surfaces, execution infrastructure, and enterprise integrations.

Worked on a $2.5B ARR platform Shipped 10+ capabilities adopted by 6,000+ enterprise accounts each Led an 18-month Kubernetes migration while maintaining 99.999% availability Partnered with 20+ strategic enterprise accounts and Fortune 500 customers

Build in public

Public projects to launch next

GitHub

Prototype

Pipeline Failure Investigator

An AI agent that reads data pipeline logs, classifies failures, checks lineage, recalls similar incidents, and drafts a safe remediation plan.

Discuss collaboration

Evaluation

OpenEvalGate

An early open-source framework concept for evaluating enterprise workflow agents across policy correctness, tool-call safety, grounding, handoff quality, and regression risk.

Request details

Research translation

Paper to Production Notes

Short build notes that take agent research papers and translate them into product requirements, prototype architecture, and production constraints.

Writing topics

Writing platform

Themes to become known for

Reliable AI agents are product systems, not demos

Launch gates, evals, escalation, permissions, policy grounding, and production feedback loops.

What data orchestration teaches us about agent orchestration

Retries, lineage, idempotency, observability, cost, scheduling, ownership, and failure recovery.

From research paper to useful prototype

A builder’s translation layer between academic ideas and practical product constraints.

Human fallback is a feature, not a failure

Designing agent handoffs, summaries, escalation criteria, and human-agent collaboration.

Speaking and collaboration

Topics I can speak on

I am interested in hackathons, open-source communities, research conversations, and talks where practical AI agent systems meet real operations.

View talks page

Practical AI agents for enterprise workflows

How to move from impressive demos to systems with launch gates, tool safety, and measurable outcomes.

Building evals for customer-service AI agents

Golden cases, production samples, LLM-as-a-judge, human calibration, and regression prevention.

Data platforms as the blueprint for agent platforms

Lessons from Azure Data Factory applied to stateful, observable, governable agent systems.

Get in touch

Send me a research idea, open-source issue, or event invitation.

Contact / Collaborate LinkedIn GitHub