Clay Workbook Analysis — HeyDigital ABM Engagers
Clay Workbook Deep-Dive: HeyDigital ABM Engagers
1. Workbook Overview & Architecture
What This Workbook Does
This is a LinkedIn engagement → enriched lead → personalized email → multi-channel push pipeline.
HeyDigital sends LinkedIn post engagement data via webhook. The workbook extracts the engager identity,
hunts for their email via three providers, validates the best result, verifies they still work at the company,
qualifies their job title authority, enriches their company profile, filters by geography and headcount,
runs competitor analysis, cleans the data, generates a cold email opening line, and pushes the lead to
four destinations simultaneously: LeadGrow CRM, Bison (email campaigns), HeyReach (LinkedIn outreach),
and a social post content bank.
Data Flow at 10,000 ft
HeyDigital Webhook
→
Table 1: Main Pipeline
61 cols · 12 phases
→
LeadGrow CUL API
Bison Campaign API
HeyReach Campaign
+
Table 2: Social Post Bank
Route Row action
Supporting tables (separate workflows, same workspace):
Table 3: Webhook table for company data enrichment ·
Table 4: Headcount enrichment for unknown companies ·
Table 5: Missing company info gap-fill
Key Stats
| Metric | Value |
| Tables in workspace | 33 (5 active for this pipeline) |
| Main table columns | 61 |
| AI (Claygent) calls per row | 10-12 (conditional path-dependent) |
| External API calls per row | 4-7 (Icypeas, Kitt, LeadMagic, EmailGuard, LeadGrow, HeyReach) |
| Email providers in waterfall | 3 (Icypeas → Kitt → LeadMagic) |
| Output destinations | 4 (CUL, Bison, HeyReach, Social Post Bank) |
| AI models used | gpt-4.1-mini (6 calls), gpt-4o-mini (4 calls) |
| Total actions (column types) | 21 enrichment actions + 75 formulas + 1 source + 2 date |
2. Main Table: The 12-Phase Pipeline
Table t_0tfqyhkneq3VAFhoM6F — "HeyDigital ABM engagers" —
is the heart of the system. Each row represents one LinkedIn post engager flowing through
a linear enrichment chain with conditional branching (waterfall logic on email find/validate).
Here is every phase, in execution order:
1
INGEST — HeyDigital Webhook
Receives raw LinkedIn engagement payload: profile URL, company name, job title, post engagement data. Source column feeds the entire pipeline.
Webhook (source)
Extract: firstName · lastName · jobTitle · companyName · linkedinUrl · postUrl · postText
2
EMAIL FIND — 3-Provider Waterfall
Attempts to find work email using fullName + domain. Waterfall logic: each provider only runs if previous providers did not return a valid result. Conditional chain: CompetitorQualification, then Kitt, then Icypeas, then LeadMagic.
Icypeas Find Email (icypeas-find-email-v2)
Kitt HTTP API (POST api.trykitt.ai/job/find_email)
LeadMagic Find Email (leadmagic-find-work-email)
3
EMAIL VALIDATION
Validates each provider email result using LeadMagic validate-email endpoint. Only validates safe emails (onlySafe=true). Waterfall: if higher-priority provider validated, skips lower.
Validate Icypeas result
Validate Kitt result
Validate LeadMagic result
4
EMAIL CONSOLIDATION
Consolidates the best valid email from all providers. Tags which data provider won. Produces final email address.
Work Email (Enterprise) — waterfall consolidation formula
Data Provider tag (which provider won)
Final email
5
EMPLOYMENT VERIFICATION — Claygent AI
AI agent (gpt-4o-mini, claygent mode) navigates LinkedIn profile. Checks if contact still works at last known company.
Extracts: currentCompanyName, currentJobTitle, companyLinkedinUrl, companyDomain, employmentDateRange.
Outputs: stillAtCompany (bool), subStatus (still_at_last_company / changed_company / open_to_work).
Conditional: only runs if job title qualification score warrants it.
Claygent: LinkedIn Employment Verification (gpt-4o-mini)
Extract: Current Company LinkedIn URL
Extract: Current Company Domain
Extract: Current Company Name
6
JOB TITLE QUALIFICATION — AI Scoring
GPT-4.1-mini scores job title 1-10 for marketing decision-making authority.
1-4: Not qualified · 5-7: Maybe · 8-10: Qualified.
Outputs: score, 20-word justification, qualification tier.
AI: Job Title Qualification Prompt (gpt-4.1-mini)
OUTPUT: Qualification tier
7
COMPANY ENRICHMENT — 4 Claygent Calls
Deep company research. Finds verified LinkedIn company URL, extracts domain, pulls full company profile,
discovers location. Fallback chain: employment data first, then Claygent search.
Claygent: Find LinkedIn Company URL (verifies domain match)
Final Company LinkedIn URL (fallback: employment data → Claygent)
Claygent: Extract Domain from LinkedIn About section (gpt-4o-mini)
domain (consolidated)
Claygent: Full LinkedIn Company Info — 10 fields (gpt-4o-mini)
Claygent: Find Location from Website (gpt-4o-mini)
Official Company Location (fallback: website → LinkedIn)
8
GEOGRAPHY + HEADCOUNT FILTERS
Converts location to country acronym via AI. Checks against target country whitelist
(US, UK, CA, AU, DE, FR, ES, IT, NL, SE, CH, SA, AE, GB).
Extracts employee count. Combined filter: 10-300 employees AND target country.
AI: Location → Country Acronym (gpt-4.1-mini)
Country Match (whitelist check)
Employees Count
Employee and Country Check (10-300 AND target country)
9
COMPETITOR ANALYSIS — AI Qualification
GPT-4.1-mini scores whether company is a competitor (SEO/digital marketing agency) or a prospect.
1-4: Agency (competitor) · 5-7: Maybe · 8-10: Qualified prospect (non-agency).
Conditional: only runs if Employee and Country Check passes.
AI: Competitor Analysis (products+description → score, gpt-4.1-mini)
Competitor Qualification
10
DATA CLEANING + SECURITY
Cleans names (strips titles, credentials, emojis), cleans company names (removes Inc/LLC, creates acronyms),
checks email host (EmailGuard API), flags problematic MX providers (Barracuda/Mimecast/Proofpoint).
AI: Clean Full Name — strip PhD, P.Eng, emojis (gpt-4.1-mini)
Extract: clean first name · clean last name
AI: Clean Company Name — remove suffixes, create acronyms (gpt-4.1-mini)
Extract: Clean Company Name
EmailGuard: Email Host Lookup (POST app.emailguard.io)
Email Security MX Check (Barracuda/Mimecast/Proofpoint)
11
COLD EMAIL PERSONALIZATION — AI Copywriting
GPT-4.1-mini generates a Will Allred-style conversational opening sentence referencing the specific
LinkedIn post they engaged with. Classifies company type in 2-3 words of buyer language.
Maps LinkedIn usernames to real author names (Adam Robinson, Neil Patel, Emilia Moller, Anna York).
Has a parallel "v2" enrichment with refined prompt. Applies pluralization rules for grammar.
Author Name (LinkedIn username → real name mapping)
AI: Cold Email Personalization v1 (gpt-4.1-mini)
AI: Cold Email Personalization v2 — refined (gpt-4.1-mini)
Extract: First Line (opening sentence, up to 20 words)
Extract: Company Type (2-3 word buyer classification)
Pluralization Rule (grammar helper)
12
DESTINATION OUTPUTS — 4 Simultaneous Pushes
Final enriched lead is pushed to all destinations. CUL creates/updates the lead in LeadGrow CRM.
Bison attaches to an email campaign. HeyReach adds to a LinkedIn outreach campaign.
Route Row sends post content to the Social Post Bank for future reference.
CUL: POST send.leadgrow.ai/api/leads/create-or-update
Bison: POST send.leadgrow.ai/api/campaigns/{id}/leads/attach-leads
HeyReach: Add Lead to Campaign (campaignId=290267)
Route Row → Social Post Bank (t_0tfqyiguHZvcGoypZJV)
Key Conditional Logic: The email waterfall and much of the downstream enrichment is
conditional on the Competitor Qualification gate. If a lead scores as "not qualified" (they are a
competing agency), many later steps are skipped — saving significant per-row costs. The email find waterfall uses
conditionalRunFieldIds to chain: CompQual blocks Kitt, Kitt blocks Icypeas, Icypeas blocks LeadMagic.
3. All Tables & How They Stitch Together
3.1 Table Map
| Table | ID | Cols | Role | Connected To |
| HeyDigital ABM engagers |
t_0tfqyhkneq3VAFhoM6F |
61 |
Main pipeline — processes each LinkedIn engager |
Routes to Social Post Bank; pushes to CUL, Bison, HeyReach |
| Social Post Bank |
t_0tfqyiguHZvcGoypZJV |
12 |
Post content archive — receives routed rows |
Receives from Main via Route Row action; has Hey Digital Relevance AI enrichment |
| Pull in data from a Webhook Table |
t_0tfr7d8dF5mnzfnpzzZ |
19 |
Separate company enrichment entry point |
Independent webhook → LinkedIn Company Info → HTTP API output |
| Enrich 22K unknowns - headcount |
t_0tfnv6iGPatnmaWKaTC |
40 |
Bulk headcount enrichment for companies missing data |
Likely feeds results back to main table or CRM |
| Enrich missing company information |
t_0tcxgijYsmrT5seFVrr |
36 |
Gap-fill enrichment for incomplete company profiles |
Likely feedback loop to main table |
3.2 Stitching Mechanism
How Tables Connect
Primary stitch: The Main table Send table data action (Route Row, Clay Labs package) sends Post Text + Post URL to the Social Post Bank table (t_0tfqyiguHZvcGoypZJV). This is a Clay-native table-to-table routing — no external API needed. The Social Post Bank then enriches with "Hey Digital Relevance" AI scoring.
Secondary stitch (inferred): The enrichment tables (headcount, missing company info) likely feed into the Main table via Clay Lookup or Import from Table actions, pulling enriched company data back into the main pipeline. The "Enrich missing company information" workbook contains multiple tables that appear to be a sub-pipeline for company data.
External stitches: The 4 destination pushes all leave Clay via HTTP API calls — to LeadGrow API, Bison campaigns, and HeyReach. These are the exit points from Clay into the rest of the LeadGrow stack.
Webhook entry: Table 3 ("Pull in data from a Webhook Table") is a separate entry point with its own webhook source. It enriches company data and outputs via HTTP API — possibly used to backfill company info that the main pipeline references.
4. Clay Cost Analysis Per Row
Clay Credit Model (approximate)
Clay charges credits per enrichment action. Pricing varies by action type and provider.
Approximate credit costs based on Clay published pricing and provider tiers:
| Action Type | Count per Row | Credits Each | Subtotal |
| Icypeas Find Email | 1 (conditional) | ~8 | 0-8 |
| Kitt HTTP API (find email) | 1 (conditional) | ~2 | 0-2 |
| LeadMagic Find Email | 1 (conditional) | ~6 | 0-6 |
| LeadMagic Validate Email ×3 | 0-3 (conditional) | ~3 | 0-9 |
| Claygent AI (gpt-4o-mini) ×4 | 4 | ~3-5 | 12-20 |
| Use AI (gpt-4.1-mini) ×6 | 6 | ~1-2 | 6-12 |
| EmailGuard HTTP API | 1 | ~2 | 2 |
| Route Row (Send table data) | 1 | ~1 | 1 |
| HeyReach Add to Campaign | 1 | ~3 | 3 |
| LeadGrow CUL (HTTP API) | 1 | ~2 | 2 |
| LeadGrow Bison (HTTP API) | 1 | ~2 | 2 |
| TOTAL (full path) | | | ~26-65 credits |
| TOTAL (avg — waterfall skips 2 finders) | | | ~40-50 credits |
~$0.40-0.50
Clay cost per row (at ~1c/credit)
~$400-500
Clay cost per 1,000 rows
~$4,000-5,000
Clay cost per 10,000 rows (monthly)
Hidden Clay Costs
Claygent actions are expensive. Each Claygent call (the AI agent that navigates LinkedIn for employment verification, company URL finding, domain extraction, company info, and location) costs 3-5 credits minimum. This workbook has 5 Claygent calls per row — that is 15-25 credits just for AI web browsing. The gpt-4.1-mini calls are cheaper but there are 6 of them.
Waterfall email finding: The conditional logic saves costs (only 1-2 of 3 providers run), but Clay still charges for the attempt even if it yields no result.
Credit rounding: Clay rounds up fractional credits. A 0.1 credit action costs 1 credit. This is most punitive for the many small AI calls in this workbook.
Duplicate charges: The Cold Email Personalization v1 and v2 run BOTH — producing two AI-generated opening lines per row. v1 result appears unused; only v2 feeds into First Line and Company Type. That is a wasted AI call per row.
5. Trigger.dev Migration Plan
Architecture Decision: DAG of Tasks, Not One Monolith
Trigger.dev is built on event-driven task execution with retries, idempotency, and
observable state. Each Clay "column" becomes a Trigger.dev task step or
independent subtask. The 12-phase pipeline becomes a parent task
that orchestrates child tasks, with each phase as a discrete retryable unit.
The key insight: Clay columns are just sequential code with field references.
In Trigger.dev, this is TypeScript with variables — no graph resolution engine needed. The
formula expressions become inline code. Conditional branching becomes if statements.
The waterfall becomes explicit try/catch chains.
5.1 Trigger.dev Task Graph
Trigger.dev Task Graph — Replacement Architecture
[TRIGGER] HeyDigital webhook -> POST /api/triggers/heydigital-engager
[PARENT TASK] processHeyDigitalEngager
Step 1: Parse webhook -> extract firstName, lastName, jobTitle, companyName, linkedinUrl, postUrl, postText
Step 2: [CHILD TASK] findEmail
Waterfall: Icypeas API -> Kitt API -> LeadMagic API (stop on first valid result)
Step 3: [CHILD TASK] validateEmail
Validate each found email via LeadMagic, pick best
Step 4: Consolidate email + tag winning provider
Step 5: [AI TASK] verifyEmployment
Serper search LinkedIn + scrape profile + gpt-4o-mini extract: stillAtCompany, currentCompany, domain
Step 6: [AI TASK] qualifyJobTitle (gpt-4.1-mini)
Step 7: [AI TASKS] enrichCompany
7a: Find verified LinkedIn company URL (Serper + scrape + domain match)
7b: Extract domain from LinkedIn About section
7c: Pull full company info (10 fields from LinkedIn)
7d: Find location from company website
Step 8: Geo/headcount filter (deterministic code)
Country acronym: JSON lookup table (NO AI needed!)
Country whitelist check
Employee count range check (10-300)
Step 9: [AI TASK] competitorAnalysis (gpt-4.1-mini)
Step 10: Data cleaning
AI: Clean full name (gpt-4.1-mini)
AI: Clean company name (gpt-4.1-mini)
EmailGuard: host lookup + MX check
Step 11: [AI TASK] personalizeColdEmail (gpt-4.1-mini)
Generate first line + company type. ONE call, not two (v1 is unused in Clay)
Step 12: [OUTPUT TASKS] push to destinations
CUL: POST send.leadgrow.ai/api/leads/create-or-update
Bison: POST attach to campaign
HeyReach: POST add lead to campaign
Supabase: INSERT post to social_post_bank table
5.2 Key Architectural Differences from Clay
| Clay Pattern | Trigger.dev Equivalent |
| Column formula with field references | TypeScript variable assignments. No graph resolution — just sequential code. |
| Conditional action execution (conditionalRunFieldIds) | if statements + early returns. More readable, easier to debug. |
| Waterfall enrichment (try A, if fails try B) | Explicit try/catch + sequential API calls. Trigger.dev retry config handles transient failures. |
| Claygent AI web browsing | Replace with Serper.dev for search + Spider Cloud or ScrapingBee for scraping + OpenAI for extraction. No Claygent platform markup. |
| Credit-based cost model | Pay for actual API usage only (OpenAI tokens, Serper queries, scraping credits). No platform margin on top. |
| Route Row (table-to-table) | INSERT into Supabase/Postgres. Same data model, zero vendor lock-in. |
| Formula language (Clay formulas) | TypeScript. More expressive, testable with Jest, versionable in git. |
| Error handling | Per-step retries with exponential backoff, dead letter queues, alerting. Clay: column turns red, manual fix. |
5.3 Replacements for Clay-Specific Features
| Clay Feature | Replacement | Cost Impact |
| Icypeas Find Email | Icypeas API directly (bypass Clay ~4x markup) | ~75% cheaper per lookup |
| LeadMagic Find + Validate Email | LeadMagic API directly | ~60% cheaper |
| Kitt (via Clay HTTP API) | Kitt API directly | No change (already direct API call in Clay) |
| Claygent (AI web browsing) | Serper.dev + OpenAI gpt-4o-mini + fetch/scrape | ~80% cheaper (no Claygent markup) |
| Route Row to Social Post Bank | Supabase INSERT | Free (Supabase is already in stack) |
| Country acronym generation (AI) | Deterministic lookup table (200 lines of JSON) | Free — no AI call needed! |
| Author name mapping (formula) | TypeScript Map/switch | Free — no formula engine needed |
| Pluralization rule (formula) | TypeScript pluralize npm package | Free — pure code, tested once |
Biggest Win: The country acronym AI call is completely unnecessary. Clay uses GPT-4.1-mini to convert
"San Francisco, CA" -> "US". That is a deterministic lookup that costs $0 in Trigger.dev vs. 1-2 Clay credits per row.
Same for the author name mapping (4 hardcoded LinkedIn URL -> name mappings in a Clay formula — trivial switch statement in TS).
And the pluralization rule is a lodash endsWith check implemented as a Clay formula — an npm package in JS.
6. Trigger.dev Pricing: Real Numbers
Trigger.dev Plans (as of mid-2026)
| Plan | Price | Runs/Month | Concurrency | Environments | Overage |
| Hobby | Free | 1,000 | 1 | 1 (dev only) | N/A |
| Pro | $20/mo | 10,000 | 100 | 1 | $0.002/run |
| Team | $100/mo | 100,000 | 1,000 | 5 | $0.001/run |
| Enterprise | Custom | Unlimited | Unlimited | Unlimited | Negotiated |
A "run" = one task execution. Our parent task + ~5 child tasks per row = ~6 runs per lead processed.
Child tasks count against the run quota. At 10,000 leads/month, that is ~60,000 total runs.
On the Team plan, 100K included runs covers this with room. On Pro, 40K overage at $0.002 = $80 extra.
6.1 Per-Row Cost Breakdown — Trigger.dev Version
| Cost Component | Per Row | Notes |
| Trigger.dev runs (~6 tasks/row, Team plan) | $0.006 | 6 tasks x $0.001/run (marginal overage cost) |
| OpenAI gpt-4.1-mini (6 calls x ~1K tokens) | $0.004 | $0.60/1M input + $2.40/1M output tokens |
| OpenAI gpt-4o-mini (1-2 calls x ~2K tokens) | $0.001 | Cheaper model, used for company extraction |
| Serper.dev search (5-6 queries for Claygent replacement) | $0.005 | ~$0.001/query at scale |
| Spider Cloud / ScrapingBee (2-3 pages scraped) | $0.006 | ~$0.002/page for LinkedIn + company website scraping |
| Icypeas email lookup | $0.02 | Direct API pricing (vs Clay ~$0.08) |
| LeadMagic find + validate | $0.03 | Direct API pricing (vs Clay ~$0.09) |
| Kitt email lookup | $0.01 | Direct API — same cost since already direct in Clay |
| EmailGuard host lookup | $0.003 | Direct API |
| LeadGrow CUL + Bison API | $0.00 | Own infrastructure — already running |
| HeyReach API | $0.01 | Direct API |
| Supabase INSERT | $0.00 | Already in stack, marginal cost zero |
| TOTAL per row | ~$0.095 | |
~$0.10
Trigger.dev total per row
~$950
Per 10,000 rows (monthly)
~76-81%
Savings vs Clay per row
What drives the 76-81% savings:
- No Clay platform margin on enrichment actions (Clay marks up API calls 3-5x)
- No Claygent tax — Claygent is Clay most expensive feature. Replacing with Serper + scraping + OpenAI is ~80% cheaper
- Deterministic code replaces AI — country acronym lookup, author name mapping, pluralization rule all cost $0 in code vs 3+ credits in Clay
- Direct provider APIs — Icypeas/LeadMagic are 60-75% cheaper when called directly without Clay surcharge
- Eliminated duplicate AI call — Cold Email Personalization v1 runs and its output is never used. That is 1-2 credits wasted per row
- Trigger.dev pricing — at $0.001/run on Team plan, the platform cost is negligible compared to API costs (~6% of total)
7. Clay vs Trigger.dev: Total Cost Comparison
7.1 Monthly Cost at Scale
| Volume (leads/month) | Clay Cost | Trigger.dev Cost | Monthly Savings | Annual Savings |
| 1,000 | $400 – $500 | $95 + $20 (Pro plan) = $115 | $285 – $385 | $3,420 – $4,620 |
| 5,000 | $2,000 – $2,500 | $475 + $20 = $495 | $1,505 – $2,005 | $18,060 – $24,060 |
| 10,000 | $4,000 – $5,000 | $950 + $20 = $970 | $3,030 – $4,030 | $36,360 – $48,360 |
| 50,000 | $20,000 – $25,000 | $4,750 + $100 (Team) + ~$150 overage = $5,000 | $15,000 – $20,000 | $180,000 – $240,000 |
| 100,000 | $40,000 – $50,000 | $9,500 + $100 + ~$400 overage = $10,000 | $30,000 – $40,000 | $360,000 – $480,000 |
7.2 Non-Cost Considerations
Advantages of Staying on Clay
- Zero engineering time — already built, already running, no migration effort
- Visual debugging — see exactly which column failed, on which row, with what error. Invaluable for operations.
- No infrastructure — no servers, no deploys, no monitoring to set up
- Claygent infrastructure — their AI web browsing handles LinkedIn auth, rate limiting, proxy rotation. Replicating this well is 2-3 weeks of engineering.
- Enrichment library — 100+ providers pre-integrated. Adding a new enrichment is two clicks, not a PR + deploy.
- Non-technical editing — ops team can modify prompts, add columns, change waterfall logic without code.
Advantages of Migrating to Trigger.dev
- 76-81% cost reduction at any scale — $36K-48K/year savings at just 10K leads/month
- Full code control — version in git, test with Jest, review in PRs, rollback with git revert
- No vendor lock-in — Trigger.dev is open source. Can self-host. Can migrate to Inngest, QStash, Temporal.
- TypeScript — type-safe, composable, debuggable. Clay formulas are a proprietary language with no IDE support.
- Deterministic logic is free — country lookups, string transforms, formulas cost nothing vs Clay credits
- Better retry semantics — per-step retries with exponential backoff, idempotency keys, dead letter queues
- Observability — OpenTelemetry traces span the entire pipeline, from webhook to destination push
- Supabase integration — already in the LeadGrow stack, free state persistence for post bank and intermediate results
- Unlimited AI models — not locked to Clay AI model catalog. Can use Claude, Gemini, Grok, or fine-tuned models
8. Migration Roadmap
Phase 1: Infrastructure Setup (Week 1)
- Create Trigger.dev project in existing LeadGrow org (use
lg-trigger CLI for project scaffolding)
- Set up provider API accounts — Icypeas direct API key, LeadMagic direct, Serper.dev, Spider Cloud or ScrapingBee, EmailGuard direct
- Build SDK wrappers — thin TypeScript wrappers for Icypeas, LeadMagic, Kitt, EmailGuard, Serper, Spider Cloud
- Create Supabase table —
social_post_bank to replace the Social Post Bank Clay table
- Set up environments — dev/staging/production in Trigger.dev with separate API keys
Phase 2: Build Parent Task (Week 1-2)
- Implement processHeyDigitalEngager — parent task with all 12 phases as inline steps
- Email waterfall — Icypeas -> Kitt -> LeadMagic with explicit try/catch, stop on first valid result
- Company enrichment — Serper search LinkedIn -> scrape profile -> gpt-4o-mini extract fields
- Deterministic helpers — country acronym lookup, author name mapping, pluralization (npm:pluralize)
- AI tasks — job title qualification, competitor analysis, name cleaning, company cleaning, cold email personalization
- Output tasks — CUL, Bison, HeyReach pushes + Supabase INSERT
- Write tests — unit tests for each phase with mock API responses. Integration test for full pipeline.
Phase 3: Parallel Validation (Week 2-3)
- Duplicate HeyDigital webhook — send to BOTH Clay and Trigger.dev simultaneously (webhook proxy or HeyDigital config)
- Run 500-1,000 rows through both — collect outputs from both pipelines
- Diff comparison:
- Email find rate: what % of rows get a valid email? Is Trigger.dev rate same or better?
- Email quality: do the same emails get found? If different, which is better?
- Company enrichment: do LinkedIn company URLs match? Is domain extraction accurate?
- AI quality: are first lines equally good? Are job title qualifications consistent?
- Cost tracking: measure actual per-row API costs vs estimates
- Edge case testing — non-Latin names, non-English company descriptions, missing LinkedIn profiles, personal posts (should output NA)
- Fix discrepancies — iterate on prompts, scraping approach, fallback logic until Trigger.dev output >= Clay output quality
Phase 4: Gradual Cutover (Week 3-4)
- Feature-flag destinations — switch CUL/Bison/HeyReach pushes from Clay to Trigger.dev ONE at a time with monitoring
- Day 1-2: Supabase only — Trigger.dev pushes to Supabase post bank, Clay handles CRM pushes
- Day 3-4: Add CUL — Trigger.dev handles CUL + Supabase, Clay handles Bison + HeyReach
- Day 5-6: Add Bison — Trigger.dev handles CUL + Bison + Supabase, Clay handles HeyReach
- Day 7: Full cutover — Trigger.dev handles all 4 destinations
- Monitor for 1 week — with both pipelines still receiving data, confirm Trigger.dev outputs are production-quality
Phase 5: Archive Clay (Week 4)
- Stop Clay webhook — redirect HeyDigital exclusively to Trigger.dev
- Export Clay data — pull all records from Clay tables as JSON/CSV backup
- Export Clay schemas — already done (see files below). Archive for reference.
- Pause (do not delete) Clay workbook — keep for 30 days as fallback. Delete only after a full billing cycle with zero issues.
- Document the migration — write post-mortem: what was easy, what was hard, what Clay did better, what Trigger.dev did better
Migration Risks (ranked by severity)
- Claygent replacement is the hardest part. Claygent LinkedIn browsing uses BrowserBase under the hood with proxy rotation and auth management. Replacing this with Serper + scraping requires careful handling of LinkedIn anti-bot measures. Mitigation: Use Spider Cloud or ScrapingBee for LinkedIn scraping (they handle proxies). Budget 2-3 weeks just for this component.
- Icypeas direct API may have different rate limits outside of Clay enterprise agreement. Mitigation: Contact Icypeas for direct API pricing and limits before migration.
- The email waterfall logic is subtly complex — conditionalRunFieldIds create a specific priority chain (CompetitorQualification blocks Kitt, Kitt blocks Icypeas, Icypeas+Clay blocks LeadMagic). Overlooking one condition means running providers that should be skipped. Mitigation: Write integration tests that verify each waterfall branch.
- Clay formula engine handles null/undefined gracefully with fallback operators (
||, optional chaining). TypeScript requires explicit null checks. Mitigation: Use TypeScript optional chaining (?.) and nullish coalescing (??) — the syntax maps directly to Clay formula patterns.
- Cost spike risk — Clay bills are predictable (credits). Trigger.dev + direct API costs are usage-based and can spike if a webhook floods or a scraping service is over-called. Mitigation: Set usage alerts on all API accounts. Add rate limiting on the webhook endpoint.
Files Generated for This Analysis
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_main_schema.json (165 KB — 61 columns)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_routed_target_schema.json (20 KB — Social Post Bank)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_webhook_source_schema.json (32 KB — Webhook Table)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_enrich_company_schema.json (68 KB — Enrich Company)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_enrich_headcount_schema.json (71 KB — Enrich Headcount)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\index.html (this page)
All schemas extracted via clay pull --table [id] from leadgrow-clay-cli.
Analysis performed 2026-06-10. Clay workspace 206846, workbook wb_0tfqyhki4hnphtPTmom.
No records were available (table was empty or API-filtered at extraction time).
Next step: Deploy this page to Cloudflare Pages for team review.
9. Corrected Cost Model — Gated + Flex API + Prompt Caching
Key corrections applied:
- Gating: only 30% of rows reach the expensive email waterfall
- AI: OpenAI Flex API (50% discount) + prompt caching (50% off cached input)
- First name: REGEX only, no LLM, only if >1 word
- Spider Cloud: replaced with Firecrawl
- EmailGuard: free
- HeyReach: free
- Batching: 10-20 rows per Trigger.dev run
- Parallel: employment verification + company enrichment run concurrently
9.1 AI Costs — Flex API + Prompt Caching
Prompt caching strategy
Every prompt follows this structure to maximize caching:
<!-- CACHED (80% of tokens, 50% discount) -->
System prompt + Instructions + Banned words + Scoring rubric + Examples
<!-- UNCACHED (20% of tokens) -->
Variable 1: {{job_title}}
Variable 2: {{company_name}}
Variable 3: {{post_text}}
Variable 4: {{company_description}}
Flex API pricing (gpt-4.1-mini):
.30/1M input, .15/1M cached input, $1.20/1M output.
| AI Call | Input Tokens | Output Tokens | Cost (cached) | Runs on |
| Job Title Qualification | ~600 (80% cached) | ~40 | .00014 | 100% of rows |
| Competitor Analysis | ~400 (80% cached) | ~40 | .00010 | 40% of rows |
| Clean Company Name | ~800 (80% cached) | ~50 | .00014 | 30% of rows |
| Cold Email Personalization | ~1000 (80% cached) | ~30 | .00016 | 30% of rows |
| TOTAL AI (amortized per row) | | .00028 | |
AI is noise. With Flex + caching, all 4 AI calls combined cost less than 3 hundredths of a cent per row.
The real costs are the enrichment APIs.
9.2 Full Amortized Cost Per Row
| Cost Line | Per Row (ungated) | % of Rows That Hit | Amortized |
| harvestapi employment verification | .004 | 65% | .0026 |
| Serper (4-5 Google searches) | .005 | 65% | .0033 |
| Firecrawl (2-3 pages scraped) | .003 | 65% | .0020 |
| AI Ark person enrichment | ~.001 | 30% | .0003 |
| Email waterfall (Kitt → Icypeas → LeadMagic) | .035 | 30% | .0105 |
| All AI calls (Flex + cached) | .00028 | varies | .0003 |
| Trigger.dev (batched 20x) | .002 | 100% | .0020 |
| TOTAL | | | .021 |
~.02
Per-row amortized cost
~$210
Per 10,000 leads/month
~95%
Savings vs Clay ($4,000-5,000/mo)
10. Workflow Prototype — Build-Ready
Architecture Decisions
- Trigger: HeyDigital webhook → Trigger.dev HTTP endpoint
- Batching: 20 rows per parent task invocation. Trigger.dev batch trigger processes array.
- Parallelism: Employment verification + company enrichment run in parallel (no data dependency). Email waterfall runs after gates clear.
- Gating: Each phase is a step with
if (!passes) return early — no wasted API calls.
- Caching: All AI prompts split into fixed header (cached) + variable footer (uncached).
- Idempotency: Keyed on linkedinUrl + postUrl. Duplicate webhooks are no-ops.
10.1 Mermaid Flow Diagram
flowchart TD
subgraph TRIGGER["TRIGGER (always free)"]
WH["HeyDigital Webhook\n→ Trigger.dev HTTP endpoint"] --> PARSE["Parse payload\nExtract: firstName, lastName,\njobTitle, companyName,\nlinkedinUrl, postUrl, postText"]
end
subgraph GATE1["GATE 1 — Job Title (100% of rows)"]
PARSE --> JT_AI["AI: Job Title Qualification\nGPT-4.1-mini, Flex API, CACHED\nCost: .00014/row\nOutput: score 1-10 + tier"]
end
JT_AI --> JT_CHECK{"score >= 5?"}
JT_CHECK -->|"No (score 1-4)\n~35% of rows"| PUSH_EARLY["DISQUALIFIED\n→ Push to Supabase only"]
JT_CHECK -->|"Yes (Qualified/Maybe)\n~65% of rows"| PARALLEL["PARALLEL EXECUTION"]
subgraph PARALLEL_RUN["Parallel Phase (~65% of rows)"]
EV["Employment Verification\nApify harvestapi\n.004/row\nChecks: stillAtCompany? currentTitle?"]
CE["Company Enrichment\nSerper: search LinkedIn URL, domain\nFirecrawl: scrape About page + website\nExtract: industry, employees, location, description\nCost: .008/row"]
end
PARALLEL --> EV
PARALLEL --> CE
EV --> MERGE["Merge Results"]
CE --> MERGE
subgraph GATE2["GATE 2 — Geo + Headcount (~65% of rows)"]
MERGE --> COUNTRY["Country Lookup TABLE\nLocation → Acronym\nNO AI — deterministic JSON\nCost: FREE"]
COUNTRY --> GEO_CHECK{"Target country?\n(US,UK,CA,AU,DE,FR,ES,IT,NL,SE,CH,SA,AE,GB)\nAND 10-300 employees?"}
end
GEO_CHECK -->|"No\n~25% of rows"| PUSH_GEO["DISQUALIFIED\n→ Push to Supabase only"]
GEO_CHECK -->|"Yes\n~40% of rows"| GATE3
subgraph GATE3["GATE 3 — Competitor Analysis (~40% of rows)"]
COMP_AI["AI: Competitor Analysis\nGPT-4.1-mini, Flex API, CACHED\nCost: .00010/row\nScores: is this a digital agency?\n1-4 = agency, 5-7 = maybe, 8-10 = prospect"]
end
COMP_AI --> COMP_CHECK{"Is prospect?\n(score >= 5, not agency)"}
COMP_CHECK -->|"No (competitor agency)\n~10% of rows"| PUSH_COMP["DISQUALIFIED\n→ Push to Supabase only"]
COMP_CHECK -->|"Yes (qualified prospect)\n~30% of rows"| EMAIL_WF
subgraph EMAIL_WF["EMAIL WATERFALL (~30% of rows)"]
direction LR
AI_ARK["AI Ark\nPerson enrichment\n(33K credits)\n~.001/row"] --> BLITZ["Blitz\n[NEED CLARIFICATION]"]
BLITZ --> KITT["Kitt API\n~.01/row"]
KITT -->|"no valid email"| ICY["Icypeas API\n~.02/row"]
ICY -->|"no valid email"| LM["LeadMagic API\n~.03/row"]
LM --> EMAIL_RESULT["Best email + provider tag"]
end
KITT -->|"valid email"| EMAIL_RESULT
ICY -->|"valid email"| EMAIL_RESULT
EMAIL_RESULT --> EMAIL_CHECK{"Email found?"}
EMAIL_CHECK -->|"No"| PUSH_NOEMAIL["NO EMAIL\n→ Push to destinations"]
EMAIL_CHECK -->|"Yes"| CLEAN
subgraph CLEAN["DATA CLEANING (~30% of rows)"]
FN["First Name: REGEX\nsplit on space, take[0]\nOnly if >1 word\nCost: FREE"]
CN["Company Name: AI\nGPT-4.1-mini, CACHED\nStrip Inc/LLC, create acronym\nCost: .00014/row"]
end
FN --> COLD_EMAIL
CN --> COLD_EMAIL
subgraph COLD_EMAIL["COLD EMAIL (~30% of rows)"]
CE_AI["AI: Cold Email Personalization\nGPT-4.1-mini, CACHED\nOutput: firstLine (20 words)\n+ companyType (2-3 words)\nCost: .00016/row\nONE merged call, not two"]
end
COLD_EMAIL --> DEST
subgraph DEST["DESTINATIONS (always free)"]
direction LR
CUL["LeadGrow CUL\ncreate-or-update\nCost: FREE"]
BISON["EmailBison\nattach to campaign\nCost: FREE"]
HEY["HeyReach\nadd to campaign\nCost: FREE"]
SUPA["Supabase\nsocial_post_bank\nCost: FREE"]
end
PUSH_EARLY --> DEST
PUSH_GEO --> DEST
PUSH_COMP --> DEST
PUSH_NOEMAIL --> DEST
COLD_EMAIL --> CUL
COLD_EMAIL --> BISON
COLD_EMAIL --> HEY
COLD_EMAIL --> SUPA
CUL --> DONE["DONE"]
BISON --> DONE
HEY --> DONE
SUPA --> DONE
style TRIGGER fill:#0d1117,stroke:#3fb950,color:#3fb950
style GATE1 fill:#0d1117,stroke:#a371f7,color:#a371f7
style PARALLEL_RUN fill:#0d1117,stroke:#39d2c0,color:#39d2c0
style GATE2 fill:#0d1117,stroke:#58a6ff,color:#58a6ff
style GATE3 fill:#0d1117,stroke:#a371f7,color:#a371f7
style EMAIL_WF fill:#0d1117,stroke:#d2991d,color:#d2991d
style CLEAN fill:#0d1117,stroke:#58a6ff,color:#58a6ff
style COLD_EMAIL fill:#0d1117,stroke:#a371f7,color:#a371f7
style DEST fill:#0d1117,stroke:#3fb950,color:#3fb950
10.2 Trigger.dev Task Structure (TypeScript prototype)
// trigger.ts — parent task
import { task, batch } from "@trigger.dev/sdk";
export const processHeyDigitalBatch = task({
id: "process-heydigital-batch",
retry: { maxAttempts: 3 },
run: async (payload: { rows: HeyDigitalWebhookRow[] }, { ctx }) => {
const results = [];
for (const row of payload.rows) {
// GATE 1: Job title qualification (always)
const { score, qualification } = await qualifyJobTitle(row.jobTitle);
if (score < 5) {
await pushToSupabase(row, { status: "disqualified_title" });
continue; // 35% stop here
}
// PARALLEL: employment + company enrichment
const [empResult, companyResult] = await Promise.all([
verifyEmployment(row.linkedinUrl), // Apify harvestapi
enrichCompany(row.companyName), // Serper + Firecrawl
]);
// GATE 2: Geo + headcount
const country = COUNTRY_LOOKUP[companyResult.location]; // free
if (!TARGET_COUNTRIES.has(country) || companyResult.employees < 10 || companyResult.employees > 300) {
await pushToSupabase(row, { status: "disqualified_geo" });
continue; // 25% stop here
}
// GATE 3: Competitor analysis
const { isProspect } = await analyzeCompetitor(companyResult);
if (!isProspect) {
await pushToSupabase(row, { status: "disqualified_competitor" });
continue; // 10% stop here
}
// EMAIL WATERFALL
const email = await findEmail({
fullName: row.firstName + " " + row.lastName,
domain: companyResult.domain,
linkedinUrl: row.linkedinUrl,
}); // AI Ark -> Blitz -> Kitt -> Icypeas -> LeadMagic
// CLEANING
const cleanFirstName = row.firstName.split(" ").length > 1
? row.firstName.split(" ")[0].replace(/[^a-zA-Z]/g, "")
: row.firstName;
const cleanCompany = await cleanCompanyName(companyResult.companyName);
// COLD EMAIL (merged call, not two)
const { firstLine, companyType } = await personalizeEmail({
authorName: resolveAuthor(row.postAuthorLinkedinUrl),
postText: row.postText,
companyName: cleanCompany,
companyDescription: companyResult.description,
products: companyResult.products,
});
// PUSH TO ALL DESTINATIONS
await Promise.all([
pushToCUL({ firstName: cleanFirstName, lastName: row.lastName, email, company: cleanCompany, ... }),
pushToBison({ email, campaignId: process.env.CAMPAIGN_ID }),
pushToHeyReach({ firstName: cleanFirstName, lastName: row.lastName, linkedinUrl: row.linkedinUrl, ... }),
pushToSupabase(row, { status: "qualified", firstLine, companyType, email }),
]);
}
},
});
// Trigger: HeyDigital webhook -> batch of 20 rows
export const heydigitalWebhook = batch({
id: "heydigital-webhook",
trigger: httpEndpoint(),
batch: { maxItems: 20, maxWaitMs: 5000 },
run: async (items, { ctx }) => {
await processHeyDigitalBatch.trigger({ rows: items });
},
});
10.3 Open Questions Before Build
Need clarification on:
- Blitz: What is this in the email waterfall? Is it a specific enrichment provider, an internal tool, or a quick email pattern check? This sits between AI Ark and the waterfall.
- AI Ark cost: How many credits does
export-one consume per person enrichment? 33K credits suggests ~$330 if .01/credit, but need exact per-lookup cost.
- Firecrawl plan: Which Firecrawl tier are we on? Free tier (500 credits) or paid? Need to confirm per-page cost for company About page scraping.
- harvestapi (Apify) account: Do we have a paid Apify plan, or should we set one up? At free tier, .004/profile with monthly limits. Need to confirm tier for volume estimates.
- Supabase table schema: Need the
social_post_bank table created. What fields beyond postUrl + postText?
- Batch size tuning: Is 20 rows per batch optimal? Trade-off: larger batches = lower Trigger.dev costs but more memory + longer execution time per run.