Clay Workbook Deep-Dive: HeyDigital ABM Engagers

Workspace 206846 · Workbook wb_0tfqyhki4hnphtPTmom · Analyzed 2026-06-10 · Source: Clay CLI schema extraction
Open in Clay
Contents
  1. Workbook Overview & Architecture
  2. Main Table: 12-Phase Pipeline
  3. All Tables & How They Stitch
  4. Clay Cost Analysis Per Row
  5. Trigger.dev Migration Plan
  6. Trigger.dev Pricing: Real Numbers
  7. Clay vs Trigger.dev: Total Cost Comparison
  8. Migration Roadmap

1. Workbook Overview & Architecture

What This Workbook Does

This is a LinkedIn engagement → enriched lead → personalized email → multi-channel push pipeline. HeyDigital sends LinkedIn post engagement data via webhook. The workbook extracts the engager identity, hunts for their email via three providers, validates the best result, verifies they still work at the company, qualifies their job title authority, enriches their company profile, filters by geography and headcount, runs competitor analysis, cleans the data, generates a cold email opening line, and pushes the lead to four destinations simultaneously: LeadGrow CRM, Bison (email campaigns), HeyReach (LinkedIn outreach), and a social post content bank.

Data Flow at 10,000 ft
HeyDigital Webhook Table 1: Main Pipeline
61 cols · 12 phases
LeadGrow CUL API Bison Campaign API HeyReach Campaign + Table 2: Social Post Bank
Route Row action
Supporting tables (separate workflows, same workspace):
Table 3: Webhook table for company data enrichment · Table 4: Headcount enrichment for unknown companies · Table 5: Missing company info gap-fill

Key Stats

MetricValue
Tables in workspace33 (5 active for this pipeline)
Main table columns61
AI (Claygent) calls per row10-12 (conditional path-dependent)
External API calls per row4-7 (Icypeas, Kitt, LeadMagic, EmailGuard, LeadGrow, HeyReach)
Email providers in waterfall3 (Icypeas → Kitt → LeadMagic)
Output destinations4 (CUL, Bison, HeyReach, Social Post Bank)
AI models usedgpt-4.1-mini (6 calls), gpt-4o-mini (4 calls)
Total actions (column types)21 enrichment actions + 75 formulas + 1 source + 2 date

2. Main Table: The 12-Phase Pipeline

Table t_0tfqyhkneq3VAFhoM6F"HeyDigital ABM engagers" — is the heart of the system. Each row represents one LinkedIn post engager flowing through a linear enrichment chain with conditional branching (waterfall logic on email find/validate). Here is every phase, in execution order:

1
INGEST — HeyDigital Webhook
Receives raw LinkedIn engagement payload: profile URL, company name, job title, post engagement data. Source column feeds the entire pipeline.
Webhook (source) Extract: firstName · lastName · jobTitle · companyName · linkedinUrl · postUrl · postText
2
EMAIL FIND — 3-Provider Waterfall
Attempts to find work email using fullName + domain. Waterfall logic: each provider only runs if previous providers did not return a valid result. Conditional chain: CompetitorQualification, then Kitt, then Icypeas, then LeadMagic.
Icypeas Find Email (icypeas-find-email-v2) Kitt HTTP API (POST api.trykitt.ai/job/find_email) LeadMagic Find Email (leadmagic-find-work-email)
3
EMAIL VALIDATION
Validates each provider email result using LeadMagic validate-email endpoint. Only validates safe emails (onlySafe=true). Waterfall: if higher-priority provider validated, skips lower.
Validate Icypeas result Validate Kitt result Validate LeadMagic result
4
EMAIL CONSOLIDATION
Consolidates the best valid email from all providers. Tags which data provider won. Produces final email address.
Work Email (Enterprise) — waterfall consolidation formula Data Provider tag (which provider won) Final email
5
EMPLOYMENT VERIFICATION — Claygent AI
AI agent (gpt-4o-mini, claygent mode) navigates LinkedIn profile. Checks if contact still works at last known company. Extracts: currentCompanyName, currentJobTitle, companyLinkedinUrl, companyDomain, employmentDateRange. Outputs: stillAtCompany (bool), subStatus (still_at_last_company / changed_company / open_to_work). Conditional: only runs if job title qualification score warrants it.
Claygent: LinkedIn Employment Verification (gpt-4o-mini) Extract: Current Company LinkedIn URL Extract: Current Company Domain Extract: Current Company Name
6
JOB TITLE QUALIFICATION — AI Scoring
GPT-4.1-mini scores job title 1-10 for marketing decision-making authority. 1-4: Not qualified · 5-7: Maybe · 8-10: Qualified. Outputs: score, 20-word justification, qualification tier.
AI: Job Title Qualification Prompt (gpt-4.1-mini) OUTPUT: Qualification tier
7
COMPANY ENRICHMENT — 4 Claygent Calls
Deep company research. Finds verified LinkedIn company URL, extracts domain, pulls full company profile, discovers location. Fallback chain: employment data first, then Claygent search.
Claygent: Find LinkedIn Company URL (verifies domain match) Final Company LinkedIn URL (fallback: employment data → Claygent) Claygent: Extract Domain from LinkedIn About section (gpt-4o-mini) domain (consolidated) Claygent: Full LinkedIn Company Info — 10 fields (gpt-4o-mini) Claygent: Find Location from Website (gpt-4o-mini) Official Company Location (fallback: website → LinkedIn)
8
GEOGRAPHY + HEADCOUNT FILTERS
Converts location to country acronym via AI. Checks against target country whitelist (US, UK, CA, AU, DE, FR, ES, IT, NL, SE, CH, SA, AE, GB). Extracts employee count. Combined filter: 10-300 employees AND target country.
AI: Location → Country Acronym (gpt-4.1-mini) Country Match (whitelist check) Employees Count Employee and Country Check (10-300 AND target country)
9
COMPETITOR ANALYSIS — AI Qualification
GPT-4.1-mini scores whether company is a competitor (SEO/digital marketing agency) or a prospect. 1-4: Agency (competitor) · 5-7: Maybe · 8-10: Qualified prospect (non-agency). Conditional: only runs if Employee and Country Check passes.
AI: Competitor Analysis (products+description → score, gpt-4.1-mini) Competitor Qualification
10
DATA CLEANING + SECURITY
Cleans names (strips titles, credentials, emojis), cleans company names (removes Inc/LLC, creates acronyms), checks email host (EmailGuard API), flags problematic MX providers (Barracuda/Mimecast/Proofpoint).
AI: Clean Full Name — strip PhD, P.Eng, emojis (gpt-4.1-mini) Extract: clean first name · clean last name AI: Clean Company Name — remove suffixes, create acronyms (gpt-4.1-mini) Extract: Clean Company Name EmailGuard: Email Host Lookup (POST app.emailguard.io) Email Security MX Check (Barracuda/Mimecast/Proofpoint)
11
COLD EMAIL PERSONALIZATION — AI Copywriting
GPT-4.1-mini generates a Will Allred-style conversational opening sentence referencing the specific LinkedIn post they engaged with. Classifies company type in 2-3 words of buyer language. Maps LinkedIn usernames to real author names (Adam Robinson, Neil Patel, Emilia Moller, Anna York). Has a parallel "v2" enrichment with refined prompt. Applies pluralization rules for grammar.
Author Name (LinkedIn username → real name mapping) AI: Cold Email Personalization v1 (gpt-4.1-mini) AI: Cold Email Personalization v2 — refined (gpt-4.1-mini) Extract: First Line (opening sentence, up to 20 words) Extract: Company Type (2-3 word buyer classification) Pluralization Rule (grammar helper)
12
DESTINATION OUTPUTS — 4 Simultaneous Pushes
Final enriched lead is pushed to all destinations. CUL creates/updates the lead in LeadGrow CRM. Bison attaches to an email campaign. HeyReach adds to a LinkedIn outreach campaign. Route Row sends post content to the Social Post Bank for future reference.
CUL: POST send.leadgrow.ai/api/leads/create-or-update Bison: POST send.leadgrow.ai/api/campaigns/{id}/leads/attach-leads HeyReach: Add Lead to Campaign (campaignId=290267) Route Row → Social Post Bank (t_0tfqyiguHZvcGoypZJV)
Key Conditional Logic: The email waterfall and much of the downstream enrichment is conditional on the Competitor Qualification gate. If a lead scores as "not qualified" (they are a competing agency), many later steps are skipped — saving significant per-row costs. The email find waterfall uses conditionalRunFieldIds to chain: CompQual blocks Kitt, Kitt blocks Icypeas, Icypeas blocks LeadMagic.

3. All Tables & How They Stitch Together

3.1 Table Map

TableIDColsRoleConnected To
HeyDigital ABM engagers t_0tfqyhkneq3VAFhoM6F 61 Main pipeline — processes each LinkedIn engager Routes to Social Post Bank; pushes to CUL, Bison, HeyReach
Social Post Bank t_0tfqyiguHZvcGoypZJV 12 Post content archive — receives routed rows Receives from Main via Route Row action; has Hey Digital Relevance AI enrichment
Pull in data from a Webhook Table t_0tfr7d8dF5mnzfnpzzZ 19 Separate company enrichment entry point Independent webhook → LinkedIn Company Info → HTTP API output
Enrich 22K unknowns - headcount t_0tfnv6iGPatnmaWKaTC 40 Bulk headcount enrichment for companies missing data Likely feeds results back to main table or CRM
Enrich missing company information t_0tcxgijYsmrT5seFVrr 36 Gap-fill enrichment for incomplete company profiles Likely feedback loop to main table

3.2 Stitching Mechanism

How Tables Connect

Primary stitch: The Main table Send table data action (Route Row, Clay Labs package) sends Post Text + Post URL to the Social Post Bank table (t_0tfqyiguHZvcGoypZJV). This is a Clay-native table-to-table routing — no external API needed. The Social Post Bank then enriches with "Hey Digital Relevance" AI scoring.

Secondary stitch (inferred): The enrichment tables (headcount, missing company info) likely feed into the Main table via Clay Lookup or Import from Table actions, pulling enriched company data back into the main pipeline. The "Enrich missing company information" workbook contains multiple tables that appear to be a sub-pipeline for company data.

External stitches: The 4 destination pushes all leave Clay via HTTP API calls — to LeadGrow API, Bison campaigns, and HeyReach. These are the exit points from Clay into the rest of the LeadGrow stack.

Webhook entry: Table 3 ("Pull in data from a Webhook Table") is a separate entry point with its own webhook source. It enriches company data and outputs via HTTP API — possibly used to backfill company info that the main pipeline references.

4. Clay Cost Analysis Per Row

Clay Credit Model (approximate)

Clay charges credits per enrichment action. Pricing varies by action type and provider. Approximate credit costs based on Clay published pricing and provider tiers:

Action TypeCount per RowCredits EachSubtotal
Icypeas Find Email1 (conditional)~80-8
Kitt HTTP API (find email)1 (conditional)~20-2
LeadMagic Find Email1 (conditional)~60-6
LeadMagic Validate Email ×30-3 (conditional)~30-9
Claygent AI (gpt-4o-mini) ×44~3-512-20
Use AI (gpt-4.1-mini) ×66~1-26-12
EmailGuard HTTP API1~22
Route Row (Send table data)1~11
HeyReach Add to Campaign1~33
LeadGrow CUL (HTTP API)1~22
LeadGrow Bison (HTTP API)1~22
TOTAL (full path)~26-65 credits
TOTAL (avg — waterfall skips 2 finders)~40-50 credits
~$0.40-0.50
Clay cost per row (at ~1c/credit)
~$400-500
Clay cost per 1,000 rows
~$4,000-5,000
Clay cost per 10,000 rows (monthly)

Hidden Clay Costs

Claygent actions are expensive. Each Claygent call (the AI agent that navigates LinkedIn for employment verification, company URL finding, domain extraction, company info, and location) costs 3-5 credits minimum. This workbook has 5 Claygent calls per row — that is 15-25 credits just for AI web browsing. The gpt-4.1-mini calls are cheaper but there are 6 of them.

Waterfall email finding: The conditional logic saves costs (only 1-2 of 3 providers run), but Clay still charges for the attempt even if it yields no result.

Credit rounding: Clay rounds up fractional credits. A 0.1 credit action costs 1 credit. This is most punitive for the many small AI calls in this workbook.

Duplicate charges: The Cold Email Personalization v1 and v2 run BOTH — producing two AI-generated opening lines per row. v1 result appears unused; only v2 feeds into First Line and Company Type. That is a wasted AI call per row.

5. Trigger.dev Migration Plan

Architecture Decision: DAG of Tasks, Not One Monolith

Trigger.dev is built on event-driven task execution with retries, idempotency, and observable state. Each Clay "column" becomes a Trigger.dev task step or independent subtask. The 12-phase pipeline becomes a parent task that orchestrates child tasks, with each phase as a discrete retryable unit.

The key insight: Clay columns are just sequential code with field references. In Trigger.dev, this is TypeScript with variables — no graph resolution engine needed. The formula expressions become inline code. Conditional branching becomes if statements. The waterfall becomes explicit try/catch chains.

5.1 Trigger.dev Task Graph

Trigger.dev Task Graph — Replacement Architecture
[TRIGGER] HeyDigital webhook -> POST /api/triggers/heydigital-engager
[PARENT TASK] processHeyDigitalEngager
Step 1: Parse webhook -> extract firstName, lastName, jobTitle, companyName, linkedinUrl, postUrl, postText
Step 2: [CHILD TASK] findEmail
Waterfall: Icypeas API -> Kitt API -> LeadMagic API (stop on first valid result)
Step 3: [CHILD TASK] validateEmail
Validate each found email via LeadMagic, pick best
Step 4: Consolidate email + tag winning provider
Step 5: [AI TASK] verifyEmployment
Serper search LinkedIn + scrape profile + gpt-4o-mini extract: stillAtCompany, currentCompany, domain
Step 6: [AI TASK] qualifyJobTitle (gpt-4.1-mini)
Step 7: [AI TASKS] enrichCompany
7a: Find verified LinkedIn company URL (Serper + scrape + domain match)
7b: Extract domain from LinkedIn About section
7c: Pull full company info (10 fields from LinkedIn)
7d: Find location from company website
Step 8: Geo/headcount filter (deterministic code)
Country acronym: JSON lookup table (NO AI needed!)
Country whitelist check
Employee count range check (10-300)
Step 9: [AI TASK] competitorAnalysis (gpt-4.1-mini)
Step 10: Data cleaning
AI: Clean full name (gpt-4.1-mini)
AI: Clean company name (gpt-4.1-mini)
EmailGuard: host lookup + MX check
Step 11: [AI TASK] personalizeColdEmail (gpt-4.1-mini)
Generate first line + company type. ONE call, not two (v1 is unused in Clay)
Step 12: [OUTPUT TASKS] push to destinations
CUL: POST send.leadgrow.ai/api/leads/create-or-update
Bison: POST attach to campaign
HeyReach: POST add lead to campaign
Supabase: INSERT post to social_post_bank table

5.2 Key Architectural Differences from Clay

Clay PatternTrigger.dev Equivalent
Column formula with field referencesTypeScript variable assignments. No graph resolution — just sequential code.
Conditional action execution (conditionalRunFieldIds)if statements + early returns. More readable, easier to debug.
Waterfall enrichment (try A, if fails try B)Explicit try/catch + sequential API calls. Trigger.dev retry config handles transient failures.
Claygent AI web browsingReplace with Serper.dev for search + Spider Cloud or ScrapingBee for scraping + OpenAI for extraction. No Claygent platform markup.
Credit-based cost modelPay for actual API usage only (OpenAI tokens, Serper queries, scraping credits). No platform margin on top.
Route Row (table-to-table)INSERT into Supabase/Postgres. Same data model, zero vendor lock-in.
Formula language (Clay formulas)TypeScript. More expressive, testable with Jest, versionable in git.
Error handlingPer-step retries with exponential backoff, dead letter queues, alerting. Clay: column turns red, manual fix.

5.3 Replacements for Clay-Specific Features

Clay FeatureReplacementCost Impact
Icypeas Find EmailIcypeas API directly (bypass Clay ~4x markup)~75% cheaper per lookup
LeadMagic Find + Validate EmailLeadMagic API directly~60% cheaper
Kitt (via Clay HTTP API)Kitt API directlyNo change (already direct API call in Clay)
Claygent (AI web browsing)Serper.dev + OpenAI gpt-4o-mini + fetch/scrape~80% cheaper (no Claygent markup)
Route Row to Social Post BankSupabase INSERTFree (Supabase is already in stack)
Country acronym generation (AI)Deterministic lookup table (200 lines of JSON)Free — no AI call needed!
Author name mapping (formula)TypeScript Map/switchFree — no formula engine needed
Pluralization rule (formula)TypeScript pluralize npm packageFree — pure code, tested once
Biggest Win: The country acronym AI call is completely unnecessary. Clay uses GPT-4.1-mini to convert "San Francisco, CA" -> "US". That is a deterministic lookup that costs $0 in Trigger.dev vs. 1-2 Clay credits per row. Same for the author name mapping (4 hardcoded LinkedIn URL -> name mappings in a Clay formula — trivial switch statement in TS). And the pluralization rule is a lodash endsWith check implemented as a Clay formula — an npm package in JS.

6. Trigger.dev Pricing: Real Numbers

Trigger.dev Plans (as of mid-2026)

PlanPriceRuns/MonthConcurrencyEnvironmentsOverage
HobbyFree1,00011 (dev only)N/A
Pro$20/mo10,0001001$0.002/run
Team$100/mo100,0001,0005$0.001/run
EnterpriseCustomUnlimitedUnlimitedUnlimitedNegotiated

A "run" = one task execution. Our parent task + ~5 child tasks per row = ~6 runs per lead processed. Child tasks count against the run quota. At 10,000 leads/month, that is ~60,000 total runs. On the Team plan, 100K included runs covers this with room. On Pro, 40K overage at $0.002 = $80 extra.

6.1 Per-Row Cost Breakdown — Trigger.dev Version

Cost ComponentPer RowNotes
Trigger.dev runs (~6 tasks/row, Team plan)$0.0066 tasks x $0.001/run (marginal overage cost)
OpenAI gpt-4.1-mini (6 calls x ~1K tokens)$0.004$0.60/1M input + $2.40/1M output tokens
OpenAI gpt-4o-mini (1-2 calls x ~2K tokens)$0.001Cheaper model, used for company extraction
Serper.dev search (5-6 queries for Claygent replacement)$0.005~$0.001/query at scale
Spider Cloud / ScrapingBee (2-3 pages scraped)$0.006~$0.002/page for LinkedIn + company website scraping
Icypeas email lookup$0.02Direct API pricing (vs Clay ~$0.08)
LeadMagic find + validate$0.03Direct API pricing (vs Clay ~$0.09)
Kitt email lookup$0.01Direct API — same cost since already direct in Clay
EmailGuard host lookup$0.003Direct API
LeadGrow CUL + Bison API$0.00Own infrastructure — already running
HeyReach API$0.01Direct API
Supabase INSERT$0.00Already in stack, marginal cost zero
TOTAL per row~$0.095
~$0.10
Trigger.dev total per row
~$95
Per 1,000 rows
~$950
Per 10,000 rows (monthly)
~76-81%
Savings vs Clay per row
What drives the 76-81% savings:
  1. No Clay platform margin on enrichment actions (Clay marks up API calls 3-5x)
  2. No Claygent tax — Claygent is Clay most expensive feature. Replacing with Serper + scraping + OpenAI is ~80% cheaper
  3. Deterministic code replaces AI — country acronym lookup, author name mapping, pluralization rule all cost $0 in code vs 3+ credits in Clay
  4. Direct provider APIs — Icypeas/LeadMagic are 60-75% cheaper when called directly without Clay surcharge
  5. Eliminated duplicate AI call — Cold Email Personalization v1 runs and its output is never used. That is 1-2 credits wasted per row
  6. Trigger.dev pricing — at $0.001/run on Team plan, the platform cost is negligible compared to API costs (~6% of total)

7. Clay vs Trigger.dev: Total Cost Comparison

7.1 Monthly Cost at Scale

Volume (leads/month)Clay CostTrigger.dev CostMonthly SavingsAnnual Savings
1,000$400 – $500$95 + $20 (Pro plan) = $115$285 – $385$3,420 – $4,620
5,000$2,000 – $2,500$475 + $20 = $495$1,505 – $2,005$18,060 – $24,060
10,000$4,000 – $5,000$950 + $20 = $970$3,030 – $4,030$36,360 – $48,360
50,000$20,000 – $25,000$4,750 + $100 (Team) + ~$150 overage = $5,000$15,000 – $20,000$180,000 – $240,000
100,000$40,000 – $50,000$9,500 + $100 + ~$400 overage = $10,000$30,000 – $40,000$360,000 – $480,000

7.2 Non-Cost Considerations

Advantages of Staying on Clay

Advantages of Migrating to Trigger.dev

8. Migration Roadmap

Phase 1: Infrastructure Setup (Week 1)

  1. Create Trigger.dev project in existing LeadGrow org (use lg-trigger CLI for project scaffolding)
  2. Set up provider API accounts — Icypeas direct API key, LeadMagic direct, Serper.dev, Spider Cloud or ScrapingBee, EmailGuard direct
  3. Build SDK wrappers — thin TypeScript wrappers for Icypeas, LeadMagic, Kitt, EmailGuard, Serper, Spider Cloud
  4. Create Supabase tablesocial_post_bank to replace the Social Post Bank Clay table
  5. Set up environments — dev/staging/production in Trigger.dev with separate API keys

Phase 2: Build Parent Task (Week 1-2)

  1. Implement processHeyDigitalEngager — parent task with all 12 phases as inline steps
  2. Email waterfall — Icypeas -> Kitt -> LeadMagic with explicit try/catch, stop on first valid result
  3. Company enrichment — Serper search LinkedIn -> scrape profile -> gpt-4o-mini extract fields
  4. Deterministic helpers — country acronym lookup, author name mapping, pluralization (npm:pluralize)
  5. AI tasks — job title qualification, competitor analysis, name cleaning, company cleaning, cold email personalization
  6. Output tasks — CUL, Bison, HeyReach pushes + Supabase INSERT
  7. Write tests — unit tests for each phase with mock API responses. Integration test for full pipeline.

Phase 3: Parallel Validation (Week 2-3)

  1. Duplicate HeyDigital webhook — send to BOTH Clay and Trigger.dev simultaneously (webhook proxy or HeyDigital config)
  2. Run 500-1,000 rows through both — collect outputs from both pipelines
  3. Diff comparison:
    • Email find rate: what % of rows get a valid email? Is Trigger.dev rate same or better?
    • Email quality: do the same emails get found? If different, which is better?
    • Company enrichment: do LinkedIn company URLs match? Is domain extraction accurate?
    • AI quality: are first lines equally good? Are job title qualifications consistent?
    • Cost tracking: measure actual per-row API costs vs estimates
  4. Edge case testing — non-Latin names, non-English company descriptions, missing LinkedIn profiles, personal posts (should output NA)
  5. Fix discrepancies — iterate on prompts, scraping approach, fallback logic until Trigger.dev output >= Clay output quality

Phase 4: Gradual Cutover (Week 3-4)

  1. Feature-flag destinations — switch CUL/Bison/HeyReach pushes from Clay to Trigger.dev ONE at a time with monitoring
  2. Day 1-2: Supabase only — Trigger.dev pushes to Supabase post bank, Clay handles CRM pushes
  3. Day 3-4: Add CUL — Trigger.dev handles CUL + Supabase, Clay handles Bison + HeyReach
  4. Day 5-6: Add Bison — Trigger.dev handles CUL + Bison + Supabase, Clay handles HeyReach
  5. Day 7: Full cutover — Trigger.dev handles all 4 destinations
  6. Monitor for 1 week — with both pipelines still receiving data, confirm Trigger.dev outputs are production-quality

Phase 5: Archive Clay (Week 4)

  1. Stop Clay webhook — redirect HeyDigital exclusively to Trigger.dev
  2. Export Clay data — pull all records from Clay tables as JSON/CSV backup
  3. Export Clay schemas — already done (see files below). Archive for reference.
  4. Pause (do not delete) Clay workbook — keep for 30 days as fallback. Delete only after a full billing cycle with zero issues.
  5. Document the migration — write post-mortem: what was easy, what was hard, what Clay did better, what Trigger.dev did better

Migration Risks (ranked by severity)

  1. Claygent replacement is the hardest part. Claygent LinkedIn browsing uses BrowserBase under the hood with proxy rotation and auth management. Replacing this with Serper + scraping requires careful handling of LinkedIn anti-bot measures. Mitigation: Use Spider Cloud or ScrapingBee for LinkedIn scraping (they handle proxies). Budget 2-3 weeks just for this component.
  2. Icypeas direct API may have different rate limits outside of Clay enterprise agreement. Mitigation: Contact Icypeas for direct API pricing and limits before migration.
  3. The email waterfall logic is subtly complex — conditionalRunFieldIds create a specific priority chain (CompetitorQualification blocks Kitt, Kitt blocks Icypeas, Icypeas+Clay blocks LeadMagic). Overlooking one condition means running providers that should be skipped. Mitigation: Write integration tests that verify each waterfall branch.
  4. Clay formula engine handles null/undefined gracefully with fallback operators (||, optional chaining). TypeScript requires explicit null checks. Mitigation: Use TypeScript optional chaining (?.) and nullish coalescing (??) — the syntax maps directly to Clay formula patterns.
  5. Cost spike risk — Clay bills are predictable (credits). Trigger.dev + direct API costs are usage-based and can spike if a webhook floods or a scraping service is over-called. Mitigation: Set usage alerts on all API accounts. Add rate limiting on the webhook endpoint.

Files Generated for This Analysis

C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_main_schema.json (165 KB — 61 columns)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_routed_target_schema.json (20 KB — Social Post Bank)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_webhook_source_schema.json (32 KB — Webhook Table)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_enrich_company_schema.json (68 KB — Enrich Company)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\table_enrich_headcount_schema.json (71 KB — Enrich Headcount)
C:\Users\mitch\Everything_CC\temp\clay-workbook-analysis\index.html (this page)

All schemas extracted via clay pull --table [id] from leadgrow-clay-cli. Analysis performed 2026-06-10. Clay workspace 206846, workbook wb_0tfqyhki4hnphtPTmom. No records were available (table was empty or API-filtered at extraction time).

Next step: Deploy this page to Cloudflare Pages for team review.

9. Corrected Cost Model — Gated + Flex API + Prompt Caching

Key corrections applied:

9.1 AI Costs — Flex API + Prompt Caching

Prompt caching strategy

Every prompt follows this structure to maximize caching:

<!-- CACHED (80% of tokens, 50% discount) -->
System prompt + Instructions + Banned words + Scoring rubric + Examples

<!-- UNCACHED (20% of tokens) -->
Variable 1: {{job_title}}
Variable 2: {{company_name}}
Variable 3: {{post_text}}
Variable 4: {{company_description}}
  

Flex API pricing (gpt-4.1-mini): .30/1M input, .15/1M cached input, $1.20/1M output.

AI CallInput TokensOutput TokensCost (cached)Runs on
Job Title Qualification~600 (80% cached)~40.00014100% of rows
Competitor Analysis~400 (80% cached)~40.0001040% of rows
Clean Company Name~800 (80% cached)~50.0001430% of rows
Cold Email Personalization~1000 (80% cached)~30.0001630% of rows
TOTAL AI (amortized per row).00028
AI is noise. With Flex + caching, all 4 AI calls combined cost less than 3 hundredths of a cent per row. The real costs are the enrichment APIs.

9.2 Full Amortized Cost Per Row

Cost LinePer Row (ungated)% of Rows That HitAmortized
harvestapi employment verification.00465%.0026
Serper (4-5 Google searches).00565%.0033
Firecrawl (2-3 pages scraped).00365%.0020
AI Ark person enrichment~.00130%.0003
Email waterfall (Kitt → Icypeas → LeadMagic).03530%.0105
All AI calls (Flex + cached).00028varies.0003
Trigger.dev (batched 20x).002100%.0020
TOTAL.021
~.02
Per-row amortized cost
~$210
Per 10,000 leads/month
~95%
Savings vs Clay ($4,000-5,000/mo)

10. Workflow Prototype — Build-Ready

Architecture Decisions

10.1 Mermaid Flow Diagram

flowchart TD subgraph TRIGGER["TRIGGER (always free)"] WH["HeyDigital Webhook\n→ Trigger.dev HTTP endpoint"] --> PARSE["Parse payload\nExtract: firstName, lastName,\njobTitle, companyName,\nlinkedinUrl, postUrl, postText"] end subgraph GATE1["GATE 1 — Job Title (100% of rows)"] PARSE --> JT_AI["AI: Job Title Qualification\nGPT-4.1-mini, Flex API, CACHED\nCost: .00014/row\nOutput: score 1-10 + tier"] end JT_AI --> JT_CHECK{"score >= 5?"} JT_CHECK -->|"No (score 1-4)\n~35% of rows"| PUSH_EARLY["DISQUALIFIED\n→ Push to Supabase only"] JT_CHECK -->|"Yes (Qualified/Maybe)\n~65% of rows"| PARALLEL["PARALLEL EXECUTION"] subgraph PARALLEL_RUN["Parallel Phase (~65% of rows)"] EV["Employment Verification\nApify harvestapi\n.004/row\nChecks: stillAtCompany? currentTitle?"] CE["Company Enrichment\nSerper: search LinkedIn URL, domain\nFirecrawl: scrape About page + website\nExtract: industry, employees, location, description\nCost: .008/row"] end PARALLEL --> EV PARALLEL --> CE EV --> MERGE["Merge Results"] CE --> MERGE subgraph GATE2["GATE 2 — Geo + Headcount (~65% of rows)"] MERGE --> COUNTRY["Country Lookup TABLE\nLocation → Acronym\nNO AI — deterministic JSON\nCost: FREE"] COUNTRY --> GEO_CHECK{"Target country?\n(US,UK,CA,AU,DE,FR,ES,IT,NL,SE,CH,SA,AE,GB)\nAND 10-300 employees?"} end GEO_CHECK -->|"No\n~25% of rows"| PUSH_GEO["DISQUALIFIED\n→ Push to Supabase only"] GEO_CHECK -->|"Yes\n~40% of rows"| GATE3 subgraph GATE3["GATE 3 — Competitor Analysis (~40% of rows)"] COMP_AI["AI: Competitor Analysis\nGPT-4.1-mini, Flex API, CACHED\nCost: .00010/row\nScores: is this a digital agency?\n1-4 = agency, 5-7 = maybe, 8-10 = prospect"] end COMP_AI --> COMP_CHECK{"Is prospect?\n(score >= 5, not agency)"} COMP_CHECK -->|"No (competitor agency)\n~10% of rows"| PUSH_COMP["DISQUALIFIED\n→ Push to Supabase only"] COMP_CHECK -->|"Yes (qualified prospect)\n~30% of rows"| EMAIL_WF subgraph EMAIL_WF["EMAIL WATERFALL (~30% of rows)"] direction LR AI_ARK["AI Ark\nPerson enrichment\n(33K credits)\n~.001/row"] --> BLITZ["Blitz\n[NEED CLARIFICATION]"] BLITZ --> KITT["Kitt API\n~.01/row"] KITT -->|"no valid email"| ICY["Icypeas API\n~.02/row"] ICY -->|"no valid email"| LM["LeadMagic API\n~.03/row"] LM --> EMAIL_RESULT["Best email + provider tag"] end KITT -->|"valid email"| EMAIL_RESULT ICY -->|"valid email"| EMAIL_RESULT EMAIL_RESULT --> EMAIL_CHECK{"Email found?"} EMAIL_CHECK -->|"No"| PUSH_NOEMAIL["NO EMAIL\n→ Push to destinations"] EMAIL_CHECK -->|"Yes"| CLEAN subgraph CLEAN["DATA CLEANING (~30% of rows)"] FN["First Name: REGEX\nsplit on space, take[0]\nOnly if >1 word\nCost: FREE"] CN["Company Name: AI\nGPT-4.1-mini, CACHED\nStrip Inc/LLC, create acronym\nCost: .00014/row"] end FN --> COLD_EMAIL CN --> COLD_EMAIL subgraph COLD_EMAIL["COLD EMAIL (~30% of rows)"] CE_AI["AI: Cold Email Personalization\nGPT-4.1-mini, CACHED\nOutput: firstLine (20 words)\n+ companyType (2-3 words)\nCost: .00016/row\nONE merged call, not two"] end COLD_EMAIL --> DEST subgraph DEST["DESTINATIONS (always free)"] direction LR CUL["LeadGrow CUL\ncreate-or-update\nCost: FREE"] BISON["EmailBison\nattach to campaign\nCost: FREE"] HEY["HeyReach\nadd to campaign\nCost: FREE"] SUPA["Supabase\nsocial_post_bank\nCost: FREE"] end PUSH_EARLY --> DEST PUSH_GEO --> DEST PUSH_COMP --> DEST PUSH_NOEMAIL --> DEST COLD_EMAIL --> CUL COLD_EMAIL --> BISON COLD_EMAIL --> HEY COLD_EMAIL --> SUPA CUL --> DONE["DONE"] BISON --> DONE HEY --> DONE SUPA --> DONE style TRIGGER fill:#0d1117,stroke:#3fb950,color:#3fb950 style GATE1 fill:#0d1117,stroke:#a371f7,color:#a371f7 style PARALLEL_RUN fill:#0d1117,stroke:#39d2c0,color:#39d2c0 style GATE2 fill:#0d1117,stroke:#58a6ff,color:#58a6ff style GATE3 fill:#0d1117,stroke:#a371f7,color:#a371f7 style EMAIL_WF fill:#0d1117,stroke:#d2991d,color:#d2991d style CLEAN fill:#0d1117,stroke:#58a6ff,color:#58a6ff style COLD_EMAIL fill:#0d1117,stroke:#a371f7,color:#a371f7 style DEST fill:#0d1117,stroke:#3fb950,color:#3fb950

10.2 Trigger.dev Task Structure (TypeScript prototype)

// trigger.ts — parent task
import { task, batch } from "@trigger.dev/sdk";

export const processHeyDigitalBatch = task({
  id: "process-heydigital-batch",
  retry: { maxAttempts: 3 },

  run: async (payload: { rows: HeyDigitalWebhookRow[] }, { ctx }) => {
    const results = [];

    for (const row of payload.rows) {
      // GATE 1: Job title qualification (always)
      const { score, qualification } = await qualifyJobTitle(row.jobTitle);
      if (score < 5) {
        await pushToSupabase(row, { status: "disqualified_title" });
        continue; // 35% stop here
      }

      // PARALLEL: employment + company enrichment
      const [empResult, companyResult] = await Promise.all([
        verifyEmployment(row.linkedinUrl),      // Apify harvestapi
        enrichCompany(row.companyName),          // Serper + Firecrawl
      ]);

      // GATE 2: Geo + headcount
      const country = COUNTRY_LOOKUP[companyResult.location]; // free
      if (!TARGET_COUNTRIES.has(country) || companyResult.employees < 10 || companyResult.employees > 300) {
        await pushToSupabase(row, { status: "disqualified_geo" });
        continue; // 25% stop here
      }

      // GATE 3: Competitor analysis
      const { isProspect } = await analyzeCompetitor(companyResult);
      if (!isProspect) {
        await pushToSupabase(row, { status: "disqualified_competitor" });
        continue; // 10% stop here
      }

      // EMAIL WATERFALL
      const email = await findEmail({
        fullName: row.firstName + " " + row.lastName,
        domain: companyResult.domain,
        linkedinUrl: row.linkedinUrl,
      }); // AI Ark -> Blitz -> Kitt -> Icypeas -> LeadMagic

      // CLEANING
      const cleanFirstName = row.firstName.split(" ").length > 1
        ? row.firstName.split(" ")[0].replace(/[^a-zA-Z]/g, "")
        : row.firstName;
      const cleanCompany = await cleanCompanyName(companyResult.companyName);

      // COLD EMAIL (merged call, not two)
      const { firstLine, companyType } = await personalizeEmail({
        authorName: resolveAuthor(row.postAuthorLinkedinUrl),
        postText: row.postText,
        companyName: cleanCompany,
        companyDescription: companyResult.description,
        products: companyResult.products,
      });

      // PUSH TO ALL DESTINATIONS
      await Promise.all([
        pushToCUL({ firstName: cleanFirstName, lastName: row.lastName, email, company: cleanCompany, ... }),
        pushToBison({ email, campaignId: process.env.CAMPAIGN_ID }),
        pushToHeyReach({ firstName: cleanFirstName, lastName: row.lastName, linkedinUrl: row.linkedinUrl, ... }),
        pushToSupabase(row, { status: "qualified", firstLine, companyType, email }),
      ]);
    }
  },
});

// Trigger: HeyDigital webhook -> batch of 20 rows
export const heydigitalWebhook = batch({
  id: "heydigital-webhook",
  trigger: httpEndpoint(),
  batch: { maxItems: 20, maxWaitMs: 5000 },
  run: async (items, { ctx }) => {
    await processHeyDigitalBatch.trigger({ rows: items });
  },
});

10.3 Open Questions Before Build

Need clarification on:

  1. Blitz: What is this in the email waterfall? Is it a specific enrichment provider, an internal tool, or a quick email pattern check? This sits between AI Ark and the waterfall.
  2. AI Ark cost: How many credits does export-one consume per person enrichment? 33K credits suggests ~$330 if .01/credit, but need exact per-lookup cost.
  3. Firecrawl plan: Which Firecrawl tier are we on? Free tier (500 credits) or paid? Need to confirm per-page cost for company About page scraping.
  4. harvestapi (Apify) account: Do we have a paid Apify plan, or should we set one up? At free tier, .004/profile with monthly limits. Need to confirm tier for volume estimates.
  5. Supabase table schema: Need the social_post_bank table created. What fields beyond postUrl + postText?
  6. Batch size tuning: Is 20 rows per batch optimal? Trade-off: larger batches = lower Trigger.dev costs but more memory + longer execution time per run.