Scaling highly personalized outbound with AI.

Build a pipeline to enrich leads with LinkedIn data and generate highly personalized cold emails with AI for under $5.

Illustration of envelopes flowing from a contact card, representing outbound email outreach.

Everyone talks about “AI-powered personalization at scale.”

Nobody actually does it.

Why? Because the tools are expensive—until you look closer at what you actually get.

If you Google “cold email personalization tools,” you’ll see the same trio over and over: Clay, Instantly.ai, Apollo.

  • Clay: Handles enrichment and message generation, but no email sending—you’ll need another tool. For 1,000 leads with enrichment + AI, expect to consume 200,000-350,000 credits, requiring $800+/month minimum. Then add sending costs on top.

  • Instantly.ai: Originally a cold email inbox manager and sender, later added AI message generation and lead enrichment as separate products. Outreach starts at $37/month, but SuperSearch (lead database/enrichment) starts at $197/month—you’ll need both for the full suite, totaling $234+/month minimum.

  • Apollo Professional: Has all features (enrichment, message generation, sending) in one platform at $79/month, but AI quality is template-based and requires heavy editing. You get exactly 1,000 export credits/month—no room for error.

You’re looking at $75-$234+/month for a complete solution, and costs scale quickly as you need more credits, mailboxes, or higher-tier plans.

If you’re just starting out, that’s a lot. Here’s the alternative:

Build it yourself for under $5 per 1,000 leads.

That’s what we’re building: a simple Python script you run locally.

  • Input: a CSV of leads (validated emails + LinkedIn URLs)
  • Enrich: scrape profile context via Bright Data
  • Generate: highly personalized cold emails in minutes

The code for this post is in this repo: GitHub. I’ll keep it updated as we extend the pipeline in future posts.


Part 1: What you need (local)

  • Python 3.12+ and uv(recommended) or pip
  • A leads CSV with: email, first_name, last_name, profile_url
  • API keys (Bright Data + whichever LLM you pick)

Install deps once:

uv init && uv venv && uv add requests pandas python-dotenv

Part 2: Lead enrichment with LinkedIn data

This post assumes you already have a lead list with validated emails and LinkedIn profile links.

The goal is straightforward: turn each LinkedIn URL into usable context (role, company, bio, experience) so your personalization has something real to stand on.

Bright Data LinkedIn scraper

Bright Data’s LinkedIn scraper bypasses blocks, handles CAPTCHAs, and delivers structured data. Their free trial gives you a sandbox to test.

Pricing: Pay per successful result. Pricing starts at $1.5 / 1K records (pay-as-you-go) — about $0.0015 per profile.

Data delivered: Name, current role, company, location, experience history, education, about section, follower count—you name it.

Formats: JSON, CSV. All suitable for downstream processing.

Example workflow:

  1. Prepare a CSV with LinkedIn profile URLs
  2. Submit to Bright Data via API or control panel
  3. Poll the endpoint until results are ready
  4. Download as JSON/CSV
  5. Process locally
import requests
import json
import time

class Bright DataClient:
    """Thin wrapper around Bright Data Scrapers Library trigger + snapshot APIs."""

    def __init__(self, api_key: str, dataset_id: str) -> None:
        self.api_key = api_key
        self.dataset_id = dataset_id
        self.headers = {
            "Authorization": f"Bearer {self.api_key}",
            "Content-Type": "application/json",
        }

    def submit(self, urls: list[str]):
        """
        Submit LinkedIn profile URLs via the Scrapers Library trigger API.
        Returns either results (if synchronous) or a snapshot_id (if async).
        """
        trigger_url = "https://api.Bright Data.com/datasets/v3/trigger"
        params = {
            "dataset_id": self.dataset_id,
            "include_errors": "true",
            "format": "json",
        }
        payload = [{"url": url} for url in urls]

        resp = requests.post(
            trigger_url,
            headers=self.headers,
            params=params,
            json=payload,
            timeout=60,
        )
        resp.raise_for_status()
        data = resp.json()

        # API can respond synchronously (list) or async (dict with snapshot_id)
        if isinstance(data, list):
            return data

        snapshot_id = data.get("snapshot_id") or data.get("id")
        return snapshot_id

    def poll_results(self, snapshot_id: str):
        """Poll snapshots until results are ready."""
        snapshot_url = f"https://api.Bright Data.com/datasets/v3/snapshot/{snapshot_id}"
        params = {"format": "json"}

        while True:
            resp = requests.get(
                snapshot_url,
                headers=self.headers,
                params=params,
                timeout=60,
            )

            # HTTP 202: snapshot still building
            if resp.status_code == 202:
                time.sleep(5)
                continue

            # HTTP 200: snapshot ready
            resp.raise_for_status()
            return resp.json() or []


# Usage
Bright Data_API_KEY = "your_api_key"
PROFILE_DATASET_ID = "gd_l1viktl72bvl7bjuj0"  # LinkedIn Profiles dataset

client = Bright DataClient(Bright Data_API_KEY, PROFILE_DATASET_ID)

profile_urls = [
    "https://www.linkedin.com/in/someone/",
    "https://www.linkedin.com/in/someone-else/"
]

submission = client.submit(profile_urls)

# Handle sync or async response
if isinstance(submission, list):
    results = submission
else:
    results = client.poll_results(submission)

# Save as JSON for next step
with open('enriched_profiles.json', 'w') as f:
    json.dump(results, f, indent=2)

Yes, there are low-cost/free methods using internal LinkedIn APIs or scraping directly from LinkedIn. But those are risky:

  • Can get your account blocked
  • Often slower
  • Usually require logging in with a real LinkedIn account

For production, I’d still recommend Bright Data. At ~$0.0015/profile (pay-as-you-go), it’s negligible compared to the risk, effort, and maintenance of rolling your own scraper.


At this point, you should have:

  • A CSV of leads with validated emails + LinkedIn profile URLs
  • A JSON/CSV export of those LinkedIn profiles from Bright Data

Now we turn that data into messages that actually get replies.

Part 3: AI personalization

Now that you have data, use it to personalize outreach messages. This is what separates “bulk mail” from outreach that actually gets replies.

The idea: use an LLM (Claude, GPT-5.2, etc.) to generate personalized first messages based on the prospect’s LinkedIn about section and job history.

from openai import OpenAI

def generate_personalized_message(prospect_data):
    """
    Use OpenAI to generate a personalized cold email based on prospect data.
    """
    client = OpenAI()  # Requires OPENAI_API_KEY env var

    about = prospect_data.get('about', '') or ''
    prompt = (
        "You are a sales outreach specialist. Craft a concise, highly personalized cold email "
        "that feels written just for this person. Limit to 140-160 words, avoid fluff, and make one clear CTA.\n\n"
        "Use the data below thoughtfully—reference only what is relevant and authentic. If a field is empty, just skip it.\n"
        f"- Name: {prospect_data.get('first_name', '')} {prospect_data.get('last_name', '')}\n"
        f"- Title: {prospect_data.get('title', '')}\n"
        f"- Location: {prospect_data.get('location', '')}\n"
        f"- Education: {prospect_data.get('education', '')}\n"
        f"- About/Bio: {about[:600]}\n"
        f"- Company: {prospect_data.get('company', '')}\n"
        f"- Company about: {prospect_data.get('company_about', '')}\n"
        f"- Company industry: {prospect_data.get('company_industry', '')}\n"
        f"- Company size: {prospect_data.get('company_size', '')}\n"
        f"- Company website: {prospect_data.get('company_website', '')}\n\n"
        "Product: [YOUR_PRODUCT_DESCRIPTION_HERE]\n\n"
        "Structure:\n"
        "1) One-line opener that shows you've actually read their background (title, location, education, or company mission—pick the best hook).\n"
        "2) One-sentence bridge linking their context to your product's specific value (be concrete: metrics, outcomes, or workflow saved).\n"
        "3) One short bullet or micro-example that proves the benefit (no jargon; relevant to media/tech audiences if applicable).\n"
        "4) Close with a single, low-friction CTA (e.g., 10-minute intro this week) and offer to share a tailored example.\n"
        "Keep tone warm, professional, and direct."
    )

    response = client.chat.completions.create(
        model="gpt-5.2",
        max_completion_tokens=400,
        messages=[{"role": "user", "content": prompt}],
    )

    return (response.choices[0].message.content or "").strip()


# Usage
prospect = {
    'first_name': 'John',
    'last_name': 'Doe',
    'title': 'VP Engineering',
    'company': 'TechCorp',
    'location': 'San Francisco, CA',
    'about': 'Building infrastructure. Love DevOps. 10 years in SaaS.',
    'company_about': 'Leading B2B SaaS platform for enterprise workflows',
    'company_industry': 'Software',
    'company_size': '200-500 employees',
}

email_body = generate_personalized_message(prospect)
print(email_body)

Claude Sonnet 4.5 is $3 per million input tokens and $15 per million output tokens. OpenAI’s GPT-5.2 is $1.75 per million input tokens and $14 per million output tokens.

For a cold email (~300 input tokens, ~150 output), that’s roughly $0.0032 (Sonnet 4.5) or $0.0026 (GPT-5.2) per personalized email — about $3.20 or $2.60 per 1,000 prospects.


Part 4: Export (generic CSV)

Every outreach tool wants the same thing: a clean UTF-8 CSV, one lead per row.

Export the fields you care about, then map columns during import.

from datetime import datetime
from pathlib import Path

def export_leads_csv(df, output_dir='output'):
    """Export to UTF-8 CSV with timestamped filename."""
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)

    # Generate timestamped filename to avoid overwriting
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    output_file = output_dir / f"leads_{timestamp}.csv"

    columns = [
        'email',
        'first_name',
        'last_name',
        'company',
        'title',
        'custom_field_1',  # contains the personalized message
        'profile_url',
        'company_url',
        'company_about',
    ]

    df[columns].to_csv(output_file, index=False, encoding='utf-8')
    print(f"Exported {len(df)} leads to {output_file}")
    return output_file

# Usage
export_leads_csv(df)

Note: When importing into your outreach platform (Instantly, Smartlead, Lemlist, Apollo, etc.), map these columns to match the platform’s required format during the import step.


Part 5: Cost breakdown

Here’s the economics, without pretending it’s magic:

ComponentCostNotes
Bright Data LinkedIn scraping$1.5 / 1K recordspay-as-you-go pricing
AI personalization$2.6-$3.2 per 1,000GPT-5.2 or Sonnet 4.5
Total$4.1-$4.7 per 1,000Bright Data + GPT-5.2 or Sonnet 4.5

You’re saving nearly $300/month (or more) by running this pipeline yourself, compared to the typical SaaS stack that can easily hit $300-$1000+/month for similar outbound tooling.


What’s next?

This is part of a series of posts on building a modern cold outreach system:

Coming soon:

I’ll show you how to build a 100% automated appointment booking pipeline, end-to-end:

  • Automatically researching leads from LinkedIn
  • Automatically sending emails + LinkedIn DMs as a complete outbound sequence to get appointments booked

All built with Python, open-source tools, and cheap “pay-per-result” services.


A few final notes

  1. LinkedIn ToS: Bright Data operates legally (no login required, respects robots.txt). The alternative is riskier but free. Choose based on your risk tolerance.

  2. Email list quality: The weakest link in cold email is list quality. No amount of personalization fixes a bad list. Spend time on targeting and list hygiene first.

  3. Deliverability: Even with all this, ~30% of cold emails hit spam folders. Warm up your sending domains, use proper SPF/DKIM/DMARC setup, and test with tools like Mail-tester.

  4. Legally: Always comply with CAN-SPAM (US), GDPR (EU), and local regulations. Include unsubscribe links. Don’t scrape emails without consent where required.

Build smart. Ship fast. And let me know what you build with this.

Subscribe to my newsletter

I send out a newsletter every week, usually on Thursdays, that's it!

You'll get two emails initially—a confirmation link to verify your subscription, followed by a welcome message. Thanks for joining!

You can read all of my previous issues here

Related Posts.