Skip to main content

Command Palette

Search for a command to run...

How I Automated My Job Application Process. (Part 2)

Updated
9 min read
How I Automated My Job Application Process. (Part 2)

Welcome back! In Part 1, I showed you how I built a proof of concept to automate job applications using Python scripts. Now it's time for the fun part - turning those scripts into a proper application.

Here's what I learned: the gap between "it's a working POC" and "it's a real application" is where dreams go to die. But we're going to cross that gap anyway.

From Scripts to System

Those Python scripts evolved into different script types in the application, each handling a specific part of the process. Every job search became a "campaign" with its own pipeline. Here's how it works:

  1. Raw HTML Storage: You dump in raw HTML from job boards
// example html for a job listing
<article id="article-42478761" class="action-buttons"><a href="/jobsearch/jobposting/42478761?source=searchresults"
            id="ajaxupdateform:j_id_31_3_3p:1:j_id_31_3_3r" class="resultJobItem">
            <h3 class="title">
                <span class="flag">
                    <span class="new">
                        New
                    </span><span class="telework">On site</span><span class="postedonJB">
                        Posted on Job Bank
                        <span class="description"><span class="fa fa-info-circle" aria-hidden="true"></span>This job was
                            posted directly by the employer on Job Bank.</span>
                    </span>

                </span>
                <span class="job-source job-source-icon-16"><span class="wb-inv">Job Bank</span></span>
                <span class="noctitle"> software developer

                </span>
            </h3>

            <ul class="list-unstyled">
                <li class="date">November 08, 2024
                </li>
                <li class="business">OMEGA SOFTWARE SERVICES LTD.</li>
                <li class="location"><span class="fas fa-map-marker-alt" aria-hidden="true"></span> <span
                        class="wb-inv">Location</span>

                    Scarborough (ON)

                </li>
                <li class="salary"><span class="fa fa-dollar" aria-hidden="true"></span>
                    Salary:
                    $50.00 hourly</li>
                <li class="source"><span class="job-source job-source-icon-16"><span class="wb-inv">Job
                            Bank</span></span>
                    <span class="wb-inv">Job number:</span>
                    <span class="fa fa-hashtag" aria-hidden="true"></span>
                    3146897
                </li>
            </ul>
        </a><span id="ajaxupdateform:j_id_31_3_3p:1:favouritegroup" class="float job-action">
            <a href="/login" data-jobid="42478761" class="favourite saveLoginRedirectURI"
                onclick="saveLoginRedirectURIListener(this);">
                <span class="wb-inv">software developer - Save to favourites</span>
            </a></span>
    </article>

2. Initial Cleanup: A script turns that mess into structured JSON like this:

 {
    "job_link": "https://www.jobbank.gc.ca/jobsearch/jobposting/42478761?source=searchresults",
    "job_id": "42478761",
    "job_role": "software developer",
    "employer": "OMEGA SOFTWARE SERVICES LTD.",
    "location": "Scarborough (ON)",
    "work_arrangement": "On site",
    "salary": "$50.00 hourly"
  }

3. Job Fetching: Another script hits each job URL and grabs the full posting (with polite delays between requests because we're not savages) 4. Job Data Cleaning: This script uses AI to turn job postings into clean, structured data including:

  • Contact email

  • Application instructions

  • Full job description in markdown

  • Additional metadata (salary, location, requirements)

{
    "job_id": "42313964",
    "processed_timestamp": "2024-12-25T19:45:39.829Z",
    "original_fetch_timestamp": "2024-12-25T19:40:46.187Z",
    "job_json": {
      "contact_email": "careers@wiasystems.com",
      "application_instructions": "To apply, please send your resume and cover letter to careers@wiasystems.com.",
      "job_posting_text": "# Job Posting\n\n## Job Title: Software Engineer\n\n**Job Description:**\n\n- Education: Bachelor's degree in Computer Science or related field\n- Experience: 2 years to less than 3 years\n- Location: Vancouver, BC\n- Work Arrangement: Hybrid (in-person and remote)\n\n## Job Responsibilities:\n\n- Collect and document user's requirements\n- Coordinate the development, installation, integration and operation of computer-based systems\n- Define system functionality\n- Develop flowcharts, layouts, and documentation to identify solutions\n- Develop process and network models to optimize architecture\n- Develop software solutions by studying systems flow, data usage, and work processes\n- Evaluate the performance and reliability of system designs\n- Evaluate user feedback\n- Execute full lifecycle software development\n- Prepare plan to maintain software\n- Research technical information to design, develop, and test computer-based systems\n- Synthesize technical information for every phase of the cycle of a computer-based system\n- Upgrade and maintain software\n- Lead and coordinate teams of information systems professionals in the development of software and integrated information systems, process control software, and other embedded software control systems\n\n## Required Skills and Qualifications:\n\n- Agile\n- Cloud\n- Development and operations (DevOps)\n- Eclipse\n- Jira\n- Microsoft Visual Studio\n- HTML\n- Intranet\n- Internet\n- XML Technology (XSL, XSD, DTD)\n- Servers\n- Desktop applications\n- Enterprise Applications Integration (EAI)\n- Java\n- File management software\n- Word processing software\n- X Windows\n- Servlet\n- Object-Oriented programming languages\n- Presentation software\n- Mail server software\n- Project management software\n- Programming software\n- SQL\n- Database software\n- Programming languages\n- Software development\n- XML\n- MS Office\n- Spreadsheet\n- Oracle\n- TCP/IP\n- Amazon Web Services (AWS)\n- Git\n- Atlassian Confluence\n- GitHub\n- Performance testing\n- Postman\n- Software quality assurance\n- MS Excel\n- MS Outlook\n- MS SQL Server\n\n### Benefits:\n\n- Health benefits: Dental plan, Health care plan, Vision care benefits\n- Other benefits: Learning/training paid by employer, Other benefits, Paid time off (volunteering or personal days)\n\nFor more information about the position and to apply, please send your resume and cover letter to careers@wiasystems.com.",
      "job_posting_link": "https://www.jobbank.gc.ca/jobsearch/jobposting/42313964?source=searchresults",
      "additional_info": {
        "salary": "CAD 60.50 per hour",
        "location": "Vancouver, BC",
        "job_role": "Software Engineer",
        "company_name": "WIA Software Systems Inc.",
        "job_type": "Permanent, Full-time",
        "required_experience": "2 years to less than 3 years",
        "required_education": "Bachelor's degree in Computer Science or related field",
        "language_requirements": "English",
        "work_arrangement": "Hybrid (in-person and remote)"
      }
    },
    "raw_gpt_responce": ""
  },

5. Email Generation: Takes your resume + job data and crafts personalized applications that don't sound like they came from a robot

  1. Email Sending: The final step that actually gets your applications out the door

Each campaign is isolated. While a campaign can only run one script at a time (like going from cleanup to fetching to email generation), different campaigns run independently. Think of it like having multiple assembly lines - if one line stops, the others keep humming along. A script breaking in one campaign won't mess with jobs running in another.

The Tech Stack

I could tell you I chose each piece of technology after careful consideration of all possible options. But the truth? I went with what I knew would get the job done:

  • Frontend: Next.js with Shadcn for UI components

  • Backend: Express.js and nodejs (with typescript)

  • Database: MongoDB for the job data

  • Queue System: Redis for background jobs

  • AI Integration: Modular setup supporting multiple providers

The application lives at jaas.fun (Job Application Automation System - I'm great at names, I know).

The Campaign System

Each campaign in the system is completely isolated. This was crucial because:

  • Different job boards need different scripts

  • Rate limits hit at different times

  • You want to test new approaches without breaking existing ones

The campaign schema tracks everything:

  • Raw HTML from job boards

  • Cleanup scripts

  • Generated JSON

  • Email templates

  • Processing status

Each type of script gets specific functions based on its role:

  • Cleanup scripts: Access to read raw HTML and save cleaned JSON

  • Fetch scripts: Network access to job boards and data storage

  • Email generation scripts: Access to AI models and resume data

  • Email sending scripts: Access to email services and campaign status updates

No script can access functions outside its type - a cleanup script can't send emails, and an email script can't fetch new jobs. It's like giving each worker exactly the tools they need, nothing more.

The Script Execution System

This is where things get really interesting. Remember how we need to run untrusted code (our cleanup and processing scripts) safely? Enter the script execution system.

Here's how it works:

  1. Each script gets queued in Redis with:

    • Campaign ID

    • Script type (cleanup, fetch, email generation)

    • Script content

    • Input data

  2. A worker process runs continuously, waiting for new jobs. It uses vm2 to create a sandboxed environment for each script. Why? Because running arbitrary JavaScript is dangerous, and I enjoy sleeping at night.

  3. Each script runs in its own sandbox with:

    • A custom console.log that streams to Redis

    • Access to only its input data

    • Complete isolation from the main system

    • No artificial time limits (because processing 100 jobs takes longer than processing 1)

The logging system is pretty neat. Instead of writing to files or console, every log message gets:

  • Timestamped

  • Stored in Redis by campaign and script type

  • Streamed back to the UI in real-time

The best part? The whole thing is crash-proof. If a script fails, the campaign gets marked as failed but nothing else breaks. If the worker crashes, it restarts and picks up where it left off. You can literally close your browser, go get coffee, maybe actually prepare for those interviews you're about to get.

When a script finishes, the worker:

  1. Takes the output and saves it to the right place in MongoDB

  2. Updates the campaign status

  3. Cleans up any temporary data

  4. Moves on to the next job

And because it's all queue-based, you can have multiple workers running if you need to process more campaigns.

The Data Pipeline

Let me walk you through how data actually flows through the system:

  1. Raw HTML Processing:

    • User dumps raw HTML from job boards into a campaign

    • A script using Cheerio extracts basic details (job ID, title, salary)

    • Smart error handling catches missing fields early

    • HTML gets minified to save storage (we went from 175KB to 32KB per job)

  2. Job Details Fetching:

    • System hits each job URL with proper headers (looking like a real browser)

    • Handles different request types (GET for main page, POST for "how to apply")

    • Adds delays between requests (2-3 seconds) to be nice to job boards

    • Handles timeouts and expired job postings gracefully

  3. AI-Powered Data Cleaning:

    • Turns messy HTML into structured job data

    • Extracts everything from salary ranges to required skills

    • Formats job descriptions as clean markdown

    • Every response includes metadata about processing time and data quality

  4. Cover Letter Generation:

    • Pulls your resume from a configured source (GitHub in my case)

    • Matches your skills against job requirements

    • Generates both HTML and plain text versions

    • Even includes metadata about which skills matched

    • Fails fast if critical info is missing

The Email System

Here's where things get really interesting. The email generation system isn't just sending form letters - it's creating completely personalized applications:

  1. Smart Resume Handling:

    • Pulls your resume from a configured source

    • Parses skills and experience

    • Maps your background to job requirements

  2. Template-Free Generation:

    • No generic "I saw your posting" emails

    • Each letter references specific job details

    • System tracks key points addressed

    • Includes metadata about skill matches

  3. Quality Control:

    • Generates both HTML and plain text versions

    • Fails fast if critical info is missing

    • Tracks missing recommended fields

    • Analyzes tone and content

  4. Sending System:

    • Handles rate limiting automatically

    • BCC's you on all applications

The system even includes metadata about how well your experience matches the job requirements. It's like having a really picky editor who happens to be really, really fast.

What's Next

Remember how in Part 1 I mentioned the email system? Oh boy. That deserves its own article. In Part 3, I'm going to tell you about:

  • Getting rejected by AWS

  • The pitfalls of self-hosted SMTP servers

  • Understanding why big companies don't want you sending automated job applications

  • Finally finding a solution that works

Plus, I'll tell you how I got a job offer before even finishing this project. (Spoiler: It involves accidentally automating myself into a corner.)

In the meantime, check out jaas.fun for:

  • The complete source code

  • Guide on writing scripts and using the application (written with the same attention to detail as my commit messages - "fixed stuff")

  • Video demo of the system in action

Want to know when Part 3 drops? The one with all the juicy email server drama? Follow me on Twitter or LinkedIn. It has all the lessons learned about email infrastructure, rate limiting, and why not all shortcuts lead to where you think they will.

More from this blog

David Dodda

17 posts

Hi, I'm David, a tech enthusiast who loves bringing creative ideas to life. I write about frontend, backend, ai, homelab setup, and electronics. follow me on twitter for updates on my latest projects