How to Crawl Freelancer Jobs Using Beautiful Soup in Python

code : https://gist.github.com/KhaledHawwas/a1cdd070c7bbe07bc8476c609938d0ab Freelancer job platforms like Freelancer.com contain thousands of live opportunities across categories such as Android development, web design, and data science. Crawling these listings using Beautiful Soup in Python allows developers to extract structured job data for analytics, automation, and AI workflows.

In this article, we will walk through the structure and logic behind a real-world freelancer job crawling script, based on the shared gist. Instead of rewriting the code, we will explain what each section does and how it enables scalable job extraction.

Target: Android Jobs on Freelancer

The script focuses on crawling:

https://www.freelancer.com/jobs/android

This category page lists active Android-related freelance jobs. The goal is to crawl multiple pages of listings and extract structured job information.

Understanding the Core Configuration

The script begins with three important variables:

1. API_KEY

API_KEY = "YOUR API KEY"

This is your authentication credential. It authorizes requests to the Crawleo crawling API.

Instead of scraping directly with raw HTTP requests, the script uses an API endpoint to handle:

Page fetching
HTML cleaning
Anti-bot handling
Structured output formatting

You must replace this placeholder with your actual Crawleo API key.

2. API_ENDPOINT

API_ENDPOINT = "https://api.crawleo.dev/crawl"

This defines the crawling endpoint used to fetch and process the target page.

Rather than manually handling headers, sessions, and parsing complexities, the script delegates crawling to the API endpoint. This simplifies large-scale job scraping and reduces maintenance overhead.

3. base_url

base_url = "https://www.freelancer.com/jobs/android"

This is the root category page containing Android freelancer jobs.

Freelancer uses paginated URLs. When crawling job listings at scale, you must iterate across multiple pages to collect complete datasets.

4. total_pages

total_pages = 3

This defines how many pages of job listings will be crawled.

For example:

Page 1: /jobs/android
Page 2: /jobs/android/2
Page 3: /jobs/android/3

By setting total_pages = 3, the script ensures broader coverage beyond just the first page of listings.

How the Crawling Flow Works

Even without examining the full code, the logic typically follows this structure:

Step 1: Generate Paginated URLs

The script dynamically constructs URLs based on base_url and total_pages.

This enables scalable crawling across multiple result pages instead of scraping a single HTML document.

Step 2: Send Each URL to the Crawling API

Instead of scraping directly using requests, the script sends each constructed page URL to:

https://api.crawleo.dev/crawl

This approach provides several advantages:

Cleaner HTML extraction
Reduced risk of IP blocking
Automatic handling of dynamic content
Structured output options such as markdown or text

Step 3: Parse the Returned HTML Using Beautiful Soup

After receiving the page content, Beautiful Soup is used to:

Locate job containers
Extract job titles
Capture budgets
Retrieve short descriptions
Collect posting metadata

Beautiful Soup remains essential here because it allows structured navigation of the returned HTML.

This hybrid approach combines:

API-level crawling infrastructure
Python-level HTML parsing

Which creates a scalable and maintainable freelancer job scraping pipeline.

Why This Approach Is Better Than Basic Scraping

Many developers attempt to crawl freelancer job listings using only requests and Beautiful Soup. While that works for small projects, it often fails at scale due to:

Rate limits
Anti-bot protections
Frequent layout changes
Dynamic content loading

Using an API endpoint for crawling provides:

Infrastructure abstraction
Consistent HTML cleaning
Better reliability
Production-ready scaling

Then Beautiful Soup handles structured parsing locally.

How to Crawl Freelancer Jobs Using Beautiful Soup in Python

How to Crawl Freelancer Jobs Using Beautiful Soup in Python

Target: Android Jobs on Freelancer

Understanding the Core Configuration

1. API_KEY

2. API_ENDPOINT

3. base_url

4. total_pages

How the Crawling Flow Works

Step 1: Generate Paginated URLs

Step 2: Send Each URL to the Crawling API

Step 3: Parse the Returned HTML Using Beautiful Soup

Why This Approach Is Better Than Basic Scraping

Tagged

Related posts

What Are LLMs? A Plain-English Guide for Developers in 2026

How to Connect Crawleo to LM Studio via MCP

When to Use JavaScript Rendering (And When You Don’t Need It)