Football Data Python API Docs | dltHub

Build a Football Data-to-database pipeline in Python using dlt with AI Workbench support for Claude Code, Cursor, and Codex.

Last updated: Mar 11, 2026

Football-Data API is a REST API exposing football (soccer) domain data—competitions, matches, teams, persons, standings, scorers and related subresources. The REST API base URL is https://api.football-data.org/v4 and all requests require an X-Auth-Token header (API token)..

dlt is an open-source Python library that handles authentication, pagination, and schema evolution automatically. dlthub provides AI context files that enable code assistants to generate production-ready pipelines. Install with uv pip install "dlt[workspace]" and start loading Football Data data in under 10 minutes.

What data can I load from Football Data?

Here are some of the endpoints you can load from Football Data:

Resource	Endpoint	Method	Data selector	Description
areas	/v4/areas	GET	areas	List areas (countries/regions).
competitions	/v4/competitions	GET	competitions	List competitions (leagues/tournaments).
competition	/v4/competitions/{id}	GET	(object)	Single competition resource (object).
competition_matches	/v4/competitions/{id}/matches	GET	matches	Matches for a competition (subresource).
competition_teams	/v4/competitions/{id}/teams	GET	teams	Teams in a competition.
competition_standings	/v4/competitions/{id}/standings	GET	standings	Standings/league tables for a competition.
competition_scorers	/v4/competitions/{id}/scorers	GET	scorers	Top scorers for a competition.
teams	/v4/teams	GET	teams	List teams (paginated).
team	/v4/teams/{id}	GET	(object)	Single team resource.
team_matches	/v4/teams/{id}/matches	GET	matches	Matches for a team (subresource).
matches	/v4/matches	GET	matches	List matches across competitions.
match	/v4/matches/{id}	GET	(object)	Single match resource.
match_head2head	/v4/matches/{id}/head2head	GET	matches	Head-to-head matches list (response also includes aggregates/resultSet).
persons	/v4/persons	GET	persons	List persons (players/coaches) if available.
person	/v4/persons/{id}	GET	(object)	Single person resource.
person_matches	/v4/persons/{id}/matches	GET	matches	Matches for a person.

How do I authenticate with the Football Data API?

Authentication is performed by including your API token in the X-Auth-Token HTTP header for every request (e.g., X-Auth-Token: YOUR_TOKEN). Some example client response headers also include X-Authenticated-Client, X-API-Version and rate-limit related headers.

1. Get your credentials

Register / sign in at https://www.football-data.org/client/register or "Get started" on the site. 2) Create or view your API client / token on the dashboard. 3) Copy the issued API token. 4) Use the token in the X-Auth-Token header for requests.

2. Add them to .dlt/secrets.toml


[sources.football_data_source]
api_key = "your_api_token_here"

dlt reads this automatically at runtime — never hardcode tokens in your pipeline script. For production environments, see setting up credentials with dlt for environment variable and vault-based options.

How do I set up and run the pipeline?

Set up a virtual environment and install dlt:


uv venv && source .venv/bin/activate
uv pip install "dlt[workspace]"

1. Install the dlt AI Workbench:


dlt ai init --agent <your-agent> # <agent>: claude | cursor | codex

This installs project rules, a secrets management skill, appropriate ignore files, and configures the dlt MCP server for your agent. Learn more →

2. Install the rest-api-pipeline toolkit:


dlt ai toolkit rest-api-pipeline install

This loads the skills and context about dlt the agent uses to build the pipeline iteratively, efficiently, and safely. The agent uses MCP tools to inspect credentials — it never needs to read your secrets.toml directly. Learn more →

3. Start LLM-assisted coding:

Use /find-source to load data from the Football Data API into DuckDB.

The rest-api-pipeline toolkit takes over from here — it reads relevant API documentation, presents you with options for which endpoints to load, and follows a structured workflow to scaffold, debug, and validate the pipeline step by step.

4. Run the pipeline:


python football_data_pipeline.py

If everything is configured correctly, you'll see output like this:


Pipeline football_data_pipeline load step completed in 0.26 seconds
1 load package(s) were loaded to destination duckdb and into dataset football_data_data
The duckdb destination used duckdb:/football_data.duckdb location to store data
Load package 1749667187.541553 is LOADED and contains no failed jobs

Inspect your pipeline and data:


dlt pipeline football_data_pipeline show

This opens the Pipeline Dashboard where you can verify pipeline state, load metrics, schema (tables, columns, types), and query the loaded data directly.

Python pipeline example

This example loads matches and competitions from the Football Data API into DuckDB. It mirrors the endpoint and data selector configuration from the table above:


import dlt
from dlt.sources.rest_api import RESTAPIConfig, rest_api_resources

@dlt.source
def football_data_source(api_key=dlt.secrets.value):
    config: RESTAPIConfig = {
        "client": {
            "base_url": "https://api.football-data.org/v4",
            "auth": {
                "type": "api_key",
                "api_key": api_key,
            },
        },
        "resources": [
            {"name": "matches", "endpoint": {"path": "matches", "data_selector": "matches"}},
            {"name": "competitions", "endpoint": {"path": "competitions", "data_selector": "competitions"}}
        ],
    }
    yield from rest_api_resources(config)


def get_data() -> None:
    pipeline = dlt.pipeline(
        pipeline_name="football_data_pipeline",
        destination="duckdb",
        dataset_name="football_data_data",
    )
    load_info = pipeline.run(football_data_source())
    print(load_info)

To add more endpoints, append entries from the resource table to the "resources" list using the same name, path, and data_selector pattern.

How do I query the loaded data?

Once the pipeline runs, dlt creates one table per resource. You can query with Python or SQL.

Python (pandas DataFrame):


import dlt

data = dlt.pipeline("football_data_pipeline").dataset()
sessions_df = data.matches.df()
print(sessions_df.head())

SQL (DuckDB example):


SELECT * FROM football_data_data.matches LIMIT 10;

In a marimo or Jupyter notebook:


import dlt

data = dlt.pipeline("football_data_pipeline").dataset()
data.matches.df().head()

See how to explore your data in marimo Notebooks and how to query your data in Python with dataset.

What destinations can I load Football Data data to?

dlt supports loading into any of these destinations — only the destination parameter changes:

Destination	Example value
DuckDB (local, default)	`"duckdb"`
PostgreSQL	`"postgres"`
BigQuery	`"bigquery"`
Snowflake	`"snowflake"`
Redshift	`"redshift"`
Databricks	`"databricks"`
Filesystem (S3, GCS, Azure)	`"filesystem"`

Change the destination in dlt.pipeline(destination="snowflake") and add credentials in .dlt/secrets.toml. See the full destinations list.

Troubleshooting

Authentication failures

If the X-Auth-Token header is missing or invalid the API returns 403 (Restricted Resource) or a JSON error object describing the problem. Ensure X-Auth-Token: YOUR_TOKEN is included for all non-public endpoints.

Rate limits / Too Many Requests

The API is rate-limited. Unauthenticated clients: 100 requests/24h (very limited). Registered free-tier clients: typically 10 requests/minute; Standard: 30/min; higher tiers up to 60/min. Exceeding limits returns HTTP 429 Too Many Requests. Response headers to monitor include X-Requests-Available-Minute and X-RequestCounter-Reset.

Pagination and result wrappers

List resources include meta nodes such as filters and resultSet and then the list under a resource-specific key (e.g., 'matches', 'competitions', 'teams', 'scorers'). Use limit and offset query parameters to paginate (limit default and max vary per resource). Some list endpoints may return top-level count or resultSet.count; check the resource-specific response.

Common HTTP errors

400 Bad Request — malformed request or invalid filter value.
403 Restricted Resource — missing/invalid auth or resource restricted by plan.
404 Not Found — resource id does not exist.
429 Too Many Requests — rate limit exceeded.

Ensure that the API key is valid to avoid 401 Unauthorized errors. Also, verify endpoint paths and parameters to avoid 404 Not Found errors.

Next steps

Continue your data engineering journey with the other toolkits of the dltHub AI Workbench:

data-exploration — Build custom notebooks, charts, and dashboards for deeper analysis with marimo notebooks.
dlthub-runtime — Deploy, schedule, and monitor your pipeline in production.


dlt ai toolkit data-exploration install
dlt ai toolkit dlthub-runtime install

Was this page helpful?

Community Hub

Need more dlt context for Football Data?

Request dlt skills, commands, AGENT.md files, and AI-native context.

Request more context Submit context