Pardot Python API Docs | dltHub

Build a Pardot-to-database pipeline in Python using dlt with AI Workbench support for Claude Code, Cursor, and Codex.

Last updated:

Pardot (Account Engagement) is a B2B marketing automation platform that exposes REST APIs to read and manage marketing objects such as prospects, campaigns, lists, visitors, and related assets. The REST API base URL is https://pi.pardot.com/api/v5/ (modern JSON API). Legacy endpoints: https://pi.pardot.com/api/ or https://pi.demo.pardot.com/api/ for v3/v4. and All requests require a Salesforce OAuth bearer token; v5 also requires the Pardot Business Unit ID header..

dlt is an open-source Python library that handles authentication, pagination, and schema evolution automatically. dlthub provides AI context files that enable code assistants to generate production-ready pipelines. Install with uv pip install "dlt[workspace]" and start loading Pardot data in under 10 minutes.


What data can I load from Pardot?

Here are some of the endpoints you can load from Pardot:

ResourceEndpointMethodData selectorDescription
prospects/objects/prospectsGETdataQuery prospects (v5).
prospect/objects/prospects/GETdataRead single prospect (v5).
campaigns/objects/campaignsGETdataQuery campaigns (v5).
lists/objects/listsGETdataQuery lists (v5).
visitors/objects/visitorsGETdataQuery visitors (v5).
prospect_query_legacy/api/prospect/version/4/do/query?format=jsonGETresultLegacy v3/v4 query returns JSON under result with prospect entries.
list_memberships_legacy/api/listMembership/version/4/do/query?format=jsonGETresultLegacy list membership query (v3/v4).
visitor_activities_legacy/api/visitorActivity/version/4/do/query?format=jsonGETresultLegacy visitor activity query (v3/v4).

How do I authenticate with the Pardot API?

Authenticate using Salesforce OAuth 2.0 to obtain an access token and include it in the Authorization header as Bearer <access_token>. For Account Engagement API v5 also include the Pardot-Business-Unit-Id header.

1. Get your credentials

  1. In Salesforce, create a Connected App (Setup → App Manager → New Connected App).
  2. Configure OAuth scopes (e.g., api, refresh_token, full) and set a callback URL.
  3. Note the Connected App Client ID and Client Secret.
  4. Use OAuth 2.0 (authorization code, JWT bearer, or username‑password flow where allowed) to obtain an access token from Salesforce's token endpoint (https://login.salesforce.com/services/oauth2/token or sandbox/developer login as appropriate).
  5. Identify your Account Engagement (Pardot) Business Unit ID in Salesforce Setup (Installed Packages b2bmaIntegration or from Account Engagement settings) and include it in requests via the Pardot-Business-Unit-Id header.

2. Add them to .dlt/secrets.toml

[sources.pardot_source] access_token = "your_oauth_access_token" pardot_business_unit_id = "your_business_unit_id" client_id = "your_connected_app_client_id" client_secret = "your_connected_app_client_secret"

dlt reads this automatically at runtime — never hardcode tokens in your pipeline script. For production environments, see setting up credentials with dlt for environment variable and vault-based options.


How do I set up and run the pipeline?

Set up a virtual environment and install dlt:

uv venv && source .venv/bin/activate uv pip install "dlt[workspace]"

1. Install the dlt AI Workbench:

dlt ai init --agent <your-agent> # <agent>: claude | cursor | codex

This installs project rules, a secrets management skill, appropriate ignore files, and configures the dlt MCP server for your agent. Learn more →

2. Install the rest-api-pipeline toolkit:

dlt ai toolkit rest-api-pipeline install

This loads the skills and context about dlt the agent uses to build the pipeline iteratively, efficiently, and safely. The agent uses MCP tools to inspect credentials — it never needs to read your secrets.toml directly. Learn more →

3. Start LLM-assisted coding:

Use /find-source to load data from the Pardot API into DuckDB.

The rest-api-pipeline toolkit takes over from here — it reads relevant API documentation, presents you with options for which endpoints to load, and follows a structured workflow to scaffold, debug, and validate the pipeline step by step.

4. Run the pipeline:

python pardot_pipeline.py

If everything is configured correctly, you'll see output like this:

Pipeline pardot_pipeline load step completed in 0.26 seconds 1 load package(s) were loaded to destination duckdb and into dataset pardot_data The duckdb destination used duckdb:/pardot.duckdb location to store data Load package 1749667187.541553 is LOADED and contains no failed jobs

Inspect your pipeline and data:

dlt pipeline pardot_pipeline show

This opens the Pipeline Dashboard where you can verify pipeline state, load metrics, schema (tables, columns, types), and query the loaded data directly.


Python pipeline example

This example loads prospects and campaigns from the Pardot API into DuckDB. It mirrors the endpoint and data selector configuration from the table above:

import dlt from dlt.sources.rest_api import RESTAPIConfig, rest_api_resources @dlt.source def pardot_source(access_token=dlt.secrets.value): config: RESTAPIConfig = { "client": { "base_url": "https://pi.pardot.com/api/v5/ (modern JSON API). Legacy endpoints: https://pi.pardot.com/api/ or https://pi.demo.pardot.com/api/ for v3/v4.", "auth": { "type": "bearer", "access_token": access_token, }, }, "resources": [ {"name": "prospects", "endpoint": {"path": "objects/prospects", "data_selector": "data"}}, {"name": "campaigns", "endpoint": {"path": "objects/campaigns", "data_selector": "data"}} ], } yield from rest_api_resources(config) def get_data() -> None: pipeline = dlt.pipeline( pipeline_name="pardot_pipeline", destination="duckdb", dataset_name="pardot_data", ) load_info = pipeline.run(pardot_source()) print(load_info)

To add more endpoints, append entries from the resource table to the "resources" list using the same name, path, and data_selector pattern.


How do I query the loaded data?

Once the pipeline runs, dlt creates one table per resource. You can query with Python or SQL.

Python (pandas DataFrame):

import dlt data = dlt.pipeline("pardot_pipeline").dataset() sessions_df = data.prospects.df() print(sessions_df.head())

SQL (DuckDB example):

SELECT * FROM pardot_data.prospects LIMIT 10;

In a marimo or Jupyter notebook:

import dlt data = dlt.pipeline("pardot_pipeline").dataset() data.prospects.df().head()

See how to explore your data in marimo Notebooks and how to query your data in Python with dataset.


What destinations can I load Pardot data to?

dlt supports loading into any of these destinations — only the destination parameter changes:

DestinationExample value
DuckDB (local, default)"duckdb"
PostgreSQL"postgres"
BigQuery"bigquery"
Snowflake"snowflake"
Redshift"redshift"
Databricks"databricks"
Filesystem (S3, GCS, Azure)"filesystem"

Change the destination in dlt.pipeline(destination="snowflake") and add credentials in .dlt/secrets.toml. See the full destinations list.


Troubleshooting

Authentication failures

Make sure you obtain a valid Salesforce OAuth access token and include it in the Authorization header. Also include the Pardot Business Unit ID header for v5 requests. 401/403 indicate invalid token, expired token, or insufficient OAuth scopes. Use the refresh‑token flow to renew tokens.

Rate limits and throttling

Account Engagement enforces daily and concurrent request limits. Large data pulls should use pagination (v5 uses limit and offset query parameters) and back off on 429 responses. Implement exponential backoff on 429/503 responses.

Pagination quirks

Legacy v3/v4 query endpoints return up to 200 results per request and use offset to retrieve subsequent pages; responses wrap records under the result element. Version 5 uses JSON collection responses with a top‑level data array and meta pagination info; use limit and offset (or page‑based params per endpoint) to iterate.

Common errors

  • 401 Unauthorized – invalid or expired OAuth token.
  • 403 Forbidden – insufficient user permissions in Pardot.
  • 404 Not Found – incorrect endpoint or object ID.
  • 429 Too Many Requests – rate limit exceeded; implement backoff.
  • 5xx Server errors – transient issues; retry with backoff.

Ensure that the API key is valid to avoid 401 Unauthorized errors. Also, verify endpoint paths and parameters to avoid 404 Not Found errors.


Next steps

Continue your data engineering journey with the other toolkits of the dltHub AI Workbench:

  • data-exploration — Build custom notebooks, charts, and dashboards for deeper analysis with marimo notebooks.
  • dlthub-runtime — Deploy, schedule, and monitor your pipeline in production.
dlt ai toolkit data-exploration install dlt ai toolkit dlthub-runtime install

Was this page helpful?

Community Hub

Need more dlt context for Pardot?

Request dlt skills, commands, AGENT.md files, and AI-native context.