Skip to main content

Strapi

Need help deploying these sources, or figuring out how to run them in your data stack?
Join our Slack community or book a call with our support engineer Violetta.

Strapi is a headless CMS (Content Management System) that allows developers to create API-driven content management systems without having to write a lot of custom code.

Since Strapi's available endpoints vary based on your Strapi setup, ensure you recognize the ones you'll ingest to transfer data to your warehouse.

This Strapi dlt verified source and pipeline example loads data using “Strapi API” to the destination of your choice.

Sources and resources that can be loaded using this verified source are:

NameDescription
strapi_sourceRetrieves data from Strapi

Setup Guide

Grab API token

  1. Log in to Strapi.
  2. Click ⚙️ in the sidebar.
  3. Go to API tokens under global settings.
  4. Create a new API token.
  5. Fill in Name, Description, and Duration.
  6. Choose a token type: Read Only, Full Access, or custom (with find and findOne selected).
  7. Save to view your API token.
  8. Copy it for dlt secrets setup.

Note: The Strapi UI, which is described here, might change. The full guide is available at this link.

Initialize the verified source

To get started with your data pipeline, follow these steps:

  1. Enter the following command:

    dlt init strapi duckdb

    This command will initialize the pipeline example with Strapi as the source and duckdb as the destination.

  2. If you'd like to use a different destination, simply replace duckdb with the name of your preferred destination.

  3. After running this command, a new directory will be created with the necessary files and configuration settings to get started.

For more information, read the guide on how to add a verified source.

Add credentials

  1. In the .dlt folder, there's a file called secrets.toml. It's where you store sensitive information securely, like access tokens. Keep this file safe. Here's its format for service account authentication:

    # put your secret values and credentials here. do not share this file and do not push it to github
    [sources.strapi]
    api_secret_key = "api_secret_key" # please set me up!
    domain = "domain" # please set me up!
  2. Replace api_secret_key with the API token you copied above.

  3. Strapi auto-generates the domain.

  4. The domain is the URL opened in a new tab when you run Strapi, e.g., [my-strapi.up.your_app.app].

  5. Finally, enter credentials for your chosen destination as per the docs.

For more information, read the General Usage: Credentials.

Run the pipeline

  1. Before running the pipeline, ensure that you have installed all the necessary dependencies by running the command:

    pip install -r requirements.txt
  2. You're now ready to run the pipeline! To get started, run the following command:

    python strapi_pipeline.py

    In the provided script, we've included a list with one endpoint, "athletes." Simply add any other endpoints from your Strapi setup to this list in order to load them. Then, execute this file to initiate the data loading process.

  3. Once the pipeline has finished running, you can verify that everything loaded correctly by using the following command:

    dlt pipeline <pipeline_name> show

    For example, the pipeline_name for the above pipeline example is strapi, you may also use any custom name instead.

For more information, read the guide on how to run a pipeline.

Sources and resources

dlt works on the principle of sources and resources.

Source strapi_source

This function retrives data from Strapi.

@dlt.source
def strapi_source(
endpoints: List[str],
api_secret_key: str = dlt.secrets.value,
domain: str = dlt.secrets.value,
) -> Iterable[DltResource]:
...

endpoints: Collections to fetch data from.

api_secret_key: API secret key for authentication, defaults to dlt secrets.

domain: Strapi API domain name, defaults to dlt secrets.

Customization

Create your own pipeline

If you wish to create your own pipelines, you can leverage source and resource methods from this verified source.

  1. Configure the pipeline by specifying the pipeline name, destination, and dataset as follows:

    pipeline = dlt.pipeline(
    pipeline_name="strapi", # Use a custom name if desired
    destination="duckdb", # Choose the appropriate destination (e.g., duckdb, redshift, post)
    dataset_name="strapi_data" # Use a custom name if desired
    )
  2. To load the specified endpoints:

    endpoints = ["athletes"]
    load_data = strapi_source(endpoints=endpoints)

    load_info = pipeline.run(load_data)
    # pretty print the information on data that was loaded
    print(load_info)

We loaded the "athletes" endpoint above, which can be customized to suit our specific requirements.

This demo works on codespaces. Codespaces is a development environment available for free to anyone with a Github account. You'll be asked to fork the demo repository and from there the README guides you with further steps.
The demo uses the Continue VSCode extension.

Off to codespaces!

DHelp

Ask a question

Welcome to "Codex Central", your next-gen help center, driven by OpenAI's GPT-4 model. It's more than just a forum or a FAQ hub – it's a dynamic knowledge base where coders can find AI-assisted solutions to their pressing problems. With GPT-4's powerful comprehension and predictive abilities, Codex Central provides instantaneous issue resolution, insightful debugging, and personalized guidance. Get your code running smoothly with the unparalleled support at Codex Central - coding help reimagined with AI prowess.