dltHub
Blog /

API playground: Free APIs for personal data projects

  • Adrian Brudaru,
    Co-Founder & CDO

Free APIs for Data Engineering

Practicing data engineering is better with real data sources. If you are considering doing a data engineering project, consider the following:

  • Ideally, your data has entities and activities, so you can model dimensions and facts.
  • Ideally, the APIs have no auth, so they can be easily tested.
  • Ideally, the API should have some use case that you are modelling and showing the data for.
  • Ideally, you build end-to-end pipelines to showcase extraction, ingestion, modelling and displaying data.

This article outlines 10 APIs, detailing their use cases, any free tier limitations, and authentication needs.

Material teaching data loading with dlt:

Data Talks Club Data Engineering Zoomcamp

Data Talks Club Open Source Spotlight

Docs

APIs Overview

1. PokeAPI

  • URL: PokeAPI.
  • Use: Import Pokémon data for projects on data relationships and stats visualization.
  • Free: Rate-limited to 100 requests/IP/minute.
  • Auth: None.

2. REST Countries API

  • URL: REST Countries.
  • Use: Access country data for projects analyzing global metrics.
  • Free: Unlimited.
  • Auth: None.

3. OpenWeather API

  • URL: OpenWeather.
  • Use: Fetch weather data for climate analysis and predictive modeling.
  • Free: Limited requests and features.
  • Auth: API key.

4. JSONPlaceholder API

  • URL: JSONPlaceholder.
  • Use: Ideal for testing and prototyping with fake data. Use it to simulate CRUD operations on posts, comments, and user data.
  • Free: Unlimited.
  • Auth: None required.

5. Quandl API

  • URL: Quandl.
  • Use: For financial market trends and economic indicators analysis.
  • Free: Some datasets require premium.
  • Auth: API key.

6. GitHub API

  • URL: GitHub API
  • Use: Analyze open-source trends, collaborations, or stargazers data. You can use it from our verified sources repository.
  • Free: 60 requests/hour unauthenticated, 5000 authenticated.
  • Auth: OAuth or personal access token.

7. NASA API

  • URL: NASA API.
  • Use: Space-related data for projects on space exploration or earth science.
  • Free: Rate-limited.
  • Auth: API key.

8. The Movie Database (TMDb) API

  • URL: TMDb API.
  • Use: Movie and TV data for entertainment industry trend analysis.
  • Free: Requires attribution.
  • Auth: API key.

9. CoinGecko API

  • URL: CoinGecko API.
  • Use: Cryptocurrency data for market trend analysis or predictive modeling.
  • Free: Rate-limited.
  • Auth: None.

10. Public APIs GitHub list

  • URL: Public APIs list.
  • Use: Discover APIs for various projects. A meta-resource.
  • Free: Varies by API.
  • Auth: Depends on API.

11. News API

  • URL: News API.
  • Use: Get datasets containing current and historic news articles.
  • Free: Access to current news articles.
  • Auth: API-Key.

12. Exchangerates API

  • URL: Exchangerate API.
  • Use: Get realtime, intraday and historic currency rates.
  • Free: 250 monthly requests.
  • Auth: API-Key.

13. Spotify API

  • URL: Spotify API.
  • Use: Get spotify content and metadata about songs.
  • Free: Rate limit.
  • Auth: API-Key.

14. Football API

  • URL: FootBall API.
  • Use: Get information about Football Leagues & Cups.
  • Free: 100 requests/day.
  • Auth: API-Key.

15. Yahoo Finance API

  • URL: Yahoo Finance API.
  • Use: Access a wide range of financial data.
  • Free: 500 requests/month.
  • Auth: API-Key.

16. Basketball API

  • URL: Basketball API.
  • Use: Get information about basketball leagues & cups.
  • Free: 100 requests/day.
  • Auth: API-Key.

17. NY Times API

  • URL: NY Times API.
  • Use: Get info about articles, books, movies and more.
  • Free: 500 requests/day or 5 requests/minute.
  • Auth: API-Key.

18. Spoonacular API

  • URL: Spoonacular API.
  • Use: Get info about ingredients, recipes, products and menu items.
  • Free: 150 requests/day and 1 request/sec.
  • Auth: API-Key.

19. Movie database alternative API

  • URL: Movie database alternative API.
  • Use: Movie data for entertainment industry trend analysis.
  • Free: 1000 requests/day and 10 requests/sec.
  • Auth: API-Key.

20. RAWG Video games database API

  • URL: RAWG Video Games Database.
  • Use: Gather video game data, such as release dates, platforms, genres, and reviews.
  • Free: Unlimited requests for limited endpoints.
  • Auth: API key.

21. Jikan API

  • URL: Jikan API.
  • Use: Access data from MyAnimeList for anime and manga projects.
  • Free: Rate-limited.
  • Auth: None.

22. Open Library Books API

  • URL: Open Library Books API.
  • Use: Access data about millions of books, including titles, authors, and publication dates.
  • Free: Unlimited.
  • Auth: None.

23. YouTube Data API

  • URL: YouTube Data API.
  • Use: Access YouTube video data, channels, playlists, etc.
  • Free: Limited quota.
  • Auth: Google API key and OAuth 2.0.

24. Reddit API

  • URL: Reddit API.
  • Use: Access Reddit data for social media analysis or content retrieval.
  • Free: Rate-limited.
  • Auth: OAuth 2.0.

25. World Bank API

  • URL: World bank API.
  • Use: Access economic and development data from the World Bank.
  • Free: Unlimited.
  • Auth: None.

Each API offers unique insights for data engineering, from ingestion to visualization. Check each API's documentation for up-to-date details on limitations and authentication.

Using the above sources

You can create a pipeline for the APIs discussed above by using dlt's REST API source. Let’s create a PokeAPI pipeline as an example. Follow these steps:

Create a Rest API source:

dlt init rest_api duckdb

The following directory structure gets generated:

rest_api_pipeline/
├── .dlt/
   ├── config.toml          # configs for your pipeline
   └── secrets.toml         # secrets for your pipeline
├── rest_api/                # folder with source-specific files
   └── ...
├── rest_api_pipeline.py     # your main pipeline script
├── requirements.txt         # dependencies for your pipeline
└── .gitignore               # ignore files for git (not required)

Configure the source in rest_api_pipeline.py:

def load_pokemon() -> None:
    pipeline = dlt.pipeline(
        pipeline_name="rest_api_pokemon",
        destination='duckdb',
        dataset_name="rest_api_data",
    )

    pokemon_source = rest_api_source(
        {
            "client": {
                "base_url": "https://pokeapi.co/api/v2/",
            },
            "resource_defaults": {
                "endpoint": {
                    "params": {
                        "limit": 1000,
                    },
                },
            },
            "resources": [
                "pokemon",
                "berry",
                "location",
            ],
        }
    )

For a detailed guide on creating a pipeline using the Rest API source, please read the Rest API source documentation here.

Example projects

Here are some examples from dlt users and working students:

DTC learners showcase

Check out the incredible projects from our DTC learners:

Explore these projects to see the innovative solutions and hard work the learners have put into their data engineering journeys!

Showcase your project

If you want your project to be featured, let us know in the #sharing-and-contributing channel of our community Slack.