API playground: Free APIs for personal data projects
- Adrian Brudaru,
Co-Founder & CDO
Free APIs for Data Engineering
Practicing data engineering is better with real data sources. If you are considering doing a data engineering project, consider the following:
- Ideally, your data has entities and activities, so you can model dimensions and facts.
- Ideally, the APIs have no auth, so they can be easily tested.
- Ideally, the API should have some use case that you are modelling and showing the data for.
- Ideally, you build end-to-end pipelines to showcase extraction, ingestion, modelling and displaying data.
This article outlines 10 APIs, detailing their use cases, any free tier limitations, and authentication needs.
Material teaching data loading with dlt:
Data Talks Club Data Engineering Zoomcamp
Data Talks Club Open Source Spotlight
Docs
APIs Overview
1. PokeAPI
- URL: PokeAPI.
- Use: Import Pokémon data for projects on data relationships and stats visualization.
- Free: Rate-limited to 100 requests/IP/minute.
- Auth: None.
2. REST Countries API
- URL: REST Countries.
- Use: Access country data for projects analyzing global metrics.
- Free: Unlimited.
- Auth: None.
3. OpenWeather API
- URL: OpenWeather.
- Use: Fetch weather data for climate analysis and predictive modeling.
- Free: Limited requests and features.
- Auth: API key.
4. JSONPlaceholder API
- URL: JSONPlaceholder.
- Use: Ideal for testing and prototyping with fake data. Use it to simulate CRUD operations on posts, comments, and user data.
- Free: Unlimited.
- Auth: None required.
5. Quandl API
- URL: Quandl.
- Use: For financial market trends and economic indicators analysis.
- Free: Some datasets require premium.
- Auth: API key.
6. GitHub API
- URL: GitHub API
- Use: Analyze open-source trends, collaborations, or stargazers data. You can use it from our verified sources repository.
- Free: 60 requests/hour unauthenticated, 5000 authenticated.
- Auth: OAuth or personal access token.
7. NASA API
- URL: NASA API.
- Use: Space-related data for projects on space exploration or earth science.
- Free: Rate-limited.
- Auth: API key.
8. The Movie Database (TMDb) API
- URL: TMDb API.
- Use: Movie and TV data for entertainment industry trend analysis.
- Free: Requires attribution.
- Auth: API key.
9. CoinGecko API
- URL: CoinGecko API.
- Use: Cryptocurrency data for market trend analysis or predictive modeling.
- Free: Rate-limited.
- Auth: None.
10. Public APIs GitHub list
- URL: Public APIs list.
- Use: Discover APIs for various projects. A meta-resource.
- Free: Varies by API.
- Auth: Depends on API.
11. News API
- URL: News API.
- Use: Get datasets containing current and historic news articles.
- Free: Access to current news articles.
- Auth: API-Key.
12. Exchangerates API
- URL: Exchangerate API.
- Use: Get realtime, intraday and historic currency rates.
- Free: 250 monthly requests.
- Auth: API-Key.
13. Spotify API
- URL: Spotify API.
- Use: Get spotify content and metadata about songs.
- Free: Rate limit.
- Auth: API-Key.
14. Football API
- URL: FootBall API.
- Use: Get information about Football Leagues & Cups.
- Free: 100 requests/day.
- Auth: API-Key.
15. Yahoo Finance API
- URL: Yahoo Finance API.
- Use: Access a wide range of financial data.
- Free: 500 requests/month.
- Auth: API-Key.
16. Basketball API
- URL: Basketball API.
- Use: Get information about basketball leagues & cups.
- Free: 100 requests/day.
- Auth: API-Key.
17. NY Times API
- URL: NY Times API.
- Use: Get info about articles, books, movies and more.
- Free: 500 requests/day or 5 requests/minute.
- Auth: API-Key.
18. Spoonacular API
- URL: Spoonacular API.
- Use: Get info about ingredients, recipes, products and menu items.
- Free: 150 requests/day and 1 request/sec.
- Auth: API-Key.
19. Movie database alternative API
- URL: Movie database alternative API.
- Use: Movie data for entertainment industry trend analysis.
- Free: 1000 requests/day and 10 requests/sec.
- Auth: API-Key.
20. RAWG Video games database API
- URL: RAWG Video Games Database.
- Use: Gather video game data, such as release dates, platforms, genres, and reviews.
- Free: Unlimited requests for limited endpoints.
- Auth: API key.
21. Jikan API
- URL: Jikan API.
- Use: Access data from MyAnimeList for anime and manga projects.
- Free: Rate-limited.
- Auth: None.
22. Open Library Books API
- URL: Open Library Books API.
- Use: Access data about millions of books, including titles, authors, and publication dates.
- Free: Unlimited.
- Auth: None.
23. YouTube Data API
- URL: YouTube Data API.
- Use: Access YouTube video data, channels, playlists, etc.
- Free: Limited quota.
- Auth: Google API key and OAuth 2.0.
24. Reddit API
- URL: Reddit API.
- Use: Access Reddit data for social media analysis or content retrieval.
- Free: Rate-limited.
- Auth: OAuth 2.0.
25. World Bank API
- URL: World bank API.
- Use: Access economic and development data from the World Bank.
- Free: Unlimited.
- Auth: None.
Each API offers unique insights for data engineering, from ingestion to visualization. Check each API's documentation for up-to-date details on limitations and authentication.
Using the above sources
You can create a pipeline for the APIs discussed above by using dlt's
REST API source. Let’s create a PokeAPI pipeline as an example. Follow these steps:
Create a Rest API source:
dlt init rest_api duckdb
The following directory structure gets generated:
rest_api_pipeline/
├── .dlt/
│ ├── config.toml # configs for your pipeline
│ └── secrets.toml # secrets for your pipeline
├── rest_api/ # folder with source-specific files
│ └── ...
├── rest_api_pipeline.py # your main pipeline script
├── requirements.txt # dependencies for your pipeline
└── .gitignore # ignore files for git (not required)
Configure the source in rest_api_pipeline.py
:
def load_pokemon() -> None:
pipeline = dlt.pipeline(
pipeline_name="rest_api_pokemon",
destination='duckdb',
dataset_name="rest_api_data",
)
pokemon_source = rest_api_source(
{
"client": {
"base_url": "https://pokeapi.co/api/v2/",
},
"resource_defaults": {
"endpoint": {
"params": {
"limit": 1000,
},
},
},
"resources": [
"pokemon",
"berry",
"location",
],
}
)
For a detailed guide on creating a pipeline using the Rest API source, please read the Rest API source documentation here.
Example projects
Here are some examples from dlt users and working students:
- A pipeline that pulls data from an API and produces a dashboard in the dbt blog.
- A streaming pipeline on GCP that replaces expensive tools such as Segment/5tran with a setup 50-100x cheaper.
- Another streaming pipeline on AWS for a slightly different use case.
- Orchestrator + email + AI + Slack to summarize emails.
- Evaluate a frontend tool to show your ability to deliver end-to-end.
- An end-to-end data lineage implementation from extraction to dashboard.
- A bird pipeline and the associated schema management that ensures smooth operation Part 1, Part 2.
- Japanese language demos Notion calendar and exploring csv to bigquery with dlt.
- Demos with Dagster and Prefect.
DTC learners showcase
Check out the incredible projects from our DTC learners:
- e2e_de_project by scpkobayashi.
- de-zoomcamp-project by theDataFixer.
- data-engineering-zoomcamp2024-project2 by pavlokurochka.
- de-zoomcamp-2024 by snehangsude.
- zoomcamp-data-engineer-2024 by eokwukwe.
- data-engineering-zoomcamp-alex by aaalexlit.
- Zoomcamp2024 by alfredzou.
- data-engineering-zoomcamp by el-grudge.
Explore these projects to see the innovative solutions and hard work the learners have put into their data engineering journeys!
Showcase your project
If you want your project to be featured, let us know in the #sharing-and-contributing channel of our community Slack.