Facebook-groups Python API Docs | dltHub
Build a Facebook-groups-to-database pipeline in Python using dlt with AI Workbench support for Claude Code, Cursor, and Codex.
Last updated:
Facebook Groups API is an extension of the Facebook Graph API that provides programmatic access to Facebook Group objects and their edges (members, feed, albums, events) via HTTPS REST endpoints. The REST API base URL is https://graph.facebook.com and all requests require an access token (User, Page, or App) — sent as query parameter or Bearer token.
dlt is an open-source Python library that handles authentication, pagination, and schema evolution automatically. dlthub provides AI context files that enable code assistants to generate production-ready pipelines. Install with uv pip install "dlt[workspace]" and start loading Facebook-groups data in under 10 minutes.
What data can I load from Facebook-groups?
Here are some of the endpoints you can load from Facebook-groups:
| Resource | Endpoint | Method | Data selector | Description |
|---|---|---|---|---|
| me_groups | me/groups | GET | data | Lists Groups the current user is a member/administrator of (requires appropriate permissions). |
| groups | {group-id} | GET | Read a single Group node (returns object fields for the Group). | |
| group_members | {group-id}/members | GET | data | Lists members of the Group (edge returning list under the "data" key; requires groups_access_member_info or related permission). |
| group_feed | {group-id}/feed | GET | data | Lists posts in the Group feed (edge returns posts array under "data"; supports fields parameter). |
| group_albums | {group-id}/albums | GET | data | Lists albums in the Group (edge returns array under "data"). |
| group_events | {group-id}/events | GET | data | Lists events associated with the Group (edge returns array under "data"). |
| group_docs | {group-id}/docs | GET | data | (If available in version) Lists docs/files for the Group (edge returns array under "data"). |
How do I authenticate with the Facebook-groups API?
The Graph API uses OAuth access tokens (User, Page, or App tokens). Include the token as the access_token query parameter (e.g. ?access_token=TOKEN) or in the HTTP Authorization header as Bearer TOKEN. Required permissions depend on the endpoint (e.g., user_groups, groups_access_member_info, pages_read_engagement).
1. Get your credentials
- Create or use an existing Facebook App at https://developers.facebook.com/apps. 2) Configure Facebook Login and required permissions (e.g. user_groups or groups_access_member_info). 3) Use the Graph API Explorer or implement OAuth to obtain a User or Page access token for the app with granted scopes. 4) Exchange short-lived tokens for long-lived tokens if needed via the token exchange endpoint. 5) Install the app in the Group if the Group API requires app installation.
2. Add them to .dlt/secrets.toml
[sources.facebook_groups_source] access_token = "your_user_or_page_access_token_here"
dlt reads this automatically at runtime — never hardcode tokens in your pipeline script. For production environments, see setting up credentials with dlt for environment variable and vault-based options.
How do I set up and run the pipeline?
Set up a virtual environment and install dlt:
uv venv && source .venv/bin/activate uv pip install "dlt[workspace]"
1. Install the dlt AI Workbench:
dlt ai init --agent <your-agent> # <agent>: claude | cursor | codex
This installs project rules, a secrets management skill, appropriate ignore files, and configures the dlt MCP server for your agent. Learn more →
2. Install the rest-api-pipeline toolkit:
dlt ai toolkit rest-api-pipeline install
This loads the skills and context about dlt the agent uses to build the pipeline iteratively, efficiently, and safely. The agent uses MCP tools to inspect credentials — it never needs to read your secrets.toml directly. Learn more →
3. Start LLM-assisted coding:
Use /find-source to load data from the Facebook-groups API into DuckDB.
The rest-api-pipeline toolkit takes over from here — it reads relevant API documentation, presents you with options for which endpoints to load, and follows a structured workflow to scaffold, debug, and validate the pipeline step by step.
4. Run the pipeline:
python facebook_groups_pipeline.py
If everything is configured correctly, you'll see output like this:
Pipeline facebook_groups_pipeline load step completed in 0.26 seconds 1 load package(s) were loaded to destination duckdb and into dataset facebook_groups_data The duckdb destination used duckdb:/facebook_groups.duckdb location to store data Load package 1749667187.541553 is LOADED and contains no failed jobs
Inspect your pipeline and data:
dlt pipeline facebook_groups_pipeline show
This opens the Pipeline Dashboard where you can verify pipeline state, load metrics, schema (tables, columns, types), and query the loaded data directly.
Python pipeline example
This example loads me/groups and {group-id}/feed from the Facebook-groups API into DuckDB. It mirrors the endpoint and data selector configuration from the table above:
import dlt from dlt.sources.rest_api import RESTAPIConfig, rest_api_resources @dlt.source def facebook_groups_source(access_token=dlt.secrets.value): config: RESTAPIConfig = { "client": { "base_url": "https://graph.facebook.com", "auth": { "type": "bearer", "token": access_token, }, }, "resources": [ {"name": "me_groups", "endpoint": {"path": "me/groups", "data_selector": "data"}}, {"name": "group_feed", "endpoint": {"path": "{group-id}/feed", "data_selector": "data"}} ], } yield from rest_api_resources(config) def get_data() -> None: pipeline = dlt.pipeline( pipeline_name="facebook_groups_pipeline", destination="duckdb", dataset_name="facebook_groups_data", ) load_info = pipeline.run(facebook_groups_source()) print(load_info)
To add more endpoints, append entries from the resource table to the "resources" list using the same name, path, and data_selector pattern.
How do I query the loaded data?
Once the pipeline runs, dlt creates one table per resource. You can query with Python or SQL.
Python (pandas DataFrame):
import dlt data = dlt.pipeline("facebook_groups_pipeline").dataset() sessions_df = data.groups.df() print(sessions_df.head())
SQL (DuckDB example):
SELECT * FROM facebook_groups_data.groups LIMIT 10;
In a marimo or Jupyter notebook:
import dlt data = dlt.pipeline("facebook_groups_pipeline").dataset() data.groups.df().head()
See how to explore your data in marimo Notebooks and how to query your data in Python with dataset.
What destinations can I load Facebook-groups data to?
dlt supports loading into any of these destinations — only the destination parameter changes:
| Destination | Example value |
|---|---|
| DuckDB (local, default) | "duckdb" |
| PostgreSQL | "postgres" |
| BigQuery | "bigquery" |
| Snowflake | "snowflake" |
| Redshift | "redshift" |
| Databricks | "databricks" |
| Filesystem (S3, GCS, Azure) | "filesystem" |
Change the destination in dlt.pipeline(destination="snowflake") and add credentials in .dlt/secrets.toml. See the full destinations list.
Troubleshooting
Authentication failures
If you receive an OAuthException or 190 error, check that your access token is valid, not expired, and has the required scopes for the requested endpoint. Use the Access Token Debugger (developers.facebook.com/tools/debug/accesstoken/) to inspect token type and granted permissions. Ensure the app is installed in the Group when required.
Permissions and missing fields
Many Group edges require specific permissions (e.g., user_groups, groups_access_member_info). Requests lacking required permissions will return permission errors. Request only permitted fields using the fields parameter and ensure the token's app and user consent include those fields.
Rate limits and throttling
Facebook enforces rate limiting per app and token. If you see error codes indicating rate limit or Too Many Requests, implement exponential backoff and respect Retry-After headers. Monitor app-level usage in the App Dashboard.
Pagination
List edges return paginated results inside the "data" array along with a "paging" object containing cursors and next/previous URLs. Use the provided "paging.next" URL or the cursors (after/before) parameters to iterate pages. Example response shape: {"data": [...], "paging": {"cursors": {"before":"...","after":"..."}, "next":"https://graph.facebook.com/..."}}
Common error responses
- OAuthException (code 190): Invalid or expired access token. - Permissions error (code 10/200): Missing permission or app not installed. - Unsupported get request (code 100): Invalid object ID or insufficient access. - Rate limit / (#4) Too many calls: throttle and retry with backoff.
Ensure that the API key is valid to avoid 401 Unauthorized errors. Also, verify endpoint paths and parameters to avoid 404 Not Found errors.
Next steps
Continue your data engineering journey with the other toolkits of the dltHub AI Workbench:
data-exploration— Build custom notebooks, charts, and dashboards for deeper analysis with marimo notebooks.dlthub-runtime— Deploy, schedule, and monitor your pipeline in production.
dlt ai toolkit data-exploration install dlt ai toolkit dlthub-runtime install
Was this page helpful?
Community Hub
Need more dlt context for Facebook-groups?
Request dlt skills, commands, AGENT.md files, and AI-native context.