don't just take our word for it
Today dlt is being used by senior data engineers and other internal data tool builders to create data pipelines, data warehouses, and data lakes in order to democratize access to data in their organizations. Having launched a few months ago, dlt has a growing community of Python developers and is being deployed in production in several scale-up tech companies.
Leveraging dlt has changed our data operations. It has empowered our internal stakeholders, including product, business, and operation teams, to independently satisfy a majority of their data needs through self-service. This shift in responsibility has accelerated the pace of our DataOps team: we spend less time on the EL and more on the T whilst still being able to deeply customize our extractors as business requirements evolve. I chose dlt because I immediately saw the value in its core proposition. The most manual, error prone process of a data engineer's job is two-fold: It's managing evolving schemas and loading data to a destination consistently from some in-memory data structures (lists, dicts). dlt has enabled me to completely rewrite all of our core SaaS service pipelines in 2 weeks and have data pipelines in production with full confidence they will never break due to changing schemas. Furthermore, we completely removed reliance on disparate external, unreliable Singer components whilst maintaining a very light code footprint.
The current machine learning revolution has been enabled by the Cambrian explosion of Python open-source tools that have become so accessible that a wide range of practitioners can use them. I believe this revolution will also extend to the field of data, as these Python developers have started to move into the enterprise in recent years. The starting point of every data workflow is to create a dataset. As a simple-to-use Python library, dlt is the first tool that this new wave of people can use. By leveraging this library, we can extend the machine learning revolution into enterprise data.
DLT is empowering our data engineering team with unparalleled flexibility and speed. We at USC Group are prioritising speed of delivery and our engineers agree that it helped them save precious time: DLT is a game-changer for data integration into BigQuery. Its remarkable flexibility allows seamless integration of data from various sources through customizable API endpoints. Unlike traditional ingestion tools with limited options, DLT removes complexities associated with API programming, enabling us to deliver code at an unprecedented pace.
I've been relying on dltHub in both my founder and freelance roles for months, and it's taken a huge load off my mind when it comes to the gritty details of data loading and schemas. Coming from an ML background, I don't have a deep understanding of the finer points of databases and their best practices, often turning data handling into a substantial bottleneck. As a founder, lack of appropriate data can undermine decision-making, and as a freelancer, spending excessive time setting up data loading can eat into your profit margins. In this respect, dltHub has become as fundamental to my workflow as HuggingFace and GitHub
DLT + DBT can be a killer combination for customers. DLT is built on Python libraries and compatible with sources which support API and functions that can return data in JSON etc. format. It currently supports schema inferences, declarative in nature, easy to install and can be dockerized. Given it's built on Python libraries, the near future looks very promising by incorporating capabilities in EL space for end-to-end governance - support for data contracts, data profiling, data quality, data observability, incorporate chatgpt-4 to provide co-pilot functionalities.