Real Time Data Transformation And Analytics With Dbt Labs

cnfl.io/podcast-episode-259 | dbt is known as being part of the Modern Data Stack for ELT processes. Being in the MDS, dbt Labs believes in having the best of breed for every part of the stack. Oftentimes folks are using an ETL tool like Fivetran to pull data from the database into the warehouse, then using dbt to manage the transformations in the warehouse. Analysts can then build dashboards on top of that data, or execute tests.

It’s possible for an analyst to adapt this process for use with a microservice application using Apache Kafka® and the same method to pull batch data out of each and every database; however, in this episode, Amy Chen (Partner Engineering Manager, dbt Labs) tells Kris about a better way forward for analysts willing to adopt the streaming mindset: Reusable pipelines using dbt models that immediately pull events into the warehouse and materialize as materialized views by default.

dbt Labs is the company that makes and maintains dbt. dbt Core is the open-source data transformation framework that allows data teams to operate with software engineering’s best practices. dbt Cloud is the fastest and most reliable way to deploy dbt.

Inside the world of event streaming, there is a push to expand data access beyond the programmers writing the code, and towards everyone involved in the business. Over at dbt Labs they’re attempting something of the reverse— to get data analysts to adopt the best practices of software engineers, and more recently, of streaming programmers. They’re improving the process of building data pipelines while empowering businesses to bring more contributors into the analytics process, with an easy to deploy, easy to maintain platform. It offers version control to analysts who traditionally don’t have access to git, along with the ability to easily automate testing, all in the same place.

In this episode, Kris and Amy explore:
- How to revolutionize testing for analysts with two of dbt’s core functionalities
- What streaming in a batch-based analytics world should look like
- What can be done to improve workflows
- How to democratize access to data for everyone in the business

EPISODE LINKS
► Learn more about dbt labs: getdbt.com/
► An Analytics Engineer’s Guide to Streaming: cnfl.io/an-analytics-engineers-guide-to-streaming-episode-259
► Panel discussion: If Streaming Is the Answer, Why Are We Still Doing Batch?: cnfl.io/if-streaming-is-the-answer-why-are-we-still-doing-batch-episode-259
► All Current 2022 sessions and slides: cnfl.io/current-2022-episode-259
► Kris Jenkins’ Twitter: twitter.com/krisajenkins
► Streaming Audio Playlist: youtube.com/playlist?list=PLa7VYi0yPIH1B0i7mhzVi78TIkKSd-0vE
► Join the Confluent Community: cnfl.io/confluent-community-episode-259
► Learn more with Kafka tutorials, resources, and guides at Confluent Developer: cnfl.io/confluent-developer-episode-259
► Live demo: Intro to Event-Driven Microservices with Confluent: cnfl.io/demo-intro-to-event-driven-microservices-with-confluent-episode-259
► Use PODCAST100 to get an additional $100 of free Confluent Cloud usage: cnfl.io/try-cloud-episode-259
► Promo code details: cnfl.io/podcast100-details-episode-259

TIMESTAMPS
0:00 - Intro
3:48 - What is MDS?
8:48 - What is dbt?
10:32 - Who uses dbt?
14:30 - How does someone get started with dbt?
20:44 - How does dbt fit into the world of streaming?
24:04 - How can you do unit testing with dbt?
26:12 - Will batch and streaming always be a part of the solution?
32:54 - What are event streamers doing wrong?
37:19 - What are some things to know about data testing with dbt?
40:41 - What should people be watching for in the industry?
41:52 - It's a wrap!

ABOUT CONFLUENT
Confluent is pioneering a fundamentally new category of data infrastructure focused on data in motion. Confluent’s cloud-native offering is the foundational platform for data in motion – designed to be the intelligent connective tissue enabling real-time data, from multiple sources, to constantly stream across the organization. With Confluent, organizations can meet the new business imperative of delivering rich, digital front-end customer experiences and transitioning to sophisticated, real-time, software-driven backend operations. To learn more, please visit confluent.io.

#streamprocessing #apachekafka #kafka #confluent

  • Real-Time Data Transformation and Analytics with dbt Labs ( Download)
  • dbt Labs | Powering Real-Time Analytics ( Download)
  • What Is DBT and Why Is It So Popular - Intro To Data Infrastructure Part 3 ( Download)
  • The Technological Leap Behind Real-time Analytics (w/ Venkat Venkataramani of Rockset) ( Download)
  • Modern Data Transformation with dbt ( Download)
  • dbt 101: Stories from real-life data practitioners + a live look at dbt (w/ Natty + Alexis) ( Download)
  • Sponsored by: dbt Labs | Leveling Up SQL Transformations in the Lakehouse with dbt ( Download)
  • (1/3) What is dbt What is ELT in a Modern Data Warehouse ( Download)
  • What is dbt Data Build Tool | What problem does it solve | Practical use cases ( Download)
  • Delivering Modern Data Stacks With dbt to Operationalise Analytics ( Download)
  • How dbt Created Analytics Engineering... ( Download)
  • Scaling Your Projects with dbt Unleashing the Power of Data Transformation ( Download)
  • The Modern Data Stack: How Fivetran Operationalizes Data Transformations ( Download)
  • Demystifying event streams: Transforming events into tables with dbt ( Download)
  • The Future of Data Analytics ( Download)