N

  • An Introduction To The dbt Plugin API

    28 Aug 2023

    In dbt-core 1.6.0, Michelle Ark made an exciting addition: a plugin system! This system lets third-party code seamlessly integrate models and artifacts into dbt Core’s compilation process, offering new possibilities for dbt users. Initially proposed by Kshitij Aranke in March 2023 as part of a discussion around creating a python sdk for dbt, the plugin system first came to fruition to support dbt Labs’ proprietary multi-project collaboration product.

    GAI generated image of a pluggable interface

    Despite its quiet release, the dbt Core plugin system brings great benefits to the OSS dbt community, allowing native cross-project references without importing projects as packages, crafting synthetic model nodes, and creating custom artifacts. Though undocumented at present, this post aims to unravel the plugin interface, guiding you on creating your own plugins and tapping into the dbt plugin system’s potential.

    Read more...

  • From Overgrown To Thriving: Scaling Your dbt Project Like a Gardener

    24 Aug 2023

    After taking some inspiration from my last couple of years of work, I decided to put some pen to paper and formalize the lessons I’ve learned about how to scale out dbt projects. In a fortuitous turn of events, I had an opportunity to present these thoughts at MDS Fest 2023.

    Here are the slides to my session From “From Overgrown to Thriving: Scaling Tour dbt Project Like a Gardener”, in which I describe how to go about refining and rehabilitating truly massive dbt projects.

    For talk slides and a full breakdown of the tools I talks about, please visit the presentation’s git repository.

    Read more...

  • Get More Out of Your DAG

    20 Oct 2022

    This year I had the absolute pleasure to present a workshop at Coalesce 2022 in the heart of New Orleans. Along with Wasila Quader, I presented advanced uses of Jinja and dbt Macros to perform dynamic modeling including: run result storage in a data warehouse, dynamic value lookup in models, and leveraging model metadata in macros.

    If you’d like to follow along at home, your can use the Get More out of Your DAG dbt project to get started!

    Read more...

  • A Mosaic of 2020

    17 Jan 2021

    2020 was an odd one. There was jubilation, and there was grief. There was frustration, and there was pride. While I have always been one to reflect, 2020 is the first year that I’ve collected a full year of mood tracking data. In this case, the data can paint a very accurate portrait of the rollercoaster ride that was 2020.

    Figure 1: Daily average mood heatmap for 2020. Mood data was collected at multiple times per day on a 1 (Awful) to 5 (Rad) scale and aggregated at a daily grain. Significant days -- highlighted using red borders -- indicate the day my daughter was born (2/6/2020) and the day I was diagnosed with a bone tumor (10/13/2020).

    Read more...

  • To Hell And Back

    06 Jan 2020

    This summer I learned a new card game called “To Hell and Back”. Similar to “Oh Hell” and “Rats!”, “To Hell and Back” is a trick-taking card game where you bid the number of tricks you intend to take, and you must take exactly that number of tricks per hand in order to win. Bid correctly, and you earn your bid and 10 extra points. Lose your bid and you get zilch. Unlike its other variations, “To Hell and Back” starts with all players being dealt one card, and each hand the number of cards dealt per hand increases until we hit our maximum for all players. In the case of one-card hands, the differences between success and failure are luck of the draw and careful bidding.

    Figure 1: Total non-sequitur, but I learned how to shuffle cards while playing "To Hell and Back". I've only become decent at shuffling cards this last year after a considerable amount of practice. Now I can bridge in addition to a ruffle shuffle!

    Read more...

  • Python Buffalo: Web Scraping for Good

    20 Sep 2017

    The topic of the September joint Python Buffalo/Data Science meet-up is data scraping. To finish up our conversation of how we can use python to scrape data from public sources, I presented a short slide deck on the ethics of web scraping. The general thesis is that it very much depends on what you are doing and how you do it, and that in the end we should all strive to be good members of the data community. Presentation.

    Read more...