Nick Schrock Podcast Transcript

272
Headshot of Founder Nick Schrock

Nick Schrock Podcast Transcript

Nick Schrock joins host Brian Thomas on The Digital Executive Podcast.

Welcome to Coruzant Technologies, home of The Digital Executive Podcast.

Brian Thomas: Welcome to the Digital Executive. Today’s guest is Nick Schrock. Nick Schrock is the founder and CTO of Dagster Labs, the company behind Dagster, a popular open-source data orchestration platform. Before Dagster Labs, he was a principal engineer and director at Facebook from 2009 to 2017, where he founded the product infrastructure team and co-created GraphQL.

After cutting his teeth at Facebook, He pursued his passion for working on engineering pain points. After hearing that data infrastructure was a big issue, he founded Dagster to address this issue, highlighting how quickly open-source projects were able to make an impact at legacy companies.

Well, good afternoon, Nick. Welcome to the show!

Nick Schrock: Thanks, Brian. Good to be with you.

Brian Thomas: Awesome. Great to be here on a lovely blistery day that, you know, the winter time is just always my favorite time and I’m a little bit joking there, but we just got dumped on with some really, really cold weather in the Midwest. I know the rest of the country’s kind of experiencing the same.

Nick Schrock: So, yeah, I was dredged through snow in New York to get to the office today.

Brian Thomas: Yeah, it’s, it’s, it’s a little crazy, but gotta embrace it. We got to get through it. But Nick, let’s jump into your questions here. Got an amazing career. You’re the founder of Dagster labs, which inspired you to create the Dagster data orchestration platform, and how does it innovate in the field of data engineering?

Nick Schrock: Well, you know, it’s a, it’s an orchestration platform and orchestration means ordering and scheduling computations. But it really is a lot more than that. You know, I really think about the value provides on a few dimensions. One, it is designed for the notion that software engineering is data engineering rather than some of some sort of foreign discipline and really the foundation of.

Productivity, which is one of the pillars of value in software engineering, is having a highly productive workflow that’s oriented around testing. So, Dagster is built from that from the ground up. Dagster also is unique among the orchestration category in that it is oriented around kind of the first the leading abstraction, the core abstraction.

The system is not a task, but a data asset. So, you think about it. What are the data assets in the data platform that you want to design? And then you work backwards from there. And we try and make execution and scheduling more declarative and kind of something that happens behind the scenes. And when your orchestration platform is asset oriented, what’s interesting is you get a bunch of stuff for free, so lineage just kind of comes for free as a result of the programming model.

You get a base level of certain cataloging as a result, and that kind of leads into the third pillar of value. So, you get like consolidation of tools because the asteroid orientation, you also have a ton of context operationally about what’s going on. And then it allows you to kind of unify and consolidate your platform.

But. You know, we really view it as the toolkit for building this asset-oriented data platform inside of a company. And it just, the, the, it delivers a ton of value across a number of dimensions and that value compounds over time as you invest more into it.

Brian Thomas: Thank you. And I appreciate you breaking that down.

I love hearing, you know, that complete end to end process where you’ve kind of encapsulated all those services. Really do appreciate that. And moving on to the next question, Nick, can you describe your journey from being a principal engineer and director at Facebook to founding Dagster Labs? And how did your experiences at Facebook so far? Influence your entrepreneurial journey.

Nick Schrock: Yeah, so, my time at Facebook was primarily working on our internal developer tools and developer platform. So, I founded a team called Product Infrastructure, whose mission was to make our application developers more efficient and productive. And kind of the lead project, or the core, initial core project of that.

Was Facebook’s internal business object layer and a middle tier query language that team evolved from building internal abstractions and frameworks to externalizing that work in the form of open-source projects so projects like react. Came out of that team and the one that I worked on directly was GraphQL, which is effectively the kind of evolve from that core business object layer and middle tier query language I should say API, not language that came out of that.

So, the, you know, a lot of the lessons we learned are very direct from that. I think both organizationally. And in terms of technological artifacts, I think that in terms of the projects we worked on that game, you know, is still shockingly large audience and popularity. The core abstractions and the mission and then the marketing around those really matters.

So, React very directly was about your building UI components and its software engineering process. And GraphQL was really oriented around building a query language that is designed with the needs of front-end developers in mind, which led to its kind of Hierarchical query language all the way down property.

And then, you know, I left Facebook in 2017 and looking what to do next. And I actually think in that process of discovery, I kind of really thought about more deeply about what motivates me on a personal level. You know, so one thing is I identify, I’m naturally gravitated towards engineers who are in pain.

I like to say that I’m, you know, drawn to developer experience dumpster fires, like a moth to a flame. To like the project I work on, I have to feel like there’s some novel technical insight to solve that pain. And also, to solve. An organizational problem, which leads to the third thing that I like, which is a technology that is a strategic point of leverage in the organization, meaning that it kind of defines the interfaces between teams.

And in fact, often the organization molds itself to mimic and model that. And last, just a problem that. Matters. And as I was exploring and thinking after I left Facebook, I really did find a problem which aligned all those properties and did so fairly quickly. Cause I was talking to companies both inside and outside of the valley.

And I would ask them what their biggest technical liability was, what was the engineering problem that kept them from moving forward with their business? And this notion of data and ML infrastructure came up over and over and over again. And, you know, what I found was a, just engineers, just.

grappling with these tools and processes. It was absolutely brutal. I found how they were interacting with the orchestration layer, kind of the dominant income, and it was airflow. And to me, it was clear that orchestration was this incredibly important leverage point in org because every practitioner has to interact with it either directly or indirectly.

And then in turn, it is a system which invokes every computation and therefore touches every storage engine. So definitely a strategic point of leverage. And then I was kind of prototyping ideas and this notion of really leading with you have a data asset or a data set and defining that in software rather than thinking of it as a physical artifact I felt was an interesting way to, and a valuable way of approaching the problem.

And then last thing, this, this notion of a problem that matters, I really, I like to work on things with broad horizontal markets that could have a lot of impact. And that’s certainly, you know, this is a problem that faces every modern enterprise, but more profoundly, these data pipelines and the assets that come out of them are.

Incredibly important to modern society. They’re, they’re the basis of most decision making in business, either, you know, as a result of data driving dashboards and making data informed decision making possible to automated decision making processes in machine learning systems. And that has only increased in importance over time.

And it really kind of bothered me that, you know, these systems that determine how you price healthcare. Whether people get approved for mortgages or not were built on really shaky foundations. And people felt they were building systems that they fundamentally could not control and understand. And I felt that was super motivating.

And that led to founding creating Daxter and founding Daxter Labs. So that is kind of the story there.

Brian Thomas: Thank you. And I love how, I think there’s a theme here on this podcast of how we’re always trying to build a solution to solve a problem or to help people out, right. Make the world a better place.

And, you know, you talked about that development dumpster fire, right. And helping out your fellow software engineers. And I think that’s just kind of set the foundation for your entrepreneurial journey. So, I appreciate that.

Nick Schrock: Yeah, it’s completely. I mean, I think like another, I guess one thing I didn’t mention from the Facebook experience is that our internal infrastructure teams, and I was you know, a leader in one of them, we really thought of ourselves as service organizations and we use that term a lot and that speaks to exactly what you’re talking about.

You know, our job was not to convince a stakeholder, some high-level decision maker to make a top down decision and coerce teams In order to use something, our job was to serve the, our partner teams directly and drive adoption in a similar way that you do so with external teams. And that is all from a place of being a service team that’s extremely customer focused.

And your job is to empower them to, to complete their work more quickly and in better ways.

Brian Thomas: Thank you. Appreciate that. And Nick, Dagster is known as a popular open-source platform. In your view, how do you open-source projects like Dagster contribute to and shape the tech industry, especially in legacy companies?

Nick Schrock: To the legacy companies I’d like to speak to that directly. One of the things I found very gratifying about the GraphQL experience is how early in its life cycle, it was adopted by. So called legacy technologies. So, you know, yes, there was adoption, you know, in Silicon Valley companies, you could walk around San Francisco and kind of talk to people who were using GraphQL, but it was also adopted early in its lifecycle by the likes of KLM and Walmart.

So, what that really showed me is that even in the legacy. Companies, there are pockets of engineers who are empowered to bring in new technologies and do operate more like tech startups in their part of the world and legacy companies. So, and with open source, people are able to adopt those more permissionlessly.

Because they don’t have to go through an official economic procurement process. So, you can really influence and shape the way things are going across both legacy and newer technologies. I think the other thing you can do with open-source technologies is, is build a community. And that means people participate in that community, they feel like they have at least, you know, participation and maybe even partial ownership of that technology.

And that makes them more kind of devoted adherence to it. That they really think of it as an important career investment that will outlive any particular company. So, I think it happens in a couple dimensions.

Brian Thomas: Thank you so much. I appreciate that. There’s a lot that we could learn from a community, especially in the open-source community.

And I appreciate your insights there. And Nick, last question of the day with the growing importance of data and ML engineering and decision making, what trends do you foresee in these areas and how is Dagster positioned to address future challenges?

Nick Schrock: So, I guess the base level trend. Is that as they get become more important, they have to be correct and with importance also comes increased resources to these areas, which means that the practitioners are more in demand and productivity.

Becomes more and more important as well as cost control, but, you know, I think the, the, you know, what I still go back to is that data engineering is an evolving discipline that has to be formalized and kind of adopt more and more of software engineering principles and Dagster that is right where we’re positioned to address those future challenges.

Other future challenges are having Dagster as a built-in system that has a bunch of different base level capabilities like cataloging and lineage, which, you know, I think both avoids makes it unnecessary to adopt. additional tools in certain situations, or if you do need to adopt additional tools, you start from a higher level of integration so that you don’t have to reconstruct and integrate integrate as many technologies in heavyweight ways.

And so those are the things that come to mind.

Brian Thomas: Thank you. I appreciate the share. Obviously, there’s a lot we could be thinking about. At the end of the day, we’re trying to make it a better customer experience for everybody. And I know at the core of, you know, what you’ve shared in your heartfelt message here around helping others today, especially in the development community, is going to make a big difference.

And how we move things forward in a better way. So, I appreciate the share and Nick, it was such a pleasure having you on today. And I look forward to speaking with you real soon.

Nick Schrock: Thanks Brian. Thanks for having me.

Brian Thomas: Bye for now.

Nick Schrock Podcast Transcript. Listen to the audio on the guest’s podcast page.

Subscribe

* indicates required