The totally reasonable effectiveness of execution-driven science

This post has originally been published on Pasteur Labs Insights.


At Pasteur Labs, we work every day to uplift simulation workflows to the machine age. Concretely, this means we turn bleeding-edge research and technology (Simulation Intelligence) into real-world capabilities:

This particular mix — where science meets engineering, technology enthusiasm meets sobering reality, pie-in-the-sky meets down-to-earth thinking — provides many opportunities but also unique challenges. Scientists and engineers at Pasteur Labs are constantly faced with compromise: How do we answer fundamental questions in a way that leads to measurable impact without getting lost in details that don’t matter, nor providing superficial solutions that don’t hold up in the real world? We believe part of the answer lies in two paradigms that are pervasive in everything we do, namely, use-inspired science and execution-driven science.

Use-inspired science — the what. (covered elsewhere)

Use-inspired science encodes the fact that neither application-ignorant basic science nor singular-purpose applied science is sufficient to solve classes of fundamental, real-world problems. This is ingrained in Pasteur Labs at the deepest level (including the name of our company).

Execution-driven science — the how. (covered here!)

Execution-driven science is the art of navigating complexity by scoping, iterating on, surgically modifying, and stacking existing solutions, and failing fast when encountering dead ends. In particular, execution-driven science allows us to build end-to-end systems of non-trivial complexity — emphasizing execution makes the immense challenges and combinatorial search spaces we are faced with tractable. It helps us guide our day-to-day work towards what is possible, feasible, and worthwhile.

The how of research is typically much less clearly defined than the what, although it is at least as important. Execution-driven science is a particularly powerful way to define this process when faced with the need to build systems that withstand the complexities of the real world. And in the moment, it’s a valuable reminder to the individual researcher facing an overwhelming or constantly-moving research problem.

Simulating the world

I clearly remember the moment I fell in love with physics. It was the moment I realized it made reality computable.

Suddenly, I looked at the world with fresh eyes, computing the time it would take for objects to fall or trains to stop or elevators to arrive. How could a simple mathematical equation like \(t=\sqrt{2 s / a}\) generate predictions that we can observe in the real world?

I later realized that it wasn’t so easy, and that the simplified physical laws we learn in school don’t always hold up to the complexity of the real world. A second revelation came when I saw we could model complex physical systems by creating approximate solutions, solved by computers. Later still, I decided to write a simulator to forecast ocean dynamics (see Fig. 1).

Ocean simulation
Fig. 1: A high-resolution ocean simulation, executed on a single high-end workstation about the size of an AC unit. From https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2021MS002717.

The ocean is home to a range of complex physical phenomena, like mesoscale turbulence (the vortices or “eddies” most visible around the equator), boundary currents (like the Gulf Stream), an entire zoo of internal and surface waves, and an all-encompassing overturning circulation. Yet it is governed by only 8 so-called primitive equations.

After deriving them from first principles, we can easily write down the mathematical rules that generate the entire wealth of observed ocean dynamics. To actually simulate them, we need to conjure an unholy stack of tricks, approximations, numerical hacks, heuristics, and empirical laws, adding up to well above ten thousand lines of code.

As a grad student, seeing an ocean simulation unfold on a supercomputer in real time was sublime. For the longest time, I believed that science was inherently about simplicity: finding the most concise explanation for a phenomenon, or an elegant theory that predicts observations. Today I know there are many ways to do science. One of them is building systems — like climate models, fusion reactors, and space telescopes — via execution-driven science, bridging the gap between research lab and real world. Execution-driven science still emphasizes simplicity (or rather, parsimony), but introduces compounding as an additional guiding principle.

Science that’s more than the sum of its parts

Execution-driven science is about standing on the shoulders of giants and making sure they’re all standing up straight. It’s about taking the pieces others have built, polishing them, and stacking them in a way that pushes the boundaries of what’s possible. And then adding some new pieces where they matter the most, and shatter records.

NeuralGCM architecture
Fig. 2: A prime example of execution-driven science: NeuralGCM, an AI-enabled system for climate modelling that is on par with some of the best traditional climate models and orders of magnitude faster. Each of the depicted boxes is unremarkable in isolation, but in combination they create something revolutionary. From https://arxiv.org/pdf/2311.07222.

NeuralGCM is an AI-driven system for atmospheric simulations (the atmosphere is typically the most computationally expensive component of a climate model). Taken at face value, each of its parts seems unremarkable (Fig. 2): a simple encoder-decoder architecture; a differentiable dynamical core (that is, hand-written traditional simulator) in Python; a feed-forward neural network to machine-learn a data-driven correction; and an off-the-rack solver for ordinary differential equations (ODEs). But acting together, this is the first system that demonstrates that AI can aid with simulations on climatic time scales, consisting of tens of thousands to millions of iterations, at an efficiency that has never been seen before:

For both weather and climate, our approach offers orders of magnitude computational savings over conventional GCMs (Kochkov et al. 2023).

The point here is not that all it took was to combine some existing methods and reap the benefits. What actually happened is much more subtle, where success came from a combination of different factors:

This is an example of execution-driven science at its best, and demonstrates several techniques on how to wield it to push the envelope. There are many more examples like it throughout science and industry, and they follow similar patterns; for example:

💡 Notice how some of those examples are clearly use-inspired, while others are not? This illustrates how use-inspired science and execution-driven science act along two different dimensions — you can have one without the other, or be neither use-inspired nor execution-driven.

Explore violently, fail fast, be pragmatic

Let’s take a closer look at what it takes to apply execution-driven science in practice to solve hard, real-world problems.

Fig. 3: Illustration of the A* algorithm navigating major cities (Chicago and Rome). A healthy mix of doubling down on promising routes and switching lanes when getting stuck allows the algorithm to navigate a massive search space without falling back to random exploration nor mindless exploitation.

And in the end, no amount of good advice will replace your best judgment, so don’t be afraid to use it.

Science or engineering?

People ask us sometimes: is this really science, or is it engineering? We don’t believe the distinction is a useful one — there’s certainly a lot of both, often in synergistic ways. Regardless, well-conducted execution-driven and use-inspired science exhibits many of the core values of the scientific method:

Execution-driven science in the wild

At Pasteur Labs, the journey towards next-gen simulation continues — we’re still far away from humanity’s dream to simulate everything. Models need to become orders of magnitude faster, cheaper, more accurate, grounded in reality and causality. Today, we believe that data-driven, AI-infused methods are a central piece to compute the parts of reality that elude classical approaches, and to tackle problems that do not have a concise mathematical representation. Execution-driven science has proven to be a powerful tool in this quest, and we’re excited to see where it will take us next.

And while the musings above are my earned insights, they are in great part realized because of the shared pursuits, rigor, continous support, and lived examples of top-notch use-inspired and execution-driven research from my teammates at Pasteur Labs.

Eager to help? 👉 pasteurlabs.ai/careers