Where Team Topologies Goes Wrong

Introduction

Gartner places platform engineering almost at the peak of inflated expectations in their Hype Cycle for Software Engineering 2023. Senior architects and CTOs from companies big and small are touring conferences and meetups to evangelize how awesome their new internal development platform is. The new mantra is, 'DevOps is dead; long live platform engineering.' What caused this trend, and why is it so successful, at least in its catchiness? Most talks and introductions about platform engineering base all of their arguments on the book Team Topologies by Matthew Skelton and Manuel Pais, published in 2019. According to Book Authority[1], this book is ranked among the 'Best Product Management Books of All Time.' and is widely used by organizations worldwide to transform the way they work. By understanding and implementing these team topologies effectively, the authors' premise is that organizations can foster collaboration, reduce complexity, and accelerate innovation in their software delivery processes.

The core message of the book is that development teams in IT departments should follow a specific fundamental configuration: stream-aligned teams, enabling teams, platform teams, and complicated subsystem teams. The authors also strongly emphasize the importance of platforms, such as SDLC platforms or internal development platforms, for which dedicated platform teams are necessary.

Psychology as a Basis for Team Structure

The authors base their conclusions and recommendations roughly on three theories: Conway’s Law[2], Dunbar’s Number[3] and Cognitive Load Theory[4]. Conway’s Law is used to emphasize the importance of communication pathways in organizations. In short, it states that a company's software systems architecture will mirror the communication structure of that company. Based on this, the authors argue that altering a company’s communication structure can lead to the development of the desired software architecture. This technique is called "reverse Conway's maneuver," and others had already conceived this idea, which the book also cites.

What I find troubling is that Skelton and Pais take a rather rigid approach, going so far as to categorize communication as either 'good' or 'bad.' They even suggest preventing 'bad' communication by physically separating people into different parts of the building. Disregarding the effectiveness of such measures in today’s age of digital collaborative platforms as an integral part of the modern work environment, this suggestion is patronizing and harmful to employees' morale and team cohesion. People will always follow the path of least resistance to get their work done, and if management doesn’t like what their employees are doing, they should provide a more effective alternative rather than complicating the method that is already commonly used.

Dissecting Dunbar's Number

Their next point is that software development teams today have much more to consider and learn than they did a decade or two ago. Teams are expected not only to build the software but also to build and maintain the tools to deliver and deploy the software, as well as the infrastructure to run the software on. This challenge is central to the DevOps movement: In the past, with separate teams responsible for different stages of the product lifecycle — one team writing the code, another team deploying it, and yet another team running and maintaining the servers — problems were often not discovered until late in the production stage. Ownership of problems was frequently passed around, leading to blame-shifting rather than issue resolution. Software developers were restricted to their black box and did not receive feedback about the application. The operations team had no idea what they were deploying; they were simply following step-by-step instructions. DevOps emerged with the argument that teams should instead take on shared responsibility for their product. Developers should care about and participate in the delivery process and software deployment, while Ops should also take ownership of the source code. With cloud services offering programmatic interfaces to manage hardware, developers and Ops now have a common ground to collaborate effectively. In this approach, you don’t have a pure software development team merely throwing deliverables over the fence to the Ops team for deployment; instead, you have individuals skilled in the entire software development lifecycle, responsible for the software from cradle to grave. The deployment environment, feedback from end-users, and insights from monitoring are all crucial considerations when writing the software code.

The authors argue that with this increased responsibility, teams cannot handle everything alone. They attribute this to what they refer to as cognitive load, which I will discuss later. I would argue that teams are certainly capable of learning how to manage these tasks; however, they simply lack the support and time to do so. And what is the sensible thing to do when you discover that you’ve given teams more work than they can handle? You either reduce the amount of work, expand the time, or expand the team. The first two options are typically unappealing to managers, because they want to keep promises they made towards stakeholders or customers or avoid admitting they were wrong — after all, who assigned the teams the work in the first place or pressured them with tight deadlines? For the third option, the authors further contend that expanding the team is not feasible due to Dunbar's number.

Skelton and Pais emphasize the importance of this concept for the further argument in the book (“Team-first software architecture is driven by Dunbar’s number”). It is supposed to describe how many people one can form stable social relationships with. In Team Topologies, this theory, combined with the idea that team members should form close social bonds and trust each other, results in recommendations for team sizes and organizational structures. Robin Dunbar, an anthropologist, studied the social groups of primates and correlated their group sizes with average brain size to determine a similar number for humans using proportion. However, there is no scientific or empirical evidence that this number applies to humans. In fact, attempts to replicate Dunbar’s study have yielded vastly different results. The current scientific consensus is that a cognitive limit on human group sizes cannot be derived in this manner. It also makes no sense to compare primates and humans this way, because while they may be our closest evolutionary relatives, our brains function fundamentally differently in handling information. This may partly explain why apes continue to live in trees while humans build cities and fly planes. Thus, it makes no sense to impose constraints on team sizes in software development departments based on the ratio of social group sizes of primates to their brain size.

Unraveling Cognitive Load

But it gets worse. The central argument in Team Topologies against diverse teams is that doing all the tasks now possible with DevOps would impose a massive “cognitive load” on the team. But what exactly is cognitive load? Skelton and Pais cite John Sweller, an Australian educational psychologist who researched instructional design. They provide a definition in the book which they also published in a blog post[5]. They quote Sweller, defining cognitive load as 'the total amount of mental effort being used in the working memory.' The authors then outline the three different types of cognitive load according to Sweller:

Intrinsic cognitive load—relates to aspects of the task fundamental to the problem space (e.g., “What is the structure of a Java class?” “How do I create a new method?”)
Extraneous cognitive load—relates to the environment in which the task is being done (e.g., “How do I deploy this component again?” “How do I configure this service?”)
Germane cognitive load—relates to aspects of the task that need special attention for learning or high performance (e.g., “How should this service interact with the ABC service?”)

The authors do not elaborate on this definition in the book or provide any differentiation; instead, they use the term “cognitive load” as if it is self-explanatory throughout the rest of the book. According to this definition, teams responsible not only for writing application code but also for building the tools to deliver the software, testing, deploying, and maintaining the (cloud) infrastructure are seen as so overburdened that they become slow, make errors, and fail to follow best practices. And since it is supposedly impossible to give the teams more time to tackle all these tasks or to provide them with additional resources, the authors argue that the only solution is to 'reduce' this cognitive load by offloading these 'responsibilities,' so that teams can 'focus' on what they supposedly 'do best' — writing code. Enter platform engineering.

Sounds like manager talk? Well, it is. But before we jump to conclusions, let’s first look into what Sweller originally discovered. Sweller conducted research on effective learning methods[6]. He initially concluded that the difference between a novice and an expert is that the expert has formed so-called schemata, which they can quickly apply to a given problem. A novice, on the other hand, has not yet developed these schemata and therefore needs to try multiple approaches to solve a problem. Sweller then defines cognitive load as the sum of the mental effort required to solve the problem. Sweller’s research focused exclusively on mathematical and geometrical problems, examining the cognitive load involved in common problem-solving strategies, like means-ends analysis. Now, his definition of cognitive load goes like this:

Intrinsic cognitive load is the inherent level of difficulty associated with a specific task. For example, learning vocabulary has a low cognitive load, while solving a mathematical equation has a high load.
Extraneous cognitive load is generated by understanding the given problem and is influenced by the manner in which the problem is presented by the instruction materials. Here Sweller identifies the most potential for teachers and coaches to influence how well their students might learn something.
Germane cognitive load is the mental load required to form the afore-mentioned schemata.

It is worth noting that the exact quote Skelton and Pais use in their definition (“the total amount of mental effort…”) does not actually appear in Sweller's work. Sweller doesn’t provide a concise definition of cognitive load but rather describes its different components. So, at the very least, the authors are misquoting Sweller here. Additionally, when comparing Sweller’s concept of cognitive load with the definition provided in Team Topologies, it becomes apparent that they differ significantly. Furthermore, the context in which Sweller discusses cognitive load is confined to the process of learning or forming the schemata he mentions. It has absolutely nothing to do with software development. While Sweller actually conducted empirical studies (with interesting results — I recommend reading the sources), the concept of cognitive load in software development lacks any scientific evidence. This becomes clear when noting that the 'working memory' mentioned by Skelton and Pais is limited because it refers to the amount of information you can hold in your mind at any given time. According to Sweller, you can expand your working memory (or reduce cognitive load) simply by writing things down instead of trying to remember them. In contrast, long-term memory is commonly defined as unlimited in cognitive psychology and neuroscience. As a software engineer, when was the last time you had to remember all the details of your build pipeline in your head? Why do you need to keep all the details about the software development lifecycle in a DevOps-minded team in your head at the same time? How does integrating a static code analysis tool into your build chain increase “cognitive load” for the team? Ironically, the same people trying to convince you of this are often the ones who wouldn’t make room in a sprint for refactoring a codebase riddled with technical debt. The “cognitive load” a developer faces when integrating a new feature into such a mess is apparently not relevant.

To conclude this chapter: there is no scientific evidence that the form of cognitive load described by Skelton and Pais in their book — and used as a foundation for their argumentation — actually exists.

So what?

Now that we’ve established that neither Dunbar’s number nor the concept of cognitive load as described in Team Topologies has any scientific basis, all implications in the book that are founded on these theories are rendered moot. There is no scientific evidence to suggest that teams should have a specific upper limit on group size based on Dunbar’s number. Similarly, there is no valid reason to introduce measures to reduce a cognitive load on team as outlined in the book.

Note that this doesn’t imply that their statements about team structure or platforms are entirely false; it only means that the reasoning behind their conclusions in the book is flawed. That said, I do have my own doubts about the effectiveness of internal platforms and platform engineering as they are often presented. The key justification — reducing cognitive load — doesn’t hold up under scrutiny, and I believe the potential disadvantages of centralized approaches deserve more attention. I’ll explore these criticisms in detail in a future blog post.

[1] https://bookauthority.org/

[2] https://en.wikipedia.org/wiki/Conway%27s_law

[3] https://en.wikipedia.org/wiki/Dunbar's_number

[4] https://en.wikipedia.org/wiki/Cognitive_load

[5] https://itrevolution.com/articles/cognitive-load/