Introducing Viaduct, Airbnb’s data-oriented service mesh.

At Hasura’s Enterprise GraphQL Conf on October 2020, Airbnb presented Viaduct, a data-oriented service mesh aiming to bring a step function improvement in the modularity of microservices-based Service-Oriented Architectures (SOAs).

Modern applications can consist of thousands to tens of thousands of microservices connected in unconstrained ways.

Airbnb’s dependency graph. AmazonNetflix, and Uber share similarly tangled dependency graphs.

To help manage the larger number of services inherent in a microservices-based architecture, they needed organizing principles as well as technical measures to implement those principles. Their investigations led to the concept of a data-oriented service mesh, which can bring a new level of modularity to SOA. Today’s microservice is a collection of procedural endpoints — a classic, 1970s-style module.

A scalable SOA application is a service mesh which routes service invocations to instances of microservices that can handle them. The current industry standard for service meshes is to organize exclusively around remote procedure invocations without knowing anything about the data that makes up the application architecture. Their vision was to replace these procedure-oriented service meshes with service meshes organized around data.

At Airbnb, they use GraphQL™️ to build a data-oriented service mesh called Viaduct. A Viaduct service mesh is defined in terms of a GraphQL schema consisting of:

  • Types (and interfaces) describing data managed within your service mesh
  • Queries (and subscriptions) provide means to access that data, which is abstracted from the service entry points that provide the data
  • Mutations providing ways to update data, again abstracted from service entry points

The types (and interfaces) in the schema define a single graph across all of the data managed within the service mesh. For example, at an eCommerce company, a service mesh’s schema may define a field productById(id: ID) that returns results of type Product. From this starting point, a single query allows a data consumer to navigate to information about the product’s manufacturer, e.g., productById { manufacturer }; reviews of the product, e.g. productById { reviews }; and even the authors of those reviews, e.g., productById { reviews { author } }.

The data elements requested by such a query may come from many different microservices. In a procedure-oriented service mesh, the data consumer would need to take these services as explicit dependencies. In a data-oriented service mesh, it is the service mesh, not the data consumer, that knows which services provide which data element. Viaduct abstracts away the service dependencies from any single consumer.

Viaduct deals with the schema as a single artefact and has implemented several primitives that allow us to keep a unified schema while still allowing many teams to collaborate on that schema productively. As Viaduct replaces more and more of our underlying procedure-oriented service mesh, its schema captures the data managed by the application more and more completely. They have taken advantage of this “central schema,” as it is defined, as a place to define the APIs of some microservices.
Among other things, using the central schema will solve one of the bigger challenges of large-scale SOA applications: data agility. In today’s SOA applications, a change to a database schema often needs to be manually reflected in the APIs of two, three, and sometimes even more layers of microservices before it can be exposed to client code. Such changes can require weeks of coordinating among multiple teams. By deriving service APIs and database schemas from a single, central schema, a database schema change like this can be propagated to client code with a single update.

Viaduct has a mechanism for computing called “derived fields” using serverless cloud functions that operate on top of the graph without knowledge of the underlying services. These functions allow moving transformational logic out of the service mesh and into stateless containers, keeping the graph clean and reducing the number and complexity of services needed.

Viaduct is built on graphql-java and supports fine-grained field selection via GraphQL selection sets. It uses modern data-loading techniques, employs reliability techniques such as short-circuiting and soft dependencies, and implements an intra-request cache. Viaduct provides data observability, allowing the understanding, down to the field level, of what services consume what data. As a GraphQL interface, Viaduct allows taking advantage of a large ecosystem of open-source tooling, including live IDEs, mock servers, and schema visualizers.

This article is a re-adaptation from the Airbnb tech blog. Read the full story here: https://medium.com/airbnb-engineering/taming-service-oriented-architecture-using-a-data-oriented-service-mesh-da771a841344

More info on service mesh: https://www.opencloudification.com/service-mesh/