Distributed Tracing

Tracing can be an invaluable tool in debugging distributed systems performance,especially for identifying bottlenecks and understanding the latency cost ofeach component in your system. Linkerd can be configured to emit trace spansfrom the proxies, allowing you to see exactly what time requests and responsesspend inside.

Unlike most of the features of Linkerd, distributed tracing requires both codechanges and configuration. (You can read up on Distributed tracing in theservice mesh: four mythsfor why this is.)

Furthermore, Linkerd provides many of the features that are often associatedwith distributed tracing, without requiring configuration or applicationchanges, including:

  • Live service topology and dependency graphs
  • Aggregated service health, latencies, and request volumes
  • Aggregated path / route health, latencies, and request volumes

For example, Linkerd can display a live topology of all incoming and outgoingdependencies for a service, without requiring distributed tracing or any othersuch application modification:

The Linkerd dashboard showing an automatically generated topology graph)The Linkerd dashboard showing an automatically generated topology graph

Likewise, Linkerd can provide golden metrics per service and per route, againwithout requiring distributed tracing or any other such applicationmodification:

Linkerd dashboard showing an automatically generated route metrics)Linkerd dashboard showing an automatically generated route metrics

Using distributed tracing

That said, distributed tracing certainly has its uses, and Linkerd makes thisas easy as it can. Linkerd's role in distributed tracing is actually quitesimple: when a Linkerd data plane proxy sees a tracing header in a proxied HTTPrequest, Linkerd will emit a trace span for that request. This span willinclude information about the exact amount of time spent in the Linkerd proxy.When paired with software to collect, store, and analyze this information, thiscan provide significant insight into the behavior of the mesh.

To use this feature, you'll also need to introduce several additionalcomponents in your system., including an ingress layer that kicks off the traceon particular requests, a client library for your application (or a mechanismto propagate trace headers), a trace collector to collect span data and turnthem into traces, and a trace backend to store the trace data and allow theuser to view/query it.

For details, please see our guide to adding distributed tracing to yourapplication with Linkerd.