Instrumentation
Instrumentation is the act of adding observability code to an app yourself.
If you’re instrumenting an app, you need to use the OpenTelemetry SDK for your language. You’ll then use the SDK to initialize OpenTelemetry and the API to instrument your code. This will emit telemetry from your app, and any library you installed that also comes with instrumentation.
If you’re instrumenting a library, only install the OpenTelemetry API package for your language. Your library will not emit telemetry on its own. It will only emit telemetry when it is part of an app that uses the OpenTelemetry SDK. For more on instrumenting libraries, see Libraries.
For more information about the OpenTelemetry API and SDK, see the specification.
Setup
Traces
Getting a Tracer
To create spans, you’ll need to acquire or initialize a tracer first.
Ensure you have the right packages installed:
go get go.opentelemetry.io/otel \
go.opentelemetry.io/otel/trace \
go.opentelemetry.io/otel/sdk \
Then initialize an exporter, resources, tracer provider, and finally a tracer.
package app
import (
"context"
"fmt"
"log"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/otlp/otlptrace"
"go.opentelemetry.io/otel/sdk/resource"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
semconv "go.opentelemetry.io/otel/semconv/v1.21.0"
"go.opentelemetry.io/otel/trace"
)
var tracer trace.Tracer
func newExporter(ctx context.Context) /* (someExporter.Exporter, error) */ {
// Your preferred exporter: console, jaeger, zipkin, OTLP, etc.
}
func newTraceProvider(exp sdktrace.SpanExporter) *sdktrace.TracerProvider {
// Ensure default SDK resources and the required service name are set.
r, err := resource.Merge(
resource.Default(),
resource.NewWithAttributes(
semconv.SchemaURL,
semconv.ServiceName("ExampleService"),
),
)
if err != nil {
panic(err)
}
return sdktrace.NewTracerProvider(
sdktrace.WithBatcher(exp),
sdktrace.WithResource(r),
)
}
func main() {
ctx := context.Background()
exp, err := newExporter(ctx)
if err != nil {
log.Fatalf("failed to initialize exporter: %v", err)
}
// Create a new tracer provider with a batch span processor and the given exporter.
tp := newTraceProvider(exp)
// Handle shutdown properly so nothing leaks.
defer func() { _ = tp.Shutdown(ctx) }()
otel.SetTracerProvider(tp)
// Finally, set the tracer that can be used for this package.
tracer = tp.Tracer("ExampleService")
}
You can now access tracer
to manually instrument your code.
Creating Spans
Spans are created by tracers. If you don’t have one initialized, you’ll need to do that.
To create a span with a tracer, you’ll also need a handle on a context.Context
instance. These will typically come from things like a request object and may
already contain a parent span from an instrumentation library.
func httpHandler(w http.ResponseWriter, r *http.Request) {
ctx, span := tracer.Start(r.Context(), "hello-span")
defer span.End()
// do some work to track with hello-span
}
In Go, the context
package is used to store the active span. When you start a
span, you’ll get a handle on not only the span that’s created, but the modified
context that contains it.
Once a span has completed, it is immutable and can no longer be modified.
Get the current span
To get the current span, you’ll need to pull it out of a context.Context
you
have a handle on:
// This context needs contain the active span you plan to extract.
ctx := context.TODO()
span := trace.SpanFromContext(ctx)
// Do something with the current span, optionally calling `span.End()` if you want it to end
This can be helpful if you’d like to add information to the current span at a point in time.
Create nested spans
You can create a nested span to track work in a nested operation.
If the current context.Context
you have a handle on already contains a span
inside of it, creating a new span makes it a nested span. For example:
func parentFunction(ctx context.Context) {
ctx, parentSpan := tracer.Start(ctx, "parent")
defer parentSpan.End()
// call the child function and start a nested span in there
childFunction(ctx)
// do more work - when this function ends, parentSpan will complete.
}
func childFunction(ctx context.Context) {
// Create a span to track `childFunction()` - this is a nested span whose parent is `parentSpan`
ctx, childSpan := tracer.Start(ctx, "child")
defer childSpan.End()
// do work here, when this function returns, childSpan will complete.
}
Once a span has completed, it is immutable and can no longer be modified.
Span Attributes
Attributes are keys and values that are applied as metadata to your spans and are useful for aggregating, filtering, and grouping traces. Attributes can be added at span creation, or at any other time during the lifecycle of a span before it has completed.
// setting attributes at creation...
ctx, span = tracer.Start(ctx, "attributesAtCreation", trace.WithAttributes(attribute.String("hello", "world")))
// ... and after creation
span.SetAttributes(attribute.Bool("isTrue", true), attribute.String("stringAttr", "hi!"))
Attribute keys can be precomputed, as well:
var myKey = attribute.Key("myCoolAttribute")
span.SetAttributes(myKey.String("a value"))
Semantic Attributes
Semantic Attributes are attributes that are defined by the OpenTelemetry
Specification in order to provide a shared set of attribute keys across
multiple languages, frameworks, and runtimes for common concepts like HTTP
methods, status codes, user agents, and more. These attributes are available in
the go.opentelemetry.io/otel/semconv/v1.21.0
package.
For details, see Trace semantic conventions.
Events
An event is a human-readable message on a span that represents “something happening” during it’s lifetime. For example, imagine a function that requires exclusive access to a resource that is under a mutex. An event could be created at two points - once, when we try to gain access to the resource, and another when we acquire the mutex.
span.AddEvent("Acquiring lock")
mutex.Lock()
span.AddEvent("Got lock, doing work...")
// do stuff
span.AddEvent("Unlocking")
mutex.Unlock()
A useful characteristic of events is that their timestamps are displayed as offsets from the beginning of the span, allowing you to easily see how much time elapsed between them.
Events can also have attributes of their own -
span.AddEvent("Cancelled wait due to external signal", trace.WithAttributes(attribute.Int("pid", 4328), attribute.String("signal", "SIGHUP")))
Set span status
A Status can be set on a
Span, typically used to specify that a
Span has not completed successfully - Error
. By default, all spans are
Unset
, which means a span completed without error. The Ok
status is reserved
for when you need to explicitly mark a span as successful rather than stick with
the default of Unset
(i.e., “without error”).
The status can be set at any time before the span is finished.
import (
// ...
"go.opentelemetry.io/otel/codes"
// ...
)
// ...
result, err := operationThatCouldFail()
if err != nil {
span.SetStatus(codes.Error, "operationThatCouldFail failed")
}
Record errors
If you have an operation that failed and you wish to capture the error it produced, you can record that error.
import (
// ...
"go.opentelemetry.io/otel/codes"
// ...
)
// ...
result, err := operationThatCouldFail()
if err != nil {
span.SetStatus(codes.Error, "operationThatCouldFail failed")
span.RecordError(err)
}
It is highly recommended that you also set a span’s status to Error
when using
RecordError
, unless you do not wish to consider the span tracking a failed
operation as an error span. The RecordError
function does not
automatically set a span status when called.
Propagators and Context
Traces can extend beyond a single process. This requires context propagation, a mechanism where identifiers for a trace are sent to remote processes.
In order to propagate trace context over the wire, a propagator must be registered with the OpenTelemetry API.
import (
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/propagation"
)
...
otel.SetTextMapPropagator(propagation.TraceContext{})
OpenTelemetry also supports the B3 header format, for compatibility with existing tracing systems (
go.opentelemetry.io/contrib/propagators/b3
) that do not support the W3C TraceContext standard.
After configuring context propagation, you’ll most likely want to use automatic instrumentation to handle the behind-the-scenes work of actually managing serializing the context.
Metrics
To start producing metrics, you’ll need to
have an initialized MeterProvider
that lets you create a Meter
. Meters let
you create instruments that you can use to create different kinds of metrics.
OpenTelemetry Go currently supports the following instruments:
- Counter, a synchronous instrument that supports non-negative increments
- Asynchronous Counter, an asynchronous instrument which supports non-negative increments
- Histogram, a synchronous instrument that supports arbitrary values that are statistically meaningful, such as histograms, summaries, or percentile
- Asynchronous Gauge, an asynchronous instrument that supports non-additive values, such as room temperature
- UpDownCounter, a synchronous instrument that supports increments and decrements, such as the number of active requests
- Asynchronous UpDownCounter, an asynchronous instrument that supports increments and decrements
For more on synchronous and asynchronous instruments, and which kind is best suited for your use case, see Supplementary Guidelines.
If a MeterProvider
is not created either by an instrumentation library or
manually, the OpenTelemetry Metrics API will use a no-op implementation and fail
to generate data.
Here you can find more detailed package documentation for:
- Metrics API:
go.opentelemetry.io/otel/metric
- Metrics SDK:
go.opentelemetry.io/otel/sdk/metric
Initialize Metrics
To enable metrics in your app, you’ll need to
have an initialized
MeterProvider
that will let
you create a Meter
.
If a MeterProvider
is not created, the OpenTelemetry APIs for metrics will use
a no-op implementation and fail to generate data. Therefore, you have to modify
the source code to include the SDK initialization code using the following
packages:
go.opentelemetry.io/otel
go.opentelemetry.io/otel/sdk/metric
go.opentelemetry.io/otel/sdk/resource
go.opentelemetry.io/otel/exporters/stdout/stdoutmetric
Ensure you have the right Go modules installed:
go get go.opentelemetry.io/otel \
go.opentelemetry.io/otel/exporters/stdout/stdoutmetric \
go.opentelemetry.io/otel/sdk \
go.opentelemetry.io/otel/sdk/metric
Then initialize a resources, metrics exporter, and metrics provider:
package main
import (
"context"
"log"
"time"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/stdout/stdoutmetric"
"go.opentelemetry.io/otel/sdk/metric"
"go.opentelemetry.io/otel/sdk/resource"
semconv "go.opentelemetry.io/otel/semconv/v1.21.0"
)
func main() {
// Create resource.
res, err := newResource()
if err != nil {
panic(err)
}
// Create a meter provider.
// You can pass this instance directly to your instrumented code if it
// accepts a MeterProvider instance.
meterProvider, err := newMeterProvider(res)
if err != nil {
panic(err)
}
// Handle shutdown properly so nothing leaks.
defer func() {
if err := meterProvider.Shutdown(context.Background()); err != nil {
log.Println(err)
}
}()
// Register as global meter provider so that it can be used via otel.Meter
// and accessed using otel.GetMeterProvider.
// Most instrumentation libraries use the global meter provider as default.
// If the global meter provider is not set then a no-op implementation
// is used, which fails to generate data.
otel.SetMeterProvider(meterProvider)
}
func newResource() (*resource.Resource, error) {
return resource.Merge(resource.Default(),
resource.NewWithAttributes(semconv.SchemaURL,
semconv.ServiceName("my-service"),
semconv.ServiceVersion("0.1.0"),
))
}
func newMeterProvider(res *resource.Resource) (*metric.MeterProvider, error) {
metricExporter, err := stdoutmetric.New()
if err != nil {
return nil, err
}
meterProvider := metric.NewMeterProvider(
metric.WithResource(res),
metric.WithReader(metric.NewPeriodicReader(metricExporter,
// Default is 1m. Set to 3s for demonstrative purposes.
metric.WithInterval(3*time.Second))),
)
return meterProvider, nil
}
Now that a MeterProvider
is configured, you can acquire a Meter
.
Acquiring a Meter
Anywhere in your application where you have manually instrumented code you can
call otel.Meter
to
acquire a meter. For example:
import "go.opentelemetry.io/otel"
var meter = otel.Meter("my-service-meter")
Synchronous and asynchronous instruments
OpenTelemetry instruments are either synchronous or asynchronous (observable).
Synchronous instruments take a measurement when they are called. The measurement is done as another call during program execution, just like any other function call. Periodically, the aggregation of these measurements is exported by a configured exporter. Because measurements are decoupled from exporting values, an export cycle may contain zero or multiple aggregated measurements.
Asynchronous instruments, on the other hand, provide a measurement at the request of the SDK. When the SDK exports, a callback that was provided to the instrument on creation is invoked. This callback provides the SDK with a measurement that is immediately exported. All measurements on asynchronous instruments are performed once per export cycle.
Asynchronous instruments are useful in several circumstances, such as:
- When updating a counter is not computationally cheap, and you don’t want the current executing thread to wait for the measurement
- Observations need to happen at frequencies unrelated to program execution (i.e., they cannot be accurately measured when tied to a request lifecycle)
- There is no known timestamp for a measurement value
In cases like these, it’s often better to observe a cumulative value directly, rather than aggregate a series of deltas in post-processing (the synchronous example).
Using Counters
Counters can be used to measure a non-negative, increasing value.
For example, here’s how you report the number of calls for an HTTP handler:
import (
"net/http"
"go.opentelemetry.io/otel/metric"
)
func init() {
apiCounter, err := meter.Int64Counter(
"api.counter",
metric.WithDescription("Number of API calls."),
metric.WithUnit("{call}"),
)
if err != nil {
panic(err)
}
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
apiCounter.Add(r.Context(), 1)
// do some work in an API call
})
}
Using UpDown Counters
UpDown counters can increment and decrement, allowing you to observe a cumulative value that goes up or down.
For example, here’s how you report the number of items of some collection:
import (
"context"
"go.opentelemetry.io/otel/metric"
)
var itemsCounter metric.Int64UpDownCounter
func init() {
var err error
itemsCounter, err = meter.Int64UpDownCounter(
"items.counter",
metric.WithDescription("Number of items."),
metric.WithUnit("{item}"),
)
if err != nil {
panic(err)
}
}
func addItem() {
// code that adds an item to the collection
itemsCounter.Add(context.Background(), 1)
}
func removeItem() {
// code that removes an item from the collection
itemsCounter.Add(context.Background(), -1)
}
Using Histograms
Histograms are used to measure a distribution of values over time.
For example, here’s how you report a distribution of response times for an HTTP handler:
import (
"net/http"
"time"
"go.opentelemetry.io/otel/metric"
)
func init() {
histogram, err := meter.Float64Histogram(
"task.duration",
metric.WithDescription("The duration of task execution."),
metric.WithUnit("s"),
)
if err != nil {
panic(err)
}
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
start := time.Now()
// do some work in an API call
duration := time.Since(start)
histogram.Record(r.Context(), duration.Seconds())
})
}
Using Observable (Async) Counters
Observable counters can be used to measure an additive, non-negative, monotonically increasing value.
For example, here’s how you report time since the application started:
import (
"context"
"time"
"go.opentelemetry.io/otel/metric"
)
func init() {
start := time.Now()
if _, err := meter.Float64ObservableCounter(
"uptime",
metric.WithDescription("The duration since the application started."),
metric.WithUnit("s"),
metric.WithFloat64Callback(func(_ context.Context, o metric.Float64Observer) error {
o.Observe(float64(time.Since(start).Seconds()))
return nil
}),
); err != nil {
panic(err)
}
}
Using Observable (Async) UpDown Counters
Observable UpDown counters can increment and decrement, allowing you to measure an additive, non-negative, non-monotonically increasing cumulative value.
For example, here’s how you report some database metrics:
import (
"context"
"database/sql"
"go.opentelemetry.io/otel/metric"
)
// registerDBMetrics registers asynchronous metrics for the provided db.
// Make sure to unregister metric.Registration before closing the provided db.
func registerDBMetrics(db *sql.DB, meter metric.Meter, poolName string) (metric.Registration, error) {
max, err := meter.Int64ObservableUpDownCounter(
"db.client.connections.max",
metric.WithDescription("The maximum number of open connections allowed."),
metric.WithUnit("{connection}"),
)
if err != nil {
return nil, err
}
waitTime, err := meter.Int64ObservableUpDownCounter(
"db.client.connections.wait_time",
metric.WithDescription("The time it took to obtain an open connection from the pool."),
metric.WithUnit("ms"),
)
if err != nil {
return nil, err
}
reg, err := meter.RegisterCallback(
func(_ context.Context, o metric.Observer) error {
stats := db.Stats()
o.ObserveInt64(max, int64(stats.MaxOpenConnections))
o.ObserveInt64(waitTime, int64(stats.WaitDuration))
return nil
},
max,
waitTime,
)
if err != nil {
return nil, err
}
return reg, nil
}
Using Observable (Async) Gauges
Observable Gauges should be used to measure non-additive values.
For example, here’s how you report memory usage of the heap objects used in application:
import (
"context"
"runtime"
"go.opentelemetry.io/otel/metric"
)
func init() {
if _, err := meter.Int64ObservableGauge(
"memory.heap",
metric.WithDescription(
"Memory usage of the allocated heap objects.",
),
metric.WithUnit("By"),
metric.WithInt64Callback(func(_ context.Context, o metric.Int64Observer) error {
var m runtime.MemStats
runtime.ReadMemStats(&m)
o.Observe(int64(m.HeapAlloc))
return nil
}),
); err != nil {
panic(err)
}
}
Adding attributes
You can add Attributes by using the
WithAttributeSet
or
WithAttributes
options.
import (
"net/http"
"go.opentelemetry.io/otel/metric"
semconv "go.opentelemetry.io/otel/semconv/v1.21.0"
)
func init() {
apiCounter, err := meter.Int64UpDownCounter(
"api.finished.counter",
metric.WithDescription("Number of finished API calls."),
metric.WithUnit("{call}"),
)
if err != nil {
panic(err)
}
http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
// do some work in an API call and set the response HTTP status code
apiCounter.Add(r.Context(), 1,
metric.WithAttributes(semconv.HTTPStatusCode(statusCode)))
})
}
Registering Views
A view provides SDK users with the flexibility to customize the metrics output by the SDK. You can customize which metric instruments are to be processed or ignored. You can also customize aggregation and what attributes you want to report on metrics.
Every instrument has a default view, which retains the original name, description, and attributes, and has a default aggregation that is based on the type of instrument. When a registered view matches an instrument, the default view is replaced by the registered view. Additional registered views that match the instrument are additive, and result in multiple exported metrics for the instrument.
You can use the
NewView
function to create a view and register it using the
WithView
option.
For example, here’s how you create a view that renames the latency
instrument
from the v0.34.0
version of the http
instrumentation library to
request.latency
:
view := metric.NewView(metric.Instrument{
Name: "latency",
Scope: instrumentation.Scope{
Name: "http",
Version: "0.34.0",
},
}, metric.Stream{Name: "request.latency"})
meterProvider := metric.NewMeterProvider(
metric.WithView(view),
)
For example, here’s how you create a view that makes the latency
instrument
from the http
instrumentation library to be reported as an exponential
histogram:
view := metric.NewView(
metric.Instrument{
Name: "latency",
Scope: instrumentation.Scope{Name: "http"},
},
metric.Stream{
Aggregation: metric.AggregationBase2ExponentialHistogram{
MaxSize: 160,
MaxScale: 20,
},
},
)
meterProvider := metric.NewMeterProvider(
metric.WithView(view),
)
The SDK filters metrics and attributes before exporting metrics. For example, you can use views to reduce memory usage of high cardinality metrics or drop attributes that might contain sensitive data.
Here’s how you create a view that drops the latency
instrument from the http
instrumentation library:
view := metric.NewView(
metric.Instrument{
Name: "latency",
Scope: instrumentation.Scope{Name: "http"},
},
metric.Stream{Aggregation: metric.AggregationDrop{}},
)
meterProvider := metric.NewMeterProvider(
metric.WithView(view),
)
Here’s how you create a view that removes the http.request.method
attribute
recorded by the latency
instrument from the http
instrumentation library:
view := metric.NewView(
metric.Instrument{
Name: "latency",
Scope: instrumentation.Scope{Name: "http"},
},
metric.Stream{AttributeFilter: attribute.NewDenyKeysFilter("http.request.method")},
)
meterProvider := metric.NewMeterProvider(
metric.WithView(view),
)
The Name
field of criteria supports wildcard pattern matching. The *
wildcard is recognized as matching zero or more characters, and ?
is
recognized as matching exactly one character. For example, a pattern of *
matches all instrument names.
The following example shows how you create a view that sets unit to milliseconds
for any instrument with a name suffix of .ms
:
view := metric.NewView(
metric.Instrument{Name: "*.ms"},
metric.Stream{Unit: "ms"},
)
meterProvider := metric.NewMeterProvider(
metric.WithView(view),
)
The NewView
function provides a convenient way of creating views. If NewView
can’t provide the functionalities you need, you can create a custom
View
directly.
For example, here’s how you create a view that uses regular expression matching to ensure all data stream names have a suffix of the units it uses:
re := regexp.MustCompile(`[._](ms|byte)$`)
var view metric.View = func(i metric.Instrument) (metric.Stream, bool) {
// In a custom View function, you need to explicitly copy
// the name, description, and unit.
s := metric.Stream{Name: i.Name, Description: i.Description, Unit: i.Unit}
// Any instrument that does not have a unit suffix defined, but has a
// dimensional unit defined, update the name with a unit suffix.
if re.MatchString(i.Name) {
return s, false
}
switch i.Unit {
case "ms":
s.Name += ".ms"
case "By":
s.Name += ".byte"
default:
return s, false
}
return s, true
}
meterProvider := metric.NewMeterProvider(
metric.WithView(view),
)
Logs
The logs API is currently unstable, documentation TBA.
Next Steps
You’ll also want to configure an appropriate exporter to export your telemetry data to one or more telemetry backends.