API reference - Braintrust

This page covers the key APIs in the Braintrust Go SDK. For setup, see the Quickstart. For the complete reference, see pkg.go.dev.

Tracing

Tracing records what your application does as spans you can inspect in Braintrust. The recommended way to capture AI calls is auto-instrumentation: use the trace/contrib packages to instrument supported provider libraries, either at build time with Orchestrion or with runtime middleware (see Go SDK integrations). Tracing is built on OpenTelemetry, so you trace your own code with the standard OpenTelemetry API. The APIs below create the client, trace your own code, and link to your traces.

`braintrust.New`

Creates a Braintrust client and configures the OpenTelemetry pipeline that exports spans to Braintrust. Call it once on startup, passing your TracerProvider and any options.

#skip-compile

import (
	"github.com/braintrustdata/braintrust-sdk-go"
	"go.opentelemetry.io/otel"
	"go.opentelemetry.io/otel/sdk/trace"
)

tp := trace.NewTracerProvider()
otel.SetTracerProvider(tp)

client, err := braintrust.New(tp, braintrust.WithProject("My project"))
if err != nil {
	log.Fatal(err)
}

Returns: (*braintrust.Client, error). braintrust.New reads BRAINTRUST_API_KEY from the environment. Configure the rest with functional options or environment variables (see Configuration). Because tracing is built on OpenTelemetry, you trace your own application code with the standard OpenTelemetry API, and traced AI calls nest under your spans.

#skip-compile

ctx, span := otel.Tracer("my-app").Start(ctx, "process-request")
defer span.End()

`Client.Permalink`

Builds a Braintrust UI URL for a span, so you can link straight to a trace from your own logs or app.

#skip-compile

url := client.Permalink(span)

Returns: string.

Evaluations

An evaluation runs your task over a set of cases, scores each output, and logs the results to an experiment, which is how you measure quality and catch regressions as you change prompts or models. Create an evaluator with braintrust.NewEvaluator, then call Run.

`braintrust.NewEvaluator`

Creates an evaluator for input type I and result type R, bound to a client. Call Run on it with the cases, task, and scorers to execute the evaluation and log an experiment.

#skip-compile

import (
	"context"

	"github.com/braintrustdata/braintrust-sdk-go"
	"github.com/braintrustdata/braintrust-sdk-go/eval"
)

evaluator := braintrust.NewEvaluator[string, string](client)

_, err := evaluator.Run(context.Background(), eval.Opts[string, string]{
	Experiment: "answers-v1",
	Dataset: eval.NewDataset([]eval.Case[string, string]{
		{Input: "How do I reset my password?", Expected: "Use the account recovery flow."},
		{Input: "How do I export my data?", Expected: "Open Settings and choose Export."},
	}),
	Task: eval.T(answerQuestion),
	Scorers: []eval.Scorer[string, string]{
		eval.NewScorer("exact_match", func(_ context.Context, r eval.TaskResult[string, string]) (eval.Scores, error) {
			v := 0.0
			if r.Output == r.Expected {
				v = 1.0
			}
			return eval.S(v), nil
		}),
	},
})

Returns: *eval.Evaluator[I, R]. Run returns (*eval.Result, error). eval.Opts[I, R] fields:

Experiment (string, required): experiment name.
Dataset (eval.Dataset[I, R], required): the cases to run. Build an in-memory one with eval.NewDataset (see Datasets).
Task (eval.TaskFunc[I, R], required): the function under test. Wrap a plain func(ctx, input) (output, error) with eval.T.
Scorers ([]eval.Scorer[I, R]): scorers to apply to each case. Provide Scorers, Classifiers, or both.
Classifiers ([]eval.Classifier[I, R]): classifiers to apply to each case. Provide Scorers, Classifiers, or both.
ProjectName (string): project to log to. Defaults to the client’s configured project.
Tags ([]string): tags to apply to the experiment.
Metadata (eval.Metadata): metadata to attach to the experiment.
Update (bool): append to an existing experiment with the same name. Defaults to false.
Parallelism (int): number of goroutines. Defaults to 1.
TrialCount (int): number of times to run each case. Defaults to 1.
Quiet (bool): suppress result output. Defaults to false.

`eval.NewScorer`

Creates a scorer from a function. A scorer measures how good the task’s output is, returning one or more named scores per case.

#skip-compile

scorer := eval.NewScorer("exact_match", func(_ context.Context, r eval.TaskResult[string, string]) (eval.Scores, error) {
	if r.Output == r.Expected {
		return eval.S(1.0), nil
	}
	return eval.S(0.0), nil
})

Returns: eval.Scorer[I, R]. The score function receives an eval.TaskResult[I, R] (with Input, Output, Expected, and Metadata) and returns eval.Scores. Use eval.S to build a single score.

`eval.NewClassifier`

Creates a classifier from a function. Use a classifier to categorize output instead of scoring it numerically.

#skip-compile

classifier := eval.NewClassifier("topic", func(_ context.Context, r eval.TaskResult[string, string]) (eval.Classifications, error) {
	return eval.Classifications{{ID: "billing", Label: "Billing"}}, nil
})

Returns: eval.Classifier[I, R].

Datasets

A dataset is the set of cases an evaluation runs against. Define cases inline in memory, or manage datasets in Braintrust through the API client.

`eval.NewDataset`

Groups cases into an in-memory dataset you pass to Evaluator.Run, as an alternative to loading one from Braintrust.

#skip-compile

dataset := eval.NewDataset([]eval.Case[string, string]{
	{Input: "How do I reset my password?", Expected: "Use the account recovery flow."},
	{Input: "How do I export my data?", Expected: "Open Settings and choose Export."},
})

Returns: eval.Dataset[I, R]. Each eval.Case[I, R] has an Input and optional Expected, Tags, Metadata, and TrialCount.

Attachments

When your traces involve binary content like images or PDFs, log it as an attachment so it appears in Braintrust instead of as an opaque blob. When you trace AI calls, Braintrust automatically converts base64 attachments in provider messages into uploaded attachments, so you rarely need the APIs below for instrumented calls. Reach for them when you’re attaching binary content to a span yourself.

`attachment.From*`

Creates an attachment from bytes, a file, or a URL.

#skip-compile

import "github.com/braintrustdata/braintrust-sdk-go/trace/attachment"

att, err := attachment.FromFile("image/png", "chart.png")

Constructors:

attachment.FromBytes(contentType string, data []byte) → *attachment.Attachment: from raw bytes.
attachment.FromFile(contentType string, path string) → (*attachment.Attachment, error): reads a file.
attachment.FromURL(url string) → (*attachment.Attachment, error): fetches a URL and uses the response content type.
attachment.FromReader(contentType string, r io.Reader) → *attachment.Attachment: from an io.Reader.

API client

For direct access to the Braintrust REST API, use the api package. Reach for it to manage projects, experiments, datasets, and functions programmatically, beyond what the higher-level APIs above cover.

`api.NewClient`

Creates a REST API client from an API key.

#skip-compile

import "github.com/braintrustdata/braintrust-sdk-go/api"

client := api.NewClient(os.Getenv("BRAINTRUST_API_KEY"))

Returns: *api.API. Namespaces:

client.Projects(): project management.
client.Experiments(): experiment management.
client.Datasets(): dataset management. Methods include Create, Insert, InsertEvents, Delete, Fetch, and Query.
client.Functions(): function management, including Invoke(ctx, functionID, input) to call a deployed function.

Configuration

Configure the client with functional options passed to braintrust.New, or with environment variables.

#skip-compile

client, err := braintrust.New(tp,
	braintrust.WithProject("My project"),
	braintrust.WithBlockingLogin(true),
)

Client options:

braintrust.WithAPIKey(apiKey string): API key. Defaults to BRAINTRUST_API_KEY.
braintrust.WithAPIURL(apiURL string): Braintrust API URL.
braintrust.WithAppURL(appURL string): Braintrust app URL, used for permalinks.
braintrust.WithOrgName(orgName string): organization name, useful when credentials can access multiple orgs.
braintrust.WithProject(projectName string): project that receives exported spans.
braintrust.WithProjectID(projectID string): project ID. Takes precedence over the project name.
braintrust.WithBlockingLogin(enabled bool): log in synchronously during New instead of in the background.
braintrust.WithExporter(exporter trace.SpanExporter): supply a custom span exporter. Intended for testing.
braintrust.WithEnableTraceConsoleLog(enabled bool): print spans to the console.
braintrust.WithFilterAISpans(enabled bool): export only AI-related spans.

Environment variables

BRAINTRUST_API_KEY (required): Braintrust API key.
BRAINTRUST_API_URL: Braintrust API URL. Defaults to https://api.braintrust.dev.
BRAINTRUST_APP_URL: Braintrust app URL, used for permalinks. Defaults to https://www.braintrust.dev.
BRAINTRUST_DEFAULT_PROJECT: project that traced spans route to. Defaults to default-go-project.
BRAINTRUST_DEFAULT_PROJECT_ID: project UUID. Takes precedence over the project name.
BRAINTRUST_ORG_NAME: organization name, useful when credentials can access multiple orgs.
BRAINTRUST_OTEL_FILTER_AI_SPANS: set to true to export only AI-related spans.
BRAINTRUST_ENABLE_TRACE_CONSOLE_LOG: set to true to print spans to the console.
BRAINTRUST_BLOCKING_LOGIN: set to true to log in synchronously at startup.

​Tracing

​braintrust.New

​Client.Permalink

​Evaluations

​braintrust.NewEvaluator

​eval.NewScorer

​eval.NewClassifier

​Datasets

​eval.NewDataset

​Attachments

​attachment.From*

​API client

​api.NewClient

​Configuration

​Environment variables

Tracing

`braintrust.New`

`Client.Permalink`

Evaluations

`braintrust.NewEvaluator`

`eval.NewScorer`

`eval.NewClassifier`

Datasets

`eval.NewDataset`

Attachments

`attachment.From*`

API client

`api.NewClient`

Configuration

Environment variables