> ## Documentation Index
> Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Anthropic

> Trace Anthropic SDK calls, evaluate Claude models, and route them through the Braintrust gateway

If you are a coding agent, prefer the Braintrust [`bt` CLI](/reference/cli/quickstart) for repeatable, scriptable work: running evals, instrumenting code, querying logs, syncing data, managing functions, and configuring coding agents. Use the MCP server for reasoning over Braintrust data in conversation, such as ad-hoc lookups and exploration from your IDE.

Braintrust integrates with [Anthropic](https://www.anthropic.com) so you can call Claude models from the Braintrust playground, API, and SDKs. Braintrust also traces Anthropic SDK calls from your application, including streaming, prompt caching, server-side tool use, and managed agents.

## Add Anthropic as an AI provider

To use Anthropic models in the Braintrust playground, API, and gateway, connect Anthropic as a provider in your organization or project AI providers.

1. Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key.
2. Go to **<Icon icon="settings-2" /> Settings** > [**<Icon icon="sparkle" /> AI providers**](https://www.braintrust.dev/app/~/configuration/org/secrets).
3. Click <Icon icon="plus" /> **Organization provider** or <Icon icon="plus" /> **Project provider**, depending on whether you want the provider to be available across every project in the organization or just the current project.
4. Under **Model providers**, click **Anthropic**.
5. Choose your authentication method:
   * **API key**: Paste an Anthropic API key into the **Secret** field.

     <Note>
       API keys are stored as one-way cryptographic hashes, never in plaintext.
     </Note>
   * **Workload identity federation**: Exchange a Braintrust-signed OIDC token for an Anthropic access token, instead of storing a long-lived Anthropic API key in Braintrust.

     <Note>
       Workload identity federation is available only for organization-level providers on Braintrust-hosted organizations with the Braintrust gateway enabled. Project-level providers and self-hosted deployments must use **API key** authentication.
     </Note>
6. If you chose **Workload identity federation**, use the setup values shown in Braintrust to configure Anthropic:

   1. Create an Anthropic service account. Copy the `svac_...` service account ID.

   2. Register the Braintrust issuer in Anthropic. Use the **Issuer URL** shown in Braintrust, set **JWKS source** to **OIDC discovery**, leave **Discovery base URL** blank, turn **Single-use tokens** on, and set **Max token lifetime** to **1 hour**.

   3. Create a federation rule in Anthropic. Use the **Subject pattern**, **Expected audience**, and **Required claims** shown in Braintrust. Select the service account from the previous step and the Anthropic workspace Braintrust should use.

   4. Paste the Anthropic IDs back into Braintrust:

   * **Federation rule ID**: The `fdrl_...` value in Anthropic's **Workload identity** > **Rules** table.
   * **Organization ID**: The ID shown on Anthropic's **Organization settings** > **Organization** page.
   * **Service account ID**: The `svac_...` ID from the Anthropic service account.
   * **Workspace ID**: Use `default` for Anthropic's default workspace, or the workspace ID selected in the federation rule.
   * **Subject suffix**: A stable suffix for this Anthropic connection. It must match the final part of the subject pattern used in Anthropic.

   For general Anthropic concepts and Console details, see [Anthropic's workload identity federation docs](https://platform.claude.com/docs/en/manage-claude/workload-identity-federation).
7. Click **Save**.

<View title="TypeScript" icon="https://img.logo.dev/typescriptlang.org?token=pk_BdcHD9e5SCW3j1rnJkNyMQ">
  <h2 id="tracing-typescript">
    Tracing
  </h2>

  Braintrust traces Anthropic calls automatically with the `braintrust/hook.mjs` import hook, or manually with `wrapAnthropic`. Either path produces the same spans.

  Pick the tracing path that fits your application. Auto-instrumentation is the recommended path for most users.

  <Tabs>
    <Tab title="Auto-instrumentation">
      <h3 id="setup-typescript-auto">
        Setup
      </h3>

      Install the Braintrust SDK alongside the Anthropic SDK, then configure your API keys.

      <Steps>
        <Step title="Install packages">
          <CodeGroup>
            ```bash pnpm theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
            pnpm add braintrust @anthropic-ai/sdk
            ```

            ```bash npm theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
            npm install braintrust @anthropic-ai/sdk
            ```
          </CodeGroup>
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-typescript-auto">
        Trace your application
      </h3>

      To trace Anthropic calls without modifying your application code, run your app with Braintrust's import hook to patch `@anthropic-ai/sdk` at startup.

      <CodeGroup>
        ```javascript title="trace-anthropic-auto.js" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        import Anthropic from "@anthropic-ai/sdk";
        import { initLogger } from "braintrust";

        initLogger({
          projectName: "My Project", // Replace with your project name
          apiKey: process.env.BRAINTRUST_API_KEY,
        });

        const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

        const result = await client.messages.create({
          model: "claude-sonnet-4-5-20250929",
          max_tokens: 1024,
          messages: [{ role: "user", content: "What is machine learning?" }],
        });
        ```
      </CodeGroup>

      Run with the import hook:

      ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
      node --import braintrust/hook.mjs trace-anthropic-auto.js
      ```

      The auto-instrumentation example uses plain JavaScript so `node --import` can run the file directly. The Braintrust APIs work the same in TypeScript projects — compile your TypeScript to JavaScript, then run the compiled file with the import hook.

      <Note>
        If you're using a bundler, see [Trace LLM calls](/instrument/trace-llm-calls#auto-instrumentation) for plugin and loader setup.
      </Note>

      <h3 id="anthropic-bedrock-typescript">
        AWS Bedrock
      </h3>

      To trace Claude models on AWS Bedrock through Anthropic's [`@anthropic-ai/bedrock-sdk`](https://www.npmjs.com/package/@anthropic-ai/bedrock-sdk), create an `AnthropicBedrock` client and run with the import hook. The package is built on `@anthropic-ai/sdk`, so calls appear under `anthropic.messages.create` with `provider: "anthropic"` metadata, identical to the direct Anthropic SDK.

      <CodeGroup>
        ```javascript title="trace-anthropic-bedrock-auto.js" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        import { AnthropicBedrock } from "@anthropic-ai/bedrock-sdk";
        import { initLogger } from "braintrust";

        initLogger({
          projectName: "My Project", // Replace with your project name
          apiKey: process.env.BRAINTRUST_API_KEY,
        });

        // Resolves AWS credentials from the standard environment chain
        const client = new AnthropicBedrock();

        const result = await client.messages.create({
          model: "anthropic.claude-3-5-sonnet-20241022-v2:0",
          max_tokens: 1024,
          messages: [{ role: "user", content: "What is machine learning?" }],
        });
        ```
      </CodeGroup>

      Run with the import hook:

      ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
      node --import braintrust/hook.mjs trace-anthropic-bedrock-auto.js
      ```
    </Tab>

    <Tab title="Manual instrumentation">
      <h3 id="setup-typescript-manual">
        Setup
      </h3>

      Install the Braintrust SDK alongside the Anthropic SDK, then configure your API keys.

      <Steps>
        <Step title="Install packages">
          <CodeGroup>
            ```bash pnpm theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
            pnpm add braintrust @anthropic-ai/sdk
            ```

            ```bash npm theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
            npm install braintrust @anthropic-ai/sdk
            ```
          </CodeGroup>
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-typescript-manual">
        Trace your application
      </h3>

      To trace Anthropic calls manually, wrap your client with `wrapAnthropic`. Once wrapped, every `messages.create` call (including streaming) emits a span.

      <CodeGroup>
        ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        import Anthropic from "@anthropic-ai/sdk";
        import { wrapAnthropic, initLogger } from "braintrust";

        // Initialize the Braintrust logger
        const logger = initLogger({
          projectName: "My Project", // Your project name
          apiKey: process.env.BRAINTRUST_API_KEY,
        });

        // Wrap the Anthropic client with the Braintrust logger
        const client = wrapAnthropic(
          new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY }),
        );

        // All API calls are automatically logged
        const result = await client.messages.create({
          model: "claude-sonnet-4-5-20250929",
          max_tokens: 1024,
          messages: [{ role: "user", content: "What is machine learning?" }],
        });
        ```
      </CodeGroup>

      <Tip>
        For more control over tracing, learn how to [customize traces](/instrument/advanced-tracing).
      </Tip>

      <h3 id="anthropic-bedrock-typescript-manual">
        AWS Bedrock
      </h3>

      To trace Claude models on AWS Bedrock through Anthropic's [`@anthropic-ai/bedrock-sdk`](https://www.npmjs.com/package/@anthropic-ai/bedrock-sdk), wrap the `AnthropicBedrock` client with `wrapAnthropic`. The package is built on `@anthropic-ai/sdk`, so calls appear under `anthropic.messages.create` with `provider: "anthropic"` metadata, identical to the direct Anthropic SDK.

      <CodeGroup>
        ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        import { AnthropicBedrock } from "@anthropic-ai/bedrock-sdk";
        import { wrapAnthropic, initLogger } from "braintrust";

        initLogger({
          projectName: "My Project", // Replace with your project name
          apiKey: process.env.BRAINTRUST_API_KEY,
        });

        // Resolves AWS credentials from the standard environment chain
        const client = wrapAnthropic(new AnthropicBedrock());

        const result = await client.messages.create({
          model: "anthropic.claude-3-5-sonnet-20241022-v2:0",
          max_tokens: 1024,
          messages: [{ role: "user", content: "What is machine learning?" }],
        });
        ```
      </CodeGroup>
    </Tab>
  </Tabs>

  <Note>
    To trace the native AWS Bedrock Runtime client (`@aws-sdk/client-bedrock-runtime`) instead, see [AWS Bedrock](/integrations/ai-providers/bedrock#tracing-typescript).
  </Note>

  <h3 id="gateway-typescript">
    Gateway
  </h3>

  To call Anthropic through the [Braintrust gateway](/deploy/gateway), point an OpenAI-compatible client at the gateway base URL and use your Braintrust API key for authentication.

  <CodeGroup>
    ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import { OpenAI } from "openai";

    const client = new OpenAI({
      baseURL: "https://gateway.braintrust.dev/v1",
      apiKey: process.env.BRAINTRUST_API_KEY,
    });

    const response = await client.chat.completions.create({
      model: "claude-sonnet-4-5-20250929",
      messages: [{ role: "user", content: "What is a gateway?" }],
      seed: 1, // A seed activates the gateway's cache
    });
    ```
  </CodeGroup>

  <h3 id="structured-outputs-typescript">
    Structured outputs
  </h3>

  Anthropic supports structured outputs natively via the `output_config` parameter on the Anthropic SDK, or through the [Braintrust gateway](/deploy/gateway) using an OpenAI-shaped `response_format`.

  <CodeGroup>
    ```typescript Native Anthropic SDK theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import Anthropic from "@anthropic-ai/sdk";
    import { zodOutputFormat } from "@anthropic-ai/sdk/helpers/zod";
    import { z } from "zod";

    const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

    // Define a Zod schema for the response
    const ResponseSchema = z.object({
      name: z.string(),
      age: z.number(),
    });

    const message = await client.messages.parse({
      model: "claude-sonnet-4-5-20250929",
      max_tokens: 1024,
      messages: [
        { role: "user", content: "My name is John and I'm 30 years old." },
      ],
      output_config: {
        format: zodOutputFormat(ResponseSchema),
      },
    });

    console.log(message.parsed_output); // { name: "John", age: 30 }
    ```

    ```typescript Via Braintrust gateway theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import { OpenAI } from "openai";
    import { z } from "zod";

    const client = new OpenAI({
      baseURL: "https://gateway.braintrust.dev/v1",
      apiKey: process.env.BRAINTRUST_API_KEY,
    });

    // Define a Zod schema for the response
    const ResponseSchema = z.object({
      name: z.string(),
      age: z.number(),
    });

    const completion = await client.beta.chat.completions.parse({
      model: "claude-sonnet-4-5-20250929",
      messages: [
        { role: "system", content: "Extract the person's name and age." },
        { role: "user", content: "My name is John and I'm 30 years old." },
      ],
      response_format: {
        type: "json_schema",
        json_schema: {
          name: "person",
          // The Zod schema for the response
          schema: ResponseSchema,
        },
      },
    });
    ```
  </CodeGroup>

  <h3 id="what-traced-typescript">
    What Braintrust traces
  </h3>

  Braintrust emits spans for the Anthropic SDK's messages API and beta tool runner. Message spans capture the input messages, system prompt, model, request parameters, response content, stop reason, and stop sequence. Beta tool-runner spans capture task input, tools, response messages, and aggregated metrics across iterations.

  **Spans**

  | Span                                 | Coverage                             |
  | ------------------------------------ | ------------------------------------ |
  | `anthropic.messages.create`          | Messages API (and beta messages)     |
  | `anthropic.beta.messages.toolRunner` | Beta messages tool-runner iterations |

  **Metrics**

  | Metric                         | Description                                                                          |
  | ------------------------------ | ------------------------------------------------------------------------------------ |
  | `prompt_tokens`                | Input tokens                                                                         |
  | `completion_tokens`            | Output tokens                                                                        |
  | `prompt_cached_tokens`         | Tokens read from the prompt cache                                                    |
  | `prompt_cache_creation_tokens` | Tokens written to the prompt cache                                                   |
  | `time_to_first_token`          | First-token latency (streaming only)                                                 |
  | `server_tool_use_*`            | Server-side tool usage counters (for example, `server_tool_use_web_search_requests`) |

  <h3 id="tracing-resources-typescript">
    Tracing resources
  </h3>

  * [Braintrust JavaScript SDK](https://github.com/braintrustdata/braintrust-sdk-javascript)
  * [Anthropic TypeScript SDK](https://github.com/anthropics/anthropic-sdk-typescript)
  * [Anthropic Messages API reference](https://docs.anthropic.com/en/api/messages)
  * [Prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching)

  <h2 id="evals-typescript">
    Evals
  </h2>

  Evaluations distill the non-deterministic outputs of Anthropic models into an effective feedback loop. The Braintrust `Eval` function is composed of a dataset of user inputs, a task, and a set of scorers. To learn more about evaluations, see the [Experiments](/evaluate/run-evaluations) guide.

  <h3 id="basic-eval-setup-typescript">
    Basic eval setup
  </h3>

  Evaluate the outputs of Anthropic models with Braintrust.

  <CodeGroup>
    ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import { Eval } from "braintrust";
    import Anthropic from "@anthropic-ai/sdk";

    const client = new Anthropic({
      apiKey: process.env.ANTHROPIC_API_KEY,
    });

    Eval("Anthropic Evaluation", {
      // An array of user inputs and expected outputs
      data: () => [
        { input: "What is 2+2?", expected: "4" },
        { input: "What is the capital of France?", expected: "Paris" },
      ],
      task: async (input) => {
        // Your Anthropic LLM call
        const response = await client.messages.create({
          model: "claude-sonnet-4-5-20250929",
          max_tokens: 1024,
          messages: [{ role: "user", content: input }],
        });
        return response.content[0].text;
      },
      scores: [
        {
          name: "accuracy",
          // A simple scorer that returns 1 if the output matches the expected output, 0 otherwise
          scorer: (args) => (args.output === args.expected ? 1 : 0),
        },
      ],
    });
    ```
  </CodeGroup>

  <Tip>
    Learn more about eval [data](/annotate/datasets) and [scorers](/evaluate/write-scorers).
  </Tip>

  <h3 id="llm-judge-typescript">
    Use Anthropic as an LLM judge
  </h3>

  You can use Anthropic models to score the outputs of other AI systems. This example uses the `LLMClassifierFromSpec` scorer to score the relevance of the outputs of an AI system.

  Install the [`autoevals`](/evaluate/autoevals) package to use the `LLMClassifierFromSpec` scorer.

  <CodeGroup>
    ```bash pnpm theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    pnpm add autoevals
    ```

    ```bash npm theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    npm install autoevals
    ```
  </CodeGroup>

  Create a scorer that uses the `LLMClassifierFromSpec` scorer to score the relevance of the output. You can then include `relevanceScorer` as a scorer in your `Eval` function (see above).

  <CodeGroup>
    ```typescript TypeScript theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import { LLMClassifierFromSpec } from "autoevals";

    const relevanceScorer = LLMClassifierFromSpec("Relevance", {
      choice_scores: { Relevant: 1, Irrelevant: 0 },
      model: "claude-sonnet-4-5-20250929",
      use_cot: true,
    });
    ```
  </CodeGroup>
</View>

<View title="Python" icon="https://img.logo.dev/python.org?token=pk_BdcHD9e5SCW3j1rnJkNyMQ">
  <h2 id="tracing-python">
    Tracing
  </h2>

  Braintrust traces Anthropic calls automatically with `auto_instrument()`, or manually with `wrap_anthropic`. Either path produces the same spans.

  Pick the tracing path that fits your application. Auto-instrumentation is the recommended path for most users.

  <Tabs>
    <Tab title="Auto-instrumentation">
      <h3 id="setup-python-auto">
        Setup
      </h3>

      Install the Braintrust SDK alongside the Anthropic SDK, then configure your API keys.

      <Steps>
        <Step title="Install packages">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          pip install braintrust anthropic
          ```
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-python-auto">
        Trace your application
      </h3>

      To trace Anthropic calls without modifying your client construction, call `auto_instrument()` once at startup. Braintrust patches the `anthropic` SDK so every `messages.create`, `messages.stream`, beta messages, batches, and managed agents call emits a span.

      <CodeGroup>
        ```python Python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        import os
        import anthropic
        from braintrust import auto_instrument, init_logger

        init_logger(project="My Project")  # Replace with your project name
        auto_instrument()

        client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

        result = client.messages.create(
            model="claude-sonnet-4-5-20250929",
            max_tokens=1024,
            messages=[{"role": "user", "content": "What is machine learning?"}],
        )
        ```
      </CodeGroup>
    </Tab>

    <Tab title="Manual instrumentation">
      <h3 id="setup-python-manual">
        Setup
      </h3>

      Install the Braintrust SDK alongside the Anthropic SDK, then configure your API keys.

      <Steps>
        <Step title="Install packages">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          pip install braintrust anthropic
          ```
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-python-manual">
        Trace your application
      </h3>

      To trace Anthropic calls manually, wrap your client with `wrap_anthropic`. Once wrapped, every `messages.create` call (including streaming) emits a span.

      <CodeGroup>
        ```python Python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        import os

        import anthropic
        from braintrust import init_logger, wrap_anthropic

        # Initialize the Braintrust logger
        logger = init_logger(project="My Project")

        # Wrap the Anthropic client with the Braintrust logger
        client = wrap_anthropic(anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"]))

        # All API calls are automatically logged
        result = client.messages.create(
            model="claude-sonnet-4-5-20250929",
            max_tokens=1024,
            messages=[{"role": "user", "content": "What is machine learning?"}],
        )
        ```
      </CodeGroup>

      <Tip>
        For more control over tracing, learn how to [customize traces](/instrument/advanced-tracing).
      </Tip>
    </Tab>
  </Tabs>

  <h3 id="gateway-python">
    Gateway
  </h3>

  To call Anthropic through the [Braintrust gateway](/deploy/gateway), point an OpenAI-compatible client at the gateway base URL and use your Braintrust API key for authentication.

  <CodeGroup>
    ```python Python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import os

    from openai import OpenAI

    client = OpenAI(
        base_url="https://gateway.braintrust.dev/v1",
        api_key=os.environ["BRAINTRUST_API_KEY"],
    )

    response = client.chat.completions.create(
        model="claude-sonnet-4-5-20250929",
        messages=[{"role": "user", "content": "What is a gateway?"}],
        seed=1,  # A seed activates the gateway's cache
    )
    ```
  </CodeGroup>

  <h3 id="structured-outputs-python">
    Structured outputs
  </h3>

  Anthropic supports structured outputs natively via the `output_format` parameter on the Anthropic SDK, or through the [Braintrust gateway](/deploy/gateway) using an OpenAI-shaped `response_format`.

  <CodeGroup>
    ```python Native Anthropic SDK theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import os

    import anthropic
    from pydantic import BaseModel


    class Person(BaseModel):
        name: str
        age: int


    client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

    message = client.messages.parse(
        model="claude-sonnet-4-5-20250929",
        max_tokens=1024,
        messages=[
            {"role": "user", "content": "My name is John and I'm 30 years old."},
        ],
        output_format=Person,
    )

    print(message.content[0].parsed_output)  # Person(name="John", age=30)
    ```

    ```python Via Braintrust gateway theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import os

    from openai import OpenAI
    from pydantic import BaseModel


    class Person(BaseModel):
        name: str
        age: int


    client = OpenAI(
        base_url="https://gateway.braintrust.dev/v1",
        api_key=os.environ["BRAINTRUST_API_KEY"],
    )

    completion = client.beta.chat.completions.parse(
        model="claude-sonnet-4-5-20250929",
        messages=[
            {"role": "system", "content": "Extract the person's name and age."},
            {"role": "user", "content": "My name is John and I'm 30 years old."},
        ],
        response_format=Person,
    )
    ```
  </CodeGroup>

  <h3 id="what-traced-python">
    What Braintrust traces
  </h3>

  Braintrust emits spans for the Anthropic SDK's messages, batches, and managed agents APIs. Messages spans capture the input messages, system prompt, response content, stop reason, stop sequence, and request parameters (`model`, `max_tokens`, `temperature`, `top_k`, `top_p`, `stop_sequences`, `tool_choice`, `tools`, `stream`, `thinking`, `output_config`, `output_format`).

  **Spans**

  | Span                                                     | Coverage                                                              |
  | -------------------------------------------------------- | --------------------------------------------------------------------- |
  | `anthropic.messages.create`, `anthropic.messages.stream` | Messages API (and beta equivalents under `anthropic.beta.messages.*`) |
  | `anthropic.messages.batches.*`                           | Batches API                                                           |
  | `anthropic.beta.agents.*`, `anthropic.beta.sessions.*`   | Managed agents and sessions                                           |

  **Metrics**

  | Metric                            | Description                                                                          |
  | --------------------------------- | ------------------------------------------------------------------------------------ |
  | `prompt_tokens`                   | Input tokens                                                                         |
  | `completion_tokens`               | Output tokens                                                                        |
  | `prompt_cached_tokens`            | Tokens read from the prompt cache                                                    |
  | `prompt_cache_creation_tokens`    | Tokens written to the prompt cache (aggregate)                                       |
  | `prompt_cache_creation_5m_tokens` | Tokens written to Anthropic's 5-minute ephemeral cache                               |
  | `prompt_cache_creation_1h_tokens` | Tokens written to Anthropic's 1-hour ephemeral cache                                 |
  | `time_to_first_token`             | First-token latency (streaming only)                                                 |
  | `server_tool_use_*`               | Server-side tool usage counters (for example, `server_tool_use_web_search_requests`) |

  **Metadata**

  | Field                 | Description                           |
  | --------------------- | ------------------------------------- |
  | `usage_service_tier`  | Service tier that handled the request |
  | `usage_inference_geo` | Region that processed the request     |

  <h3 id="tracing-resources-python">
    Tracing resources
  </h3>

  * [Braintrust Python SDK](https://github.com/braintrustdata/braintrust-sdk-python)
  * [Anthropic Python SDK](https://github.com/anthropics/anthropic-sdk-python)
  * [Anthropic Messages API reference](https://docs.anthropic.com/en/api/messages)
  * [Prompt caching](https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching)

  <h2 id="evals-python">
    Evals
  </h2>

  Evaluations distill the non-deterministic outputs of Anthropic models into an effective feedback loop. The Braintrust `Eval` function is composed of a dataset of user inputs, a task, and a set of scorers. To learn more about evaluations, see the [Experiments](/evaluate/run-evaluations) guide.

  <h3 id="basic-eval-setup-python">
    Basic eval setup
  </h3>

  Evaluate the outputs of Anthropic models with Braintrust.

  <CodeGroup>
    ```python Python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import os

    import anthropic
    from braintrust import Eval

    client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

    def task(input):
        response = client.messages.create(
            model="claude-sonnet-4-5-20250929",
            max_tokens=1024,
            messages=[{"role": "user", "content": input}],
        )
        return response.content[0].text


    def accuracy_scorer(output, expected, **kwargs):
        return 1 if output == expected else 0

    Eval(
        "Anthropic Evaluation",
        data=[
            {"input": "What is 2+2?", "expected": "4"},
            {"input": "What is the capital of France?", "expected": "Paris"},
        ],
        task=task,
        scores=[accuracy_scorer],
    )
    ```
  </CodeGroup>

  <Tip>
    Learn more about eval [data](/annotate/datasets) and [scorers](/evaluate/write-scorers).
  </Tip>

  <h3 id="llm-judge-python">
    Use Anthropic as an LLM judge
  </h3>

  You can use Anthropic models to score the outputs of other AI systems. This example uses the `LLMClassifierFromSpec` scorer to score the relevance of the outputs of an AI system.

  Install the [`autoevals`](/evaluate/autoevals) package to use the `LLMClassifierFromSpec` scorer.

  ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
  pip install autoevals
  ```

  Create a scorer that uses the `LLMClassifierFromSpec` scorer to score the relevance of the output. You can then include `relevance_scorer` as a scorer in your `Eval` function (see above).

  <CodeGroup>
    ```python Python theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    from autoevals import LLMClassifierFromSpec

    relevance_scorer = LLMClassifierFromSpec(
        "Relevance",
        choice_scores={"Relevant": 1, "Irrelevant": 0},
        model="claude-sonnet-4-5-20250929",
        use_cot=True,
    )
    ```
  </CodeGroup>
</View>

<View title="Ruby" icon="https://img.logo.dev/ruby-lang.org?token=pk_BdcHD9e5SCW3j1rnJkNyMQ">
  <h2 id="tracing-ruby">
    Tracing
  </h2>

  Braintrust traces Anthropic calls automatically when you load `braintrust/setup`, or manually with `Braintrust.instrument!`. Either path produces the same spans.

  Pick the tracing path that fits your application. Auto-instrumentation is the recommended path for most users.

  <Tabs>
    <Tab title="Auto-instrumentation">
      <h3 id="setup-ruby-auto">
        Setup
      </h3>

      Install the Braintrust gem alongside the Anthropic gem, then configure your API keys.

      <Steps>
        <Step title="Install gems">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          gem install braintrust anthropic
          ```
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>
          BRAINTRUST_DEFAULT_PROJECT=<your-project-name>  # Project that spans are logged to

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-ruby-auto">
        Trace your application
      </h3>

      To trace Anthropic calls without modifying your client construction, load `braintrust/setup` early in your application. Braintrust intercepts `require "anthropic"` and patches the gem so every `messages.create` and `messages.stream` call (including the beta messages API) emits a span.

      <CodeGroup>
        ```ruby Ruby theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        require 'braintrust/setup'
        require 'anthropic'

        client = Anthropic::Client.new(api_key: ENV.fetch('ANTHROPIC_API_KEY', nil))

        client.messages.create(
          model: 'claude-sonnet-4-5-20250929',
          max_tokens: 1024,
          messages: [{ role: 'user', content: 'What is machine learning?' }]
        )
        ```
      </CodeGroup>

      <Tip>
        In a Rails app, add `gem "braintrust", require: "braintrust/setup"` to your Gemfile to enable auto-instrumentation without an explicit `require` line.
      </Tip>
    </Tab>

    <Tab title="Manual instrumentation">
      <h3 id="setup-ruby-manual">
        Setup
      </h3>

      Install the Braintrust gem alongside the Anthropic gem, then configure your API keys.

      <Steps>
        <Step title="Install gems">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          gem install braintrust anthropic
          ```
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-ruby-manual">
        Trace your application
      </h3>

      To trace Anthropic calls manually, instrument your client with `Braintrust.instrument!`. Once instrumented, every `messages.create` call (including streaming) emits a span.

      <CodeGroup>
        ```ruby Ruby theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        require 'braintrust'
        require 'anthropic'

        # Initialize Braintrust
        Braintrust.init(default_project: 'My Project')

        # Create Anthropic client
        client = Anthropic::Client.new(api_key: ENV.fetch('ANTHROPIC_API_KEY', nil))

        # Instrument the client with Braintrust tracing
        Braintrust.instrument!(:anthropic, target: client)

        # All API calls are automatically logged
        client.messages.create(
          model: 'claude-sonnet-4-5-20250929',
          max_tokens: 1024,
          messages: [{ role: 'user', content: 'What is machine learning?' }]
        )
        ```
      </CodeGroup>

      <Tip>
        For more control over tracing, learn how to [customize traces](/instrument/advanced-tracing).
      </Tip>
    </Tab>
  </Tabs>

  <h3 id="what-traced-ruby">
    What Braintrust traces
  </h3>

  Braintrust emits spans for the Anthropic SDK's messages API. Each span captures the input messages, system prompt, response content, stop reason, stop sequence, and request parameters (`model`, `max_tokens`, `temperature`, `top_p`, `top_k`, `stop_sequences`, `tools`, `tool_choice`, `thinking`, `metadata`, `service_tier`).

  **Spans**

  | Span                                                     | Coverage                            |
  | -------------------------------------------------------- | ----------------------------------- |
  | `anthropic.messages.create`, `anthropic.messages.stream` | Messages API (and beta equivalents) |

  **Metrics**

  | Metric                         | Description                          |
  | ------------------------------ | ------------------------------------ |
  | `prompt_tokens`                | Input tokens                         |
  | `completion_tokens`            | Output tokens                        |
  | `prompt_cached_tokens`         | Tokens read from the prompt cache    |
  | `prompt_cache_creation_tokens` | Tokens written to the prompt cache   |
  | `time_to_first_token`          | First-token latency (streaming only) |

  <h3 id="tracing-resources-ruby">
    Tracing resources
  </h3>

  * [Braintrust Ruby SDK](https://github.com/braintrustdata/braintrust-sdk-ruby)
  * [Anthropic Ruby SDK](https://github.com/anthropics/anthropic-sdk-ruby)
  * [Anthropic Messages API reference](https://docs.anthropic.com/en/api/messages)

  <h2 id="evals-ruby">
    Evals
  </h2>

  Evaluations distill the non-deterministic outputs of Anthropic models into an effective feedback loop. The Braintrust `Eval` API is composed of a dataset of user inputs, a task, and a set of scorers. To learn more about evaluations, see the [Experiments](/evaluate/run-evaluations) guide.

  <h3 id="basic-eval-setup-ruby">
    Basic eval setup
  </h3>

  Evaluate the outputs of Anthropic models with Braintrust.

  <CodeGroup>
    ```ruby Ruby theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    require 'braintrust'
    require 'anthropic'

    Braintrust.init

    client = Anthropic::Client.new(api_key: ENV.fetch('ANTHROPIC_API_KEY', nil))

    Braintrust::Eval.run(
      project: 'Anthropic Evaluation',
      experiment: 'basic-eval',
      # An array of user inputs and expected outputs
      cases: [
        { input: 'What is 2+2?', expected: '4' },
        { input: 'What is the capital of France?', expected: 'Paris' }
      ],
      # Your Anthropic LLM call
      task: lambda do |input|
        response = client.messages.create(
          model: 'claude-sonnet-4-5-20250929',
          max_tokens: 1024,
          messages: [{ role: 'user', content: input }]
          )
        response.content[0].text
      end,
      # A simple scorer that returns 1 if the output matches the expected output, 0 otherwise
      scorers: [
        Braintrust::Eval.scorer('accuracy') do |_input, expected, output|
          output == expected ? 1.0 : 0.0
        end
      ]
    )
    ```
  </CodeGroup>

  <Tip>
    Learn more about eval [data](/annotate/datasets) and [scorers](/evaluate/write-scorers).
  </Tip>
</View>

<View title="Go" icon="https://img.logo.dev/go.dev?token=pk_BdcHD9e5SCW3j1rnJkNyMQ">
  <h2 id="tracing-go">
    Tracing
  </h2>

  The Braintrust Go SDK ships an Anthropic middleware that you can attach manually, or apply automatically at compile time with [Orchestrion](https://github.com/DataDog/orchestrion). Either path produces the same traces.

  Pick the tracing path that fits your application. Auto-instrumentation is the recommended path for most users.

  <Tabs>
    <Tab title="Auto-instrumentation">
      <h3 id="setup-go-auto">
        Setup
      </h3>

      Install the Braintrust Go SDK alongside the Anthropic Go SDK, then configure your API keys.

      <Steps>
        <Step title="Install packages">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          go get github.com/braintrustdata/braintrust-sdk-go
          go get github.com/braintrustdata/braintrust-sdk-go/trace/contrib/anthropic
          go get github.com/anthropics/anthropic-sdk-go
          ```
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-go-auto">
        Trace your application
      </h3>

      To trace Anthropic calls without modifying your application code, build your app with [Orchestrion](https://github.com/DataDog/orchestrion), a compile-time tool that rewrites every `anthropic.NewClient` call to attach Braintrust's middleware automatically.

      <Steps>
        <Step title="Install Orchestrion">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          go install github.com/DataDog/orchestrion@latest
          ```
        </Step>

        <Step title="Create orchestrion.tool.go in your project root">
          ```go title="orchestrion.tool.go" #skip-compile #skip-format theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          //go:build tools

          package main

          import (
          	_ "github.com/DataDog/orchestrion"
          	_ "github.com/braintrustdata/braintrust-sdk-go/trace/contrib/anthropic"
          )
          ```
        </Step>

        <Step title="Write your app">
          Orchestrion instruments every `anthropic.NewClient` call at build time, so no middleware wiring is needed.

          <CodeGroup>
            ```go Go #skip-compile #skip-format theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
            package main

            import (
            	"context"
            	"log"
            	"os"

            	"github.com/anthropics/anthropic-sdk-go"
            	"go.opentelemetry.io/otel"
            	"go.opentelemetry.io/otel/sdk/trace"

            	"github.com/braintrustdata/braintrust-sdk-go"
            )

            func main() {
            	ctx := context.Background()

            	tp := trace.NewTracerProvider()
            	defer tp.Shutdown(ctx)
            	otel.SetTracerProvider(tp)

            	_, err := braintrust.New(tp,
            		braintrust.WithProject("My Project"),
            		braintrust.WithAPIKey(os.Getenv("BRAINTRUST_API_KEY")),
            	)
            	if err != nil {
            		log.Fatal(err)
            	}

            	client := anthropic.NewClient()

            	message, err := client.Messages.New(ctx, anthropic.MessageNewParams{
            		Model: anthropic.ModelClaudeSonnet4_5_20250929,
            		Messages: []anthropic.MessageParam{
            			anthropic.NewUserMessage(anthropic.NewTextBlock("What is machine learning?")),
            		},
            		MaxTokens: 1024,
            	})
            	if err != nil {
            		log.Fatal(err)
            	}
            	_ = message
            }
            ```
          </CodeGroup>
        </Step>

        <Step title="Build and run with Orchestrion">
          Build with Orchestrion to enable auto-instrumentation:

          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          go mod tidy
          orchestrion go build -o myapp
          ./myapp
          ```

          <Accordion title="Enable Orchestrion via GOFLAGS">
            Instead of running `orchestrion go build`, you can set a `GOFLAGS` environment variable to enable Orchestrion for normal `go build` commands:

            ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
            export GOFLAGS="-toolexec='orchestrion toolexec'"
            go build ./...
            ```
          </Accordion>
        </Step>
      </Steps>
    </Tab>

    <Tab title="Manual instrumentation">
      <h3 id="setup-go-manual">
        Setup
      </h3>

      Install the Braintrust Go SDK alongside the Anthropic Go SDK, then configure your API keys.

      <Steps>
        <Step title="Install packages">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          go get github.com/braintrustdata/braintrust-sdk-go
          go get github.com/braintrustdata/braintrust-sdk-go/trace/contrib/anthropic
          go get github.com/anthropics/anthropic-sdk-go
          ```
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-go-manual">
        Trace your application
      </h3>

      To trace Anthropic calls manually, attach Braintrust's tracing middleware yourself by passing `traceanthropic.NewMiddleware()` as an option on `anthropic.NewClient`. Once attached, every `Messages.New` call (including streaming) emits a span.

      <CodeGroup>
        ```go Go theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        package main

        import (
        	"context"
        	"log"
        	"os"

        	"github.com/anthropics/anthropic-sdk-go"
        	"github.com/anthropics/anthropic-sdk-go/option"
        	"go.opentelemetry.io/otel"
        	"go.opentelemetry.io/otel/sdk/trace"

        	"github.com/braintrustdata/braintrust-sdk-go"
        	traceanthropic "github.com/braintrustdata/braintrust-sdk-go/trace/contrib/anthropic"
        )

        func main() {
        	ctx := context.Background()

        	// Set up OpenTelemetry TracerProvider
        	tp := trace.NewTracerProvider()
        	defer tp.Shutdown(ctx)
        	otel.SetTracerProvider(tp)

        	// Initialize Braintrust client
        	_, err := braintrust.New(tp,
        		braintrust.WithProject("My Project"),
        		braintrust.WithAPIKey(os.Getenv("BRAINTRUST_API_KEY")),
        	)
        	if err != nil {
        		log.Fatal(err)
        	}

        	// Create Anthropic client with tracing middleware
        	client := anthropic.NewClient(
        		option.WithMiddleware(traceanthropic.NewMiddleware()),
        	)

        	// All API calls are automatically logged
        	message, err := client.Messages.New(ctx, anthropic.MessageNewParams{
        		Model: anthropic.ModelClaudeSonnet4_5_20250929,
        		Messages: []anthropic.MessageParam{
        			anthropic.NewUserMessage(anthropic.NewTextBlock("What is machine learning?")),
        		},
        		MaxTokens: 1024,
        	})
        	if err != nil {
        		log.Fatal(err)
        	}
        	_ = message
        }
        ```
      </CodeGroup>

      <Tip>
        For more control over tracing, learn how to [customize traces](/instrument/advanced-tracing).
      </Tip>
    </Tab>
  </Tabs>

  <h3 id="what-traced-go">
    What Braintrust traces
  </h3>

  Braintrust emits spans for the Anthropic SDK's messages API. Each span captures the input messages, system prompt, response content, and request parameters (`model`, `max_tokens`, `temperature`, `top_p`, `top_k`, `stop_sequences`, `stream`, `tools`, `tool_choice`, `metadata`, `container`, `mcp_servers`, `service_tier`, `thinking`).

  **Spans**

  | Span                        | Coverage                |
  | --------------------------- | ----------------------- |
  | `anthropic.messages.create` | `/v1/messages` requests |

  **Metrics**

  | Metric                         | Description                          |
  | ------------------------------ | ------------------------------------ |
  | `prompt_tokens`                | Input tokens                         |
  | `completion_tokens`            | Output tokens                        |
  | `prompt_cached_tokens`         | Tokens read from the prompt cache    |
  | `prompt_cache_creation_tokens` | Tokens written to the prompt cache   |
  | `time_to_first_token`          | First-token latency (streaming only) |

  <h3 id="tracing-resources-go">
    Tracing resources
  </h3>

  * [Braintrust Go SDK](https://github.com/braintrustdata/braintrust-sdk-go)
  * [Anthropic Go SDK](https://github.com/anthropics/anthropic-sdk-go)
  * [Anthropic Messages API reference](https://docs.anthropic.com/en/api/messages)

  <h2 id="evals-go">
    Evals
  </h2>

  Evaluations distill the non-deterministic outputs of Anthropic models into an effective feedback loop. The Braintrust `Evaluator` API is composed of a dataset of user inputs, a task, and a set of scorers. To learn more about evaluations, see the [Experiments](/evaluate/run-evaluations) guide.

  <h3 id="basic-eval-setup-go">
    Basic eval setup
  </h3>

  Evaluate the outputs of Anthropic models with Braintrust.

  <CodeGroup>
    ```go Go theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    package main

    import (
    	"context"
    	"log"
    	"os"

    	"github.com/anthropics/anthropic-sdk-go"
    	"github.com/anthropics/anthropic-sdk-go/option"
    	"go.opentelemetry.io/otel"
    	"go.opentelemetry.io/otel/sdk/trace"

    	"github.com/braintrustdata/braintrust-sdk-go"
    	"github.com/braintrustdata/braintrust-sdk-go/eval"
    	traceanthropic "github.com/braintrustdata/braintrust-sdk-go/trace/contrib/anthropic"
    )

    func main() {
    	ctx := context.Background()

    	// Set up OpenTelemetry TracerProvider
    	tp := trace.NewTracerProvider()
    	defer tp.Shutdown(ctx)
    	otel.SetTracerProvider(tp)

    	// Initialize Braintrust
    	bt, err := braintrust.New(tp,
    		braintrust.WithAPIKey(os.Getenv("BRAINTRUST_API_KEY")),
    	)
    	if err != nil {
    		log.Fatal(err)
    	}

    	// Create Anthropic client with tracing
    	client := anthropic.NewClient(
    		option.WithMiddleware(traceanthropic.NewMiddleware()),
    	)

    	// Create evaluator
    	evaluator := braintrust.NewEvaluator[string, string](bt)

    	// Run evaluation
    	_, err = evaluator.Run(ctx, eval.Opts[string, string]{
    		Experiment: "Anthropic Evaluation",
    		// Dataset of user inputs and expected outputs
    		Dataset: eval.NewDataset([]eval.Case[string, string]{
    			{Input: "What is 2+2?", Expected: "4"},
    			{Input: "What is the capital of France?", Expected: "Paris"},
    		}),
    		// Task function with Anthropic LLM call
    		Task: eval.T(func(ctx context.Context, input string) (string, error) {
    			message, err := client.Messages.New(ctx, anthropic.MessageNewParams{
    				Model: anthropic.ModelClaudeSonnet4_5_20250929,
    				Messages: []anthropic.MessageParam{
    					anthropic.NewUserMessage(anthropic.NewTextBlock(input)),
    				},
    				MaxTokens: 1024,
    			})
    			if err != nil {
    				return "", err
    			}
    			return message.Content[0].Text, nil
    		}),
    		// Simple scorer that returns 1 if output matches expected, 0 otherwise
    		Scorers: []eval.Scorer[string, string]{
    			eval.NewScorer("accuracy", func(ctx context.Context, r eval.TaskResult[string, string]) (eval.Scores, error) {
    				score := 0.0
    				if r.Output == r.Expected {
    					score = 1.0
    				}
    				return eval.S(score), nil
    			}),
    		},
    	})
    	if err != nil {
    		log.Fatal(err)
    	}
    }
    ```
  </CodeGroup>

  <Tip>
    Learn more about eval [data](/annotate/datasets) and [scorers](/evaluate/write-scorers).
  </Tip>
</View>

<View title="Java" icon="https://img.logo.dev/java.com?token=pk_BdcHD9e5SCW3j1rnJkNyMQ">
  <h2 id="tracing-java">
    Tracing
  </h2>

  The Braintrust Java SDK ships an Anthropic interceptor that you can attach manually with `BraintrustAnthropic.wrap()`, or have applied automatically by the [Braintrust Java agent](/instrument/trace-llm-calls#auto-instrumentation). Both paths produce the same spans.

  Pick the tracing path that fits your application. Auto-instrumentation is the recommended path for most users.

  <Tabs>
    <Tab title="Auto-instrumentation">
      <h3 id="setup-java-auto">
        Setup
      </h3>

      Install the Braintrust Java SDK alongside the Anthropic Java SDK, then configure your API keys.

      <Steps>
        <Step title="Install packages">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          # add to build.gradle dependencies{} block
          implementation 'dev.braintrust:braintrust-sdk-java:<version-goes-here>'
          implementation 'com.anthropic:anthropic-java:<version-goes-here>'
          ```
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-java-auto">
        Trace your application
      </h3>

      To trace Anthropic calls without modifying your application code, attach the [`braintrust-java-agent`](/instrument/trace-llm-calls#auto-instrumentation) at JVM startup. The agent intercepts every Anthropic client build and applies the Braintrust interceptor automatically.

      <Steps>
        <Step title="Add the agent dependency">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          # build.gradle
          configurations {
              braintrustAgent
          }

          dependencies {
              braintrustAgent 'dev.braintrust:braintrust-java-agent:+'
          }

          tasks.withType(JavaExec).configureEach {
              jvmArgs "-javaagent:${configurations.braintrustAgent.asPath}"
          }
          ```
        </Step>

        <Step title="Run your app">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ./gradlew run
          ```

          Anthropic client builds in your application code are now intercepted automatically. No call to `BraintrustAnthropic.wrap()` is required.
        </Step>
      </Steps>
    </Tab>

    <Tab title="Manual instrumentation">
      <h3 id="setup-java-manual">
        Setup
      </h3>

      Install the Braintrust Java SDK alongside the Anthropic Java SDK, then configure your API keys.

      <Steps>
        <Step title="Install packages">
          ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          # add to build.gradle dependencies{} block
          implementation 'dev.braintrust:braintrust-sdk-java:<version-goes-here>'
          implementation 'com.anthropic:anthropic-java:<version-goes-here>'
          ```
        </Step>

        <Step title="Get an Anthropic API key">
          Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
        </Step>

        <Step title="Set environment variables">
          ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
          ANTHROPIC_API_KEY=<your-anthropic-api-key>
          BRAINTRUST_API_KEY=<your-braintrust-api-key>

          # For organizations on the EU data plane, use https://api-eu.braintrust.dev
          # For self-hosted deployments, use your data plane URL
          # BRAINTRUST_API_URL=<your-braintrust-api-url>
          ```
        </Step>
      </Steps>

      <h3 id="trace-java-manual">
        Trace your application
      </h3>

      To trace Anthropic calls manually, wrap your client with `BraintrustAnthropic.wrap()`. Once wrapped, every `messages().create()` call (including streaming) emits a span.

      <CodeGroup>
        ```java Java theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
        import com.anthropic.client.AnthropicClient;
        import com.anthropic.client.okhttp.AnthropicOkHttpClient;
        import com.anthropic.models.messages.MessageCreateParams;
        import com.anthropic.models.messages.Model;
        import dev.braintrust.Braintrust;
        import dev.braintrust.instrumentation.anthropic.BraintrustAnthropic;

        class AnthropicTracing {
            public static void main(String[] args) {
                var braintrust = Braintrust.get();
                var openTelemetry = braintrust.openTelemetryCreate();

                // Wrap the Anthropic client with Braintrust instrumentation
                AnthropicClient client = BraintrustAnthropic.wrap(openTelemetry, AnthropicOkHttpClient.fromEnv());

                // All API calls are automatically logged
                var result = client.messages().create(
                    MessageCreateParams.builder()
                        .model(Model.CLAUDE_SONNET_4_5_20250929)
                        .maxTokens(1024)
                        .addUserMessage("What is machine learning?")
                        .build());
            }
        }
        ```
      </CodeGroup>

      <Tip>
        For more control over tracing, learn how to [customize traces](/instrument/advanced-tracing).
      </Tip>
    </Tab>
  </Tabs>

  <h3 id="what-traced-java">
    What Braintrust traces
  </h3>

  Braintrust emits spans for the Anthropic Messages API. Each span captures the input messages and response content.

  **Spans**

  | Span                         | Coverage                                         |
  | ---------------------------- | ------------------------------------------------ |
  | Anthropic Messages API spans | `messages().create()` calls, including streaming |

  **Metrics**

  | Metric                            | Description                                            |
  | --------------------------------- | ------------------------------------------------------ |
  | `prompt_tokens`                   | Input tokens                                           |
  | `completion_tokens`               | Output tokens                                          |
  | `prompt_cached_tokens`            | Tokens read from the prompt cache                      |
  | `prompt_cache_creation_tokens`    | Tokens written to the prompt cache (aggregate)         |
  | `prompt_cache_creation_5m_tokens` | Tokens written to Anthropic's 5-minute ephemeral cache |
  | `prompt_cache_creation_1h_tokens` | Tokens written to Anthropic's 1-hour ephemeral cache   |
  | `time_to_first_token`             | First-token latency (streaming only)                   |

  <h3 id="tracing-resources-java">
    Tracing resources
  </h3>

  * [Braintrust Java SDK](https://github.com/braintrustdata/braintrust-sdk-java)
  * [Trace LLM calls](/instrument/trace-llm-calls) for general Java agent setup
  * [Anthropic Java SDK](https://github.com/anthropics/anthropic-sdk-java)
  * [Anthropic Messages API reference](https://docs.anthropic.com/en/api/messages)

  <h2 id="evals-java">
    Evals
  </h2>

  Evaluations distill the non-deterministic outputs of Anthropic models into an effective feedback loop. To learn more about evaluations, see the [Experiments](/evaluate/run-evaluations) guide.

  <h3 id="basic-eval-setup-java">
    Basic eval setup
  </h3>

  Evaluate the outputs of Anthropic models with Braintrust.

  <CodeGroup>
    ```java Java theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    import com.anthropic.client.AnthropicClient;
    import com.anthropic.client.okhttp.AnthropicOkHttpClient;
    import com.anthropic.models.messages.MessageCreateParams;
    import com.anthropic.models.messages.Model;
    import dev.braintrust.Braintrust;
    import dev.braintrust.eval.DatasetCase;
    import dev.braintrust.eval.Scorer;
    import dev.braintrust.instrumentation.anthropic.BraintrustAnthropic;
    import java.util.function.Function;

    class AnthropicEvaluation {
        public static void main(String[] args) {
            var braintrust = Braintrust.get();
            var openTelemetry = braintrust.openTelemetryCreate();
            AnthropicClient client = BraintrustAnthropic.wrap(openTelemetry, AnthropicOkHttpClient.fromEnv());

            Function<String, String> taskFunction = (String input) -> {
                var request = MessageCreateParams.builder()
                    .model(Model.CLAUDE_SONNET_4_5_20250929)
                    .maxTokens(1024)
                    .addUserMessage(input)
                    .build();
                var response = client.messages().create(request);
                return response.content().get(0).text().map(block -> block.text()).orElse("");
            };

            var eval = braintrust.<String, String>evalBuilder()
                .name("Anthropic Evaluation")
                .cases(
                    DatasetCase.of("What is 2+2?", "4"),
                    DatasetCase.of("What is the capital of France?", "Paris"))
                .taskFunction(taskFunction)
                .scorers(
                    Scorer.of("contains_answer", (evalCase, output) ->
                        output.contains("4") || output.contains("Paris") ? 1.0 : 0.0))
                .build();

            var result = eval.run();
            System.out.println(result.createReportString());
        }
    }
    ```
  </CodeGroup>

  <Tip>
    Learn more about eval [data](/annotate/datasets) and [scorers](/evaluate/write-scorers).
  </Tip>
</View>

<View title=".NET" icon="https://img.logo.dev/dotnet.microsoft.com?token=pk_BdcHD9e5SCW3j1rnJkNyMQ">
  <h2 id="tracing-dotnet">
    Tracing
  </h2>

  The Braintrust .NET SDK ships an Anthropic instrumentation that you attach with the `.WithBraintrust()` extension method.

  <h3 id="setup-dotnet">
    Setup
  </h3>

  Install the Braintrust .NET SDK alongside the Anthropic .NET SDK, then configure your API keys.

  <Steps>
    <Step title="Install packages">
      ```bash theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
      # add to .csproj file
      dotnet add package Braintrust.Sdk
      dotnet add package Braintrust.Sdk.Anthropic
      ```
    </Step>

    <Step title="Get an Anthropic API key">
      Visit [Anthropic's Console](https://console.anthropic.com/settings/keys) and create a new API key, then [add it as a Braintrust AI provider](#add-anthropic-as-an-ai-provider).
    </Step>

    <Step title="Set environment variables">
      ```bash title=".env" theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
      ANTHROPIC_API_KEY=<your-anthropic-api-key>
      BRAINTRUST_API_KEY=<your-braintrust-api-key>

      # For organizations on the EU data plane, use https://api-eu.braintrust.dev
      # For self-hosted deployments, use your data plane URL
      # BRAINTRUST_API_URL=<your-braintrust-api-url>
      ```
    </Step>
  </Steps>

  <h3 id="manual-instrumentation-dotnet">
    Manual instrumentation
  </h3>

  To trace Anthropic calls, wrap your client with `.WithBraintrust()`. Once wrapped, every `Messages.Create` call (including streaming) emits a span.

  <CodeGroup>
    ```csharp C# theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}}
    using System;
    using System.Collections.Generic;
    using System.Threading.Tasks;
    using Anthropic;
    using Anthropic.Models.Messages;
    using Braintrust.Sdk.Anthropic;

    class AnthropicTracing
    {
        static async Task Main(string[] args)
        {
            // Wrap the Anthropic client with Braintrust instrumentation
            var client = new AnthropicClient().WithBraintrust();

            // All API calls are automatically logged
            var result = await client.Messages.Create(new MessageCreateParams
            {
                Model = "claude-sonnet-4-5-20250929",
                MaxTokens = 1024,
                Messages = new List<MessageParam>
                {
                    new MessageParam { Role = "user", Content = "What is machine learning?" }
                }
            });

            if (result.Content[0].TryPickText(out var textBlock))
            {
                Console.WriteLine(textBlock.Text);
            }
        }
    }
    ```
  </CodeGroup>

  <Tip>
    For more control over tracing, learn how to [customize traces](/instrument/advanced-tracing).
  </Tip>

  <h3 id="what-traced-dotnet">
    What Braintrust traces
  </h3>

  Braintrust emits spans for the Anthropic Messages API. Each span captures the input messages (including the system prompt), response content, model, and stop reason and stop sequence (streaming only). Request parameters such as max tokens, temperature, top-p, top-k, stop sequences, and tools are captured as metadata.

  **Spans**

  | Span                        | Coverage                     |
  | --------------------------- | ---------------------------- |
  | `anthropic.messages.create` | Non-streaming messages calls |
  | `Message Stream`            | Streaming messages calls     |

  **Metrics**

  | Metric                | Description                          |
  | --------------------- | ------------------------------------ |
  | `prompt_tokens`       | Input tokens                         |
  | `completion_tokens`   | Output tokens                        |
  | `tokens`              | Total tokens                         |
  | `time_to_first_token` | First-token latency (streaming only) |

  <h3 id="tracing-resources-dotnet">
    Tracing resources
  </h3>

  * [Braintrust .NET SDK](https://github.com/braintrustdata/braintrust-sdk-dotnet)
  * [Braintrust.Sdk.Anthropic](https://www.nuget.org/packages/Braintrust.Sdk.Anthropic)
  * [Anthropic .NET SDK](https://github.com/anthropics/anthropic-sdk-dotnet)
  * [Anthropic Messages API reference](https://docs.anthropic.com/en/api/messages)
</View>
