> ## Documentation Index > Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt > Use this file to discover all available pages before exploring further. # Autoevals TypeScript API > TypeScript API reference for Autoevals v0.0.131 AutoEvals is a tool to quickly and easily evaluate AI model outputs. ## Installation ```bash npm theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}} npm install autoevals ``` ```bash pnpm theme={"theme":{"light":"github-light","dark":"github-dark-dimmed"}} pnpm add autoevals ``` ## RAGAS Evaluators ### AnswerCorrectness Measures answer correctness compared to ground truth using a weighted average of factuality and semantic similarity. ### AnswerRelevancy Scores the relevancy of the generated answer to the given question. Answers with incomplete, redundant or unnecessary information are penalized. ### AnswerSimilarity Scores the semantic similarity between the generated answer and ground truth. ### ContextEntityRecall Estimates context recall by estimating TP and FN using annotated answer and retrieved context. ### ContextPrecision ContextPrecision evaluator function. ### ContextRecall ContextRecall evaluator function. ### ContextRelevancy ContextRelevancy evaluator function. ### Faithfulness Measures factual consistency of the generated answer with the given context. ## LLM Evaluators ### Battle Test whether an output *better* performs the `instructions` than the original (expected) value. ### ClosedQA Test whether an output answers the `input` using knowledge built into the model. You can specify `criteria` to further constrain the answer. ### Factuality Test whether an output is factual, compared to an original (`expected`) value. ### Humor Test whether an output is funny. ### Possible Test whether an output is a possible solution to the challenge posed in the input. ### Security Test whether an output is malicious. ### Sql Test whether a SQL query is semantically the same as a reference (output) query. ### Summary Test whether an output is a better summary of the `input` than the original (`expected`) value. ### Translation Test whether an `output` is as good of a translation of the `input` in the specified `language` as an expert (`expected`) value. ## String Evaluators ### EmbeddingSimilarity A scorer that uses cosine similarity to compare two strings. ### ExactMatch A simple scorer that tests whether two values are equal. If the value is an object or array, it will be JSON-serialized and the strings compared for equality. ### Levenshtein A simple scorer that uses the Levenshtein distance to compare two strings. ### LevenshteinScorer LevenshteinScorer evaluator function. ## JSON Evaluators ### JSONDiff A simple scorer that compares JSON objects, using a customizable comparison method for strings (defaults to Levenshtein) and numbers (defaults to NumericDiff). ### ValidJSON A binary scorer that evaluates the validity of JSON output, optionally validating against a JSON Schema definition (see [https://json-schema.org/learn/getting-started-step-by-step#create](https://json-schema.org/learn/getting-started-step-by-step#create)). ## Custom Evaluators ### LLMClassifierFromSpec LLMClassifierFromSpec evaluator function. ### LLMClassifierFromSpecFile LLMClassifierFromSpecFile evaluator function. ### LLMClassifierFromTemplate LLMClassifierFromTemplate evaluator function. ### OpenAIClassifier OpenAIClassifier evaluator function. ### buildClassificationTools buildClassificationTools evaluator function. ## List Evaluators ### ListContains A scorer that semantically evaluates the overlap between two lists of strings. It works by computing the pairwise similarity between each element of the output and the expected value, and then using Linear Sum Assignment to find the best matching pairs. ## Moderation ### Moderation A scorer that uses OpenAI's moderation API to determine if AI response contains ANY flagged content. ## Numeric Evaluators ### NumericDiff A simple scorer that compares numbers by normalizing their difference. ## Configuration ### init init evaluator function. ## Utilities ### makePartial makePartial evaluator function. ### normalizeValue normalizeValue evaluator function. ## Source Code For the complete TypeScript source code and additional examples, visit the [autoevals GitHub repository](https://github.com/braintrustdata/autoevals).