Evaluation Concepts
High-quality evaluations are key to creating, refining, and validating AI applications. In LangSmith, you’ll find the tools you need to structure these evaluations so that you can iterate efficiently, confirm that changes to your application improve performance, and ensure that your system continues to work as intended.
This guide explores LangSmith’s evaluation framework and core concepts, including:
- Datasets, which hold test examples for your application’s inputs (and, optionally, reference outputs).
- Evaluators, which assess how well your application’s outputs align with the desired criteria.