Tag: testing

All the articles with the tag "testing".

genaievaluation
Evaluating an LLM Agent Like Real Software: Observability and Evals with Langfuse
A vibe-check isn't a test. How to trace, score, and gate an LLM agent with Langfuse — and the silent escalation regression evals catch that a demo never would.
12 Jun, 2026
genaillm
Testing LLM-Based Applications
LLMs are stochastic — the same prompt yields different outputs, so deterministic tests break. How to test LLM apps with DeepEval and evals instead.
8 Dec, 2024

Evaluating an LLM Agent Like Real Software: Observability and Evals with Langfuse