Tag: testing
All the articles with the tag "testing".
- genaievaluation
Evaluating an LLM Agent Like Real Software: Observability and Evals with Langfuse
A vibe-check isn't a test. How to trace, score, and gate an LLM agent with Langfuse — and the silent escalation regression evals catch that a demo never would.
- genaillm
Testing LLM-Based Applications
LLMs are stochastic — the same prompt yields different outputs, so deterministic tests break. How to test LLM apps with DeepEval and evals instead.