How I stopped burning money on LLM calls with Promptfoo
I used Promptfoo evals to tune cheaper models, catch regressions early, and stop wasting money on flaky LLM calls.
I used Promptfoo evals to tune cheaper models, catch regressions early, and stop wasting money on flaky LLM calls.
Lessons from shipping multiple defense layers in an AI fitness app, and the tradeoffs behind each one.
A TraceCollector that records every event in the agent's lifecycle and feeds them to an interactive flow graph.
A composable prompt architecture where named segments get assembled at request time based on context.
A structured query DSL that lets the model express intent as JSON while the server compiles it into safe, parameterised SQL.
Intercepting validation errors and asking a cheap model to fix the JSON instead of failing the whole call.
I halved my AI agent's response time by not loading tools it doesn't need.