How I use data to optimize AI apps

A video collaboration between Find AI and Velvet
At Find AI, we use OpenAI a lot. Last week, we made 19 million requests.

Understanding what's happening at that scale can be challenging. It's a classic OODA loop:

  • Observe what our application is doing and which systems are triggering requests
  • Orient around what's happening, such as which models are the most costly in aggregate
  • Decide how to make the system more efficient, such as by testing a more efficient model or shorter prompt
  • Act by rolling out changes

Velvet, an AI Gateway, is the tool in our development stack that enables this observability and optimization loop. I worked with them this week to produce a video about how we use data to optimize our AI-powered apps at Find AI.

The video covers observability tools in development, cost attribution, using the OpenAI Batch API, evaluating new models, and fine-tuning. I hope it's a useful resource for people running AI models in production.

