How I use data to optimize AI apps
A video collaboration between Find AI and Velvet
At Find AI, we use OpenAI a lot. Last week, we made 19 million requests.
Understanding what's happening at that scale can be challenging. It's a classic OODA loop:
- Observe what our application is doing and which systems are triggering requests
- Orient around what's happening, such as which models are the most costly in aggregate
- Decide how to make the system more efficient, such as by testing a more efficient model or shorter prompt
- Act by rolling out changes
Velvet, an AI Gateway, is the tool in our development stack that enables this observability and optimization loop. I worked with them this week to produce a video about how we use data to optimize our AI-powered apps at Find AI.
The video covers observability tools in development, cost attribution, using the OpenAI Batch API, evaluating new models, and fine-tuning. I hope it's a useful resource for people running AI models in production.