Skip to content

Tag: caching

← All tags

2 articles filed under this tag. Newest first below ; start with the highlighted pick if you are new here.

Cost Optimization in LLM Applications

Token budgeting, semantic and exact caching, model routing tiers, and fallback strategies to control spend without turning the product into a smaller model glued to a spreadsheet of hacks.

· 6 min read