Pricing

Free to route. You pay 20% only of the money FlexInference actually saves you on flex.

How pricing works

FlexInference is free to route. You pay only when a flex request actually saves you money, and then only 20% of what it saved. There is no per-request fee, no subscription, and no markup on tokens.

The 20% commission applies only to flex requests (a duration start_within, like 00h-00m-30s) on priced, non-Anthropic models. Standard routing (a start_within of default, priority, or auto) and every Anthropic model are always free, because they do not run a flex race and so save you nothing to share.

If a flex race does not beat the standard price on a given request, that request is free too. You are never charged more than 20% of real, measured savings.

A worked example

Take one flex request that would have cost $0.1530 on the standard tier but completed on flex for $0.0810:

  • Standard price: $0.1530
  • Flex price (billed to you by your provider): $0.0810
  • Saved on this request: $0.0720
  • FlexInference commission (20% of the savings): $0.0144

Of the $0.0720 saved, you keep $0.0576 and FlexInference earns $0.0144. If flex had not been cheaper, this request would have been free.

Bring your own key (BYOK)

FlexInference runs on your own provider keys. Token usage is billed to you directly by OpenAI, Gemini, or Anthropic, at their prices, drawing down your existing credits. FlexInference never marks up tokens and never resells provider capacity: the only thing we charge for is the flex savings we find you.

When you are charged

Commission accrues per request and is billed monthly. We charge only once your accrued commission reaches $20; a smaller balance carries forward to the next month until it crosses that threshold.

Billing is automatic. Add a card on file in the dashboard Billing page, and when a monthly bill is due we charge that card for the accrued commission.

If a payment is past due

If a charge fails and the balance stays unpaid, billable flex is paused: priced flex requests return a 402 until the balance is settled. Free routing keeps working - the default, priority, and auto tiers, and every Anthropic model, are unaffected.

Update your card in the dashboard Billing page to clear the balance and resume flex.

Taxes and currency

All prices and charges are in US dollars (USD). You are responsible for any taxes, levies, or duties associated with your use of the Service, except for taxes on FlexInference's own income.

Questions

See our Terms of Service for the full billing terms, or email us at [email protected] with any pricing question.