Skip to content

Credits — How It Works

The final cost is calculated in real time based on the cost of the specific request — to the specific model, under the specific conditions. How much compute is required, how much it costs us to serve — there are many dynamic variables. We calculate in tokens internally, but we show users a clear and stable unit: a credit.

A credit is an abstraction over real cost. It is simply the unit we use in the interface. Behind the scenes, it translates into tokens and real cost, taking the following factors into account:

  • the chosen model;
  • the number of input tokens (prompt);
  • the number of output tokens (response);
  • system messages and auxiliary contexts;
  • tools (tools / agents) used and external integrations.

Simply put: we calculate everything internally in tokens and actual cost, then present users with a convenient, stable currency — credits.

We target a scale from 0.1 to 5 credits:

  • 0.1 credits — the simplest request, for example: “add an alias to this function on line 126”.
  • ~0.5–1 credit — a standard quick refactor, a short explanation, or a small code generation task.
  • ~1–5 credits — larger tasks: drafting a plan, refactoring a module, or a detailed answer with examples.
  • ~5–20 credits — “fuzzy” or resource-intensive requests, for example: “build me a clone of the Bing search engine that runs on my laptop” — requests with a high degree of ambiguity that require significant compute and multiple iterations.

The more resources a task requires, the higher the cost in credits.

  • Different models consume different amounts of compute.
  • Input/output length directly affects execution time and price.
  • Using additional system messages and agents increases the load.
  • If a request is poorly specified, the model performs more computations and clarification attempts.

Therefore, the final amount is calculated at the moment the request is executed — precisely for those specific conditions.

To use credits more efficiently and get predictable results:

  • State your task clearly. The more specific the request, the faster and cheaper the response.
  • Break large tasks into steps. Instead of “do X”, ask “1) prepare a plan; 2) write a skeleton; 3) implement function A”.
  • Choose your model deliberately. For simple assistance, use a lightweight model; for deep generation, use a powerful one. To learn more about available models, visit the Models page.
  • Reuse system prompts. Do not send the same large contexts every time if it is not necessary.

Transparency is very important to us — we strive for maximum openness and are actively working to provide more and more information in various ways. One such capability is the usage export feature. The export will include detailed information for each request, such as:

  • request ID and timestamps;
  • model type and name;
  • number of input tokens and output tokens;
  • system messages, prompts used, and contexts;
  • tools / agents and external integrations involved;
  • total cost in tokens and in credits;
  • execution latency and status (success / error);
  • size of transferred files or context size (if applicable);
  • processing region/cluster (if relevant) and other metadata (retries, number of iterations, external API calls).

We plan to make the export easy to analyze and to add filtering by date range, models, and request types, so you can easily reconcile expenses and understand exactly what credits were charged for.

Credits are charged based on the real cost of the request to us. We are committed to fair and transparent accounting: internally — precise calculations in tokens and resources; externally — a simple, understandable unit for the user. If you wish, you will be able to get a detailed breakdown for each charge through the usage export or in the request logs.

A credit = a convenient unit for displaying the real, dynamic cost of compute. The range for typical requests is from 0.1 (micro-tasks) to 20 (complex, ambiguous requests). The more precise and concise the request, the cheaper and faster the result. We are working to give you as much detail as possible — and will soon add an export showing all key billing parameters.