ContextBudget

A ContextBudget defines the token budget constraints that control how much context the pipeline can select. All fields are validated at construction time — no invalid budget can exist at runtime.

Fields

FieldTypeRequiredDefaultConstraints
maxTokensintegerYes>= 0. Hard ceiling: the model’s context window size.
targetTokensintegerYes>= 0, <= maxTokens. Soft goal: the slicer aims for this token count.
outputReserveintegerNo0>= 0, <= maxTokens. Tokens reserved for model output generation, subtracted from available budget.
reservedSlotsmap of ContextKind to integerNoempty mapMinimum guaranteed items per kind. Each value >= 0. Used by QuotaSlice.
estimationSafetyMarginPercentfloat64No0.0>= 0.0, <= 100.0. Percentage buffer for token estimation error.

Validation Rules

A conforming implementation MUST enforce these validation rules at construction time and reject invalid budgets:

  1. maxTokens >= 0 — Negative maximum tokens are invalid.
  2. targetTokens >= 0 — Negative target tokens are invalid.
  3. targetTokens <= maxTokens — The soft target cannot exceed the hard ceiling.
  4. outputReserve >= 0 — Negative output reserve is invalid.
  5. outputReserve <= maxTokens — The output reserve cannot exceed the context window.
  6. estimationSafetyMarginPercent >= 0.0 AND <= 100.0 — Must be a valid percentage.
  7. Each value in reservedSlots >= 0 — Negative slot reservations are invalid.

Effective Budget

The pipeline computes an effective budget for the slicing stage by subtracting tokens already committed and applying any configured safety margin:

reservedTokens  = sum of all values in reservedSlots
effectiveMax    = max(0, maxTokens - outputReserve - pinnedTokens - reservedTokens)
effectiveTarget = max(0, targetTokens - pinnedTokens - reservedTokens)
effectiveTarget = min(effectiveTarget, effectiveMax)

if estimationSafetyMarginPercent > 0:
    multiplier      = 1.0 - estimationSafetyMarginPercent / 100.0
    effectiveMax    = floor(effectiveMax * multiplier)
    effectiveTarget = floor(effectiveTarget * multiplier)
    effectiveTarget = min(effectiveTarget, effectiveMax)

Where:

  • pinnedTokens is the sum of tokens for all pinned items.
  • reservedTokens is the sum of all values in reservedSlots, subtracted alongside outputReserve and pinnedTokens to reserve capacity for per-kind guarantees.
  • The safety margin is applied after all subtractions as a multiplicative reduction. Both effectiveMax and effectiveTarget use floor (integer truncation toward zero) when converting from the floating-point product.

See Stage 5: Slice for the full pseudocode.

Semantics

  • maxTokens is the absolute ceiling. The total tokens of all selected items (pinned + sliced) MUST NOT exceed maxTokens - outputReserve, except when pinned items alone exceed this value (which is an error reported during classification).

  • targetTokens is the soft goal. The slicer aims to select items whose total tokens are at most targetTokens. The pipeline checks for overflow against targetTokens after merging pinned and sliced items.

  • outputReserve carves out tokens for the model’s response. It reduces the effective budget available for context items.

  • reservedSlots guarantees minimum representation per ContextKind. The sum of reserved slot values is subtracted from the effective budget during pipeline budget computation, reducing the token ceiling available for non-reserved items. Additionally, QuotaSlice uses reserved slots to guarantee minimum representation per kind.

  • estimationSafetyMarginPercent provides a buffer for callers whose token counts are estimates rather than exact. It is applied as a multiplicative reduction to the effective budget after reserved slots and other subtractions. A value of 10.0 reduces the effective budget to 90% of its post-subtraction value.