Introduction

Cupel Specification Version 1.0.0

This document is the language-agnostic specification for the Cupel context selection algorithm. Cupel selects, scores, and orders context items for inclusion in a language model’s context window, subject to a token budget.

Purpose

Large language model applications must decide which pieces of context (messages, documents, tool outputs, memories) to include in a finite context window. Cupel defines a deterministic pipeline that takes a set of candidate context items and a token budget, and produces an ordered list of selected items that fits within the budget.

This specification enables interoperable implementations across programming languages. An implementation that conforms to this specification will produce the same selected items in the same order as any other conforming implementation, given identical inputs.

Scope

This specification defines:

The data model: ContextItem, ContextBudget, ScoredItem, and supporting enumerations
The pipeline: a fixed 6-stage transformation (Classify, Score, Deduplicate, Sort, Slice, Place)
All scorer algorithms: RecencyScorer, PriorityScorer, KindScorer, TagScorer, FrequencyScorer, ReflexiveScorer, CompositeScorer, ScaledScorer
All slicer strategies: GreedySlice, KnapsackSlice, QuotaSlice
All placer strategies: ChronologicalPlacer, UShapedPlacer
Overflow handling: Throw, Truncate, Proceed strategies

This specification does not define:

Streaming or asynchronous pipeline execution (implementation-specific)
Diagnostics, tracing, or observability infrastructure
Dependency injection or builder APIs
Serialization formats (JSON, etc.)
Named policies or preset configurations
Tokenizer implementations (the caller provides token counts)

Conformance Model

A conforming implementation must exhibit behavioral equivalence: given the same input items, budget, scorer, slicer, and placer configuration, a conforming implementation must select the same items in the same order as the reference behavior defined in this specification.

Behavioral equivalence does not require bit-exact floating-point scores. Intermediate score values may differ due to floating-point evaluation order, provided the final selection and ordering are identical. Conformance tests compare selected item sets and their ordering, not exact score values. Where individual scores are verified (e.g., per-stage scorer tests), an epsilon tolerance of 1e-9 is used.

See Conformance Levels for the full conformance requirements.

Numeric Precision

All scoring computations MUST use IEEE 754 64-bit double-precision floating-point arithmetic. This includes:

Individual scorer outputs
Composite score aggregation (weighted averages)
Score-based comparisons in sorting, deduplication, and slicing
Density calculations in slicer algorithms

Implementations MUST NOT use 32-bit floats, fixed-point, or arbitrary-precision arithmetic for scoring. Integer arithmetic is permitted for token counts and budget calculations.

Notation Conventions

Algorithms in this specification use CLRS-style pseudocode with the following conventions:

Convention	Meaning
bold lowercase	Keywords: for, if, else, return, while
`monospace`	Variables, field names, function names
`<-`	Assignment (not `=`, to avoid confusion with equality)
`=`	Equality comparison
`[i]`	0-based array/list indexing
`//`	Comments
`length(x)`	Number of elements in array or list `x`

Pipeline Overview

The Cupel pipeline is a fixed sequence of six stages. Every execution follows this order; stages cannot be reordered or skipped.

flowchart LR
    A[Candidate Items] --> B[Classify]
    B --> C[Score]
    C --> D[Deduplicate]
    D --> E[Sort]
    E --> F[Slice]
    F --> G[Place]
    G --> H[Ordered Context Window]

Stage	Input	Output	Purpose
Classify	Candidate items	Pinned list + Scoreable list	Partition items; exclude invalid
Score	Scoreable items	Scored items	Compute relevance scores
Deduplicate	Scored items	Unique scored items	Remove duplicate content
Sort	Unique scored items	Sorted scored items	Order by score descending
Slice	Sorted scored items	Budget-fitting items	Select items within token budget
Place	Sliced + pinned items	Ordered items	Determine final presentation order

The pipeline operates on the principle of ordinal-only scoring: scorers assign relevance scores (rank), slicers select items within budget (drop), and placers determine presentation order (position). Each concern is strictly separated.