contexty

package module
v0.2.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 2, 2026 License: MIT Imports: 8 Imported by: 0

README

contexty

Go Reference Go Report Card

TL;DR — contexty is a Go library for dynamic LLM context window management. It lets you assemble, format, and intelligently truncate or drop chat history and RAG documents against strict token limits using configurable eviction strategies.

Installation

go get github.com/skosovsky/contexty

Requires Go 1.23+.

Quick Start (AI-friendly)

Full pipeline: init counter and builder, add system + RAG blocks, compile, handle errors.

package main

import (
	"context"
	"errors"
	"log"

	"github.com/skosovsky/contexty"
)

func main() {
	ctx := context.Background()
	counter := &contexty.CharFallbackCounter{CharsPerToken: 4}
	builder := contexty.NewBuilder(contexty.AllocatorConfig{
		MaxTokens:    2000,
		TokenCounter: counter,
	})

	// Required system prompt (must fit; otherwise Compile returns error)
	builder.AddBlock(contexty.MemoryBlock{
		ID:       "system",
		Tier:     contexty.TierSystem,
		Strategy: contexty.NewStrictStrategy(),
		Messages: []contexty.Message{contexty.TextMessage("system", "You are a helpful assistant.")},
	})

	// RAG block: drop entirely if it does not fit
	builder.AddBlock(contexty.MemoryBlock{
		ID:       "rag",
		Tier:     contexty.TierRAG,
		Strategy: contexty.NewDropStrategy(),
		Messages: []contexty.Message{
			contexty.TextMessage("system", "Retrieved doc 1..."),
			contexty.TextMessage("system", "Retrieved doc 2..."),
		},
	})

	msgs, report, err := builder.Compile(ctx)
	if err != nil {
		if errors.Is(err, contexty.ErrBudgetExceeded) {
			log.Fatal("system block or strategy contract: block exceeds budget")
		}
		if errors.Is(err, contexty.ErrInvalidConfig) {
			log.Fatal("invalid config: MaxTokens or TokenCounter")
		}
		log.Fatal(err)
	}
	log.Printf("compiled %d messages, tokens used: %d", len(msgs), report.TotalTokensUsed)
}

Key abstractions and contracts

  • Token Counter (token.go): Implement TokenCounter with Count(ctx context.Context, msgs []Message) (int, error). The context is passed from Compile for cancellation and timeouts (e.g. when calling a remote tokenization service). Use CharFallbackCounter for prototyping; optional EstimateTool for custom tool-call token weights. To plug in your own tokenizer (e.g. tiktoken), implement the interface and pass it in AllocatorConfig.TokenCounter.
  • Strategies (strategies.go): Built-in strategies — Strict (error if block does not fit), Drop (remove block), Truncate (remove oldest messages; options: KeepUserAssistantPairs, MinMessages, ProtectRole), Summarize (call your Summarizer). Custom strategies implement Apply(ctx, msgs, originalTokens, limit, counter) and must return messages whose total token count ≤ limit; Compile enforces this and returns ErrStrategyExceededBudget on violation.
  • Formatter (formatter.go): InjectIntoSystem merges auxiliary blocks into a single system message with XML-style tags (<context>, <fact>); only text parts are included; content is XML-escaped.
  • Fact Extractor (factextractor.go): Interface for extracting facts from conversation history; reserved for v2 — the allocator does not use it yet.

How limits are resolved

Blocks are processed in tier order: System (0), Core (1), RAG (2), History (3), Scratchpad (4). Within the same tier, insertion order (AddBlock) is preserved. For each block:

  1. Token count is computed; if the block has optional MaxTokens and it is less than the remaining budget, the strategy receives that as the limit (per-block cap).
  2. If the block fits within the remaining budget, it is appended as-is.
  3. If not, the block’s EvictionStrategy.Apply is called; the result is re-counted and must satisfy used ≤ remaining; otherwise Compile returns ErrStrategyExceededBudget.
  4. Remaining budget is decreased by the tokens used.

What gets dropped or truncated is thus determined by block order (tiers + AddBlock) and each block’s strategy, not by a single global “priority” field.

Features

  • Message model: Message has Role, Content ([]ContentPart), Name, ToolCalls, ToolCallID, Metadata. ContentPart: Type (e.g. "text", "image_url"), Text, ImageURL. Helpers: TextMessage, MultipartMessage. Multimodal content is supported without provider-specific validation in core.
  • Tiers: System, Core, RAG, History, Scratchpad; lower number = higher priority. Custom tiers via Tier(N).
  • MemoryBlock: ID, Messages, Tier, Strategy; optional MaxTokens (per-block token cap), CacheControl (for provider prompt caching; not interpreted in core).
  • Builder: NewBuilder(config), AddBlock, Compile(ctx). A builder can be reused: call AddBlock and Compile multiple times; each Compile uses the current list of blocks (blocks are not cleared).
  • Token counting: You inject a TokenCounter; context is passed from Compile for cancellation/timeouts. CharFallbackCounter and optional EstimateTool; or your own implementation (e.g. tiktoken).
  • Eviction strategies: Strict, Drop, TruncateOldest (KeepUserAssistantPairs, MinMessages, ProtectRole), Summarize(Summarizer). Custom strategy: implement Apply; contract enforced by Compile. Eviction labels in report: "rejected", "dropped", "truncated", "summarized", or "evicted" for custom.
  • Summarizer: Interface used by SummarizeStrategy: Summarize(ctx, msgs) (Message, error).
  • CompileReport: TotalTokensUsed, RemainingTokens, OriginalTokens, OriginalTokensPerBlock, TokensPerBlock, Evictions, BlocksDropped (see below).
  • Validation policy: Minimal validation in core (no provider-specific role/URL/JSON checks). The only hard guarantee is TotalTokensUsed ≤ MaxTokens; strategy output is checked and ErrStrategyExceededBudget is returned on violation.

Strategies at a glance

Strategy When to use If block doesn't fit
NewStrictStrategy() System persona, rules (must fit) Returns error
NewDropStrategy() RAG, optional facts Block removed
NewTruncateOldestStrategy(opts...) Chat history Oldest messages removed; opts: KeepUserAssistantPairs, MinMessages, ProtectRole
NewSummarizeStrategy(summarizer) Long blocks to compress Summarizer called; else dropped

Truncate options: KeepUserAssistantPairs(true) keeps user/assistant pairs; MinMessages(n) drops the block if fewer than n messages would remain; ProtectRole("developer") never removes messages with that role—the first removable message (or pair) is removed instead.

CompileReport

After Compile(ctx):

  • TotalTokensUsed — tokens in the final []Message.
  • RemainingTokensMaxTokens - TotalTokensUsed after compile.
  • OriginalTokens — total tokens before eviction (all blocks).
  • OriginalTokensPerBlock — map block ID → tokens before eviction (before strategy was applied).
  • TokensPerBlock — map block ID → tokens used in output.
  • Evictions — map block ID → eviction label ("rejected", "dropped", "truncated", "summarized"). Only blocks for which an eviction strategy was actually applied appear here.
  • BlocksDropped — slice of block IDs that were fully removed.

Error handling

Use errors.Is(err, contexty.Err...) to handle specific failures:

Error When
ErrInvalidConfig MaxTokens ≤ 0 or TokenCounter == nil
ErrNilStrategy A MemoryBlock has nil Strategy
ErrTokenCountFailed TokenCounter.Count returned an error
ErrBudgetExceeded StrictStrategy: block does not fit in remaining budget
ErrStrategyExceededBudget Strategy returned messages exceeding remaining budget (contract violation)
ErrInvalidCharsPerToken CharFallbackCounter with CharsPerToken ≤ 0

Example: if the system block with Strict strategy does not fit, Compile returns an error that wraps or equals ErrBudgetExceeded.

Testing

Use testing_helpers.go for unit tests without heavy CGO tokenizers. FixedCounter returns a count based on message structure: set TokensPerMessage, and optionally TokensPerContentPart and TokensPerToolCall, to simulate realistic eviction (e.g. removing one “heavy” message frees many tokens). Example:

counter := &contexty.FixedCounter{
	TokensPerMessage:    10,
	TokensPerContentPart: 5,
	TokensPerToolCall:   20,
}
// Use counter in AllocatorConfig and assert report.TotalTokensUsed, evictions, etc.

Full example

See examples/full_assembly for a multi-tier setup (system, core, RAG, history) that demonstrates compilation when total content exceeds the token limit, with evictions and dropped blocks.

Documentation

Full API: pkg.go.dev/github.com/skosovsky/contexty.

License

MIT. See LICENSE.

Documentation

Overview

Package contexty implements a token budget allocator for LLM context windows.

LLMs have a fixed token limit (e.g. 8192). Contexty helps fit system prompts, pinned facts, RAG results, chat history, and tool outputs into that budget by treating memory as tiers: higher-priority blocks are allocated first, and configurable eviction strategies (strict, drop, truncate, summarize) apply when a block does not fit.

The library does not tokenize text itself. Callers inject a TokenCounter (e.g. tiktoken for a specific model, or CharFallbackCounter for tests). See AllocatorConfig, Builder, and Builder.Compile for the main API.

Example:

counter := &contexty.CharFallbackCounter{CharsPerToken: 4}
builder := contexty.NewBuilder(contexty.AllocatorConfig{
    MaxTokens:    4000,
    TokenCounter: counter,
})
builder.AddBlock(contexty.MemoryBlock{
    ID: "persona", Tier: contexty.TierSystem,
    Strategy: contexty.NewStrictStrategy(),
    Messages: []contexty.Message{contexty.TextMessage("system", "You are a helpful assistant.")},
})
msgs, report, err := builder.Compile(ctx)
// report.TotalTokensUsed, report.Evictions, report.BlocksDropped describe what happened.
// See examples/full_assembly for a full multi-tier setup.
Example (BuilderChatHistory)
package main

import (
	"context"
	"fmt"

	"github.com/skosovsky/contexty"
)

func main() {
	// System (1 msg) + history (6 msgs); 50 tokens each = 350 total, limit 300 -> truncate removes 1.
	counter := &contexty.FixedCounter{TokensPerMessage: 50}
	b := contexty.NewBuilder(contexty.AllocatorConfig{MaxTokens: 300, TokenCounter: counter})
	b.AddBlock(contexty.MemoryBlock{
		ID: "sys", Tier: contexty.TierSystem, Strategy: contexty.NewStrictStrategy(),
		Messages: []contexty.Message{contexty.TextMessage("system", "You are helpful.")},
	})
	history := []contexty.Message{
		contexty.TextMessage("user", "hi"),
		contexty.TextMessage("assistant", "hello"),
		contexty.TextMessage("user", "hi"),
		contexty.TextMessage("assistant", "hello"),
		contexty.TextMessage("user", "hi"),
		contexty.TextMessage("assistant", "hello"),
	}
	b.AddBlock(contexty.MemoryBlock{
		ID: "history", Tier: contexty.TierHistory, Strategy: contexty.NewTruncateOldestStrategy(),
		Messages: history,
	})
	msgs, report, err := b.Compile(context.Background())
	if err != nil {
		return
	}
	fmt.Printf("messages: %d, tokens: %d, eviction(history)=%q\n",
		len(msgs), report.TotalTokensUsed, report.Evictions["history"])
}
Output:

messages: 6, tokens: 300, eviction(history)="truncated"
Example (InjectIntoSystemXML)
package main

import (
	"fmt"
	"strings"

	"github.com/skosovsky/contexty"
)

func main() {
	sys := contexty.TextMessage("system", "Base.")
	got := contexty.InjectIntoSystem(sys,
		contexty.Message{Content: []contexty.ContentPart{{Type: "text", Text: "Fact1"}}},
		contexty.Message{Content: []contexty.ContentPart{{Type: "text", Text: "Fact2"}}},
	)
	text := got.Content[0].Text
	fmt.Println(got.Role)
	fmt.Println(len(text) > 0 && text[:1] == "B")
	fmt.Println(strings.Contains(text, "<context>") && strings.Contains(text, "<fact>"))
}
Output:

system
true
true

Index

Examples

Constants

View Source
const DefaultTokensPerNonTextPart = 85

DefaultTokensPerNonTextPart is the fallback token count for content parts that are not Type "text" (e.g. image_url). No validation or network checks.

Variables

View Source
var (
	// ErrBudgetExceeded is returned by StrictStrategy when a block does not fit
	// within the remaining token budget.
	ErrBudgetExceeded = errors.New("contexty: block exceeds remaining token budget")

	// ErrInvalidConfig is returned by Compile when configuration is invalid
	// (e.g. MaxTokens <= 0 or TokenCounter == nil).
	ErrInvalidConfig = errors.New("contexty: invalid allocator config")

	// ErrTokenCountFailed is returned when TokenCounter.Count returns an error.
	ErrTokenCountFailed = errors.New("contexty: token counting failed")

	// ErrInvalidCharsPerToken is returned by CharFallbackCounter when
	// CharsPerToken is zero or negative.
	ErrInvalidCharsPerToken = errors.New("contexty: CharsPerToken must be positive")

	// ErrNilStrategy is returned by Compile when a MemoryBlock has a nil Strategy.
	ErrNilStrategy = errors.New("contexty: block has nil eviction strategy")

	// ErrStrategyExceededBudget is returned by Compile when an EvictionStrategy.Apply
	// returns messages whose token count exceeds the remaining budget (contract violation).
	ErrStrategyExceededBudget = errors.New("contexty: strategy returned output exceeding remaining budget")
)

Sentinel errors for typical contexty failure modes. Use errors.Is to check for these in calling code.

Functions

This section is empty.

Types

type AllocatorConfig

type AllocatorConfig struct {
	MaxTokens    int          // Total token budget (must be > 0)
	TokenCounter TokenCounter // Required; used by Compile
}

AllocatorConfig configures the token budget and how to count tokens.

type Builder

type Builder struct {
	// contains filtered or unexported fields
}

Builder collects memory blocks and compiles them into a single message slice within the token budget. A Builder can be reused: call AddBlock and Compile multiple times. Each Compile uses the current list of blocks (blocks are not cleared after Compile). For a fresh compile, create a new Builder.

func NewBuilder

func NewBuilder(cfg AllocatorConfig) *Builder

NewBuilder returns a new Builder with the given config. Config is not validated until Compile.

Example
package main

import (
	"context"
	"fmt"

	"github.com/skosovsky/contexty"
)

func main() {
	counter := &contexty.CharFallbackCounter{CharsPerToken: 4}
	builder := contexty.NewBuilder(contexty.AllocatorConfig{
		MaxTokens:    100,
		TokenCounter: counter,
	})
	builder.AddBlock(contexty.MemoryBlock{
		ID: "sys", Tier: contexty.TierSystem, Strategy: contexty.NewStrictStrategy(),
		Messages: []contexty.Message{contexty.TextMessage("system", "You are helpful.")},
	})
	msgs, report, err := builder.Compile(context.Background())
	if err != nil {
		return
	}
	fmt.Printf("messages: %d, tokens: %d\n", len(msgs), report.TotalTokensUsed)
}
Output:

messages: 1, tokens: 4

func (*Builder) AddBlock

func (b *Builder) AddBlock(block MemoryBlock) *Builder

AddBlock appends a block and returns the builder for chaining.

func (*Builder) Compile

func (b *Builder) Compile(ctx context.Context) ([]Message, CompileReport, error)

Compile assembles all blocks into a single []Message that fits within MaxTokens. Blocks are processed in Tier order (stable sort); within the same Tier, insertion order is kept. Returns the final messages, a report, and an error (e.g. invalid config or StrictStrategy overflow). Compile can be called multiple times on the same Builder; each call uses the current blocks.

Example
package main

import (
	"context"
	"fmt"

	"github.com/skosovsky/contexty"
)

func main() {
	counter := &contexty.CharFallbackCounter{CharsPerToken: 4}
	b := contexty.NewBuilder(contexty.AllocatorConfig{MaxTokens: 50, TokenCounter: counter})
	b.AddBlock(contexty.MemoryBlock{
		ID: "core", Tier: contexty.TierCore, Strategy: contexty.NewDropStrategy(),
		Messages: []contexty.Message{contexty.TextMessage("system", "User: Alice")},
	})
	b.AddBlock(contexty.MemoryBlock{
		ID: "history", Tier: contexty.TierHistory, Strategy: contexty.NewTruncateOldestStrategy(),
		Messages: []contexty.Message{
			contexty.TextMessage("user", "Hi"),
			contexty.TextMessage("assistant", "Hello!"),
		},
	})
	msgs, report, err := b.Compile(context.Background())
	if err != nil {
		return
	}
	fmt.Printf("msgs=%d evictions=%v\n", len(msgs), report.Evictions)
}
Output:

msgs=3 evictions=map[]

type CharFallbackCounter

type CharFallbackCounter struct {
	// CharsPerToken is the character-to-token ratio (e.g. 4 for English).
	// Must be positive.
	CharsPerToken int
	// TokensPerNonTextPart is the weight for content parts with Type != "text"
	// (e.g. image_url). Zero means use DefaultTokensPerNonTextPart.
	TokensPerNonTextPart int
	// EstimateTool is optional; when set, used for each ToolCall instead of rune-based fallback.
	EstimateTool ToolCallEstimator
}

CharFallbackCounter approximates token count by dividing character count by a configurable ratio. It does not use a real tokenizer (BPE/tiktoken). Suitable for prototyping and environments where exact counting is not critical. For production, inject a model-specific TokenCounter (e.g. tiktoken).

func (*CharFallbackCounter) Count

func (c *CharFallbackCounter) Count(ctx context.Context, msgs []Message) (int, error)

Count returns the estimated token count for all messages. Text from ContentPart (Type "text") is measured in runes; non-text parts use a constant weight. ToolCalls: if EstimateTool is set, its result is summed; otherwise runes of Arguments+Name are used. Returns ErrInvalidCharsPerToken if CharsPerToken <= 0.

type CompileReport

type CompileReport struct {
	TotalTokensUsed        int               // Total tokens in the final result
	OriginalTokens         int               // Total tokens before eviction (all blocks considered)
	RemainingTokens        int               // MaxTokens minus TotalTokensUsed after compile
	OriginalTokensPerBlock map[string]int    // Block ID -> tokens before eviction (before strategy applied)
	TokensPerBlock         map[string]int    // Block ID -> tokens used in output
	Evictions              map[string]string // Block ID -> strategy applied ("rejected", "dropped", "truncated", "summarized")
	BlocksDropped          []string          // IDs of blocks completely removed (may contain duplicates if multiple blocks shared the same ID)
}

CompileReport describes what happened during Compile: token usage and evictions.

type ContentPart added in v0.2.0

type ContentPart struct {
	Type     string    // "text", "image_url", or provider-specific
	Text     string    `json:"text,omitempty"`
	ImageURL *ImageURL `json:"image_url,omitempty"`
}

ContentPart represents a single part of message content (text or image). Type is not validated in core; typical values are "text", "image_url".

type EvictionStrategy

type EvictionStrategy interface {
	// Apply returns a subset of msgs that fits within limit tokens, or an error.
	// originalTokens is the token count of msgs (from counter.Count(ctx, msgs)); use it to avoid re-counting.
	// Returned messages must have total token count <= limit; Compile enforces this.
	Apply(ctx context.Context, msgs []Message, originalTokens int, limit int, counter TokenCounter) ([]Message, error)
}

EvictionStrategy defines how to shrink or trim a block to fit the remaining budget. Each MemoryBlock has its own strategy (strict, drop, truncate, summarize).

Apply receives originalTokens (pre-counted by Builder) for DRY; implementations must return messages whose total token count <= limit. Compile re-counts output and returns ErrStrategyExceededBudget if the contract is violated.

func NewDropStrategy

func NewDropStrategy() EvictionStrategy

NewDropStrategy returns a strategy that drops the block entirely when it exceeds the limit. Use for RAG or other optional blocks where partial content is worse than none.

func NewStrictStrategy

func NewStrictStrategy() EvictionStrategy

NewStrictStrategy returns a strategy that fails with ErrBudgetExceeded when the block exceeds the limit. Use for TierSystem and other blocks that must never be evicted.

func NewSummarizeStrategy

func NewSummarizeStrategy(summarizer Summarizer) EvictionStrategy

NewSummarizeStrategy returns a strategy that calls the given Summarizer when the block exceeds the limit. If the summary still does not fit, the block is dropped (empty result). Panics if summarizer is nil (programmer error at init time).

func NewTruncateOldestStrategy

func NewTruncateOldestStrategy(opts ...TruncateOption) EvictionStrategy

NewTruncateOldestStrategy returns a strategy that truncates from the oldest messages. Options: KeepUserAssistantPairs, MinMessages, ProtectRole.

type FactExtractor

type FactExtractor interface {
	Extract(ctx context.Context, history []Message) ([]string, error)
}

FactExtractor analyzes conversation history and extracts new long-term facts. This interface is reserved for v2; the allocator does not use it yet. Implementations typically call an LLM with a prompt like "Extract new facts about the user."

TODO: v2 — integrate with TierCore updates and diffing utilities.

type FixedCounter

type FixedCounter struct {
	// TokensPerMessage is the base weight per message (always applied).
	TokensPerMessage int
	// TokensPerContentPart is added for each ContentPart in a message (0 = not used).
	TokensPerContentPart int
	// TokensPerToolCall is added for each ToolCall in a message (0 = not used).
	TokensPerToolCall int
}

FixedCounter returns a token count derived from message structure for testing. Enables realistic eviction tests: removing one "heavy" message frees many tokens.

func (*FixedCounter) Count

func (c *FixedCounter) Count(ctx context.Context, msgs []Message) (int, error)

Count returns the sum over msgs of (base + len(Content)*TokensPerContentPart + len(ToolCalls)*TokensPerToolCall).

type FunctionCall added in v0.2.0

type FunctionCall struct {
	Name      string `json:"name"`
	Arguments string `json:"arguments"`
}

FunctionCall holds function name and arguments (JSON string; not validated in core).

type ImageURL added in v0.2.0

type ImageURL struct {
	URL    string `json:"url"`
	Detail string `json:"detail,omitempty"` // e.g. "low", "high"
}

ImageURL holds URL and optional detail level for image content. No URL validation or network checks in core.

type MemoryBlock

type MemoryBlock struct {
	ID           string
	Messages     []Message
	Tier         Tier
	Strategy     EvictionStrategy
	MaxTokens    int    // Optional: hard per-block token limit (0 = no limit)
	CacheControl string // Optional: caching rules for the block
}

MemoryBlock is a logical group of messages with a Tier and an EvictionStrategy. ID is used in CompileReport; empty ID is allowed. MaxTokens is optional: when > 0 and less than the remaining global budget, Apply receives this value as the limit so the block is capped locally (e.g. RAG block limited to 200 tokens). CacheControl is for provider-specific prompt caching (e.g. Anthropic/Gemini); not interpreted in core.

type Message

type Message struct {
	Role       string
	Content    []ContentPart // Always slice; text-only = one part with Type "text"
	Name       string        // Optional: function name for tool messages
	ToolCalls  []ToolCall
	ToolCallID string
	Metadata   map[string]any
}

Message is the minimal unit of context: a single chat turn with role and content. v2: Content is always []ContentPart; use TextMessage/MultipartMessage helpers. ToolCalls and Metadata support agents and prompt caching; no validation in core.

func InjectIntoSystem

func InjectIntoSystem(systemMsg Message, blocks ...Message) Message

InjectIntoSystem merges auxiliary text blocks into a single system message using XML tags for structured separation. Only text parts (Type "text") are included; other part types (e.g. image_url, audio) are safely ignored to avoid embedding large or binary content. Content is XML-escaped to prevent injection. If blocks is empty, returns systemMsg unchanged.

Example
package main

import (
	"fmt"

	"github.com/skosovsky/contexty"
)

func main() {
	sys := contexty.TextMessage("system", "You are a doctor.")
	got := contexty.InjectIntoSystem(sys,
		contexty.Message{Content: []contexty.ContentPart{{Type: "text", Text: "Patient has fever."}}},
		contexty.Message{Content: []contexty.ContentPart{{Type: "text", Text: "Allergies: none."}}},
	)
	fmt.Println(got.Role)
	fmt.Println(len(got.Content) > 0 && len(got.Content[0].Text) > 0 && got.Content[0].Text[0] == 'Y')
}
Output:

system
true

func MultipartMessage added in v0.2.0

func MultipartMessage(role string, parts ...ContentPart) Message

MultipartMessage creates a message with multiple content parts (text, images, etc.).

func TextMessage added in v0.2.0

func TextMessage(role, text string) Message

TextMessage creates a simple text-only message (single ContentPart with Type "text").

type Summarizer

type Summarizer interface {
	Summarize(ctx context.Context, msgs []Message) (Message, error)
}

Summarizer compresses a slice of messages into a single summary message. Typically implemented via a cheap/fast LLM call; used by SummarizeStrategy.

type Tier

type Tier int

Tier is the priority level of a memory block (lower number = higher priority). The type is int so callers can define custom tiers (e.g. Tier(10) for debug logs). Built-in constants cover typical use cases but the set is not closed.

const (
	// TierSystem is for immutable instructions (persona, rules). Never evicted; error if doesn't fit.
	TierSystem Tier = 0
	// TierCore is for pinned facts (user name, preferences).
	TierCore Tier = 1
	// TierRAG is for external knowledge (episodic retrieval).
	TierRAG Tier = 2
	// TierHistory is for conversation history (working memory).
	TierHistory Tier = 3
	// TierScratchpad is for temporary reasoning and tool call logs.
	TierScratchpad Tier = 4
)

func (Tier) String

func (t Tier) String() string

String returns the tier name for built-in constants, or "Tier(N)" for custom values.

type TokenCounter

type TokenCounter interface {
	Count(ctx context.Context, msgs []Message) (int, error)
}

TokenCounter counts tokens for a slice of messages. The library does not implement real tokenization; the caller injects an implementation. Count must account for message structure (role, content parts, tool calls) and any per-message overhead; no validation of content types or URLs in core. The context is passed from Compile and may be used for cancellation or timeouts (e.g. when counting involves a network call to a tokenization service).

type ToolCall added in v0.2.0

type ToolCall struct {
	ID       string       `json:"id"`
	Type     string       `json:"type"` // typically "function"
	Function FunctionCall `json:"function"`
}

ToolCall represents a tool/function call in agent messages.

type ToolCallEstimator added in v0.2.1

type ToolCallEstimator func(call ToolCall) int

ToolCallEstimator returns the token weight of a single tool call. When non-nil in CharFallbackCounter, it is used for ToolCalls instead of rune-based fallback.

type TruncateOption

type TruncateOption func(*truncateConfig)

TruncateOption configures TruncateOldestStrategy behavior.

func KeepUserAssistantPairs

func KeepUserAssistantPairs(keep bool) TruncateOption

KeepUserAssistantPairs ensures messages are removed in user-assistant pairs from the start, so that dialog coherence is preserved (no orphan user or assistant).

func MinMessages

func MinMessages(n int) TruncateOption

MinMessages sets the minimum number of messages to keep after truncation. If the remaining budget cannot fit at least MinMessages messages, the block is dropped entirely (empty result).

func ProtectRole added in v0.2.2

func ProtectRole(role string) TruncateOption

ProtectRole marks a role so that messages with this role are never removed when truncating. The first removable message (or user+assistant pair when KeepUserAssistantPairs is set) is removed instead. Duplicate roles are not added; the config stays deduplicated.

Directories

Path Synopsis
examples
full_assembly command
Full-assembly example: builds a multi-tier context (system, core, RAG, history) and compiles it within a token budget.
Full-assembly example: builds a multi-tier context (system, core, RAG, history) and compiles it within a token budget.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL