ion7-llm / pool

module

pool

ion7.llm.pool.Slot

session ion7.llm.Session
sampler ion7.core.Sampler
stop ion7.llm.Stop
thinking ion7.llm.chat.Thinking
tool_sm ion7.llm.chat.tool_stream?
raw_parts string[]
toks integer[]
max_tokens integer
on_chunk function? `(slot, chunk) -> nil`. `chunk` is a
next_tok integer? Token to consume on next tick.
n_generated integer
stop_reason string? Set when the slot terminates.

ion7.llm.Pool

_ctx ion7.core.Context
_vocab ion7.core.Vocab
_cm ion7.llm.kv.ContextManager
_batch cdata
_batch_cap integer
_batch_gc cdata
_slots ion7.llm.pool.Slot[]
_opts table

Functions

Pool.new

Build a pool.

Pool.new(ctx, vocab, cm, opts)
ctxion7.core.Context
vocabion7.core.Vocab
cmion7.llm.kv.ContextManager
optstable?
→ ion7.llm.Pool

Pool:add

Register a session into the pool. Prefills it through the context manager, samples the first token, and queues it for the next tick.

Pool:add(session, opts)
sessionion7.llm.Session
optstable?
→ ion7.llm.pool.Slot

Pool:slots

Snapshot of the slot list. Returns the live array — do not mutate.

Pool:slots()
→ ion7.llm.pool.Slot[]

Pool:n_active

Number of slots that have not yet hit a stop condition.

Pool:n_active()
→ integer

Pool:tick

Execute one parallel decode step. Returns true when at least one slot was processed, false when every slot has already terminated.

Pool:tick()
→ boolean

Pool:run

Drive every slot to completion. Calls `tick` in a loop, then runs `:_finalise` on each slot so `slot.session:last_response()` is ready to read.

Pool:run()
→ ion7.llm.pool.Slot[]same array `:slots()` returns

Pool:reset

Drop every slot. Sessions are NOT released — the caller decides whether to keep them around for the next round of chat.

Pool:reset()

Pool:free

Free the pool's batch immediately. Idempotent.

Pool:free()