class

ion7.core.Speculative

_ptr cdata `ion7_speculative_t*` (auto-freed via ffi.gc).

_n_draft integer Per-step max draft tokens.

_draft_buf cdata Reusable buffer for `:draft` output.

_ctx_buf cdata Reusable token-history buffer (grows on demand).

_ctx_buf_sz integer Current capacity of `_ctx_buf` in tokens.

Functions

Speculative.new Speculative:begin Speculative:draft Speculative:accept Speculative:stats Speculative:free

Speculative.new

Create a speculative-decoding engine.

Speculative.new(ctx_tgt, ctx_dft, opts)

ctx_tgtcdata|ion7.core.ContextTarget context (mandatory).

ctx_dftcdata|ion7.core.Context|nilDraft context, only for `DRAFT`.

optstable?

→ ion7.core.Speculative

raises — When `ion7_speculative_init` returns NULL.

Speculative:begin

(Re)initialise the engine with the current prompt history. Pass an empty table on a fresh conversation — the cache will warm up on subsequent `draft()` calls.

Speculative:begin(tokens)

tokensinteger[]1-based Lua array of token ids.

Speculative:draft

Generate up to `n_draft` predicted tokens for the next step.

Speculative:draft(tokens, last_tok)

tokensinteger[]All generated tokens so far (1-based).

last_tokintegerThe most recently committed token id.

→ integer[]1-based array of draft tokens (possibly empty).

Speculative:accept

Tell the engine how many consecutive draft tokens the target model accepted. Call after every speculative step, even when `0`.

Speculative:accept(n_accepted)

n_acceptedinteger

Speculative:stats

Print acceptance-rate / effective-speedup stats to stderr.

Speculative:stats()

Speculative:free

Explicit release. Idempotent. Disarms the GC finalizer first to avoid a double-free if the GC runs later.

Speculative:free()