class
ion7.core.Speculative
_ptr
cdata
`ion7_speculative_t*` (auto-freed via ffi.gc).
_n_draft
integer
Per-step max draft tokens.
_draft_buf
cdata
Reusable buffer for `:draft` output.
_ctx_buf
cdata
Reusable token-history buffer (grows on demand).
_ctx_buf_sz
integer
Current capacity of `_ctx_buf` in tokens.
Functions
Speculative.new
Speculative:begin
Speculative:draft
Speculative:accept
Speculative:stats
Speculative:free
Speculative.new
Create a speculative-decoding engine.
Speculative.new(ctx_tgt, ctx_dft, opts)
ctx_tgtcdata|ion7.core.ContextTarget context (mandatory).
ctx_dftcdata|ion7.core.Context|nilDraft context, only for `DRAFT`.
optstable?
→ ion7.core.Speculative
raises — When `ion7_speculative_init` returns NULL.
Speculative:begin
(Re)initialise the engine with the current prompt history. Pass an empty table on a fresh conversation — the cache will warm up on subsequent `draft()` calls.
Speculative:begin(tokens)
tokensinteger[]1-based Lua array of token ids.
Speculative:draft
Generate up to `n_draft` predicted tokens for the next step.
Speculative:draft(tokens, last_tok)
tokensinteger[]All generated tokens so far (1-based).
last_tokintegerThe most recently committed token id.
→ integer[]1-based array of draft tokens (possibly empty).
Speculative:accept
Tell the engine how many consecutive draft tokens the target model accepted. Call after every speculative step, even when `0`.
Speculative:accept(n_accepted)
n_acceptedinteger
Speculative:stats
Print acceptance-rate / effective-speedup stats to stderr.
Speculative:stats()
Speculative:free
Explicit release. Idempotent. Disarms the GC finalizer first to avoid a double-free if the GC runs later.
Speculative:free()