class
ion7.core.Vocab
Functions
Vocab.new
Wrap a raw `llama_vocab*` returned by `llama_model_get_vocab`. The caller MUST keep the parent Model alive at least as long as the Vocab — we anchor it as `self._model_ref` for that very purpose. The chat-templates handle is initialised eagerly. If the bridge shared library is missing, the `require` of `ion7.core.ffi.bridge` would have failed at module load already — but we degrade gracefully here by checking the call result, leaving `_tmpls = nil` when initialisation fails (e.g. for a model with no embedded template). Calls to `apply_template` will then raise a clear error.
Vocab:n_vocab
Vocab:n_tokens
Backwards-compatible alias for `n_vocab`.
Vocab:type
Vocab:tokenize
Tokenise UTF-8 `text` into an int32 cdata array. `llama_tokenize` returns the negated required size when our buffer is too small ; we honour that contract by reallocating once and retrying. The initial heuristic (1 token per byte + 16 specials) is a generous overestimate that almost always avoids the retry.
raises — When tokenisation fails after the retry.
Vocab:detokenize
Detokenise back into a UTF-8 Lua string. Uses the pre-allocated `_dtok_buf` (64 KB). For exceptionally long outputs we retry once with a heap-allocated buffer sized to the exact requested capacity.
Vocab:piece
Convert a single token to its visible text piece. Memoised : the second call with the same token id returns a cached Lua string.
Vocab:text
Raw text representation of a token (no lstrip / SPM whitespace normalisation). Empty string if the token has no text view.
Vocab:score
Float score of a token (only meaningful for SPM/Unigram vocabs).
Vocab:attr
Attribute bitmask for a token (`LLAMA_TOKEN_ATTR_*`).
Vocab:is_eog
True for end-of-generation tokens (EOS, EOT, custom STOP, ...).
Vocab:is_control
True for control / special tokens.
Vocab:bos
Beginning-of-sequence token id, or `-1` if the model has none.
Vocab:eos
End-of-sequence token id, or `-1` if absent.
Vocab:eot
End-of-turn token id (chat models).
Vocab:nl
Newline token id, or `-1`.
Vocab:pad
Padding token id.
Vocab:sep
Sentence-separator token id.
Vocab:mask
Mask token id (for masked-LM models).
Vocab:fim_pre
Fill-in-the-Middle prefix token id. `-1` when the model has no FIM.
Vocab:fim_suf
FIM suffix token id.
Vocab:fim_mid
FIM middle token id.
Vocab:fim_pad
FIM padding token id.
Vocab:fim_rep
FIM repository-marker token id.
Vocab:fim_sep
FIM separator token id.
Vocab:get_add_bos
True if the vocab automatically prepends BOS during tokenisation.
Vocab:get_add_eos
True if the vocab automatically appends EOS during tokenisation.
Vocab:get_add_sep
True if the vocab automatically appends a sentence separator.
Vocab:builtin_templates
Names of the chat templates llama.cpp ships with (separate from the per-model template stored inside the GGUF). Useful for debugging / introspection ; production code should generally rely on `apply_template` which uses the model's own template.
Vocab:apply_template
Apply the model's embedded Jinja2 chat template to a sequence of messages. Returns the formatted prompt ready to feed to `Vocab:tokenize`. The return string excludes the trailing NUL byte (that's what `needed - 1` is computing — `ion7_chat_templates_apply` reports the byte count INCLUDING NUL).
raises — When the template engine fails (missing `_tmpls` or runtime error).
Vocab:supports_thinking
True if the embedded template recognises `enable_thinking` (Qwen3, DeepSeek-R1 et al.).