Skip to content

global token

The token library provides means to intercept the input and deal with it at the Lua level. The library provides a basic scanner infrastructure that can be used to write macros that accept a wide range of arguments. This interface is on purpose kept general and as performance is quite ok. One can build additional parsers without too much overhead. It's up to macro package writers to see how they can benefit from this as the main principle behind LuaTeX is to provide a minimal set of tools and no solutions. The scanner functions are probably the most intriguing.

Reference:

😱 Types incomplete or incorrect? 🙏 Please contribute!


methods


token.scan_keyword


function token.scan_keyword(keyword: string) ->  boolean
@param keyword - An ASCII based keyword to scan for.

@return - True if the keyword could be gobbled up otherwise false.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Scan and gobble a given keyword.

As with the regular TeX keyword scanner this is case insensitive (and ASCII based).

Example:

\def\scanner{\directlua{
  print(token.scan_keyword('keyword'))
}}
\scanner keyword % true
\scanner KEYWORD % true
\scanner not the keyword % false

Reference:

token.scan_keyword_cs


function token.scan_keyword_cs(keyword: string) ->  boolean
@param keyword - A case sensitive and UTF-8 based keyword

@return - True if the case sensitive and UTF-8 based keyword could be gobbled up otherwise false.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Scan and gobble a given case sensitive and UTF-8 based keyword.

Example:

\def\scanner{\directlua{
  print(token.scan_keyword_cs('Keyword'))
}}
\scanner Keyword % true
\scanner keyword % false

Reference:

token.scan_int


function token.scan_int() ->  integer

Scan and gobble a given integer.

Example:

\def\scanner{\directlua{
  print(token.scan_int())
}}
\scanner 1 % 1
\scanner 1.1 % 1 (Scans only 1 not 1.1)
\scanner -1 % -1
\scanner 1234567890 % 1234567890
\scanner string % Missing number, treated as zero
\scanner 12345678901 % Number to big

Reference:

token.scan_real


function token.scan_real() ->  number

Scan and gobble a floating point number that cannot have an exponent (1E10 is scanned as 1.0).

Example:

\def\scan{\directlua{
  print(token.scan_real())
}}
\scan 1E10 % 1.0 Does not scan “E10“
\scan 1 % 1.0
\scan 1.1 % 1.1
\scan .1 % 0.1
\scan - .1 % -0.1
\scan -1 % -1.0
\scan - 1 % -1.0
\scan 1234567890 % 1234567890.0

Reference:

token.scan_float


function token.scan_float() ->  number

Scan and gobble a floating point number that can be provided with an exponent (e. g. 1E10).

Example:

\def\scan{\directlua{
  print(token.scan_float())
}}
\scan 1E10 % 10000000000.0
\scan .1e-10 % 1e-11
\scan 1 % 1.0
\scan 1.1 % 1.1
\scan .1 % 0.1
\scan - .1 % -0.1
\scan -1 % -1.0
\scan - 1 % -1.0

Reference:

token.scan_dimen


function token.scan_dimen(
  inf: boolean?,
  mu: boolean?
) ->  integer
@param inf - inf values allowed

@param mu - mu (math units) units required

Returns a number representing a dimension and or two numbers being the filler and order

Example:

Parameter inf:

\directlua{token.scan_dimen(true)}1fi % 1
\directlua{token.scan_dimen(true)}1fil % 2
\directlua{token.scan_dimen(true)}1fill % 3
\directlua{token.scan_dimen(true)}1filll % 4

Parameter mu:

\directlua{token.scan_dimen(false, true)}1mu % 65536
\directlua{token.scan_dimen(false, true)}1cm % Illegal unit of measure (mu inserted).
Reference:

token.scan_glue


function token.scan_glue(mu_units: boolean?) ->  GlueSpecNode {
    width = integer,
    stretch = integer,
    stretch_order = integer,
    shrink = integer,
    shrink_order = integer,
}

returns a glue spec node

Example:

\def\scan{\directlua{
  local node = token.scan_glue()
  print(node.width, node.stretch, node.stretch_order, node.shrink, node.shrink_order)
}}
\def\scanMu{\directlua{
  local node = token.scan_glue(true)
  print(node.width, node.stretch, node.stretch_order, node.shrink, node.shrink_order)
}}
\scan 1pt % 65536 0 0 0 0
\scan 1pt plus 2pt % 65536 131072 0 0 0
\scan 1pt minus 3pt % 65536 0 0 196608 0
\scan 1pt plus 2pt minus 3pt % 65536 131072 0 196608 0
\scan 1pt plus 2fi minus 3fi % 65536 131072 1 196608 1
\scan 1pt plus 2fil minus 3fil % 65536 131072 2 196608 2
\scan 1pt plus 2fill minus 3fill % 65536 131072 3 196608 3
\scan 1pt plus 2filll minus 3filll % 65536 131072 4 196608 4
\scan string % Missing number, treated as zero.
\scanMu 3mu % 196608 0 0 0 0

Reference:

token.scan_toks


function token.scan_toks(
  definer: boolean?,
  expand: boolean?
) ->  Token[]
@param definer - macro_def, \def

Scan a list of tokens delimited by balanced braces.

Example:

\directlua{
  local t = token.scan_toks()
  for id, tok in ipairs(t) do
    print(id, tok, tok.command, tok.cmdname, tok.csname)
  end
}{Some text}

Reference:

😱 Types incomplete or incorrect? 🙏 Please contribute!-

token.scan_code


function token.scan_code(bitset)

Return a character if its category is in the given bitset (representing catcodes)

Reference:

😱 Types incomplete or incorrect? 🙏 Please contribute!

token.scan_string


function token.scan_string() ->  string

@return - A string given between { }, as \macro or as sequence of characters with catcode 11 or 12

😱 Types incomplete or incorrect? 🙏 Please contribute!

Scan and gobble a string.

The string scanner scans for something between curly braces and expands on the way, or when it sees a control sequence it will return its meaning. Otherwise it will scan characters with catcode letter or other.

Example:

\def\scan{\directlua{
  print(token.scan_string())
}}
\def\bar{bar}
\def\foo{\bar}
\scan \foo % bar
\scan {\foo} % bar
\scan {A string} % A string
\scan A string % A
\scan Word1 Word2 % Word1

Reference:

token.scan_argument


function token.scan_argument(expand: boolean?) ->  string
@param expand - When a braced argument is scanned, expansion can be prohibited by passing false (default is true)

Scan and gobble an argument.

This function is simular to token.scan_string but also accepts a \cs. It expands the given argument. When a braced argument is scanned, expansion can be prohibited by passing false (default is true). In case of a control sequence passing false will result in a one-level expansion (the meaning of the macro).

Example:

\def\scan{\directlua{
  print(token.scan_argument(true))
}}
\def\scanNoExpand{\directlua{
  print(token.scan_argument(false))
}}
\def\foo{bar}
\scan \foo % bar
\scan { {\bf text} } % {\fam \bffam \tenbf text}
\scanNoExpand { {\bf text} } % {\bf text}
\scan c % c
\scan \bf % \fam \bffam \tenbf

Reference:

token.scan_word


function token.scan_word()

Return a sequence of characters with catcode 11 or 12 as a string.

Reference:

😱 Types incomplete or incorrect? 🙏 Please contribute!

token.scan_csname


function token.scan_csname()

Return foo after scanning \foo.

Reference:

😱 Types incomplete or incorrect? 🙏 Please contribute!

token.scan_list


function token.scan_list()

Pick up a box specification and return a [h|v]list node.

Reference:

😱 Types incomplete or incorrect? 🙏 Please contribute!

token.get_next


function token.get_next() ->  Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}

Scan and gobble the next token.

The different scanner functions of the token library look for a sequence of tokens. This function scans just the next token.

Reference:

token.scan_token


function token.scan_token() ->  Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}

Use scan_token if you want to enforce expansion first you can.

Reference:

token.expand


function token.expand()

Trigger expansion of the next token in the input.

This can be quite unpredictable but when you call it you probably know enough about TeX not to be too worried about that. It basically is a call to the internal expand related function.

Reference:

😱 Types incomplete or incorrect? 🙏 Please contribute!

token.get_command


function token.get_command(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> command integer

@return command - A number representing the internal command number, for example 147.

Return the internal command number.

Reference:

@see Token.command

token.get_cmdname


function token.get_cmdname(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> cmdname TokenCommandName

@return cmdname - The type of the command (for instance the catcode in case of a character or the classifier that determines the internal treatment, for example letter.

Return the type of the command (for instance the catcode in case of a character or the classifier that determines the internal treatment, for example letter.

Reference:

@see Token.cmdname

token.get_csname


function token.get_csname(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> csname string?

@return csname - The associated control sequence (if applicable), for example bigskip.

Return the associated control sequence (if applicable), for example bigskip.

Reference:

@see Token.csname

token.get_id


function token.get_id(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> id integer

@return id - The unique id of the token, for example 6876.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Return the unique id of the token.

Reference:

token.get_tok


function token.get_tok(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> tok integer

@return tok - The full token number as stored in TeX, for example 536883863.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Return the full token number as stored in TeX.

Reference:

@see Token.tok

token.get_active


function token.get_active(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> active boolean

@return active - A boolean indicating the active state of the token, for example true.

Return a boolean indicating the active state of the token, for example true.

Reference:

@see Token.active

token.get_expandable


function token.get_expandable(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> expandable boolean

@return expandable - A boolean indicating if the token (macro) is expandable, for example true.

Return a boolean indicating if the token (macro) is expandable.

Reference:

@see Token.expandable

token.get_protected


function token.get_protected(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> protected boolean

@return protected - A boolean indicating if the token (macro) is protected, for example false.

Return a boolean indicating if the token (macro) is protected.

Reference:

@see Token.protected

token.get_mode


function token.get_mode(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> mode integer

@return mode - A number either representing a character or another entity, for example 1007.

Return a number either representing a character or another entity.

Reference:

@see Token.mode

token.get_index


function token.get_index(t: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}) -> index integer

@return index - A number running from 0x0000 upto 0xFFFF indicating a TeX register index, for example 1007.

Return a number running from 0x0000 upto 0xFFFF indicating a TeX register index.

Reference:

@see Token.index

token.get_macro


function token.get_macro(name: string) ->  string
@param name - The name of the macro without the leading backslash.

@return - for example foo #1 bar.

Get the content of a macro.

Reference:

@see token.set_macro

token.get_meaning


function token.get_meaning(name: string) ->  string
@param name - The name of the macro without the leading backslash.

@return - for example ->foo #1 bar.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Get the meaning of a macro including the argument specification (as usual in TeX separated by ->).

Reference:

token.commands


function token.commands() ->  table

Ask for a list of commands.

Reference:

token.command_id


function token.command_id(cmdname: TokenCommandName) -> __Reference integer?
@param cmdname - for example letter

@return __Reference - :__

Return the id of a token class.

token.create


function token.create(
  chr: integer,
  cmd: integer?
) ->  Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}

Create a token.

Reference:

@see token.new

token.new


function token.new(
  chr: integer,
  cmd: integer
) ->  Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
}

A variant that ignores the current catcode table is:

Reference:

@see token.create

token.is_defined


function token.is_defined(cs: string) ->  boolean

Example:

\def\foo{bar}
\directlua{
  print(token.is_defined('foo')) % true
  print(token.is_defined('nofoo')) % false
  print(token.is_defined('bf')) % true
}

Reference:

token.biggest_char


function token.biggest_char() ->  integer

Example:

print(token.biggest_char()) % 1114111

Reference:

token.set_macro


function token.set_macro(
  csname: string,
  content: string?,
  global: "global"?
)

Create a macro.

Example:

\directlua{
  token.set_macro("test", "content")
}
\test

Reference:

token.set_macro


function token.set_macro(
  catcodetable: integer,
  csname: string,
  content: string?,
  global: "global"?
)
@param catcodetable - A catcodetable identifier.

Create a macro.

Example:

\directlua{
  token.set_macro("test", "content")
}
\test

Reference:

token.set_char


function token.set_char(
  csname: string,
  number: integer,
  global: "global"?
)

Do a chardef at the Lua end, where invalid assignments are silently ignored.

Example:

\directlua{
  token.set_char('myT', 84)
  token.set_char('mye', 101)
  token.set_char('myX', 88)
}
\myT\mye\myX % TeX

Reference:

token.set_lua


function token.set_lua(
  name: string,
  id: integer,
  ...: ("global"|"protected")
)

local index = 1 while t[index] do index = index + 1 end

t[index] = function(slot) print(slot) end token.set_lua('mycode', index, 'protected', 'global') }

\mycode

\bye

__Reference:__

* Corresponding C source code: [lnewtokenlib.c#L1168-L1216](https://gitlab.lisn.upsaclay.fr/texlive/luatex/-/blob/f52b099f3e01d53dc03b315e1909245c3d5418d3/source/texk/web2c/luatexdir/lua/lnewtokenlib.c#L1168-L1216)





@see lua.get_functions_table




### token.put_next
---
```lua
function token.put_next(...: Token {
    command = integer,
    cmdname = TokenCommandName,
    csname = string?,
    id = integer,
    tok = integer,
    active = boolean,
    expandable = boolean,
    protected = boolean,
    mode = integer,
    index = integer?,
})

Put the next token back in the input.

Example:

local t1 = token.get_next()
local t2 = token.get_next()
local t3 = token.get_next()
local t4 = token.get_next()
-- watch out, we flush in sequence
token.put_next { t1, t2 }
-- but this one gets pushed in front
token.put_next ( t3, t4 )
-- so when we get wxyz we put yzwx!

Reference:

token.is_token


function token.is_token(t: any) ->  boolean

Check if the given argument is a token.

Example:

\directlua{
  local t = token.get_next()
  print(token.is_token(t)) % true
  print(token.is_token('text')) % false
  print(token.is_token(true)) % false
}Token

Reference:

@see token.type

token.type


function token.type(t: any) ->  "token"?

Return the string token if the given parameter is a token else nil.

Example:

\directlua{
  local t = token.get_next()
  print(token.type(t)) % 'token'
  print(token.type('text')) % nil
  print(token.type(true)) % nil
}Token

Reference:

@see token.is_token