Skip to content

global string

The string library has a few extra functions.

😱 Types incomplete or incorrect? 🙏 Please contribute!


methods


string.explode


function string.explode(
  text: string,
  separator: string?
) ->  string[]
@param text - A text that is to be divided into several substrings.

@param separator - A separator that is used to split the string (default +).

Break a string into pieces.

This functions splits a stringa into sub-strings based on the value of the string argument separator. The second argument is a string that is either empty (this splits the string into characters), a single character (this splits on each occurrence of that character, possibly introducing empty strings), or a single character followed by the plus sign + (this special version does not create empty sub-strings). The default value for separator is + (multiple spaces). Note: separator is not hidden by surrounding braces as it would be if this function was written in TeX macros.

Example:

for _, word in ipairs(string.explode("one  two three")) do
  print(word)
end
-- one
-- two
-- three

for _, word in ipairs(string.explode("one,,two,three", ',')) do
  print(word)
end

-- one
--
-- two
-- three

for _, word in ipairs(string.explode("one,,two,three", ',+')) do
  print(word)
end
-- one
-- two
-- three

Reference:

  • Corresponding C source code: lstrlibext.c#L247-309
  • https://gitlab.lisn.upsaclay.fr/texlive/luatex/-/blob/4f2b914d365bab8a2747afe6e8c86d0f1c8475f7/manual/luatex-lua.tex#L399-409

string.utfvalue


function string.utfvalue(text: string)
 ->  integer
 ->  integer ...
@param text - The input string.

@return - The Unicode codepoints of the characters in the given string.

😱 Types incomplete or incorrect? 🙏 Please contribute!

@return - The Unicode codepoints of the characters in the given string.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Return the Unicode codepoints of the characters in the given string.

Example:

local a = string.utfvalue("abc")
print(a) -- 97

local a, b, c = string.utfvalue("abc")
print(a, b, c) -- 97 98 99

print(string.utfvalue("abc")) -- 97 98 99

Reference:

@see string.utfvalues

string.utfvalues


function string.utfvalues(text: string) -> code_point fun() -> integer
@param text - The input string.

@return code_point - an integer value in the Unicode range 😱 Types incomplete or incorrect? 🙏 Please contribute!

Provide an iterator function that iterates over each character of the string by returning an integer value in the Unicode range.

Example:

for code_point in string.utfvalues("abc") do
  print(code_point)
end
-- 97
-- 98
-- 99
Reference:

@see string.utfvalue

string.utfcharacter


function string.utfcharacter(
  code_point: integer,
  ...: integer
) ->  string
@param code_point - A Unicode code point

@param ... - For each character a integer argument

@return - A string with the characters of the given code points.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Convert multiple unicode code points into a string.

Example:

print(string.utfcharacter(97, 98, 99)) -- abc

Reference:

string.utfcharacters


function string.utfcharacters(text: string) -> character fun() -> string
@param text - The input string.

@return character - a string with a single UTF-8 token in it

😱 Types incomplete or incorrect? 🙏 Please contribute!

Provide an iterator function that iterates over each character of the string by returning a string with a single UTF-8 token in it.

Example:

for character in string.utfcharacters("\u{61}\u{62}\u{63}") do
  print(character)
end
-- a
-- b
-- c

Reference:

string.utflength


function string.utflength(text: string) ->  integer
@param text - The input string.

@return - The length of the given string

😱 Types incomplete or incorrect? 🙏 Please contribute!

Return the length of the given string.

Example:

print(string.len("äöü"))
print(string.utflength("äöü"))

Reference:

string.characters


function string.characters(text: string) ->  fun() -> string
@param text - The input string.

@return - A string containing one byte.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Provide an iterator function that iterates over each character of the string by returning a string containing one byte.

Example:

for character in string.characters('abc') do
  print(character)
end
-- a
-- b
-- c

for character in string.characters('äöü') do
  print(character)
end
-- �
-- �
-- �
-- �
-- �
-- �

Reference:

@see string.bytes

string.characterpairs


function string.characterpairs(text: string) ->  fun() -> (string,string)
@param text - The input string.

@return - Two strings of one byte each, or an empty second string if the string length was odd.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Provide an iterator function that iterates over each character of the string by returning two strings.

Each of these returned strings contains one byte or an empty second string if the input string length was odd.

Example:

for c1, c2 in string.characterpairs('äöü') do
  print(c1, c2)
  print(c1 .. c2)
end
-- �    �
-- ä
-- �    �
-- ö
-- �    �
-- ü

for c1, c2 in string.characterpairs('a') do
  print("'" .. c1 .. "'", "'" .. c2 .. "'")
end
-- 'a'  ''

Reference:

@see string.bytepairs

string.bytes


function string.bytes(text: string) ->  fun() -> integer
@param text - The input string.

@return - A single byte value.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Provide an iterator function that iterates over each character of the string by returning a single byte value.

Example:

for byte in string.bytes('abc') do
  print(byte)
end
-- 97
-- 98
-- 99

for byte in string.bytes('äöü') do
  print(byte)
end
-- 195
-- 164
-- 195
-- 182
-- 195
-- 188

Reference:

@see string.characters

string.bytepairs


function string.bytepairs(text: string) ->  fun() -> (integer,integer?)
@param text - The input string.

@return - Two byte values or nil as the second return value if the input string length was odd.

😱 Types incomplete or incorrect? 🙏 Please contribute!

Provide an iterator function that iterates over each character of the string by returning two byte values or nil.

If the input string has an odd length, nil is returned.

Example:

for b1, b2 in string.bytepairs('abc') do
  print(b1, b2)
end
-- 97   98
-- 99   nil

for b1, b2 in string.bytepairs('äöü') do
  print(b1, b2)
end
-- 195  164
-- 195  182
-- 195  188

Reference:

@see string.characterpairs