Strings and Characters

Safety

Always limit the maximum length of strings.

String in Rhai contain any text sequence of valid Unicode characters.

Internally strings are stored in UTF-8 encoding.

type_of() a string returns "string".

String and Character Literals

String and character literals follow JavaScript-style syntax.

TypeQuotesEscapes?Continuation?Interpolation?
Normal string"..."yeswith \no
Multi-line literal string`...`nonowith ${...}
Character'...'yesnono

Tip: Building strings

Strings can be built up from other strings and types via the + operator (provided by the MoreStringPackage but excluded when using a raw Engine).

This is particularly useful when printing output.

Standard Escape Sequences

Tip: Character to_int()

Use the to_int method to convert a Unicode character into its 32-bit Unicode encoding.

There is built-in support for Unicode (\uxxxx or \Uxxxxxxxx) and hex (\xxx) escape sequences for normal strings and characters.

Hex sequences map to ASCII characters, while \u maps to 16-bit common Unicode code points and \U maps the full, 32-bit extended Unicode code points.

Escape sequences are not supported for multi-line literal strings wrapped by back-ticks (`).

Escape sequenceMeaning
\\back-slash (\)
\ttab
\rcarriage-return (CR)
\nline-feed (LF)
\" or ""double-quote (")
\'single-quote (')
\xxxASCII character in 2-digit hex
\uxxxxUnicode character in 4-digit hex
\UxxxxxxxxUnicode character in 8-digit hex

Line Continuation

For a normal string wrapped by double-quotes ("), a back-slash (\) character at the end of a line indicates that the string continues onto the next line without any line-break.

Whitespace up to the indentation of the opening double-quote is ignored in order to enable lining up blocks of text.

Spaces are not added, so to separate one line with the next with a space, put a space before the ending back-slash (\) character.

let x = "hello, world!\
         hello world again! \
         this is the ""last"" time!!!";
// ^^^^^^ these whitespaces are ignored

// The above is the same as:
let x = "hello, world!hello world again! this is the \"last\" time!!!";

A string with continuation does not open up a new line. To do so, a new-line character must be manually inserted at the appropriate position.

let x = "hello, world!\n\
         hello world again!\n\
         this is the last time!!!";

// The above is the same as:
let x = "hello, world!\nhello world again!\nthis is the last time!!!";

No ending quote before the line ends is a syntax error

If the ending double-quote is omitted, it is a syntax error.

let x = "hello
";
//            ^ syntax error: unterminated string literal

Why not go multi-line?

Technically speaking, there is no difficulty in allowing strings to run for multiple lines without the continuation back-slash.

Rhai forces you to manually mark a continuation with a back-slash because the ending quote is easy to omit. Once it happens, the entire remainder of the script would become one giant, multi-line string.

This behavior is different from Rust, where string literals can run for multiple lines.

Indexing

Strings can be indexed into to get access to any individual character. This is similar to many modern languages but different from Rust.

From beginning

Individual characters within a string can be accessed with zero-based, non-negative integer indices:

string [ index from 0 to (total number of characters − 1) ]

From end

A negative index accesses a character in the string counting from the end, with −1 being the last character.

string [ index from −1 to −(total number of characters) ]

Character indexing can be SLOOOOOOOOW

Internally, a Rhai string is still stored compactly as a Rust UTF-8 string in order to save memory.

Therefore, getting the character at a particular index involves walking through the entire UTF-8 encoded bytes stream to extract individual Unicode characters, counting them on the way.

Because of this, indexing can be a slow procedure, especially for long strings. Along the same lines, getting the length of a string (which returns the number of characters, not bytes) can also be slow.

Sub-Strings

Sub-strings, or slices in some programming languages, are parts of strings.

In Rhai, a sub-string can be specified by indexing with a range of characters:

string [ first character (starting from zero) .. last character (exclusive) ]

string [ first character (starting from zero) ..= last character (inclusive) ]

Sub-string ranges always start from zero counting towards the end of the string. Negative ranges are not supported.

Examples

let name = "Bob";
let middle_initial = 'C';
let last = "Davis";

let full_name = `${name} ${middle_initial}. ${last}`;
full_name == "Bob C. Davis";

// String building with different types
let age = 42;
let record = `${full_name}: age ${age}`;
record == "Bob C. Davis: age 42";

// Unlike Rust, Rhai strings can be indexed to get a character
// (disabled with 'no_index')
let c = record[4];
c == 'C';                               // single character

let slice = record[4..8];               // sub-string slice
slice == " C. D";

ts.s = record;                          // custom type properties can take strings

let c = ts.s[4];
c == 'C';

let c = ts.s[-4];                       // negative index counts from the end
c == 'e';

let c = "foo"[0];                       // indexing also works on string literals...
c == 'f';

let c = ("foo" + "bar")[5];             // ... and expressions returning strings
c == 'r';

let text = "hello, world!";
text[0] = 'H';                          // modify a single character
text == "Hello, world!";

text[7..=11] = "Earth";                 // modify a sub-string slice
text == "Hello, Earth!";

// Escape sequences in strings
record += " \u2764\n";                  // escape sequence of '❤' in Unicode
record == "Bob C. Davis: age 42 ❤\n";  // '\n' = new-line

// Unlike Rust, Rhai strings can be directly modified character-by-character
// (disabled with 'no_index')
record[4] = '\x58'; // 0x58 = 'X'
record == "Bob X. Davis: age 42 ❤\n";

// Use 'in' to test if a substring (or character) exists in a string
"Davis" in record == true;
'X' in record == true;
'C' in record == false;

// Strings can be iterated with a 'for' statement, yielding characters
for ch in record {
    print(ch);
}