Next: , Previous: , Up: Character Type   [Contents][Index]

2.3.3.2 General Escape Syntax

In addition to the specific escape sequences for special important control characters, Emacs provides several types of escape syntax that you can use to specify non-ASCII text characters.

Firstly, you can specify characters by their Unicode values. ?\unnnn represents a character with Unicode code point ‘U+nnnn’, where nnnn is (by convention) a hexadecimal number with exactly four digits. The backslash indicates that the subsequent characters form an escape sequence, and the ‘u’ specifies a Unicode escape sequence.

There is a slightly different syntax for specifying Unicode characters with code points higher than U+ffff: ?\U00nnnnnn represents the character with code point ‘U+nnnnnn’, where nnnnnn is a six-digit hexadecimal number. The Unicode Standard only defines code points up to ‘U+10ffff’, so if you specify a code point higher than that, Emacs signals an error.

Secondly, you can specify characters by their hexadecimal character codes. A hexadecimal escape sequence consists of a backslash, ‘x’, and the hexadecimal character code. Thus, ‘?\x41’ is the character A, ‘?\x1’ is the character C-a, and ?\xe0 is the character ‘a’ with grave accent. You can use any number of hex digits, so you can represent any character code in this way.

Thirdly, you can specify characters by their character code in octal. An octal escape sequence consists of a backslash followed by up to three octal digits; thus, ‘?\101’ for the character A, ‘?\001’ for the character C-a, and ?\002 for the character C-b. Only characters up to octal code 777 can be specified this way.

These escape sequences may also be used in strings. See Non-ASCII in Strings.

Next: , Previous: , Up: Character Type   [Contents][Index]