Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed.
The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too.
When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.
We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations.
Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and modification follow.
NO WARRANTY
If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found.
one line to give the program's name and an idea of what it does. Copyright (C) 19yy name of author This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111, USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) 19yy name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. signature of Ty Coon, 1 April 1989 Ty Coon, President of Vice
This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License.
Most of the GNU Emacs text editor is written in the programming language called Emacs Lisp. You can write new code in Emacs Lisp and install it as an extension to the editor. However, Emacs Lisp is more than a mere "extension language"; it is a full computer programming language in its own right. You can use it as you would any other programming language.
Because Emacs Lisp is designed for use in an editor, it has special features for scanning and parsing text as well as features for handling files, buffers, displays, subprocesses, and so on. Emacs Lisp is closely integrated with the editing facilities; thus, editing commands are functions that can also conveniently be called from Lisp programs, and parameters for customization are ordinary Lisp variables.
This manual attempts to be a full description of Emacs Lisp. For a beginner's introduction to Emacs Lisp, see An Introduction to Emacs Lisp Programming, by Bob Chassell, also published by the Free Software Foundation. This manual presumes considerable familiarity with the use of Emacs for editing; see The GNU Emacs Manual for this basic information.
Generally speaking, the earlier chapters describe features of Emacs Lisp that have counterparts in many programming languages, and later chapters describe features that are peculiar to Emacs Lisp or relate specifically to editing.
This is edition 2.5.
This manual has gone through numerous drafts. It is nearly complete but not flawless. There are a few topics that are not covered, either because we consider them secondary (such as most of the individual modes) or because they are yet to be written. Because we are not able to deal with them completely, we have left out several parts intentionally. This includes most information about usage on VMS.
The manual should be fully correct in what it does cover, and it is therefore open to criticism on anything it says--from specific examples and descriptive text, to the ordering of chapters and sections. If something is confusing, or you find that you have to look at the sources or experiment to learn something not covered in the manual, then perhaps the manual should be fixed. Please let us know.
As you use the manual, we ask that you mark pages with corrections so you can later look them up and send them in. If you think of a simple, real-life example for a function or group of functions, please make an effort to write it up and send it in. Please reference any comments to the chapter name, section name, and function name, as appropriate, since page numbers and chapter and section numbers will change and we may have trouble finding the text you are talking about. Also state the number of the edition you are criticizing.
Please mail comments and corrections to
bug-lisp-manual@gnu.org
We let mail to this list accumulate unread until someone decides to
apply the corrections. Months, and sometimes years, go by between
updates. So please attach no significance to the lack of a reply--your
mail will be acted on in due time. If you want to contact the
Emacs maintainers more quickly, send mail to
bug-gnu-emacs@gnu.org.
Lisp (LISt Processing language) was first developed in the late 1950s at the Massachusetts Institute of Technology for research in artificial intelligence. The great power of the Lisp language makes it ideal for other purposes as well, such as writing editing commands.
Dozens of Lisp implementations have been built over the years, each with its own idiosyncrasies. Many of them were inspired by Maclisp, which was written in the 1960s at MIT's Project MAC. Eventually the implementors of the descendants of Maclisp came together and developed a standard for Lisp systems, called Common Lisp. In the meantime, Gerry Sussman and Guy Steele at MIT developed a simplified but very powerful dialect of Lisp, called Scheme.
GNU Emacs Lisp is largely inspired by Maclisp, and a little by Common Lisp. If you know Common Lisp, you will notice many similarities. However, many features of Common Lisp have been omitted or simplified in order to reduce the memory requirements of GNU Emacs. Sometimes the simplifications are so drastic that a Common Lisp user might be very confused. We will occasionally point out how GNU Emacs Lisp differs from Common Lisp. If you don't know Common Lisp, don't worry about it; this manual is self-contained.
A certain amount of Common Lisp emulation is available via the `cl' library See section `Common Lisp Extension' in Common Lisp Extensions.
Emacs Lisp is not at all influenced by Scheme; but the GNU project has an implementation of Scheme, called Guile. We use Guile in all new GNU software that calls for extensibility.
This section explains the notational conventions that are used in this manual. You may want to skip this section and refer back to it later.
Throughout this manual, the phrases "the Lisp reader" and "the Lisp printer" refer to those routines in Lisp that convert textual representations of Lisp objects into actual Lisp objects, and vice versa. See section Printed Representation and Read Syntax, for more details. You, the person reading this manual, are thought of as "the programmer" and are addressed as "you". "The user" is the person who uses Lisp programs, including those you write.
Examples of Lisp code appear in this font or form: (list 1 2
3). Names that represent metasyntactic variables, or arguments to a
function being described, appear in this font or form:
first-number.
nil and t
In Lisp, the symbol nil has three separate meanings: it
is a symbol with the name `nil'; it is the logical truth value
false; and it is the empty list--the list of zero elements.
When used as a variable, nil always has the value nil.
As far as the Lisp reader is concerned, `()' and `nil' are
identical: they stand for the same object, the symbol nil. The
different ways of writing the symbol are intended entirely for human
readers. After the Lisp reader has read either `()' or `nil',
there is no way to determine which representation was actually written
by the programmer.
In this manual, we use () when we wish to emphasize that it
means the empty list, and we use nil when we wish to emphasize
that it means the truth value false. That is a good convention to use
in Lisp programs also.
(cons 'foo ()) ; Emphasize the empty list (not nil) ; Emphasize the truth value false
In contexts where a truth value is expected, any non-nil value
is considered to be true. However, t is the preferred way
to represent the truth value true. When you need to choose a
value which represents true, and there is no other basis for
choosing, use t. The symbol t always has the value
t.
In Emacs Lisp, nil and t are special symbols that always
evaluate to themselves. This is so that you do not need to quote them
to use them as constants in a program. An attempt to change their
values results in a setting-constant error. The same is true of
any symbol whose name starts with a colon (`:'). See section Variables That Never Change.
A Lisp expression that you can evaluate is called a form. Evaluating a form always produces a result, which is a Lisp object. In the examples in this manual, this is indicated with `=>':
(car '(1 2))
=> 1
You can read this as "(car '(1 2)) evaluates to 1".
When a form is a macro call, it expands into a new form for Lisp to evaluate. We show the result of the expansion with `==>'. We may or may not show the result of the evaluation of the expanded form.
(third '(a b c))
==> (car (cdr (cdr '(a b c))))
=> c
Sometimes to help describe one form we show another form that produces identical results. The exact equivalence of two forms is indicated with `=='.
(make-sparse-keymap) == (list 'keymap)
Many of the examples in this manual print text when they are
evaluated. If you execute example code in a Lisp Interaction buffer
(such as the buffer `*scratch*'), the printed text is inserted into
the buffer. If you execute the example by other means (such as by
evaluating the function eval-region), the printed text is
displayed in the echo area. You should be aware that text displayed in
the echo area is truncated to a single line.
Examples in this manual indicate printed text with `-|',
irrespective of where that text goes. The value returned by evaluating
the form (here bar) follows on a separate line.
(progn (print 'foo) (print 'bar))
-| foo
-| bar
=> bar
Some examples signal errors. This normally displays an error message in the echo area. We show the error message on a line starting with `error-->'. Note that `error-->' itself does not appear in the echo area.
(+ 23 'x) error--> Wrong type argument: number-or-marker-p, x
Some examples show modifications to text in a buffer, with "before" and "after" versions of the text. These examples show the contents of the buffer in question between two lines of dashes containing the buffer name. In addition, `-!-' indicates the location of point. (The symbol for point, of course, is not part of the text in the buffer; it indicates the place between two characters where point is currently located.)
---------- Buffer: foo ----------
This is the -!-contents of foo.
---------- Buffer: foo ----------
(insert "changed ")
=> nil
---------- Buffer: foo ----------
This is the changed -!-contents of foo.
---------- Buffer: foo ----------
Functions, variables, macros, commands, user options, and special forms are described in this manual in a uniform format. The first line of a description contains the name of the item followed by its arguments, if any. The category--function, variable, or whatever--is printed next to the right margin. The description follows on succeeding lines, sometimes with examples.
In a function description, the name of the function being described appears first. It is followed on the same line by a list of argument names. These names are also used in the body of the description, to stand for the values of the arguments.
The appearance of the keyword &optional in the argument list
indicates that the subsequent arguments may be omitted (omitted
arguments default to nil). Do not write &optional when
you call the function.
The keyword &rest (which must be followed by a single argument
name) indicates that any number of arguments can follow. The single
following argument name will have a value, as a variable, which is a
list of all these remaining arguments. Do not write &rest when
you call the function.
Here is a description of an imaginary function foo:
foo subtracts integer1 from integer2,
then adds all the rest of the arguments to the result. If integer2
is not supplied, then the number 19 is used by default.
(foo 1 5 3 9)
=> 16
(foo 5)
=> 14
More generally,
(foo w x y...) == (+ (- x w) y...)
Any argument whose name contains the name of a type (e.g., integer, integer1 or buffer) is expected to be of that type. A plural of a type (such as buffers) often means a list of objects of that type. Arguments named object may be of any type. (See section Lisp Data Types, for a list of Emacs object types.) Arguments with other sorts of names (e.g., new-file) are discussed specifically in the description of the function. In some sections, features common to the arguments of several functions are described at the beginning.
See section Lambda Expressions, for a more complete description of optional and rest arguments.
Command, macro, and special form descriptions have the same format, but the word `Function' is replaced by `Command', `Macro', or `Special Form', respectively. Commands are simply functions that may be called interactively; macros process their arguments differently from functions (the arguments are not evaluated), but are presented the same way.
Special form descriptions use a more complex notation to specify optional and repeated arguments because they can break the argument list down into separate arguments in more complicated ways. `[optional-arg]' means that optional-arg is optional and `repeated-args...' stands for zero or more arguments. Parentheses are used when several arguments are grouped into additional levels of list structure. Here is an example:
(count-loop (i 0 10) (prin1 i) (princ " ") (prin1 (aref vector i)) (terpri))
If from and to are omitted, var is bound to
nil before the loop begins, and the loop exits if var is
non-nil at the beginning of an iteration. Here is an example:
(count-loop (done)
(if (pending)
(fixit)
(setq done t)))
In this special form, the arguments from and to are optional, but must both be present or both absent. If they are present, inc may optionally be specified as well. These arguments are grouped with the argument var into a list, to distinguish them from body, which includes all remaining elements of the form.
A variable is a name that can hold a value. Although any variable can be set by the user, certain variables that exist specifically so that users can change them are called user options. Ordinary variables and user options are described using a format like that for functions except that there are no arguments.
Here is a description of the imaginary electric-future-map
variable.
User option descriptions have the same format, but `Variable' is replaced by `User Option'.
These facilities provide information about which version of Emacs is in use.
(emacs-version) => "GNU Emacs 20.3.5 (i486-pc-linux-gnulibc1, X toolkit) of Sat Feb 14 1998 on psilocin.gnu.org"
Called interactively, the function prints the same information in the echo area.
current-time (see section Time of Day).
emacs-build-time
=> (13623 62065 344633)
"20.3.1". The last number in this string is not
really part of the Emacs release version number; it is incremented each
time you build Emacs in any given directory.
The following two variables have existed since Emacs version 19.23:
This manual was written by Robert Krawitz, Bil Lewis, Dan LaLiberte, Richard M. Stallman and Chris Welty, the volunteers of the GNU manual group, in an effort extending over several years. Robert J. Chassell helped to review and edit the manual, with the support of the Defense Advanced Research Projects Agency, ARPA Order 6082, arranged by Warren A. Hunt, Jr. of Computational Logic, Inc.
Corrections were supplied by Karl Berry, Jim Blandy, Bard Bloom, Stephane Boucher, David Boyes, Alan Carroll, Richard Davis, Lawrence R. Dodd, Peter Doornbosch, David A. Duff, Chris Eich, Beverly Erlebacher, David Eckelkamp, Ralf Fassel, Eirik Fuller, Stephen Gildea, Bob Glickstein, Eric Hanchrow, George Hartzell, Nathan Hess, Masayuki Ida, Dan Jacobson, Jak Kirman, Bob Knighten, Frederick M. Korz, Joe Lammens, Glenn M. Lewis, K. Richard Magill, Brian Marick, Roland McGrath, Skip Montanaro, John Gardiner Myers, Thomas A. Peterson, Francesco Potorti, Friedrich Pukelsheim, Arnold D. Robbins, Raul Rockwell, Per Starback, Shinichirou Sugou, Kimmo Suominen, Edward Tharp, Bill Trost, Rickard Westman, Jean White, Matthew Wilding, Carl Witty, Dale Worley, Rusty Wright, and David D. Zuhn.
A Lisp object is a piece of data used and manipulated by Lisp programs. For our purposes, a type or data type is a set of possible objects.
Every object belongs to at least one type. Objects of the same type have similar structures and may usually be used in the same contexts. Types can overlap, and objects can belong to two or more types. Consequently, we can ask whether an object belongs to a particular type, but not for "the" type of an object.
A few fundamental object types are built into Emacs. These, from which all other types are constructed, are called primitive types. Each object belongs to one and only one primitive type. These types include integer, float, cons, symbol, string, vector, subr, byte-code function, plus several special types, such as buffer, that are related to editing. (See section Editing Types.)
Each primitive type has a corresponding Lisp function that checks whether an object is a member of that type.
Note that Lisp is unlike many other languages in that Lisp objects are self-typing: the primitive type of the object is implicit in the object itself. For example, if an object is a vector, nothing can treat it as a number; Lisp knows it is a vector, not a number.
In most languages, the programmer must declare the data type of each variable, and the type is known by the compiler but not represented in the data. Such type declarations do not exist in Emacs Lisp. A Lisp variable can have any type of value, and it remembers whatever value you store in it, type and all.
This chapter describes the purpose, printed representation, and read syntax of each of the standard types in GNU Emacs Lisp. Details on how to use these types can be found in later chapters.
The printed representation of an object is the format of the
output generated by the Lisp printer (the function prin1) for
that object. The read syntax of an object is the format of the
input accepted by the Lisp reader (the function read) for that
object. See section Reading and Printing Lisp Objects.
Most objects have more than one possible read syntax. Some types of object have no read syntax, since it may not make sense to enter objects of these types directly in a Lisp program. Except for these cases, the printed representation of an object is also a read syntax for it.
In other languages, an expression is text; it has no other form. In Lisp, an expression is primarily a Lisp object and only secondarily the text that is the object's read syntax. Often there is no need to emphasize this distinction, but you must keep it in the back of your mind, or you will occasionally be very confused.
Every type has a printed representation. Some types have no read
syntax--for example, the buffer type has none. Objects of these types
are printed in hash notation: the characters `#<' followed by
a descriptive string (typically the type name followed by the name of
the object), and closed with a matching `>'. Hash notation cannot
be read at all, so the Lisp reader signals the error
invalid-read-syntax whenever it encounters `#<'.
(current-buffer)
=> #<buffer objects.texi>
When you evaluate an expression interactively, the Lisp interpreter
first reads the textual representation of it, producing a Lisp object,
and then evaluates that object (see section Evaluation). However,
evaluation and reading are separate activities. Reading returns the
Lisp object represented by the text that is read; the object may or may
not be evaluated later. See section Input Functions, for a description of
read, the basic function for reading objects.
A comment is text that is written in a program only for the sake of humans that read the program, and that has no effect on the meaning of the program. In Lisp, a semicolon (`;') starts a comment if it is not within a string or character constant. The comment continues to the end of line. The Lisp reader discards comments; they do not become part of the Lisp objects which represent the program within the Lisp system.
The `#@count' construct, which skips the next count characters, is useful for program-generated comments containing binary data. The Emacs Lisp byte compiler uses this in its output files (see section Byte Compilation). It isn't meant for source files, however.
See section Tips on Writing Comments, for conventions for formatting comments.
There are two general categories of types in Emacs Lisp: those having to do with Lisp programming, and those having to do with editing. The former exist in many Lisp implementations, in one form or another. The latter are unique to Emacs Lisp.
The range of values for integers in Emacs Lisp is -134217728 to
134217727 (28 bits; i.e.,
to
on most machines. (Some machines may provide a wider range.) It is
important to note that the Emacs Lisp arithmetic functions do not check
for overflow. Thus (1+ 134217727) is -134217728 on most
machines.
The read syntax for integers is a sequence of (base ten) digits with an optional sign at the beginning and an optional period at the end. The printed representation produced by the Lisp interpreter never has a leading `+' or a final `.'.
-1 ; The integer -1. 1 ; The integer 1. 1. ; Also The integer 1. +1 ; Also the integer 1. 268435457 ; Also the integer 1 on a 28-bit implementation.
See section Numbers, for more information.
Emacs supports floating point numbers (though there is a compilation option to disable them). The precise range of floating point numbers is machine-specific.
The printed representation for floating point numbers requires either a decimal point (with at least one digit following), an exponent, or both. For example, `1500.0', `15e2', `15.0e2', `1.5e3', and `.15e4' are five ways of writing a floating point number whose value is 1500. They are all equivalent.
See section Numbers, for more information.
A character in Emacs Lisp is nothing more than an integer. In other words, characters are represented by their character codes. For example, the character A is represented as the integer 65.
Individual characters are not often used in programs. It is far more common to work with strings, which are sequences composed of characters. See section String Type.
Characters in strings, buffers, and files are currently limited to the range of 0 to 524287--nineteen bits. But not all values in that range are valid character codes. Codes 0 through 127 are ASCII codes; the rest are non-ASCII (see section Non-ASCII Characters). Characters that represent keyboard input have a much wider range, to encode modifier keys such as Control, Meta and Shift.
Since characters are really integers, the printed representation of a character is a decimal number. This is also a possible read syntax for a character, but writing characters that way in Lisp programs is a very bad idea. You should always use the special read syntax formats that Emacs Lisp provides for characters. These syntax formats start with a question mark.
The usual read syntax for alphanumeric characters is a question mark followed by the character; thus, `?A' for the character A, `?B' for the character B, and `?a' for the character a.
For example:
?Q => 81 ?q => 113
You can use the same syntax for punctuation characters, but it is often a good idea to add a `\' so that the Emacs commands for editing Lisp code don't get confused. For example, `?\ ' is the way to write the space character. If the character is `\', you must use a second `\' to quote it: `?\\'.
You can express the characters Control-g, backspace, tab, newline, vertical tab, formfeed, return, and escape as `?\a', `?\b', `?\t', `?\n', `?\v', `?\f', `?\r', `?\e', respectively. Thus,
?\a => 7 ; C-g ?\b => 8 ; backspace, BS, C-h ?\t => 9 ; tab, TAB, C-i ?\n => 10 ; newline, C-j ?\v => 11 ; vertical tab, C-k ?\f => 12 ; formfeed character, C-l ?\r => 13 ; carriage return, RET, C-m ?\e => 27 ; escape character, ESC, C-[ ?\\ => 92 ; backslash character, \
These sequences which start with backslash are also known as escape sequences, because backslash plays the role of an escape character; this usage has nothing to do with the character ESC.
Control characters may be represented using yet another read syntax. This consists of a question mark followed by a backslash, caret, and the corresponding non-control character, in either upper or lower case. For example, both `?\^I' and `?\^i' are valid read syntax for the character C-i, the character whose value is 9.
Instead of the `^', you can use `C-'; thus, `?\C-i' is equivalent to `?\^I' and to `?\^i':
?\^I => 9 ?\C-I => 9
In strings and buffers, the only control characters allowed are those that exist in ASCII; but for keyboard input purposes, you can turn any character into a control character with `C-'. The character codes for these non-ASCII control characters include the bit as well as the code for the corresponding non-control character. Ordinary terminals have no way of generating non-ASCII control characters, but you can generate them straightforwardly using X and other window systems.
For historical reasons, Emacs treats the DEL character as the control equivalent of ?:
?\^? => 127 ?\C-? => 127
As a result, it is currently not possible to represent the character Control-?, which is a meaningful input character under X, using `\C-'. It is not easy to change this, as various Lisp files refer to DEL in this way.
For representing control characters to be found in files or strings, we recommend the `^' syntax; for control characters in keyboard input, we prefer the `C-' syntax. Which one you use does not affect the meaning of the program, but may guide the understanding of people who read it.
A meta character is a character typed with the META modifier key. The integer that represents such a character has the bit set (which on most machines makes it a negative number). We use high bits for this and other modifiers to make possible a wide range of basic character codes.
In a string, the bit attached to an ASCII character indicates a meta character; thus, the meta characters that can fit in a string have codes in the range from 128 to 255, and are the meta versions of the ordinary ASCII characters. (In Emacs versions 18 and older, this convention was used for characters outside of strings as well.)
The read syntax for meta characters uses `\M-'. For example, `?\M-A' stands for M-A. You can use `\M-' together with octal character codes (see below), with `\C-', or with any other syntax for a character. Thus, you can write M-A as `?\M-A', or as `?\M-\101'. Likewise, you can write C-M-b as `?\M-\C-b', `?\C-\M-b', or `?\M-\002'.
The case of a graphic character is indicated by its character code; for example, ASCII distinguishes between the characters `a' and `A'. But ASCII has no way to represent whether a control character is upper case or lower case. Emacs uses the bit to indicate that the shift key was used in typing a control character. This distinction is possible only when you use X terminals or other special terminals; ordinary terminals do not report the distinction to the computer in any way.
The X Window System defines three other modifier bits that can be set in a character: hyper, super and alt. The syntaxes for these bits are `\H-', `\s-' and `\A-'. (Case is significant in these prefixes.) Thus, `?\H-\M-\A-x' represents Alt-Hyper-Meta-x.
Finally, the most general read syntax for a character represents the
character code in either octal or hex. To use octal, write a question
mark followed by a backslash and the octal character code (up to three
octal digits); thus, `?\101' for the character A,
`?\001' for the character C-a, and ?\002 for the
character C-b. Although this syntax can represent any ASCII
character, it is preferred only when the precise octal value is more
important than the ASCII representation.
?\012 => 10 ?\n => 10 ?\C-j => 10 ?\101 => 65 ?A => 65
To use hex, write a question mark followed by a backslash, `x',
and the hexadecimal character code. You can use any number of hex
digits, so you can represent any character code in this way.
Thus, `?\x41' for the character A, `?\x1' for the
character C-a, and ?\x8e0 for the character
``a'.
A backslash is allowed, and harmless, preceding any character without a special escape meaning; thus, `?\+' is equivalent to `?+'. There is no reason to add a backslash before most characters. However, you should add a backslash before any of the characters `()\|;'`"#.,' to avoid confusing the Emacs commands for editing Lisp code. Also add a backslash before whitespace characters such as space, tab, newline and formfeed. However, it is cleaner to use one of the easily readable escape sequences, such as `\t', instead of an actual whitespace character such as a tab.
A symbol in GNU Emacs Lisp is an object with a name. The symbol name serves as the printed representation of the symbol. In ordinary use, the name is unique--no two symbols have the same name.
A symbol can serve as a variable, as a function name, or to hold a property list. Or it may serve only to be distinct from all other Lisp objects, so that its presence in a data structure may be recognized reliably. In a given context, usually only one of these uses is intended. But you can use one symbol in all of these ways, independently.
A symbol name can contain any characters whatever. Most symbol names are written with letters, digits, and the punctuation characters `-+=*/'. Such names require no special punctuation; the characters of the name suffice as long as the name does not look like a number. (If it does, write a `\' at the beginning of the name to force interpretation as a symbol.) The characters `_~!@$%^&:<>{}' are less often used but also require no special punctuation. Any other characters may be included in a symbol's name by escaping them with a backslash. In contrast to its use in strings, however, a backslash in the name of a symbol simply quotes the single character that follows the backslash. For example, in a string, `\t' represents a tab character; in the name of a symbol, however, `\t' merely quotes the letter `t'. To have a symbol with a tab character in its name, you must actually use a tab (preceded with a backslash). But it's rare to do such a thing.
Common Lisp note: In Common Lisp, lower case letters are always "folded" to upper case, unless they are explicitly escaped. In Emacs Lisp, upper case and lower case letters are distinct.
Here are several examples of symbol names. Note that the `+' in the fifth example is escaped to prevent it from being read as a number. This is not necessary in the sixth example because the rest of the name makes it invalid as a number.
foo ; A symbol named `foo'.
FOO ; A symbol named `FOO', different from `foo'.
char-to-string ; A symbol named `char-to-string'.
1+ ; A symbol named `1+'
; (not `+1', which is an integer).
\+1 ; A symbol named `+1'
; (not a very readable name).
\(*\ 1\ 2\) ; A symbol named `(* 1 2)' (a worse name).
+-*/_~!@$%^&=:<>{} ; A symbol named `+-*/_~!@$%^&=:<>{}'.
; These characters need not be escaped.
A sequence is a Lisp object that represents an ordered set of elements. There are two kinds of sequence in Emacs Lisp, lists and arrays. Thus, an object of type list or of type array is also considered a sequence.
Arrays are further subdivided into strings, vectors, char-tables and
bool-vectors. Vectors can hold elements of any type, but string
elements must be characters, and bool-vector elements must be t
or nil. The characters in a string can have text properties like
characters in a buffer (see section Text Properties); vectors and
bool-vectors do not support text properties even when their elements
happen to be characters. Char-tables are like vectors except that they
are indexed by any valid character code.
Lists, strings and the other array types are different, but they have
important similarities. For example, all have a length l, and all
have elements which can be indexed from zero to l minus one.
Several functions, called sequence functions, accept any kind of
sequence. For example, the function elt can be used to extract
an element of a sequence, given its index. See section Sequences, Arrays, and Vectors.
It is generally impossible to read the same sequence twice, since
sequences are always created anew upon reading. If you read the read
syntax for a sequence twice, you get two sequences with equal contents.
There is one exception: the empty list () always stands for the
same object, nil.
A cons cell is an object that consists of two pointers or slots, called the CAR slot and the CDR slot. Each slot can point to or hold to any Lisp object. We also say that the "the CAR of this cons cell is" whatever object its CAR slot currently points to, and likewise for the CDR.
A list is a series of cons cells, linked together so that the CDR slot of each cons cell holds either the next cons cell or the empty list. See section Lists, for functions that work on lists. Because most cons cells are used as part of lists, the phrase list structure has come to refer to any structure made out of cons cells.
The names CAR and CDR derive from the history of Lisp. The
original Lisp implementation ran on an IBM 704 computer which
divided words into two parts, called the "address" part and the
"decrement"; CAR was an instruction to extract the contents of
the address part of a register, and CDR an instruction to extract
the contents of the decrement. By contrast, "cons cells" are named
for the function cons that creates them, which in turn is named
for its purpose, the construction of cells.
Because cons cells are so central to Lisp, we also have a word for "an object which is not a cons cell". These objects are called atoms.
The read syntax and printed representation for lists are identical, and consist of a left parenthesis, an arbitrary number of elements, and a right parenthesis.
Upon reading, each object inside the parentheses becomes an element
of the list. That is, a cons cell is made for each element. The
CAR slot of the cons cell points to the element, and its CDR
slot points to the next cons cell of the list, which holds the next
element in the list. The CDR slot of the last cons cell is set to
point to nil.
A list can be illustrated by a diagram in which the cons cells are
shown as pairs of boxes, like dominoes. (The Lisp reader cannot read
such an illustration; unlike the textual notation, which can be
understood by both humans and computers, the box illustrations can be
understood only by humans.) This picture represents the three-element
list (rose violet buttercup):
--- --- --- --- --- ---
| | |--> | | |--> | | |--> nil
--- --- --- --- --- ---
| | |
| | |
--> rose --> violet --> buttercup
In this diagram, each box represents a slot that can point to any Lisp object. Each pair of boxes represents a cons cell. Each arrow is a pointer to a Lisp object, either an atom or another cons cell.
In this example, the first box, which holds the CAR of the first
cons cell, points to or "contains" rose (a symbol). The second
box, holding the CDR of the first cons cell, points to the next
pair of boxes, the second cons cell. The CAR of the second cons
cell is violet, and its CDR is the third cons cell. The
CDR of the third (and last) cons cell is nil.
Here is another diagram of the same list, (rose violet
buttercup), sketched in a different manner:
--------------- ---------------- ------------------- | car | cdr | | car | cdr | | car | cdr | | rose | o-------->| violet | o-------->| buttercup | nil | | | | | | | | | | --------------- ---------------- -------------------
A list with no elements in it is the empty list; it is identical
to the symbol nil. In other words, nil is both a symbol
and a list.
Here are examples of lists written in Lisp syntax:
(A 2 "A") ; A list of three elements.
() ; A list of no elements (the empty list).
nil ; A list of no elements (the empty list).
("A ()") ; A list of one element: the string "A ()".
(A ()) ; A list of two elements: A and the empty list.
(A nil) ; Equivalent to the previous.
((A B C)) ; A list of one element
; (which is a list of three elements).
Here is the list (A ()), or equivalently (A nil),
depicted with boxes and arrows:
--- --- --- ---
| | |--> | | |--> nil
--- --- --- ---
| |
| |
--> A --> nil
Dotted pair notation is an alternative syntax for cons cells
that represents the CAR and CDR explicitly. In this syntax,
(a . b) stands for a cons cell whose CAR is
the object a, and whose CDR is the object b. Dotted
pair notation is therefore more general than list syntax. In the dotted
pair notation, the list `(1 2 3)' is written as `(1 . (2 . (3
. nil)))'. For nil-terminated lists, you can use either
notation, but list notation is usually clearer and more convenient.
When printing a list, the dotted pair notation is only used if the
CDR of a cons cell is not a list.
Here's an example using boxes to illustrate dotted pair notation.
This example shows the pair (rose . violet):
--- ---
| | |--> violet
--- ---
|
|
--> rose
You can combine dotted pair notation with list notation to represent
conveniently a chain of cons cells with a non-nil final CDR.
You write a dot after the last element of the list, followed by the
CDR of the final cons cell. For example, (rose violet
. buttercup) is equivalent to (rose . (violet . buttercup)).
The object looks like this:
--- --- --- ---
| | |--> | | |--> buttercup
--- --- --- ---
| |
| |
--> rose --> violet
The syntax (rose . violet . buttercup) is invalid because
there is nothing that it could mean. If anything, it would say to put
buttercup in the CDR of a cons cell whose CDR is already
used for violet.
The list (rose violet) is equivalent to (rose . (violet)),
and looks like this:
--- --- --- ---
| | |--> | | |--> nil
--- --- --- ---
| |
| |
--> rose --> violet
Similarly, the three-element list (rose violet buttercup)
is equivalent to (rose . (violet . (buttercup))).
An association list or alist is a specially-constructed list whose elements are cons cells. In each element, the CAR is considered a key, and the CDR is considered an associated value. (In some cases, the associated value is stored in the CAR of the CDR.) Association lists are often used as stacks, since it is easy to add or remove associations at the front of the list.
For example,
(setq alist-of-colors
'((rose . red) (lily . white) (buttercup . yellow)))
sets the variable alist-of-colors to an alist of three elements. In the
first element, rose is the key and red is the value.
See section Association Lists, for a further explanation of alists and for functions that work on alists.
An array is composed of an arbitrary number of slots for pointing to other Lisp objects, arranged in a contiguous block of memory. Accessing any element of an array takes approximately the same amount of time. In contrast, accessing an element of a list requires time proportional to the position of the element in the list. (Elements at the end of a list take longer to access than elements at the beginning of a list.)
Emacs defines four types of array: strings, vectors, bool-vectors, and char-tables.
A string is an array of characters and a vector is an array of
arbitrary objects. A bool-vector can hold only t or nil.
These kinds of array may have any length up to the largest integer.
Char-tables are sparse arrays indexed by any valid character code; they
can hold arbitrary objects.
The first element of an array has index zero, the second element has index 1, and so on. This is called zero-origin indexing. For example, an array of four elements has indices 0, 1, 2, and 3. The largest possible index value is one less than the length of the array. Once an array is created, its length is fixed.
All Emacs Lisp arrays are one-dimensional. (Most other programming languages support multidimensional arrays, but they are not essential; you can get the same effect with an array of arrays.) Each type of array has its own read syntax; see the following sections for details.
The array type is contained in the sequence type and contains the string type, the vector type, the bool-vector type, and the char-table type.
A string is an array of characters. Strings are used for many purposes in Emacs, as can be expected in a text editor; for example, as the names of Lisp symbols, as messages for the user, and to represent text extracted from buffers. Strings in Lisp are constants: evaluation of a string returns the same string.
See section Strings and Characters, for functions that operate on strings.
The read syntax for strings is a double-quote, an arbitrary number of
characters, and another double-quote, "like this". To include a
double-quote in a string, precede it with a backslash; thus, "\""
is a string containing just a single double-quote character. Likewise,
you can include a backslash by preceding it with another backslash, like
this: "this \\ is a single embedded backslash".
The newline character is not special in the read syntax for strings; if you write a new line between the double-quotes, it becomes a character in the string. But an escaped newline--one that is preceded by `\'---does not become part of the string; i.e., the Lisp reader ignores an escaped newline while reading a string. An escaped space `\ ' is likewise ignored.
"It is useful to include newlines
in documentation strings,
but the newline is \
ignored if escaped."
=> "It is useful to include newlines
in documentation strings,
but the newline is ignored if escaped."
You can include a non-ASCII international character in a string constant by writing it literally. There are two text representations for non-ASCII characters in Emacs strings (and in buffers): unibyte and multibyte. If the string constant is read from a multibyte source, such as a multibyte buffer or string, or a file that would be visited as multibyte, then the character is read as a multibyte character, and that makes the string multibyte. If the string constant is read from a unibyte source, then the character is read as unibyte and that makes the string unibyte.
You can also represent a multibyte non-ASCII character with its character code, using a hex escape, `\xnnnnnnn', with as many digits as necessary. (Multibyte non-ASCII character codes are all greater than 256.) Any character which is not a valid hex digit terminates this construct. If the character that would follow is a hex digit, write `\ ' (backslash and space) to terminate the hex escape--for example, `\x8e0\ ' represents one character, `a' with grave accent. `\ ' in a string constant is just like backslash-newline; it does not contribute any character to the string, but it does terminate the preceding hex escape.
Using a multibyte hex escape forces the string to multibyte. You can represent a unibyte non-ASCII character with its character code, which must be in the range from 128 (0200 octal) to 255 (0377 octal). This forces a unibyte string. See section Text Representations, for more information about the two text representations.
You can use the same backslash escape-sequences in a string constant
as in character literals (but do not use the question mark that begins a
character constant). For example, you can write a string containing the
nonprinting characters tab and C-a, with commas and spaces between
them, like this: "\t, \C-a". See section Character Type, for a
description of the read syntax for characters.
However, not all of the characters you can write with backslash escape-sequences are valid in strings. The only control characters that a string can hold are the ASCII control characters. Strings do not distinguish case in ASCII control characters.
Properly speaking, strings cannot hold meta characters; but when a
string is to be used as a key sequence, there is a special convention
that provides a way to represent meta versions of ASCII characters in a
string. If you use the `\M-' syntax to indicate a meta character
in a string constant, this sets the
bit of the character in the string. If the string is used in
define-key or lookup-key, this numeric code is translated
into the equivalent meta character. See section Character Type.
Strings cannot hold characters that have the hyper, super, or alt modifiers.
A string can hold properties for the characters it contains, in addition to the characters themselves. This enables programs that copy text between strings and buffers to copy the text's properties with no special effort. See section Text Properties, for an explanation of what text properties mean. Strings with text properties use a special read and print syntax:
#("characters" property-data...)
where property-data consists of zero or more elements, in groups of three as follows:
beg end plist
The elements beg and end are integers, and together specify a range of indices in the string; plist is the property list for that range. For example,
#("foo bar" 0 3 (face bold) 3 4 nil 4 7 (face italic))
represents a string whose textual contents are `foo bar', in which
the first three characters have a face property with value
bold, and the last three have a face property with value
italic. (The fourth character has no text properties, so its
property list is nil. It is not actually necessary to mention
ranges with nil as the property list, since any characters not
mentioned in any range will default to having no properties.)
A vector is a one-dimensional array of elements of any type. It takes a constant amount of time to access any element of a vector. (In a list, the access time of an element is proportional to the distance of the element from the beginning of the list.)
The printed representation of a vector consists of a left square bracket, the elements, and a right square bracket. This is also the read syntax. Like numbers and strings, vectors are considered constants for evaluation.
[1 "two" (three)] ; A vector of three elements.
=> [1 "two" (three)]
See section Vectors, for functions that work with vectors.
A char-table is a one-dimensional array of elements of any type, indexed by character codes. Char-tables have certain extra features to make them more useful for many jobs that involve assigning information to character codes--for example, a char-table can have a parent to inherit from, a default value, and a small number of extra slots to use for special purposes. A char-table can also specify a single value for a whole character set.
The printed representation of a char-table is like a vector except that there is an extra `#^' at the beginning.
See section Char-Tables, for special functions to operate on char-tables. Uses of char-tables include:
A bool-vector is a one-dimensional array of elements that
must be t or nil.
The printed representation of a Bool-vector is like a string, except
that it begins with `#&' followed by the length. The string
constant that follows actually specifies the contents of the bool-vector
as a bitmap--each "character" in the string contains 8 bits, which
specify the next 8 elements of the bool-vector (1 stands for t,
and 0 for nil). The least significant bits of the character
correspond to the lowest indices in the bool-vector. If the length is not a
multiple of 8, the printed representation shows extra elements, but
these extras really make no difference.
(make-bool-vector 3 t)
=> #&3"\007"
(make-bool-vector 3 nil)
=> #&3"\0"
;; These are equal since only the first 3 bits are used.
(equal #&3"\377" #&3"\007")
=> t
Just as functions in other programming languages are executable,
Lisp function objects are pieces of executable code. However,
functions in Lisp are primarily Lisp objects, and only secondarily the
text which represents them. These Lisp objects are lambda expressions:
lists whose first element is the symbol lambda (see section Lambda Expressions).
In most programming languages, it is impossible to hav