| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
A machine description has two parts: a file of instruction patterns (`.md' file) and a C header file of macro definitions.
The `.md' file for a target machine contains a pattern for each instruction that the target machine supports (or at least each instruction that is worth telling the compiler about). It may also contain comments. A semicolon causes the rest of the line to be a comment, unless the semicolon is inside a quoted string.
See the next chapter for information on the C header file.
16.1 Everything about Instruction Patterns How to write instruction patterns. 16.2 Example of define_insnAn explained example of a define_insnpattern.16.3 RTL Template The RTL template defines what insns match a pattern. 16.4 Output Templates and Operand Substitution The output template says how to make assembler code from such an insn. 16.5 C Statements for Assembler Output For more generality, write C code to output the assembler code. 16.6 Operand Constraints When not all operands are general operands. 16.7 Standard Pattern Names For Generation Names mark patterns to use for code generation. 16.8 When the Order of Patterns Matters When the order of patterns makes a difference. 16.9 Interdependence of Patterns Having one pattern may make you need another. 16.10 Defining Jump Instruction Patterns Special considerations for patterns for jump insns. 16.11 Canonicalization of Instructions 16.12 Machine-Specific Peephole Optimizers Defining machine-specific peephole optimizations. 16.13 Defining RTL Sequences for Code Generation Generating a sequence of several RTL insns for a standard operation. 16.14 Defining How to Split Instructions Splitting Instructions into Multiple Instructions 16.15 Instruction Attributes Specifying the value of attributes for generated insns.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Each instruction pattern contains an incomplete RTL expression, with pieces
to be filled in later, operand constraints that restrict how the pieces can
be filled in, and an output pattern or C code to generate the assembler
output, all wrapped up in a define_insn expression.
A define_insn is an RTL expression containing four or five operands:
The absence of a name is indicated by writing an empty string where the name should go. Nameless instruction patterns are never used for generating RTL code, but they may permit several simpler insns to be combined later on.
Names that are not thus known and used in RTL-generation have no effect; they are equivalent to no name at all.
match_operand,
match_operator, and match_dup expressions that stand for
operands of the instruction.
If the vector has only one element, that element is the template for the
instruction pattern. If the vector has multiple elements, then the
instruction pattern is a parallel expression containing the
elements described.
For a named pattern, the condition (if present) may not depend on the data in the insn being matched, but only the target-machine-type flags. The compiler needs to test these conditions during initialization in order to learn exactly which named instructions are available in a particular run.
For nameless patterns, the condition is applied only when matching an
individual insn, and only after the insn has matched the pattern's
recognition template. The insn's operands may be found in the vector
operands.
When simple substitution isn't general enough, you can specify a piece of C code to compute the output. See section 16.5 C Statements for Assembler Output.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
define_insn Here is an actual example of an instruction pattern, for the 68000/68020.
(define_insn "tstsi"
[(set (cc0)
(match_operand:SI 0 "general_operand" "rm"))]
""
"*
{ if (TARGET_68020 || ! ADDRESS_REG_P (operands[0]))
return \"tstl %0\";
return \"cmpl #0,%0\"; }")
|
This is an instruction that sets the condition codes based on the value of
a general operand. It has no condition, so any insn whose RTL description
has the form shown may be handled according to this pattern. The name
`tstsi' means "test a SImode value" and tells the RTL generation
pass that, when it is necessary to test such a value, an insn to do so
can be constructed using this pattern.
The output control string is a piece of C code which chooses which output template to return based on the kind of operand and the specific type of CPU for which code is being generated.
`"rm"' is an operand constraint. Its meaning is explained below.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The RTL template is used to define which insns match the particular pattern and how to find their operands. For named patterns, the RTL template also says how to construct an insn from specified operands.
Construction involves substituting specified operands into a copy of the template. Matching involves determining the values that serve as the operands in the insn being matched. Both of these activities are controlled by special expression types that direct matching and substitution of the operands.
(match_operand:m n predicate constraint)
Operand numbers must be chosen consecutively counting from zero in
each instruction pattern. There may be only one match_operand
expression in the pattern for each operand number. Usually operands
are numbered in the order of appearance in match_operand
expressions. In the case of a define_expand, any operand numbers
used only in match_dup expressions have higher values than all
other operand numbers.
predicate is a string that is the name of a C function that accepts two
arguments, an expression and a machine mode. During matching, the
function will be called with the putative operand as the expression and
m as the mode argument (if m is not specified,
VOIDmode will be used, which normally causes predicate to accept
any mode). If it returns zero, this instruction pattern fails to match.
predicate may be an empty string; then it means no test is to be done
on the operand, so anything which occurs in this position is valid.
Most of the time, predicate will reject modes other than m---but
not always. For example, the predicate address_operand uses
m as the mode of memory ref that the address should be valid for.
Many predicates accept const_int nodes even though their mode is
VOIDmode.
constraint controls reloading and the choice of the best register class to use for a value, as explained later (see section 16.6 Operand Constraints).
People are often unclear on the difference between the constraint and the predicate. The predicate helps decide whether a given insn matches the pattern. The constraint plays no role in this decision; instead, it controls various decisions in the case of an insn which does match.
On CISC machines, the most common predicate is
"general_operand". This function checks that the putative
operand is either a constant, a register or a memory reference, and that
it is valid for mode m.
For an operand that must be a register, predicate should be
"register_operand". Using "general_operand" would be
valid, since the reload pass would copy any non-register operands
through registers, but this would make GNU CC do extra work, it would
prevent invariant operands (such as constant) from being removed from
loops, and it would prevent the register allocator from doing the best
possible job. On RISC machines, it is usually most efficient to allow
predicate to accept only objects that the constraints allow.
For an operand that must be a constant, you must be sure to either use
"immediate_operand" for predicate, or make the instruction
pattern's extra condition require a constant, or both. You cannot
expect the constraints to do this work! If the constraints allow only
constants, but the predicate allows something else, the compiler will
crash when that case arises.
(match_scratch:m n constraint)
scratch or reg
expression.
When matching patterns, this is equivalent to
(match_operand:m n "scratch_operand" pred) |
but, when generating RTL, it produces a (scratch:m)
expression.
If the last few expressions in a parallel are clobber
expressions whose operands are either a hard register or
match_scratch, the combiner can add or delete them when
necessary. See section 15.13 Side Effect Expressions.
(match_dup n)
In construction, match_dup acts just like match_operand:
the operand is substituted into the insn being constructed. But in
matching, match_dup behaves differently. It assumes that operand
number n has already been determined by a match_operand
appearing earlier in the recognition template, and it matches only an
identical-looking expression.
(match_operator:m n predicate [operands...])
When constructing an insn, it stands for an RTL expression whose expression code is taken from that of operand n, and whose operands are constructed from the patterns operands.
When matching an expression, it matches an expression if the function predicate returns nonzero on that expression and the patterns operands match the operands of the expression.
Suppose that the function commutative_operator is defined as
follows, to match any expression whose operator is one of the
commutative arithmetic operators of RTL and whose mode is mode:
int
commutative_operator (x, mode)
rtx x;
enum machine_mode mode;
{
enum rtx_code code = GET_CODE (x);
if (GET_MODE (x) != mode)
return 0;
return (GET_RTX_CLASS (code) == 'c'
|| code == EQ || code == NE);
}
|
Then the following pattern will match any RTL expression consisting of a commutative operator applied to two general operands:
(match_operator:SI 3 "commutative_operator" [(match_operand:SI 1 "general_operand" "g") (match_operand:SI 2 "general_operand" "g")]) |
Here the vector [operands...] contains two patterns
because the expressions to be matched all contain two operands.
When this pattern does match, the two operands of the commutative
operator are recorded as operands 1 and 2 of the insn. (This is done
by the two instances of match_operand.) Operand 3 of the insn
will be the entire commutative expression: use GET_CODE
(operands[3]) to see which commutative operator was used.
The machine mode m of match_operator works like that of
match_operand: it is passed as the second argument to the
predicate function, and that function is solely responsible for
deciding whether the expression to be matched "has" that mode.
When constructing an insn, argument 3 of the gen-function will specify the operation (i.e. the expression code) for the expression to be made. It should be an RTL expression, whose expression code is copied into a new expression whose operands are arguments 1 and 2 of the gen-function. The subexpressions of argument 3 are not used; only its expression code matters.
When match_operator is used in a pattern for matching an insn,
it usually best if the operand number of the match_operator
is higher than that of the actual operands of the insn. This improves
register allocation because the register allocator often looks at
operands 1 and 2 of insns to see if it can do register tying.
There is no way to specify constraints in match_operator. The
operand of the insn which corresponds to the match_operator
never has any constraints because it is never reloaded as a whole.
However, if parts of its operands are matched by
match_operand patterns, those parts may have constraints of
their own.
(match_op_dup:m n[operands...])
match_dup, except that it applies to operators instead of
operands. When constructing an insn, operand number n will be
substituted at this point. But in matching, match_op_dup behaves
differently. It assumes that operand number n has already been
determined by a match_operator appearing earlier in the
recognition template, and it matches only an identical-looking
expression.
(match_parallel n predicate [subpat...])
parallel expression with a variable number of elements. This
expression should only appear at the top level of an insn pattern.
When constructing an insn, operand number n will be substituted at
this point. When matching an insn, it matches if the body of the insn
is a parallel expression with at least as many elements as the
vector of subpat expressions in the match_parallel, if each
subpat matches the corresponding element of the parallel,
and the function predicate returns nonzero on the
parallel that is the body of the insn. It is the responsibility
of the predicate to validate elements of the parallel beyond
those listed in the match_parallel.
A typical use of match_parallel is to match load and store
multiple expressions, which can contain a variable number of elements
in a parallel. For example,
(define_insn ""
[(match_parallel 0 "load_multiple_operation"
[(set (match_operand:SI 1 "gpc_reg_operand" "=r")
(match_operand:SI 2 "memory_operand" "m"))
(use (reg:SI 179))
(clobber (reg:SI 179))])]
""
"loadm 0,0,%1,%2")
|
This example comes from `a29k.md'. The function
load_multiple_operations is defined in `a29k.c' and checks
that subsequent elements in the parallel are the same as the
set in the pattern, except that they are referencing subsequent
registers and memory locations.
An insn that matches this pattern might look like:
(parallel
[(set (reg:SI 20) (mem:SI (reg:SI 100)))
(use (reg:SI 179))
(clobber (reg:SI 179))
(set (reg:SI 21)
(mem:SI (plus:SI (reg:SI 100)
(const_int 4))))
(set (reg:SI 22)
(mem:SI (plus:SI (reg:SI 100)
(const_int 8))))])
|
(match_par_dup n [subpat...])
match_op_dup, but for match_parallel instead of
match_operator.
(match_insn predicate)
match_* recognizers,
match_insn does not take an operand number.
The machine mode m of match_insn works like that of
match_operand: it is passed as the second argument to the
predicate function, and that function is solely responsible for
deciding whether the expression to be matched "has" that mode.
(match_insn2 n predicate)
The machine mode m of match_insn2 works like that of
match_operand: it is passed as the second argument to the
predicate function, and that function is solely responsible for
deciding whether the expression to be matched "has" that mode.
(address (match_operand:m n "address_operand" ""))
address expressions never appear in RTL code, only in machine
descriptions. And they are used only in machine descriptions that do
not use the operand constraint feature. When operand constraints are
in use, the letter `p' in the constraint serves this purpose.
m is the machine mode of the memory location being
addressed, not the machine mode of the address itself. That mode is
always the same on a given target machine (it is Pmode, which
normally is SImode), so there is no point in mentioning it;
thus, no machine mode is written in the address expression. If
some day support is added for machines in which addresses of different
kinds of objects appear differently or are used differently (such as
the PDP-10), different formats would perhaps need different machine
modes and these modes might be written in the address
expression.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The output template is a string which specifies how to output the assembler code for an instruction pattern. Most of the template is a fixed string which is output literally. The character `%' is used to specify where to substitute an operand; it can also be used to identify places where different variants of the assembler require different syntax.
In the simplest case, a `%' followed by a digit n says to output operand n at that point in the string.
`%' followed by a letter and a digit says to output an operand in an
alternate fashion. Four letters have standard, built-in meanings described
below. The machine description macro PRINT_OPERAND can define
additional letters with nonstandard meanings.
`%cdigit' can be used to substitute an operand that is a constant value without the syntax that normally indicates an immediate operand.
`%ndigit' is like `%cdigit' except that the value of the constant is negated before printing.
`%adigit' can be used to substitute an operand as if it were a memory reference, with the actual operand treated as the address. This may be useful when outputting a "load address" instruction, because often the assembler syntax for such an instruction requires you to write the operand as if it were a memory reference.
`%ldigit' is used to substitute a label_ref into a jump
instruction.
`%=' outputs a number which is unique to each instruction in the entire compilation. This is useful for making local labels to be referred to more than once in a single template that generates multiple assembler instructions.
`%' followed by a punctuation character specifies a substitution that
does not use an operand. Only one case is standard: `%%' outputs a
`%' into the assembler code. Other nonstandard cases can be
defined in the PRINT_OPERAND macro. You must also define
which punctuation characters are valid with the
PRINT_OPERAND_PUNCT_VALID_P macro.
The template may generate multiple assembler instructions. Write the text for the instructions, with `\;' between them.
When the RTL contains two operands which are required by constraint to match each other, the output template must refer only to the lower-numbered operand. Matching operands are not always identical, and the rest of the compiler arranges to put the proper RTL expression for printing into the lower-numbered operand.
One use of nonstandard letters or punctuation following `%' is to
distinguish between different assembler languages for the same machine; for
example, Motorola syntax versus MIT syntax for the 68000. Motorola syntax
requires periods in most opcode names, while MIT syntax does not. For
example, the opcode `movel' in MIT syntax is `move.l' in Motorola
syntax. The same file of patterns is used for both kinds of output syntax,
but the character sequence `%.' is used in each place where Motorola
syntax wants a period. The PRINT_OPERAND macro for Motorola syntax
defines the sequence to output a period; the macro for MIT syntax defines
it to do nothing.
As a special case, a template consisting of the single character #
instructs the compiler to first split the insn, and then output the
resulting instructions separately. This helps eliminate redundancy in the
output templates. If you have a define_insn that needs to emit
multiple assembler instructions, and there is an matching define_split
already defined, then you can simply use # as the output template
instead of writing an output template that emits the multiple assembler
instructions.
If the macro ASSEMBLER_DIALECT is defined, you can use construct
of the form `{option0|option1|option2}' in the templates. These
describe multiple variants of assembler language syntax.
See section 17.16.7 Output of Assembler Instructions.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Often a single fixed template string cannot produce correct and efficient assembler code for all the cases that are recognized by a single instruction pattern. For example, the opcodes may depend on the kinds of operands; or some unfortunate combinations of operands may require extra machine instructions.
If the output control string starts with a `@', then it is actually a series of templates, each on a separate line. (Blank lines and leading spaces and tabs are ignored.) The templates correspond to the pattern's constraint alternatives (see section 16.6.2 Multiple Alternative Constraints). For example, if a target machine has a two-address add instruction `addr' to add into a register and another `addm' to add a register to memory, you might write this pattern:
(define_insn "addsi3"
[(set (match_operand:SI 0 "general_operand" "=r,m")
(plus:SI (match_operand:SI 1 "general_operand" "0,0")
(match_operand:SI 2 "general_operand" "g,r")))]
""
"@
addr %2,%0
addm %2,%0")
|
If the output control string starts with a `*', then it is not an
output template but rather a piece of C program that should compute a
template. It should execute a return statement to return the
template-string you want. Most such templates use C string literals, which
require doublequote characters to delimit them. To include these
doublequote characters in the string, prefix each one with `\'.
The operands may be found in the array operands, whose C data type
is rtx [].
It is very common to select different ways of generating assembler code
based on whether an immediate operand is within a certain range. Be
careful when doing this, because the result of INTVAL is an
integer on the host machine. If the host machine has more bits in an
int than the target machine has in the mode in which the constant
will be used, then some of the bits you get from INTVAL will be
superfluous. For proper results, you must carefully disregard the
values of those bits.
It is possible to output an assembler instruction and then go on to output
or compute more of them, using the subroutine output_asm_insn. This
receives two arguments: a template-string and a vector of operands. The
vector may be operands, or it may be another array of rtx
that you declare locally and initialize yourself.
When an insn pattern has multiple alternatives in its constraints, often
the appearance of the assembler code is determined mostly by which alternative
was matched. When this is so, the C code can test the variable
which_alternative, which is the ordinal number of the alternative
that was actually satisfied (0 for the first, 1 for the second alternative,
etc.).
For example, suppose there are two opcodes for storing zero, `clrreg'
for registers and `clrmem' for memory locations. Here is how
a pattern could use which_alternative to choose between them:
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "=r,m")
(const_int 0))]
""
"*
return (which_alternative == 0
? \"clrreg %0\" : \"clrmem %0\");
")
|
The example above, where the assembler code to generate was solely determined by the alternative, could also have been specified as follows, having the output control string start with a `@':
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "=r,m")
(const_int 0))]
""
"@
clrreg %0
clrmem %0")
|
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Each match_operand in an instruction pattern can specify a
constraint for the type of operands allowed.
Constraints can say whether
an operand may be in a register, and which kinds of register; whether the
operand can be a memory reference, and which kinds of address; whether the
operand may be an immediate constant, and which possible values it may
have. Constraints can also require two operands to match.
16.6.1 Simple Constraints Basic use of constraints. 16.6.2 Multiple Alternative Constraints When an insn has two alternative constraint-patterns. 16.6.3 Register Class Preferences Constraints guide which hard register to put things in. 16.6.4 Constraint Modifier Characters More precise control over effects of constraints. 16.6.5 Constraints for Particular Machines Existing constraints for some particular machines. 16.6.6 Not Using Constraints Describing a clean machine without constraints.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The simplest kind of constraint is a string full of letters, each of which describes one kind of operand that is permitted. Here are the letters that are allowed:
For example, an address which is constant is offsettable; so is an address that is the sum of a register and a constant (as long as a slightly larger constant is also within the range of address-offsets supported by the machine); but an autoincrement or autodecrement address is not offsettable. More complicated indirect/indexed addresses may or may not be offsettable depending on the other addressing modes that the machine supports.
Note that in an output operand which can be matched by another operand, the constraint letter `o' is valid only when accompanied by both `<' (if the target machine has predecrement addressing) and `>' (if the target machine has preincrement addressing).
const_double) is
allowed, but only if the target floating point format is the same as
that of the host machine (on which the compiler is running).
const_double) is
allowed.
This might appear strange; if an insn allows a constant operand with a value not known at compile time, it certainly must allow any known value. So why use `s' instead of `i'? Sometimes it allows better code to be generated.
For example, on the 68000 in a fullword instruction it is possible to use an immediate operand; but if the immediate value is between -128 and 127, better code results from loading the value into a register and using the register. This is because the load into the register can be done with a `moveq' instruction. We arrange for this to happen by defining the letter `K' to mean "any integer outside the range -128 to 127", and then specifying `Ks' in the operand constraints.
general_operand. This is normally used in the constraint of
a match_scratch when certain alternatives will not actually
require a scratch register.
This is called a matching constraint and what it really means is that the assembler has only a single operand that fills two roles considered separate in the RTL insn. For example, an add insn has two input operands and one output operand in the RTL, but on most CISC machines an add instruction really has only two operands, one of them an input-output operand:
addl #35,r12 |
Matching constraints are used in these circumstances. More precisely, the two operands that match must include one input-only operand and one output-only operand. Moreover, the digit must be a smaller number than the number of the operand that uses it in the constraint.
For operands to match in a particular case usually means that they
are identical-looking RTL expressions. But in a few special cases
specific kinds of dissimilarity are allowed. For example, *x
as an input operand will match *x++ as an output operand.
For proper results in such cases, the output template should always
use the output-operand's number when printing the operand.
`p' in the constraint must be accompanied by address_operand
as the predicate in the match_operand. This predicate interprets
the mode specified in the match_operand as the mode of the memory
reference for which the address would be valid.
EXTRA_CONSTRAINT is passed the
operand as its first argument and the constraint letter as its
second operand.
A typical use for this would be to distinguish certain types of memory references that affect other insn operands.
Do not define these constraint letters to accept register references
(reg); the reload pass does not expect this and would not handle
it properly.
In order to have valid assembler code, each operand must satisfy its constraint. But a failure to do so does not prevent the pattern from applying to an insn. Instead, it directs the compiler to modify the code so that the constraint will be satisfied. Usually this is done by copying an operand into a register.
Contrast, therefore, the two instruction patterns that follow:
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "=r")
(plus:SI (match_dup 0)
(match_operand:SI 1 "general_operand" "r")))]
""
"...")
|
which has two operands, one of which must appear in two places, and
(define_insn ""
[(set (match_operand:SI 0 "general_operand" "=r")
(plus:SI (match_operand:SI 1 "general_operand" "0")
(match_operand:SI 2 "general_operand" "r")))]
""
"...")
|
which has three operands, two of which are required by a constraint to be identical. If we are considering an insn of the form
(insn n prev next
(set (reg:SI 3)
(plus:SI (reg:SI 6) (reg:SI 109)))
...)
|
the first pattern would not apply at all, because this insn does not contain two identical subexpressions in the right place. The pattern would say, "That does not look like an add instruction; try other patterns." The second pattern would say, "Yes, that's an add instruction, but there is something wrong with it." It would direct the reload pass of the compiler to generate additional insns to make the constraint true. The results might look like this:
(insn n2 prev n
(set (reg:SI 3) (reg:SI 6))
...)
(insn n n2 next
(set (reg:SI 3)
(plus:SI (reg:SI 3) (reg:SI 109)))
...)
|
It is up to you to make sure that each operand, in each pattern, has constraints that can handle any RTL expression that could be present for that operand. (When multiple alternatives are in use, each pattern must, for each possible combination of operand expressions, have at least one alternative which can handle that combination of operands.) The constraints don't need to allow any possible operand--when this is the case, they do not constrain--but they must at least point the way to reloading any possible operand so that it will fit.
For example, an operand whose constraints permit everything except registers is safe provided its predicate rejects registers.
An operand whose predicate accepts only constant values is safe provided its constraints include the letter `i'. If any possible constant value is accepted, then nothing less than `i' will do; if the predicate is more selective, then the constraints may also be more selective.
If the operand's predicate can recognize registers, but the constraint does not permit them, it can make the compiler crash. When this operand happens to be a register, the reload pass will be stymied, because it does not know how to copy a register temporarily into memory.
If the predicate accepts a unary operator, the constraint applies to the
operand. For example, the MIPS processor at ISA level 3 supports an
instruction which adds two registers in SImode to produce a
DImode result, but only if the registers are correctly sign
extended. This predicate for the input operands accepts a
sign_extend of an SImode register. Write the constraint
to indicate the type of register that is required for the operand of the
sign_extend.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Sometimes a single instruction has multiple alternative sets of possible operands. For example, on the 68000, a logical-or instruction can combine register or an immediate value into memory, or it can combine any kind of operand into a register; but it cannot combine one memory location into another.
These constraints are represented as multiple alternatives. An alternative can be described by a series of letters for each operand. The overall constraint for an operand is made from the letters for this operand from the first alternative, a comma, the letters for this operand from the second alternative, a comma, and so on until the last alternative. Here is how it is done for fullword logical-or on the 68000:
(define_insn "iorsi3"
[(set (match_operand:SI 0 "general_operand" "=m,d")
(ior:SI (match_operand:SI 1 "general_operand" "%0,0")
(match_operand:SI 2 "general_operand" "dKs,dmKs")))]
...)
|
The first alternative has `m' (memory) for operand 0, `0' for operand 1 (meaning it must match operand 0), and `dKs' for operand 2. The second alternative has `d' (data register) for operand 0, `0' for operand 1, and `dmKs' for operand 2. The `=' and `%' in the constraints apply to all the alternatives; their meaning is explained in the next section (see section 16.6.3 Register Class Preferences).
If all the operands fit any one alternative, the instruction is valid. Otherwise, for each alternative, the compiler counts how many instructions must be added to copy the operands so that that alternative applies. The alternative requiring the least copying is chosen. If two alternatives need the same amount of copying, the one that comes first is chosen. These choices can be altered with the `?' and `!' characters:
?
!
When an insn pattern has multiple alternatives in its constraints, often
the appearance of the assembler code is determined mostly by which
alternative was matched. When this is so, the C code for writing the
assembler code can use the variable which_alternative, which is
the ordinal number of the alternative that was actually satisfied (0 for
the first, 1 for the second alternative, etc.). See section 16.5 C Statements for Assembler Output.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The operand constraints have another function: they enable the compiler to decide which kind of hardware register a pseudo register is best allocated to. The compiler examines the constraints that apply to the insns that use the pseudo register, looking for the machine-dependent letters such as `d' and `a' that specify classes of registers. The pseudo register is put in whichever class gets the most "votes". The constraint letters `g' and `r' also vote: they vote in favor of a general register. The machine description says which registers are considered general.
Of course, on some machines all registers are equivalent, and no register classes are defined. Then none of this complexity is relevant.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here are constraint modifier characters.
When the compiler fixes up the operands to satisfy the constraints, it needs to know which operands are inputs to the instruction and which are outputs from it. `=' identifies an output; `+' identifies an operand that is both input and output; all other operands are assumed to be input only.
`&' applies only to the alternative in which it is written. In constraints with multiple alternatives, sometimes one alternative requires `&' while others do not. See, for example, the `movdf' insn of the 68000.
An input operand can be tied to an earlyclobber operand if its only use as an input occurs before the early result is written. Adding alternatives of this form often allows GCC to produce better code when only some of the inputs can be affected by the earlyclobber. See, for example, the `mulsi3' insn of the ARM.
`&' does not obviate the need to write `='.
(define_insn "addhi3"
[(set (match_operand:HI 0 "general_operand" "=m,r")
(plus:HI (match_operand:HI 1 "general_operand" "%0,0")
(match_operand:HI 2 "general_operand" "di,g")))]
...)
|
Here is an example: the 68000 has an instruction to sign-extend a halfword in a data register, and can also sign-extend a value by copying it into an address register. While either kind of register is acceptable, the constraints on an address-register destination are less strict, so it is best if register allocation makes an address register its goal. Therefore, `*' is used so that the `d' constraint letter (for data register) is ignored when computing register preferences.
(define_insn "extendhisi2"
[(set (match_operand:SI 0 "general_operand" "=*d,a")
(sign_extend:SI
(match_operand:HI 1 "general_operand" "0,g")))]
...)
|
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Whenever possible, you should use the general-purpose constraint letters
in asm arguments, since they will convey meaning more readily to
people reading your code. Failing that, use the constraint letters
that usually have very similar meanings across architectures. The most
commonly used constraints are `m' and `r' (for memory and
general-purpose registers respectively; see section 16.6.1 Simple Constraints), and
`I', usually the letter indicating the most common
immediate-constant format.
For each machine architecture, the `config/machine.h' file
defines additional constraints. These constraints are used by the
compiler itself for instruction generation, as well as for asm
statements; therefore, some of the constraints are not particularly
interesting for asm. The constraints are defined through these
macros:
REG_CLASS_FROM_LETTER
CONST_OK_FOR_LETTER_P
CONST_DOUBLE_OK_FOR_LETTER_P
EXTRA_CONSTRAINT
Inspecting these macro definitions in the compiler source for your machine is the best way to be certain you have the right constraints. However, here is a summary of the machine-dependent constraints available on some particular machines.
f
F
G
I
J
K
L
M
Q
asm statements)
R
S
l
b
q
h
A
a
f
I
J
K
L
M
N
O
P
G
H
asm statements, use the machine
independent `E' or `F' instead)
b
f
h
q
c
l
x
y
z
I
J
K
L
M
N
O
P
G
Q
asm statements)
R
S
U
q
b, c, or d register
A
d register (for 64-bit ints)
f
t
u
a
b
c
d
D
S
I
J
K
L
M
lea instruction)
N
out instruction)
G
f
fp0 to fp3)
l
r0 to r15)
b
g0 to g15)
d
I
J
K
G
H
d
f
h
l
x
y
z
I
J
K
L
lui)
M
N
O
P
G
Q
asm statements)
R
asm statements)
S
asm statements)
a
d
f
x
y
I
J
K
L
M
G
H
f
e
I
J
K
sethi instruction)
G
H
Q
asm statements)
S
T
U
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Some machines are so clean that operand constraints are not required. For example, on the Vax, an operand valid in one context is valid in any other context. On such a machine, every operand constraint would be `g', excepting only operands of "load address" instructions which are written as if they referred to a memory location's contents but actual refer to its address. They would have constraint `p'.
For such machines, instead of writing `g' and `p' for all
the constraints, you can choose to write a description with empty constraints.
Then you write `""' for the constraint in every match_operand.
Address operands are identified by writing an address expression
around the match_operand, not by their constraints.
When the machine description has just empty constraints, certain parts of compilation are skipped, making the compiler faster. However, few machines actually do not need constraints; all machine descriptions now in existence use constraints.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Here is a table of the instruction names that are meaningful in the RTL generation pass of the compiler. Giving one of these names to an instruction pattern tells the RTL generation pass that it can use the pattern to accomplish a certain task.
If operand 0 is a subreg with mode m of a register whose
own mode is wider than m, the effect of this instruction is
to store the specified value in the part of the register that corresponds
to mode m. The effect on the rest of the register is undefined.
This class of patterns is special in several ways. First of all, each of these names must be defined, because there is no other way to copy a datum from one place to another.
Second, these patterns are not used solely in the RTL generation pass. Even the reload pass can generate move insns to copy values from stack slots into temporary registers. When it does so, one of the operands is a hard register and the other is an operand that can need to be reloaded into a register.
Therefore, when given such a pair of operands, the pattern must generate
RTL which needs no reloading and needs no temporary registers--no
registers other than the operands. For example, if you support the
pattern with a define_expand, then in such a case the
define_expand mustn't call force_reg or any other such
function which might generate new pseudo registers.
This requirement exists even for subword modes on a RISC machine where fetching those modes from memory normally requires several insns and some temporary registers. Look in `spur.md' to see how the requirement can be satisfied.
During reload a memory reference with an invalid address may be passed
as an operand. Such an address will be replaced with a valid address
later in the reload pass. In this case, nothing may be done with the
address except to use it as it stands. If it is copied, it will not be
replaced with a valid address. No attempt should be made to make such
an address into a valid address and no routine (such as
change_address) that will do so may be called. Note that
general_operand will fail when applied to such an address.
The global variable reload_in_progress (which must be explicitly
declared if required) can be used to determine whether such special
handling is required.
The variety of operands that have reloads depends on the rest of the machine description, but typically on a RISC machine these can only be pseudo registers that did not get hard registers, while on other machines explicit memory references will get optional reloads.
If a scratch register is required to move an object to or from memory,
it can be allocated using gen_reg_rtx prior to life analysis.
If there are cases needing
scratch registers after reload, you must define
SECONDARY_INPUT_RELOAD_CLASS and perhaps also
SECONDARY_OUTPUT_RELOAD_CLASS to detect them, and provide
patterns `reload_inm' or `reload_outm' to handle
them. See section 17.6 Register Classes.
The global variable no_new_pseudos can be used to determine if it
is unsafe to create new pseudo registers. If this variable is nonzero, then
it is unsafe to call gen_reg_rtx to allocate a new pseudo.
The constraints on a `movm' must permit moving any hard
register to any other hard register provided that
HARD_REGNO_MODE_OK permits mode m in both registers and
REGISTER_MOVE_COST applied to their classes returns a value of 2.
It is obligatory to support floating point `movm'
instructions into and out of any registers that can hold fixed point
values, because unions and structures (which have modes SImode or
DImode) can be in those registers and they may have floating
point members.
There may also be a need to support fixed point `movm'
instructions in and out of floating point registers. Unfortunately, I
have forgotten why this was so, and I don't know whether it is still
true. If HARD_REGNO_MODE_OK rejects fixed point values in
floating point registers, then the constraints of the fixed point
`movm' instructions must be designed to avoid ever trying to
reload into a floating point register.
SECONDARY_RELOAD_CLASS
macro in see section 17.6 Register Classes.
subreg
with mode m of a register whose natural mode is wider,
the `movstrictm' instruction is guaranteed not to alter
any of the register except the part which belongs to mode m.
Define this only if the target machine really has such an instruction; do not define this if the most efficient way of loading consecutive registers from memory is to do them one at a time.
On some machines, there are restrictions as to which consecutive
registers can be stored into memory, such as particular starting or
ending register numbers or only a range of valid counts. For those
machines, use a define_expand (see section 16.13 Defining RTL Sequences for Code Generation)
and make the pattern fail if the restrictions are not met.
Write the generated insn as a parallel with elements being a
set of one register from the appropriate memory location (you may
also need use or clobber elements). Use a
match_parallel (see section 16.3 RTL Template) to recognize the insn. See
`a29k.md' and `rs6000.md' for examples of the use of this insn
pattern.
HImode, and store
a SImode product in operand 0.
For machines with an instruction that produces both a quotient and a remainder, provide a pattern for `divmodm4' but do not provide patterns for `divm3' and `modm3'. This allows optimization in the relatively common case when both the quotient and remainder are computed.
If an instruction that just produces a quotient or just a remainder
exists and is more efficient than the instruction that produces both,
write the output routine of `divmodm4' to call
find_reg_note and look for a REG_UNUSED note on the
quotient or remainder and generate the appropriate instruction.
ashlm3 instructions.
The sqrt built-in function of C always uses the mode which
corresponds to the C data type double.
The ffs built-in function of C always uses the mode which
corresponds to the C data type int.
(set (cc0) (compare (match_operand:m 0 ...)
(match_operand:m 1 ...)))
|
(set (cc0) (match_operand:m 0 ...)) |
`tstm' patterns should not be defined for machines that do
not use (cc0). Doing so would confuse the optimizer since it
would no longer be clear which set operations were comparisons.
The `cmpm' patterns should be used instead.
Pmode.
The number of bytes to move is the third operand, in mode m.
Usually, you specify word_mode for m. However, if you can
generate better code knowing the range of valid lengths is smaller than
those representable in a full word, you should provide a pattern with a
mode corresponding to the range of values you can handle efficiently
(e.g., QImode for values in the range 0--127; note we avoid numbers
that appear negative) and also a pattern with word_mode.
The fourth operand is the known shared alignment of the source and
destination, in the form of a const_int rtx. Thus, if the
compiler knows that both source and destination are word-aligned,
it may provide the value 4 for this operand.
Descriptions of multiple movstrm patterns can only be
beneficial if the patterns for smaller modes have fewer restrictions
on their first, second and fourth operands. Note that the mode m
in movstrm does not impose any restriction on the mode of
individually moved data units in the block.
These patterns need not give special consideration to the possibility that the source and destination strings might overlap.
Pmode. The number of bytes to clear is
the second operand, in mode m. See `movstrm' for
a discussion of the choice of mode.
The third operand is the known alignment of the destination, in the form
of a const_int rtx. Thus, if the compiler knows that the
destination is word-aligned, it may provide the value 4 for this
operand.
The use for multiple clrstrm is as for movstrm.
mem referring to the first character of the string,
operand 2 is the character to search for (normally zero),
and operand 3 is a constant describing the known alignment
of the beginning of the string.
word_mode.
Operand 1 may have mode byte_mode or word_mode; often
word_mode is allowed only for registers. Operands 2 and 3 must
be valid for word_mode.
The RTL generation pass generates this instruction only with constants for operands 2 and 3.
The bit-field value is sign-extended to a full word integer before it is stored in operand 0.
word_mode) into a bit
field in operand 0, where operand 1 specifies the width in bits and
operand 2 the starting bit. Operand 0 may have mode byte_mode or
word_mode; often word_mode is allowed only for registers.
Operands 1 and 2 must be valid for word_mode.
The RTL generation pass generates this instruction only with constants for operands 1 and 2.
The mode of the operands being compared need not be the same as the operands being moved. Some machines, sparc64 for example, have instructions that conditionally move an integer value based on the floating point condition codes and vice versa.
If the machine does not have conditional move instructions, do not define these patterns.
eq, lt or leu.
You specify the mode that the operand must have when you write the
match_operand expression. The compiler automatically sees
which mode you have used and supplies an operand of that mode.
The value stored for a true condition must have 1 as its low bit, or
else must be negative. Otherwise the instruction is not suitable and
you should omit it from the machine description. You describe to the
compiler exactly which value is stored by defining the macro
STORE_FLAG_VALUE (see section 17.19 Miscellaneous Parameters). If a description cannot be
found that can be used for all the `scond' patterns, you
should omit those operations from the machine description.
These operations may fail, but should do so only in relatively uncommon cases; if they would fail for common cases involving integer comparisons, it is best to omit these patterns.
If these operations are omitted, the compiler will usually generate code
that copies the constant one to the target and branches around an
assignment of zero to the target. If this code is more efficient than
the potential instructions used for the `scond' pattern
followed by those required to convert the result into a 1 or a zero in
SImode, you should omit the `scond' operations from
the machine description.
label_ref that
refers to the label to jump to. Jump if the condition codes meet
condition cond.
Some machines do not follow the model assumed here where a comparison
instruction is followed by a conditional branch instruction. In that
case, the `cmpm' (and `tstm') patterns should
simply store the operands away and generate all the required insns in a
define_expand (see section 16.13 Defining RTL Sequences for Code Generation) for the conditional
branch operations. All calls to expand `bcond' patterns are
immediately preceded by calls to expand either a `cmpm'
pattern or a `tstm' pattern.
Machines that use a pseudo register for the condition code value, or where the mode used for the comparison depends on the condition being tested, should also use the above mechanism. See section 16.10 Defining Jump Instruction Patterns.
The above discussion also applies to the `movmodecc' and `scond' patterns.
const_int; operand 2 is the number of registers used as
operands.
On most machines, operand 2 is not actually stored into the RTL pattern. It is supplied for the sake of some RISC machines which need to put this information into the assembler code; they can put it in the RTL instead of operand 1.
Operand 0 should be a mem RTX whose address is the address of the
function. Note, however, that this address can be a symbol_ref
expression even if it would not be a legitimate memory address on the
target machine. If it is also not a valid argument for a call
instruction, the pattern for this operation should be a
define_expand (see section 16.13 Defining RTL Sequences for Code Generation) that places the
address into a register and uses that register in the call instruction.
Subroutines that return BLKmode objects use the `call'
insn.
RETURN_POPS_ARGS is non-zero. They should emit a parallel
that contains both the function call and a set to indicate the
adjustment made to the frame pointer.
For machines where RETURN_POPS_ARGS can be non-zero, the use of these
patterns increases the number of functions for which the frame pointer
can be eliminated, if desired.
parallel
expression where each element is a set expression that indicates
the saving of a function return value into the result block.
This instruction pattern should be defined to support
__builtin_apply on machines where special instructions are needed
to call a subroutine with arbitrary arguments or to save the value
returned. This instruction pattern is required on machines that have
multiple registers that can hold a return value (i.e.
FUNCTION_VALUE_REGNO_P is true for more than one register).
Like the `movm' patterns, this pattern is also used after the RTL generation phase. In this case it is to support machines where multiple instructions are usually needed to return from a function, but some class of functions only requires one instruction to implement a return. Normally, the applicable functions are those which do not need to save any registers or allocate stack space.
For such machines, the condition specified in this pattern should only
be true when reload_completed is non-zero and the function's
epilogue would only be a single instruction. For machines with register
windows, the routine leaf_function_p may be used to determine if
a register window push is required.
Machines that have conditional return instructions should define patterns such as
(define_insn ""
[(set (pc)
(if_then_else (match_operator
0 "comparison_operator"
[(cc0) (const_int 0)])
(return)
(pc)))]
"condition"
"...")
|
where condition would normally be the same condition specified on the named `return' pattern.
__builtin_return on machines where special
instructions are needed to return a value of any type.
Operand 0 is a memory location where the result of calling a function
with __builtin_apply is stored; operand 1 is a parallel
expression where each element is a set expression that indicates
the restoring of a function return value from the result block.
(const_int 0) will do as an
RTL pattern.
SImode.
CASE_DROPS_THROUGH is defined,
then an out-of-bounds index drops through to the code following
the jump table instead of jumping to this label. In that case,
this label is not actually used by the `casesi' instruction,
but it is always provided as an operand.)
The table is a addr_vec or addr_diff_vec inside of a
jump_insn. The number of elements in the table is one plus the
difference between the upper bound and the lower bound.
This pattern requires two operands: the address or offset, and a label
which should immediately precede the jump table. If the macro
CASE_VECTOR_PC_RELATIVE evaluates to a nonzero value then the first
operand is an offset which counts from the address of the table; otherwise,
it is an absolute address to jump to. In either case, the first operand has
mode Pmode.
The `tablejump' insn is always the last insn before the jump table it uses. Its assembler code normally has no need to use the second operand, but you should incorporate it in the RTL pattern so that the jump optimizer will not delete the table as unreachable code.
Operand 0 is always a reg and has mode Pmode; operand 1
may be a reg, mem, symbol_ref, const_int, etc
and also has mode Pmode.
Canonicalization of a function pointer usually involves computing the address of the function which would be called if the function pointer were used in an indirect call.
Only define this pattern if function pointers on the target machine can have different values but still call the same function when used in an indirect call.
Pmode. Do not define these patterns on
such machines.
Some machines require special handling for stack pointer saves and
restores. On those machines, define the patterns corresponding to the
non-standard cases by using a define_expand (see section 16.13 Defining RTL Sequences for Code Generation) that produces the required insns. The three types of
saves and restores are:
alloca. Only
the epilogue uses the restored stack pointer, allowing a simpler save or
restore sequence on some machines.
When saving the stack pointer, operand 0 is the save area and operand 1
is the stack pointer. The mode used to allocate the save area defaults
to Pmode but you can override that choice by defining the
STACK_SAVEAREA_MODE macro (see section 17.3 Storage Layout). You must
specify an integral mode, or VOIDmode if no save area is needed
for a particular type of save (either because no save is needed or
because a machine-specific save area can be used). Operand 0 is the
stack pointer and operand 1 is the save area for restore operations. If
`save_stack_block' is defined, operand 0 must not be
VOIDmode since these saves can be arbitrarily nested.
A save area is a mem that is at a constant offset from
virtual_stack_vars_rtx when the stack pointer is saved for use by
nonlocal gotos and a reg in the other two cases.
STACK_GROWS_DOWNWARD is undefined) operand 1 from
the stack pointer to create space for dynamically allocated data.
Store the resultant pointer to this space into operand 0. If you
are allocating space from the main stack, do this by emitting a
move insn to copy virtual_stack_dynamic_rtx to operand 0.
If you are allocating the space elsewhere, generate code to copy the
location of the space to operand 0. In the latter case, you must
ensure this space gets freed when the corresponding space on the main
stack is free.
Do not define this pattern if all that must be done is the subtraction. Some machines require other operations such as stack probes or maintaining the back chain. Define this pattern to emit those operations in addition to updating the stack pointer.
If you need to emit instructions before the stack has been adjusted, put them into the `allocate_stack' pattern. Otherwise, define this pattern to emit the required instructions.
No operands are provided.
On most machines you need not define this pattern, since GNU CC will already generate the correct code, which is to load the frame pointer and static chain, restore the stack (using the `restore_stack_nonlocal' pattern, if defined), and jump indirectly to the dispatcher. You need only define this pattern if this code will not work on your machine.
jmp_buf. You will not normally need to define this pattern.
A typical reason why you might need this pattern is if some value, such
as a pointer to a global table, must be restored. Though it is
preferred that the pointer value be recalculated if possible (given the
address of a label for instance). The single argument is a pointer to
the jmp_buf. Note that the buffer is five words long and that
the first three are normally used by the generic mechanism.
builtin_setjmp_setup. The single argument is a pointer to the
jmp_buf.
__builtin_eh_return,
and thence __throw are built. It is intended to allow communication
between the exception handling machinery and the normal epilogue code
for the target.
The pattern takes three arguments. The first is the exception context pointer. This will have already been copied to the function return register appropriate for a pointer; normally this can be ignored. The second argument is an offset to be added to the stack pointer. It will have been copied to some arbitrary call-clobbered hard reg so that it will survive until after reload to when the normal epilogue is generated. The final argument is the address of the exception handler to which the function should return. This will normally need to copied by the pattern to some special register.
This pattern must be defined if RETURN_ADDR_RTX does not yield
something that can be reliably and permanently modified, i.e. a fixed
hard register or a stack memory reference.
Using a prologue pattern is generally preferred over defining
FUNCTION_PROLOGUE to emit assembly code for the prologue.
The prologue pattern is particularly useful for targets which perform
instruction scheduling.
Using an epilogue pattern is generally preferred over defining
FUNCTION_EPILOGUE to emit assembly code for the prologue.
The epilogue pattern is particularly useful for targets which perform
instruction scheduling or which have delay slots for their return instruction.
The sibcall_epilogue pattern must not clobber any arguments used for
parameter passing or any stack slots for arguments passed to the current
function.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Sometimes an insn can match more than one instruction pattern. Then the pattern that appears first in the machine description is the one used. Therefore, more specific patterns (patterns that will match fewer things) and faster instructions (those that will produce better code when they do match) should usually go first in the description.
In some cases the effect of ordering the patterns can be used to hide a pattern when it is not valid. For example, the 68000 has an instruction for converting a fullword to floating point and another for converting a byte to floating point. An instruction converting an integer to floating point could match either one. We put the pattern to convert the fullword first to make sure that one will be used rather than the other. (Otherwise a large integer might be generated as a single-byte immediate quantity, which would not work.) Instead of using this pattern ordering it would be possible to make the pattern for convert-a-byte smart enough to deal properly with any constant value.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] |