deep

a Cross Development Platform for Java

User Tools

Site Tools


dev:crosscompiler:backend_arm:register_allocation

Register Allocation

The ARM architecture offers 16 GPRs. Depending on the implemented VFP version a certain number of floating point registers (extension registers) are present. The figure below assumes 32 EXTRs with 64 bit each. The first 16 EXTRs are interleaved with 32 EXTRs with 32 bit precision.

Register Usage

Register State Use
R0 volatile 1st. parameter, local variables, return value
R1 volatile 2nd. parameter, local variables, return value (if long, lower 4 bytes of long)
R2-R5 volatile further parameters, local variables
R6 dedicated scratch register
R7-R12 nonvolatile local variables
R13 dedicated stack pointer
R14 dedicated link register, scratch register
R15 dedicated PC
D0 dedicated scratch register
D1 dedicated scratch register, return value
D2-D6 volatile parameters, local variables
D7-D10 volatile local variables
D11-D31 nonvolatile local variables
S0,S1 dedicated scratch register
S2 volatile 1. parameter, local variables, return value
S3-S13 volatile further parameters, local variables
S14-S21 volatile local variables
S22-S31 nonvolatile local variables

Beside the 32 extension register of the VFPv3 with double prescision, there are 32 single prescision register which are interleaved with the first 16 double prescision registers. When the register allocator reserves a double precision register, e.g. d3, it will mark s6 and s7 as reserved as well. When a single precision register is reserved, e.g. s28, it will mark d14 as reserved as well.

Local variables are assigned volatile registers. However, if the live range of an SSA value incorporates a method call, a nonvolatile register has to be used. These will be assigned in decreasing order from R12/D31. As working registers volatile registers are used. If there are not enough of them, nonvolatiles (just below the locals) are assigned.
Care must be taken when defining which extension registers should be volatile or nonvolatile. As S and D registers are interleaved, there are twice as many nonvolatile D registers as there are nonvolatile S registers.
Optimization: If a method is a leaf method, it would not be necessary to copy the parameters from their volatile registers into nonvolatile registers.
For the translation of certain SSA instructions (e.g. of type long) further auxiliary registers are needed. They are assigned and reserved by the register allocator as well.

Parameter Passing

Parameters are passed in R0..R5 and D2..D6.
In the interface arm/Registers you can find the definitions about which registers are used for volatiles and nonvolatiles and which are used for parameter passing. Important: the number of parameter registers must be smaller or equal than the number of volatiles. Due to the interleaving of D and S registers in the extension registers, parameters must be carefully copied into the parameter registers. D2 to D6 hold the parameters regardless of the parameter being of type single or double. To give an example. For method m1(float a, double b, float c, double d) the four parameters will be passed as follows:

S0S1S2S3S4S5S6S7S8S9S10S11
D0 D1 D2 D3 D4 D5
a b c d

If the parameters do not fit into the parameter register, they will be passed on the stack in a special parameter block. Note that the block is filled with parameters of type integer (offset increases with parameter number) first. After this parameters of type float and double are pushed into this block. Parameters of type float and double both occupy 8 bytes.

Locals on the Stack

Most SSA instruction have 2 operands and a result. Each of them can reside on the stack. Operands must be fetched into a free register, the result must be stored onto the stack. A single SSA instruction sCstoreToArray has three operands but no result. Therefore, if stack slots are used, the following registers are reserved and cannot be used freely during register allocation.

Register State Use
R7 dedicated result register or third operand of sCstoreToArray
R8 dedicated first operand of SSA instruction
R9 dedicated second operand of SSA instruction
R10 dedicated auxiliary register
R5 dedicated long part of result or long part of third operand of sCstoreToArray
R4 dedicated long part of first operand of SSA instruction
R3 dedicated long part of second operand of SSA instruction
D11 dedicated result register or third operand of sCstoreToArray
D12 dedicated first operand of SSA instruction
D13 dedicated second operand of SSA instruction
S22 dedicated result register or third operand of sCstoreToArray
S23 dedicated first operand of SSA instruction
S24 dedicated second operand of SSA instruction

When dividing numbers of type long, a large number of auxiliary registers are necessary. For these cases the register allocation is run with a reduced register set with the same reserved registers as above given above.

dev/crosscompiler/backend_arm/register_allocation.txt · Last modified: 2019/10/03 10:22 by ursgraf