The ARM architecture offers 16 GPRs. Depending on the implemented VFP version a certain number of floating point registers (extension registers) are present. The figure below assumes 32 EXTRs with 64 bit each. The first 16 EXTRs are interleaved with 32 EXTRs with 32 bit precision.
|R0||volatile||1st. parameter, local variables, return value|
|R1||volatile||2nd. parameter, local variables, return value (if long, lower 4 bytes of long)|
|R2-R5||volatile||further parameters, local variables|
|R14||dedicated||link register, scratch register|
|D1||dedicated||scratch register, return value|
|D2-D6||volatile||parameters, local variables|
|S2||volatile||1. parameter, local variables, return value|
|S3-S13||volatile||further parameters, local variables|
Beside the 32 extension register of the VFPv3 with double prescision, there are 32 single prescision register which are interleaved with the first 16 double prescision registers. When the register allocator reserves a double precision register, e.g. d3, it will mark s6 and s7 as reserved as well. When a single precision register is reserved, e.g. s28, it will mark d14 as reserved as well.
Local variables are assigned volatile registers. However, if the live range of an SSA value incorporates a method call, a nonvolatile register has to be used. These will be assigned in decreasing order from R12/D31. As working registers volatile registers are used. If there are not enough of them, nonvolatiles (just below the locals) are assigned.
Care must be taken when defining which extension registers should be volatile or nonvolatile. As S and D registers are interleaved, there are twice as many nonvolatile D registers as there are nonvolatile S registers.
Optimization: If a method is a leaf method, it would not be necessary to copy the parameters from their volatile registers into nonvolatile registers.
For the translation of certain SSA instructions (e.g. of type long) further auxiliary registers are needed. They are assigned and reserved by the register allocator as well.
Parameters are passed in R0..R5 and D2..D6.
In the interface arm/Registers you can find the definitions about which registers are used for volatiles and nonvolatiles and which are used for parameter passing. Important: the number of parameter registers must be smaller or equal than the number of volatiles. Due to the interleaving of D and S registers in the extension registers, parameters must be carefully copied into the parameter registers. D2 to D6 hold the parameters regardless of the parameter being of type single or double. To give an example. For method m1(float a, double b, float c, double d) the four parameters will be passed as follows:
If the parameters do not fit into the parameter register, they will be passed on the stack in a special parameter block. Note that the block is filled with parameters of type integer (offset increases with parameter number) first. After this parameters of type float and double are pushed into this block. Parameters of type float and double both occupy 8 bytes.
Most SSA instruction have 2 operands and a result. Each of them can reside on the stack. Operands must be fetched into a free register, the result must be stored onto the stack. A single SSA instruction
sCstoreToArray has three operands but no result. Therefore, if stack slots are used, the following registers are reserved and cannot be used freely during register allocation.
|R7||dedicated|| result register or third operand of
|R8||dedicated||first operand of SSA instruction|
|R9||dedicated||second operand of SSA instruction|
|R5||dedicated|| long part of result or long part of third operand of
|R4||dedicated||long part of first operand of SSA instruction|
|R3||dedicated||long part of second operand of SSA instruction|
|D11||dedicated|| result register or third operand of
|D12||dedicated||first operand of SSA instruction|
|D13||dedicated||second operand of SSA instruction|
|S22||dedicated|| result register or third operand of
|S23||dedicated||first operand of SSA instruction|
|S24||dedicated||second operand of SSA instruction|
When dividing numbers of type long, a large number of auxiliary registers are necessary. For these cases the register allocation is run with a reduced register set with the same reserved registers as above given above.