All results of all SSA instructions have an assigned register. Now, each SSA instruction can be translated into one or a sequence of machine instructions. In order to do this, we must define the stackframe, which is used when calling a method.
We use a stack pointer (R1) but no frame pointer.
Explanation:
LR is saved onto the stack first. Side note: This could be optimized, if method is a leaf method LR can stay unsaved. CTR, CR and XER need not to be saved. All of them might be used but they are never used across method calls. Considering the GPR's and FPR's, all nonvolatile register, which are used within this method, must be saved on the stack. Important: volatile FPR's must be saved as well if US.ENABLE_FLOATS()
is called in this method (see Exceptions).
The field local variables is only used if the number of registers does not suffice. When dealing with FPR's some temporary space on the stack might be necessary (temp. memory). Some compiler specific subroutines dealing with long operations need the same space for saving and restoring some registers which are used therein.
Padding ensures that the stack frame is always a multiple of 16 bytes (quad-word aligned).
The field parameter serves to hold parameters for method calls, which can not be placed directly into registers. The size of the field parameters is determined by considering all the calls to other methods within this method and taking the maximum size of their parameters. parameters must be at the top of the stack! This garantees that in all called methods these parameters can be accessed with the same offset.
The stack pointer always points to the top of the actual frame. At the top the stack pointer of the caller has to be stored. When creating a new frame the relocation of the pointer and storing of its previous value must be atomic. This is achieved with the instruction stwu. When leaving the method in the epilogue a simple instruction addi suffices, since the compiler knows the size of the frame. The back chain pointer is used for the debugger, for exceptions and for the garbage collection.
In case of an exception the registers LR, CTR, CR, XER, SRR0 and SRR1 must be additionally saved. Then follow all GPRs. In fact, it would be sufficient to store all the volatile GPR's and the nonvolatile GPR's which are used within this method. But stmw and lmw can be applied very efficiently but allow for storing a whole row only.
FPR's need no saving, as they are normally not allowed to be used in exceptions (FP bit in MSR = 0). Here again, if US.ENABLE_FLOATS()
is called in this method, the FPR's must be saved as well.
Optimization: The PPC architecture has lots of FPR's. One could use half of them in normal methods and the other half in exception methods. For such a case all normal methods which could be called from within exception methods must be translated a second time with the second set. The compiler would have to find out how to handle each method.
All parameters must be copied in the appropriate registers, see Register Allocation. During this it might be necessary that two or more registers must be swapped in a cycle. For this purpose two arrays destGPR and destFPR are determined. They show which source register goes into which destination register, if the register holds a parameter. If a cycle is found it will be solved through the aid of R0 or FR0, respectively.
As shown in Register Allocation, return values must be passed in R2 (together with R3 for longs) or FPR1. For longs we must consider the case that R2 and R3 must be swapped in order to contain the correct return value.
While generating code certain addresses are not known.
Such addresses must be loaded with the aid of an auxiliary (with addi and addis). After linking these addresses are known and must be corrected in the code. For this purpose a table called fixups is maintained. It contains all references to the objects, which were created by the class file reader and whose final addresses must be inserted into the code. In order to know at what position in the machine code array a certain correction has to be made, the next position is stored in the instruction addi as an immediate operand beginning with the last position (lastFixup).
Java does not allow direct access and manipulation of absolute memory locations. Nevertheless this is essential for embedded programming. We therefore include this possibility by using a special class org.deepjava.unsafe.ppc.US.java (US stand for unsafe). Wenn methods of this class are used the code generator has to insert machine code directly. The register allocator does not have to allocate registers for this instructions. US.java therefore serves as simple stubs.
There is a set of methods contained in the class LL.java. These methods handle functions which should cause no call to a method but which could be implemented directly with one or several machine instructions depending on the target architecture. Here again, LL.java contains simple stubs.
Subroutines are methods for which there is no Java code (and hence no Bytecode or SSA) but only machine code. This is useful for the delegation of interface methods (see Interfaces) or for dividing longs.
Such methods are listed in Method.compSpecSubroutines and the code generator issues code (if subroutines are used). These subroutines must be linked as well.
Currently, there are three types of compiler specific methods
imDelegIiMm needs 3 auxiliary registers. At compile time we cannot reserve such auxiliary registers as these registers must always be the same. Therefore, we use parameter registers. They are volatile and can be freely used though this means, that interface methods can pass less parameters, which is generally true. If not true, the compiler will report.
R10 holds the necessary information for the delegate method. The first two bytes are the ID of the sought-after interface, tho last to bytes contain the method offset. Loading of R10 should happen after parameter copying as R10 might be used there.