====== Implementation of Strings ======
In Java a string like
String str = new String("abc");
causes the allocation of two objects. Strings in Java are //immutable//. With the field //offset// it is possible to form substrings in the constant string.
[{{ .:strings1.png?450&direct | //A string in Java// }}]
Creating two objects and managing them leads to unfavorable behaviour in terms of memory and runtime. A more elegant solution would create a single object. What to do?
- Change of the class file and replacement of the class //String// (relatively costly)
- Change of code generator
We choose the second alternative. When a //new// is encountered we must check if a string is to be created. If this is the case a special string allocation method is to be called, see [[.:objects|Objects]]. The translation of the SSA instructions //sCloadFromArray// and //sCstoreToArray// nedd to be adapted as well.
===== Optimized String =====
The creation and manipulation of strings is done solely in //java/lang/String//. This class extends //BString// (this stands for //Base String//). //BString// has the single field //count//, which is the lenght of the string. This ensures that the field //count// will reside at index 0. //String// has another field //value// of type //char[]//. Instead of containing a reference to a character array, this field contains the characters directly.
[{{ .strings2.png?300&direct | //Optimized String// }}]
The field //size// holds the number of bytes of the whole object. This is analogous to other objects (see [[dev:rts:heap|Heap Manager and Garbage Collection]]). It is used for an efficient sweep phase.
===== Creation =====
The Bytecode for the creation of a string is
new java/lang/String
invokespecial constructor
//new// must not yet be executed, because the size of the character array cannot be fetched from the type descriptor. The character array will be inserted directly into the object and this size varies from string to string. The code generator has to check if //new// creates something of type //String//. If true, nothing will be done except putting the reference to the type descriptor of //String// into the result register of //new//. \\
The next Bytecode instruction is //invokespecial//. In our case, it must not call the constructor, because the object does not yet exist. The linker will insert a call to a special factory method with similar parameters.
public static String allocateString(int ref, char value[]) {
int len = value.length;
String str = newstring(ref, len * 2);
str.count = len;
for (int i = 0; i < len; i++) str.value[i] = value[i];
return str;
}
Foremost, the necessary size must be calculated and the object must be allocated on the heap. The linker must replace the method address of the stub //newstring// with the method //newstring// in the class //Heap//. //newstring// does:
* set the tag of the string object to the type descriptor of //String// (similar to //newobject//).
* create the necessary number of array fields (similar to //newarray//).
* set the field //count//.
* call the constructor of the super class. This call has to be inserted in the code generator.
* initialize the string.
Finally, the reference to the newly created string has to be put into the result register of the //sCnew// instruction.
===== Access to Fields =====
When accessing fields of a string, the code generator has to modify the code. Such an access shows up in the SSA as follows
1: MonadicRef[sCloadFromField] {0} (Char-Array)
2: NoOpnd[sCloadLocal] (Integer)
3: Dyadic[sCloadFromArray] {1, 2} (Char)
The field //value// is loaded. //value// is a reference. With this reference and an index into the array the proper character can be accessed. The code generator has to check for the instruction //sCloadFromField// if the field is //String.value//. If true, the fetching is ommited and the offset of //value// is stored. With the subsequent instruction //sCloadFromArray//, the offset must be added to the index. At the same time one cannot use //value// as the array reference but must use the reference to the string. \\
Similarly, writing to a string must be customized. However, strings in java are immutable. Writing to strings happens only in the factory methods during initialization. Checking for array index out of bounds may be ommited as well.
===== Constant String =====
Constant strings are stored in the constant block of a class, see [[dev:crosscompiler:backend_ppc:linker32|Linker32]]. The layout there must be identical to the layout on the heap as given above.
===== Special Linking =====
The type descriptor of the class //String// must have its field //nofInstPtr// set to 0. Though there is a field //char[] values// we access this field in an omtimized way as decribed above. The garbage collector must not follow this field!