Next: Reader and Printer, Up: Character and String Types
Characters are immediate objects (that is, they require no heap
allocation) in all permutations of build-time options. Even on a 32-bit
platform with :SB-UNICODE
, there are three bits to spare after
allocating 8 bits for the character widetag and 21 for the character
code. There is only one such layout, and consequently only one widetag
is needed: the difference between base-char
and character
is purely on the magnitude of the char-code
.
Objects of type (simple-array nil (*))
are represented in memory
as two words: the first is the object header, with the appropriate
widetag, and the second is the length field. No memory is needed for
elements of these objects, as they can have none.
Objects of type simple-base-string
have the header word
with widetag, then a word for the length, and after that a sequence of
8-bit char-code
bytes. The system arranges for there to be a
null byte after the sequence of lisp character codes.
Objects of type (simple-array character (*))
, where this is a
distinct type from simple-base-string
, have the header word with
widetag, length, and then a sequence of 32-bit char-code
bytes.
Again, the system arranges for there to be a null word after the
sequence of character codes.
Non-simple character arrays, and simple character arrays of non-unit dimensionality, have an array header with a reference to an underlying data array of the appropriate form from the above representations.