Microprocessor 4. Philippe Darche
Чтение книги онлайн.
Читать онлайн книгу Microprocessor 4 - Philippe Darche страница 8
nop (no operation
, cf. § 2.8.5) will classically take up one byte compared to a word with several bytes with a fixed format. The format's variability makes it difficult to use a pipeline or a superscalar execution (this will be covered in a future book by the author on microprocessors). As an example of a fixed format, we cite the format n = 32 bits for MIPS Technologies microprocessors. Even if the format is fixed, the number of fields may vary as well as the format. Encoding uses three types, which are Register (R-type), Immediate (I-type) and Jump (J-type) format (Figure 1.5). The operation code, completed possibly by the function field, specifies the instruction. For the first type, the second field is a specifier of the source register (rs). The following specifies the target or destination register (rt or rd), which receives the result or branching condition. The last field is an immediate value, a jump or address displacement. For the J type, the operand is the jump address in a 26-bit format. For the last type, the third field is a destination register specifier (rd). The penultimate field indicates the value of a possible shift (0 = no shift). Note the conventions rt = rs + immediate and rd = rs + rt. This simple encoding should be compared with that of the Arm® family, which can show as many as 21 types (Arm 2000).
Figure 1.5. Three fixed formats for MIPS instructions
None of these different fields have been standardized and are dependent on the manufacturer and the MPU family. For example, for Bayliss et al. (1981), an instruction is formed of four fields, which are the function fields (opcode), reference fields, and format and class fields. The class specifies the number of operands and their types. The necessary format field if there is at least one operand indicates their location (memory, register or pile, for example). The reference field gives their location explicitly. Their operation code field specifies the operation to be executed.
Figure 1.6 shows the typical variable instruction of an existing microprocessor. The instruction code has a format of 6 bits. The direction bit D indicates the direction of transfer (0 = source specified by the field reg, 1 = destination specified by the field). The bit W specifies the transfer format (0 = byte, 1 = word of 16 bits). The 2rd byte is called a “post-byte”. The mode field indicates whether the transfer involves only the registers or if the memory is involved, the two displacement fields therefore indicate the length of the latter. We recognize the Little Endian byte order (LE (Cohen 1981), cf. § 2.6.2 from Darche (2012)) typical of Intel architecture since the Least Significant Byte (LSB) is first stored in the memory, in the order of the increasing addresses. To finish, the R/M (Register/Memory) field, poorly named, specifies the addressing mode, that is, the method of calculating the effective address (cf. § 1.2). Another format exists where the instruction is coded on a single byte. Thus, the format of these instructions can vary from 1 to 6 bytes. It is possible to add to these three types of prefix to modify the behavior of the instruction.
Figure 1.6. Typical instruction format from 8086/88
The architecture can also add a field, before or after the operation code to code the instruction class (called an extension of the operation code) or to specify a variable format. One example is the central IBM System/370 computer with its first 2 bits. The encoding of one instruction of the i486 by Intel is a typical example of the CISC approach (Complex Instruction Set Computer, this will be covered in a future book by the author on microprocessors). This type of instruction has a size ranging from 1 to 13 bytes. The word-code is therefore formed of one or two bytes for the operational code, a modify Register or Memory (mod R/M) byte, a ScaleIndex-Base (SIB) byte, the bytes for displacement and the bytes for the immediate values. The reg/operation code field specifies a register or makes it possible to add information for the operation code. The R/M field specifies a register (23 at most) or, if it is combined with the mode field, makes it possible to specify a mode of address (24 maximum). The SIB byte makes it possible to specify the scale factor (0, 2, 4 or 8), an index register number and the base register number. In addition, one or more prefix bytes (in any order except for REX, see below) can change how the following instruction is interpreted. Figure 1.7 shows the instruction format for Intel IA-32 and Intel 64 architectures, which has changed with the evolution of MPUs. For example, the operation code for Pentium had a maximum size of two bytes. Today, the maximum length of an instruction is 15 bytes. The format for the instructions has not ceased growing.
Another example is Arm® architecture, which, to the left of the operation code, adds a condition field (Figure 2.23). Today, there are sets of instructions in multiple formats, a sort of compromise between fixed and variable formats with only two formats, for example, 32 bits and another value such as 16 bits with 19 different forms of encoding for Thumb® (Arm®) technology linked to the compression of these instruction codes (cf. § 1.1.1).
Figure 1.7. Variable instruction format Intel IA-32 and Intel 64 (Intel 2016) architectures
Several technical solutions exist for retaining ascending binary compatibility (cf. § 3.3.3). Intel has chosen the instruction prefix. It affects how the instruction is interpreted. For example, a REX (Register Extension) prefix in 64-bit mode that indicates that the instruction uses extended registers is a valid instruction (inc
or dec
) in IA-32 mode. This solution had already been used by Z80 with four non-assigned machine codes (hexadecimal values CB, DD, ED and FD as prefix) to expand its compatible instruction set with 8080. Another solution was to add a post-byte to distinguish between the sets of instructions. One recent example is the VEX prefix for Vector Extensions, which makes it possible to encode the AVX (Advanced Vector eXtensions, cf. § 2.7.1) extension from Intel.
The number of instructions, type of architecture (stack-based, register-based, etc.), the number of addressable registers, the number of internal busses and the type, format and location of the operands will have an influence on the format i of an instruction. For access to primary memory, the memory organization, in particular the exchange format (byte or word), byte order (remember the Endian story! cf. § V1-2.2.1) and the alignment (cf. § 2.6.1 from Darche (2012)), will have some influence. The ISA can be evaluated by the number of instructions F, their complexity, their format i and the memory space they occupy. The designer's choice will depend on the function of the desired performances (execution time, memory requirement, etc.), of the usage domains and the manufacturing cost. Complexity, if it is not material, could affect the software, in particular the compiler as in the RISC approach and in the programmer. The appendix shows the instruction encoding table for MPU 6809E from Motorola. For information, the aspect of decoding an instruction has been discussed in the previous volume.
1.1.1. Code compression