1 files changed, 849 insertions, 0 deletions
diff --git a/yasm_arch.7 b/yasm_arch.7
new file mode 100644
index 0000000..bd72d3f
--- /dev/null
+++ b/yasm_arch.7
@@ -0,0 +1,849 @@
+'\" t
+.\"     Title: yasm_arch
+.\"    Author: Peter Johnson <peter@tortall.net>
+.\" Generator: DocBook XSL Stylesheets v1.75.2 <http://docbook.sf.net/>
+.\"      Date: October 2006
+.\"    Manual: Yasm Supported Architectures
+.\"    Source: Yasm
+.\"  Language: English
+.\"
+.TH "YASM_ARCH" "7" "October 2006" "Yasm" "Yasm Supported Architectures"
+.\" -----------------------------------------------------------------
+.\" * set default formatting
+.\" -----------------------------------------------------------------
+.\" disable hyphenation
+.nh
+.\" disable justification (adjust text to left margin only)
+.ad l
+.\" -----------------------------------------------------------------
+.\" * MAIN CONTENT STARTS HERE *
+.\" -----------------------------------------------------------------
+.SH "NAME"
+yasm_arch \- Yasm Supported Target Architectures
+.SH "SYNOPSIS"
+.HP \w'\fByasm\fR\ 'u
+\fByasm\fR \fB\-a\ \fR\fB\fIarch\fR\fR [\fB\-m\ \fR\fB\fImachine\fR\fR] \fB\fI\&.\&.\&.\fR\fR
+.SH "DESCRIPTION"
+.PP
+The standard Yasm distribution includes a number of modules for different target architectures\&. Each target architecture can support one or more machine architectures\&.
+.PP
+The architecture and machine are selected on the
+\fByasm\fR(1)
+command line by use of the
+\fB\-a \fR\fB\fIarch\fR\fR
+and
+\fB\-m \fR\fB\fImachine\fR\fR
+command line options, respectively\&.
+.PP
+The machine architecture may also automatically be selected by certain object formats\&. For example, the
+\(lqelf32\(rq
+object format selects the
+\(lqx86\(rq
+machine architecture by default, while the
+\(lqelf64\(rq
+object format selects the
+\(lqamd64\(rq
+machine architecture by default\&.
+.SH "X86 ARCHITECTURE"
+.PP
+The
+\(lqx86\(rq
+architecture supports the IA\-32 instruction set and derivatives and the AMD64 instruction set\&. It consists of two machines:
+\(lqx86\(rq
+(for the IA\-32 and derivatives) and
+\(lqamd64\(rq
+(for the AMD64 and derivatives)\&. The default machine for the
+\(lqx86\(rq
+architecture is the
+\(lqx86\(rq
+machine\&.
+.SS "BITS Setting"
+.PP
+The x86 architecture BITS setting specifies to Yasm the processor mode in which the generated code is intended to execute\&. x86 processors can run in three different major execution modes: 16\-bit, 32\-bit, and on AMD64\-supporting processors, 64\-bit\&. As the x86 instruction set contains portions whose function is execution\-mode dependent (such as operand\-size and address\-size override prefixes), Yasm cannot assemble x86 instructions correctly unless it is told by the user in what processor mode the code will execute\&.
+.PP
+The BITS setting can be changed in a variety of ways\&. When using the NASM\-compatible parser, the BITS setting can be changed directly via the use of the
+\fBBITS xx\fR
+assembler directive\&. The default BITS setting is determined by the object format in use\&.
+.SS "BITS 64 Extensions"
+.PP
+The AMD64 architecture is a new 64\-bit architecture developed by AMD, based on the 32\-bit x86 architecture\&. It extends the original x86 architecture by doubling the number of general purpose and SIMD registers, extending the arithmetic operations and address space to 64 bits, as well as other features\&.
+.PP
+Recently, Intel has introduced an essentially identical version of AMD64 called EM64T\&.
+.PP
+When an AMD64\-supporting processor is executing in 64\-bit mode, a number of additional extensions are available, including extra general purpose registers, extra SSE2 registers, and RIP\-relative addressing\&.
+.PP
+Yasm extends the base NASM syntax to support AMD64 as follows\&. To enable assembly of instructions for the 64\-bit mode of AMD64 processors, use the directive
+\fBBITS 64\fR\&. As with NASM\'s BITS directive, this does not change the format of the output object file to 64 bits; it only changes the assembler mode to assume that the instructions being assembled will be run in 64\-bit mode\&. To specify an AMD64 object file, use
+\fB\-m amd64\fR
+on the Yasm command line, or explicitly target a 64\-bit object format such as
+\fB\-f win64\fR
+or
+\fB\-f elf64\fR\&.
+.sp
+.it 1 an-trap
+.nr an-no-space-flag 1
+.nr an-break-flag 1
+.br
+.ps +1
+\fBRegister Changes\fR
+.RS 4
+.PP
+The additional 64\-bit general purpose registers are named r8\-r15\&. There are also 8\-bit (rXb), 16\-bit (rXw), and 32\-bit (rXd) subregisters that map to the least significant 8, 16, or 32 bits of the 64\-bit register\&. The original 8 general purpose registers have also been extended to 64\-bits: eax, edx, ecx, ebx, esi, edi, esp, and ebp have new 64\-bit versions called rax, rdx, rcx, rbx, rsi, rdi, rsp, and rbp respectively\&. The old 32\-bit registers map to the least significant bits of the new 64\-bit registers\&.
+.PP
+New 8\-bit registers are also available that map to the 8 least significant bits of rsi, rdi, rsp, and rbp\&. These are called sil, dil, spl, and bpl respectively\&. Unfortunately, due to the way instructions are encoded, these new 8\-bit registers are encoded the same as the old 8\-bit registers ah, dh, ch, and bh\&. The processor tells which is being used by the presence of the new REX prefix that is used to specify the other extended registers\&. This means it is illegal to mix the use of ah, dh, ch, and bh with an instruction that requires the REX prefix for other reasons\&. For instance:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add ah, [r10]
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+(NASM syntax) is not a legal instruction because the use of r10 requires a REX prefix, making it impossible to use ah\&.
+.PP
+In 64\-bit mode, an additional 8 SSE2 registers are also available\&. These are named xmm8\-xmm15\&.
+.RE
+.sp
+.it 1 an-trap
+.nr an-no-space-flag 1
+.nr an-break-flag 1
+.br
+.ps +1
+\fB64 Bit Instructions\fR
+.RS 4
+.PP
+By default, most operations in 64\-bit mode remain 32\-bit; operations that are 64\-bit usually require a REX prefix (one bit in the REX prefix determines whether an operation is 64\-bit or 32\-bit)\&. Thus, essentially all 32\-bit instructions have a 64\-bit version, and the 64\-bit versions of instructions can use extended registers
+\(lqfor free\(rq
+(as the REX prefix is already present)\&. Examples in NASM syntax:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov eax, 1  ; 32\-bit instruction
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rcx, 1  ; 64\-bit instruction
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+Instructions that modify the stack (push, pop, call, ret, enter, and leave) are implicitly 64\-bit\&. Their 32\-bit counterparts are not available, but their 16\-bit counterparts are\&. Examples in NASM syntax:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+push eax  ; illegal instruction
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+push rbx  ; 1\-byte instruction
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+push r11  ; 2\-byte instruction with REX prefix
+.fi
+.if n \{\
+.RE
+.\}
+.RE
+.sp
+.it 1 an-trap
+.nr an-no-space-flag 1
+.nr an-break-flag 1
+.br
+.ps +1
+\fBImplicit Zero Extension\fR
+.RS 4
+.PP
+Results of 32\-bit operations are implicitly zero\-extended to the upper 32 bits of the corresponding 64\-bit register\&. 16 and 8 bit operations, on the other hand, do not affect upper bits of the register (just as in 32\-bit and 16\-bit modes)\&. This can be used to generate smaller code in some instances\&. Examples in NASM syntax:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov ecx, 1  ; 1 byte shorter than mov rcx, 1
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+and edx, 3  ; equivalent to and rdx, 3
+.fi
+.if n \{\
+.RE
+.\}
+.RE
+.sp
+.it 1 an-trap
+.nr an-no-space-flag 1
+.nr an-break-flag 1
+.br
+.ps +1
+\fBImmediates\fR
+.RS 4
+.PP
+For most instructions in 64\-bit mode, immediate values remain 32 bits; their value is sign\-extended into the upper 32 bits of the target register prior to being used\&. The exception is the mov instruction, which can take a 64\-bit immediate when the destination is a 64\-bit register\&. Examples in NASM syntax:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add rax, 1           ; optimized down to signed 8\-bit
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add rax, dword 1     ; force size to 32\-bit
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add rax, 0xffffffff  ; sign\-extended 32\-bit
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add rax, \-1          ; same as above
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add rax, 0xffffffffffffffff ; truncated to 32\-bit (warning)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov eax, 1           ; 5 byte
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rax, 1           ; 5 byte (optimized to signed 32\-bit)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rax, qword 1     ; 10 byte (forced 64\-bit)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rbx, 0x1234567890abcdef ; 10 byte
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rcx, 0xffffffff  ; 10 byte (does not fit in signed 32\-bit)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov ecx, \-1          ; 5 byte, equivalent to above
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rcx, sym         ; 5 byte, 32\-bit size default for symbols
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rcx, qword sym   ; 10 byte, override default size
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+The handling of mov reg64, unsized immediate is different between YASM and NASM 2\&.x; YASM follows the above behavior, while NASM 2\&.x does the following:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add rax, 0xffffffff  ; sign\-extended 32\-bit immediate
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add rax, \-1          ; same as above
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add rax, 0xffffffffffffffff ; truncated 32\-bit (warning)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+add rax, sym         ; sign\-extended 32\-bit immediate
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov eax, 1           ; 5 byte (32\-bit immediate)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rax, 1           ; 10 byte (64\-bit immediate)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rbx, 0x1234567890abcdef ; 10 byte instruction
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rcx, 0xffffffff  ; 10 byte instruction
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov ecx, \-1          ; 5 byte, equivalent to above
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov ecx, sym         ; 5 byte (32\-bit immediate)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rcx, sym         ; 10 byte instruction
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov rcx, qword sym   ; 10 byte (64\-bit immediate)
+.fi
+.if n \{\
+.RE
+.\}
+.RE
+.sp
+.it 1 an-trap
+.nr an-no-space-flag 1
+.nr an-break-flag 1
+.br
+.ps +1
+\fBDisplacements\fR
+.RS 4
+.PP
+Just like immediates, displacements, for the most part, remain 32 bits and are sign extended prior to use\&. Again, the exception is one restricted form of the mov instruction: between the al/ax/eax/rax register and a 64\-bit absolute address (no registers allowed in the effective address)\&. In NASM syntax, use of the 64\-bit absolute form requires
+\fB[qword]\fR\&. Examples in NASM syntax:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov eax, [1]    ; 32 bit, with sign extension
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov al, [rax\-1] ; 32 bit, with sign extension
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov al, [qword 0x1122334455667788] ; 64\-bit absolute
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov al, [0x1122334455667788] ; truncated to 32\-bit (warning)
+.fi
+.if n \{\
+.RE
+.\}
+.RE
+.sp
+.it 1 an-trap
+.nr an-no-space-flag 1
+.nr an-break-flag 1
+.br
+.ps +1
+\fBRIP Relative Addressing\fR
+.RS 4
+.PP
+In 64\-bit mode, a new form of effective addressing is available to make it easier to write position\-independent code\&. Any memory reference may be made RIP relative (RIP is the instruction pointer register, which contains the address of the location immediately following the current instruction)\&.
+.PP
+In NASM syntax, there are two ways to specify RIP\-relative addressing:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov dword [rip+10], 1
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+stores the value 1 ten bytes after the end of the instruction\&.
+\fB10\fR
+can also be a symbolic constant, and will be treated the same way\&. On the other hand,
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov dword [symb wrt rip], 1
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+stores the value 1 into the address of symbol
+\fBsymb\fR\&. This is distinctly different than the behavior of:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov dword [symb+rip], 1
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+which takes the address of the end of the instruction, adds the address of
+\fBsymb\fR
+to it, then stores the value 1 there\&. If
+\fBsymb\fR
+is a variable, this will
+\fInot\fR
+store the value 1 into the
+\fBsymb\fR
+variable!
+.PP
+Yasm also supports the following syntax for RIP\-relative addressing:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [rel sym], rax  ; RIP\-relative
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [abs sym], rax  ; not RIP\-relative
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+The behavior of:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [sym], rax
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+Depends on a mode set by the DEFAULT directive, as follows\&. The default mode is always "abs", and in "rel" mode, use of registers, an fs or gs segment override, or an explicit "abs" override will result in a non\-RIP\-relative effective address\&.
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+default rel
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [sym], rbx      ; RIP\-relative
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [abs sym], rbx  ; not RIP\-relative (explicit override)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [rbx+1], rbx    ; not RIP\-relative (register use)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [fs:sym], rbx   ; not RIP\-relative (fs or gs use)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [ds:sym], rbx   ; RIP\-relative (segment, but not fs or gs)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [rel sym], rbx  ; RIP\-relative (redundant override)
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+default abs
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [sym], rbx      ; not RIP\-relative
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [abs sym], rbx  ; not RIP\-relative
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [rbx+1], rbx    ; not RIP\-relative
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [fs:sym], rbx   ; not RIP\-relative
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [ds:sym], rbx   ; not RIP\-relative
+.fi
+.if n \{\
+.RE
+.\}
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov [rel sym], rbx  ; RIP\-relative (explicit override)
+.fi
+.if n \{\
+.RE
+.\}
+.RE
+.sp
+.it 1 an-trap
+.nr an-no-space-flag 1
+.nr an-break-flag 1
+.br
+.ps +1
+\fBMemory references\fR
+.RS 4
+.PP
+Usually the size of a memory reference can be deduced by which registers you\'re moving\-\-for example, "mov [rax],ecx" is a 32\-bit move, because ecx is 32 bits\&. YASM currently gives the non\-obvious "invalid combination of opcode and operands" error if it can\'t figure out how much memory you\'re moving\&. The fix in this case is to add a memory size specifier: qword, dword, word, or byte\&.
+.PP
+Here\'s a 64\-bit memory move, which sets 8 bytes starting at rax:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov qword [rax], 1
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+Here\'s a 32\-bit memory move, which sets 4 bytes:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov dword [rax], 1
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+Here\'s a 16\-bit memory move, which sets 2 bytes:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov word [rax], 1
+.fi
+.if n \{\
+.RE
+.\}
+.PP
+Here\'s an 8\-bit memory move, which sets 1 byte:
+.sp
+.if n \{\
+.RS 4
+.\}
+.nf
+mov byte [rax], 1
+.fi
+.if n \{\
+.RE
+.\}
+.RE
+.SH "LC3B ARCHITECTURE"
+.PP
+The
+\(lqlc3b\(rq
+architecture supports the LC\-3b ISA as used in the ECE 312 (now ECE 411) course at the University of Illinois, Urbana\-Champaign, as well as other university courses\&. See
+\m[blue]\fB\%http://courses.ece.uiuc.edu/ece411/\fR\m[]
+for more details and example code\&. The
+\(lqlc3b\(rq
+architecture consists of only one machine:
+\(lqlc3b\(rq\&.
+.SH "SEE ALSO"
+.PP
+\fByasm\fR(1)
+.SH "BUGS"
+.PP
+When using the
+\(lqx86\(rq
+architecture, it is overly easy to generate AMD64 code (using the
+\fBBITS 64\fR
+directive) and generate a 32\-bit object file (by failing to specify
+\fB\-m amd64\fR
+on the command line or selecting a 64\-bit object format)\&. Similarly, specifying
+\fB\-m amd64\fR
+does not default the BITS setting to 64\&. An easy way to avoid this is by directly specifying a 64\-bit object format such as
+\fB\-f elf64\fR\&.
+.SH "AUTHOR"
+.PP
+\fBPeter Johnson\fR <\&peter@tortall\&.net\&>
+.RS 4
+Author.
+.RE
+.SH "COPYRIGHT"
+.br
+Copyright \(co 2004, 2005, 2006, 2007 Peter Johnson
+.br