aboutsummaryrefslogtreecommitdiff
path: root/doc/KVM-Test-API.asciidoc
blob: 7e537afcb7b44e36742cf34b1a5a8df71497d378 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
LTP KVM Test API
================

Testing KVM is more complex than other Linux features. Some KVM bugs allow
userspace code running inside the virtual machine to bypass (emulated) hardware
access restrictions and elevate its privileges inside the guest operating
system. The worst types of KVM bugs may even allow the guest code to crash or
compromise the physical host. KVM tests therefore need to be split into two
components – a KVM controller program running on the physical host and a guest
payload program running inside the VM. The cooperation of these two components
allows testing even of bugs that somehow cross the virtualization boundary.

NOTE: See also
      https://github.com/linux-test-project/ltp/wiki/Test-Writing-Guidelines[Test Writing Guidelines],
      https://github.com/linux-test-project/ltp/wiki/C-Test-Case-Tutorial[C Test Case Tutorial],
      https://github.com/linux-test-project/ltp/wiki/C-Test-API[C Test API].

1. Basic KVM test structure
---------------------------

KVM tests are simple C source files containing both the KVM controller code
and the guest payload code separated by `#ifdef COMPILE_PAYLOAD` preprocessor
condition. The file will be compiled twice. Once to compile the payload part,
once to compile the KVM controller part and embed the payload binary inside.
The result is a single self-contained binary that'll execute the embedded
payload inside a KVM virtual machine and print results in the same format as
a normal LTP test.

A KVM test source should start with `#include "kvm_test.h"` instead of the
usual `tst_test.h`. The `kvm_test.h` header file will include the other basic
headers appropriate for the current compilation pass. Everything else in the
source file should be enclosed in `#ifdef COMPILE_PAYLOAD ... #else ... #endif`
condition, including any other header file includes. Note that the standard
LTP headers are not available in the payload compilation pass, only the KVM
guest library headers can be included.

.Example KVM test
[source,c]
-------------------------------------------------------------------------------
#include "kvm_test.h"

#ifdef COMPILE_PAYLOAD

/* Guest payload code */

void main(void)
{
	tst_res(TPASS, "Hello, world!");
}

#else /* COMPILE_PAYLOAD */

/* KVM controller code */

static struct tst_test test = {
	.test_all = tst_kvm_run,
	.setup = tst_kvm_setup,
	.cleanup = tst_kvm_cleanup,
};

#endif /* COMPILE_PAYLOAD */
-------------------------------------------------------------------------------

The KVM controller code is a normal LTP test and needs to define an instance
of `struct tst_test` with metadata and the usual setup, cleanup, and test
functions. Basic implementation of all three functions is provided by the KVM
host library.

On the other hand, the payload is essentially a tiny kernel that'll run
on bare virtual hardware. It cannot access any files, Linux syscalls, standard
library functions, etc. except for the small subset provided by the KVM guest
library. The payload code must define a `void main(void)` function which will
be the VM entry point of the test.

2. KVM host library
-------------------

The KVM host library provides helper functions for creating and running
a minimal KVM virtual machine.

2.1 Data structures
~~~~~~~~~~~~~~~~~~~

[source,c]
-------------------------------------------------------------------------------
struct tst_kvm_instance {
	int vm_fd, vcpu_fd;
	struct kvm_run *vcpu_info;
	size_t vcpu_info_size;
	struct kvm_userspace_memory_region ram[MAX_KVM_MEMSLOTS];
	struct tst_kvm_result *result;
};
-------------------------------------------------------------------------------

`struct tst_kvm_instance` holds the file descriptors and memory buffers
of a single KVM virtual machine:

- `int vm_fd` is the main VM file descriptor created by `ioctl(KVM_CREATE_VM)`
- `int vcpu_fd` is the virtual CPU filedescriptor created by
  `ioctl(KVM_CREATE_VCPU)`
- `struct kvm_run *vcpu_info` is the VCPU state structure created by
  `mmap(vcpu_fd)`
- `size_t vcpu_info_size` is the size of `vcpu_info` buffer
- `struct kvm_userspace_memory_region ram[MAX_KVM_MEMSLOTS]` is the list
  of memory slots defined in this VM. Unused memory slots have zero
  in the `userspace_addr` field.
- `struct tst_kvm_result *result` is a buffer for passing test result data
  from the VM to the controller program, mainly `tst_res()`/`tst_brk()` flags
  and messages.

[source,c]
-------------------------------------------------------------------------------
struct tst_kvm_result {
	int32_t result;
	int32_t lineno;
	uint64_t file_addr;
	char message[0];
};
-------------------------------------------------------------------------------

`struct tst_kvm_result` is used to pass test results and synchronization data
between the KVM guest and the controller program. Most often, it is used
to pass `tst_res()` and `tst_brk()` messages from the VM, but special values
can also be used to send control flow requests both ways.

- `int32_t result` is the message type, either one of the `TPASS`, `TFAIL`,
  `TWARN`, `TBROK`, `TINFO` flags or a special control flow value. Errno flags
  are not supported.
- `int32_t lineno` is the line number for `tst_res()`/`tst_brk()` messages.
- `uint64_t file_addr` is the VM address of the filename string for
  `tst_res()`/`tst_brk()` messages.
- `char message[0]` is the buffer for arbitrary message data, most often used
  to pass `tst_res()`/`tst_brk()` message strings.

2.2 Working with virtual machines
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The KVM host library provides default implementation of the setup, cleanup
and test functions for `struct tst_test` in cases where you do not need
to customize the VM configuration. You can either assign these functions
to the `struct tst_test` instance directly or call them from your own function
that does some additional steps. All three function must be used together.

- `void tst_kvm_setup(void)`
- `void tst_kvm_run(void)`
- `void tst_kvm_cleanup(void)`

Note: `tst_kvm_run()` calls `tst_free_all()`. Calling it will free all
previously allocated guarded buffers.

- `void tst_kvm_validate_result(int value)` – Validate whether the value
  returned in `struct tst_kvm_result.result` can be safely passed
  to `tst_res()` or `tst_brk()`. If the value is not valid, the controller
  program will be terminated with error.

- `uint64_t tst_kvm_get_phys_address(const struct tst_kvm_instance *inst,
  uint64_t addr)` – Converts pointer value (virtual address) from KVM virtual
  machine `inst` to the corresponding physical address. Returns 0 if
  the virtual address is not mapped to any physical address. If virtual memory
  mapping is not enabled in the VM or not available on the arch at all, this
  function simply returns `addr` as is.

- `int tst_kvm_find_phys_memslot(const struct tst_kvm_instance *inst,
  uint64_t paddr)` – Returns index of the memory slot in KVM virtual machine
  `inst` which contains the physical address `paddr`. If the address is not
  backed by a memory buffer, returns -1.

- `int tst_kvm_find_memslot(const struct tst_kvm_instance *inst,
  uint64_t addr)` – Returns index of the memory slot in KVM virtual machine
  `inst` which contains the virtual address `addr`. If the virtual address
  is not mapped to a valid physical address backed by a memory buffer,
  returns -1.

- `void *tst_kvm_get_memptr(const struct tst_kvm_instance *inst,
  uint64_t addr)` – Converts pointer value (virtual address) from KVM virtual
  machine `inst` to host-side pointer.

- `void *tst_kvm_alloc_memory(struct tst_kvm_instance *inst, unsigned int slot,
  uint64_t baseaddr, size_t size, unsigned int flags)` – Allocates a guarded
  buffer of given `size` in bytes and installs it into specified memory `slot`
  of the KVM virtual machine `inst` at base address `baseaddr`. The buffer
  will be automatically page aligned at both ends. See the kernel
  documentation of `KVM_SET_USER_MEMORY_REGION` ioctl for list of valid
  `flags`. Returns pointer to page-aligned beginning of the allocated buffer.
  The actual requested `baseaddr` will be located at
  `ret + baseaddr % pagesize`.

- `struct kvm_cpuid2 *tst_kvm_get_cpuid(int sysfd)` – Get a list of supported
  virtual CPU features returned by `ioctl(KVM_GET_SUPPORTED_CPUID)`.
  The argument must be an open file descriptor returned by `open("/dev/kvm")`.

- `void tst_kvm_create_instance(struct tst_kvm_instance *inst,
  size_t ram_size)` – Creates and fully initializes a new KVM virtual machine
  with at least `ram_size` bytes of memory. The VM instance info will be
  stored in `inst`.

- `int tst_kvm_run_instance(struct tst_kvm_instance *inst, int exp_errno)` –
  Executes the program installed in KVM virtual machine `inst`. Any result
  messages returned by the VM will be automatically printed to controller
  program output. Returns zero. If `exp_errno` is non-zero, the VM execution
  syscall is allowed to fail with the `exp_errno` error code and
  `tst_kvm_run_instance()` will return -1 instead of terminating the test.

- `void tst_kvm_destroy_instance(struct tst_kvm_instance *inst)` – Deletes
  the KVM virtual machine `inst`. Note that the guarded buffers assigned
  to the VM by `tst_kvm_create_instance()` or `tst_kvm_alloc_memory()` will
  not be freed.

The KVM host library does not provide any way to reset a VM instance back
to initial state. Running multiple iterations of the test requires destroying
the old VM instance and creating a new one. Otherwise the VM will exit
without reporting any results on the second iteration and the test will fail.
The `tst_kvm_run()` function handles this issue correctly.

3. KVM guest library
--------------------

The KVM guest library provides a minimal implementation of both the LTP
test library and the standard C library functions. Do not try to include
the usual LTP or C headers in guest payload code, it will not work.

3.1 Standard C functions
~~~~~~~~~~~~~~~~~~~~~~~~

`#include "kvm_test.h"`

The functions listed below are implemented according to the C standard:

- `void *memset(void *dest, int val, size_t size)`
- `void *memzero(void *dest, size_t size)`
- `void *memcpy(void *dest, const void *src, size_t size)`
- `char *strcpy(char *dest, const char *src)`
- `char *strcat(char *dest, const char *src)`
- `size_t strlen(const char *str)`

3.2 LTP library functions
~~~~~~~~~~~~~~~~~~~~~~~~~

`#include "kvm_test.h"`

The KVM guest library currently provides the LTP functions for reporting test
results. All standard result flags except for `T*ERRNO` are supported
with the same rules as usual. However, the printf-like formatting is not
implemented yet.

- `void tst_res(int result, const char *message)`
- `void tst_brk(int result, const char *message) __attribute__((noreturn))`

A handful of useful macros is also available:

- `TST_TEST_TCONF(message)` – Generates a test program that will simply print
  a `TCONF` message and exit. This is useful when the real test cannot be
  built due to missing dependencies or arch limitations.

- `ARRAY_SIZE(arr)` – Returns the number of items in statically allocated
  array `arr`.

- `LTP_ALIGN(x, a)` – Aligns integer `x` to be a multiple of `a`, which
  must be a power of 2.

3.3 Arch independent functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

`#include "kvm_test.h"`

Memory management in KVM guest library currently uses only primitive linear
buffer for memory allocation. There are no checks whether the VM can allocate
more memory and the already allocated memory cannot be freed.

- `void *tst_heap_alloc(size_t size)` – Allocates a block of memory on the heap.

- `void *tst_heap_alloc_aligned(size_t size, size_t align)` – Allocates
  a block of memory on the heap with the starting address aligned to given
  value.

3.4 x86 specific functions
~~~~~~~~~~~~~~~~~~~~~~~~~~

`#include "kvm_test.h"` +
`#include "kvm_x86.h"`

- `struct kvm_interrupt_frame` – Opaque arch-dependent structure which holds
  interrupt frame information. Use the functions below to get individual values:

- `uintptr_t kvm_get_interrupt_ip(const struct kvm_interrupt_frame *ifrm)` –
  Get instruction pointer value from interrupt frame structure. This may be
  the instruction which caused an interrupt or the one immediately after,
  depending on the interrupt vector semantics.

- `int (*tst_interrupt_callback)(void *userdata,
  struct kvm_interrupt_frame *ifrm, unsigned long errcode)` – Interrupt handler
  callback prototype. When an interrupt occurs, the assigned callback function
  will be passed the `userdata` pointer that was given
  to `tst_set_interrupt_callback()`, interrupt frame `ifrm` and the error
  code `errcode` defined by the interrupt vector semantics. If the interrupt
  vector does not generate an error code, `errcode` will be set to zero.
  The callback function must return 0 if the interrupt was successfully
  handled and test execution should resume. Non-zero return value means that
  the interrupt could not be handled and the test will terminate with error.

- `void tst_set_interrupt_callback(unsigned int vector,
  tst_interrupt_callback func, void *userdata)` – Register new interrupt
  handler callback function `func` for interrupt `vector`. The `userdata`
  argument is an arbitrary pointer that will be passed to `func()` every time
  it gets called. The previous interrupt handler callback will be removed.
  Setting `func` to `NULL` will remove any existing interrupt handler
  from `vector` and the interrupt will become fatal error.

[source,c]
-------------------------------------------------------------------------------
struct page_table_entry_pae {
	unsigned int present: 1;
	unsigned int writable: 1;
	unsigned int user_access: 1;
	unsigned int write_through: 1;
	unsigned int disable_cache: 1;
	unsigned int accessed: 1;
	unsigned int dirty: 1;
	unsigned int page_type: 1;
	unsigned int global: 1;
	unsigned int padding: 3;
	uint64_t address: 40;
	unsigned int padding2: 7;
	unsigned int prot_key: 4;
	unsigned int noexec: 1;
} __attribute__((__packed__));

struct kvm_cpuid {
	unsigned int eax, ebx, ecx, edx;
};

struct kvm_cregs {
	unsigned long cr0, cr2, cr3, cr4;
};

struct kvm_sregs {
	uint16_t cs, ds, es, fs, gs, ss;
};
-------------------------------------------------------------------------------

`struct page_table_entry_pae` is the page table entry structure for PAE and
64bit paging modes. See Intel(R) 64 and IA-32 Architectures Software
Developer's Manual, Volume 3, Chapter 4 for explanation of the fields.

- `uintptr_t kvm_get_page_address_pae(const struct page_table_entry_pae *entry)`
  – Returns the physical address of the memory page referenced by the given
  page table `entry`. Depending on memory mapping changes done by the test,
  the physical address may not be a valid pointer. The caller must determine
  whether the address points to another page table entry or a data page, using
  the known position in page table hierarchy and `entry->page_type`. Returns
  zero if the `entry` does not reference any memory page.

- `void kvm_set_segment_descriptor(struct segment_descriptor *dst, uint64_t baseaddr, uint32_t limit, unsigned int flags)` -
  Fill the `dst` segment descriptor with given values. The maximum value
  of `limit` is `0xfffff` (inclusive) regardless of `flags`.

- `void kvm_parse_segment_descriptor(struct segment_descriptor *src, uint64_t *baseaddr, uint32_t *limit, unsigned int *flags)` -
  Parse data in the `src` segment descriptor and copy them to variables
  pointed to by the other arguments. Any parameter except the first one can
  be `NULL`.

- `int kvm_find_free_descriptor(const struct segment_descriptor *table, size_t size)` -
  Find the first segment descriptor in `table` which does not have
  the `SEGFLAG_PRESENT` bit set. The function handles double-size descriptors
  correctly. Returns index of the first available descriptor or -1 if all
  `size` descriptors are taken.

- `unsigned int kvm_create_stack_descriptor(struct segment_descriptor *table, size_t tabsize, void *stack_base)` -
  Convenience function for registering a stack segment descriptor. It'll
  automatically find a free slot in `table` and fill the necessary flags.
  The `stack_base` pointer must point to the bottom of the stack.

- `void kvm_get_cpuid(unsigned int eax, unsigned int ecx,
  struct kvm_cpuid *buf)` – Executes the CPUID instruction with the given
  `eax` and `ecx` arguments and stores the results in `buf`.

- `void kvm_read_cregs(struct kvm_cregs *buf)` – Copies the current values
  of control registers to `buf`.

- `void kvm_read_sregs(struct kvm_sregs *buf)` - Copies the current values
  of segment registers to `buf`.

- `uint64_t kvm_rdmsr(unsigned int msr)` – Returns the current value
  of model-specific register `msr`.

- `void kvm_wrmsr(unsigned int msr, uint64_t value)` – Stores `value`
  into model-specific register `msr`.

- `void kvm_exit(void) __attribute__((noreturn))` – Terminate the test.
  Similar to calling `exit(0)` in a regular LTP test, although `kvm_exit()`
  will terminate only one iteration of the test, not the whole host process.

See Intel(R) 64 and IA-32 Architectures Software Developer's Manual
for documentation of standard and model-specific x86 registers.

3.5 AMD SVM helper functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

`#include "kvm_test.h"` +
`#include "kvm_x86.h"` +
`#include "kvm_x86_svm.h"`

The KVM guest library provides basic helper functions for creating and running
nested virtual machines using the AMD SVM technology.

.Example code to execute nested VM
[source,c]
-------------------------------------------------------------------------------
int guest_main(void)
{
	...
	return 0;
}

void main(void)
{
	struct kvm_svm_vcpu *vm;

	kvm_init_svm();
	vm = kvm_create_svm_vcpu(guest_main, 1);
	kvm_svm_vmrun(vm);
}
-------------------------------------------------------------------------------

- `int kvm_is_svm_supported(void)` - Returns non-zero value if the CPU
  supports AMD SVM, otherwise returns 0.

- `int kvm_get_svm_state(void)` - Returns non-zero value if SVM is currently
  enabled, otherwise returns 0.

- `void kvm_set_svm_state(int enabled)` - Enable or disable SVM according
  to argument. If SVM is disabled by host or not supported, the test will exit
  with `TCONF`.

- `void kvm_init_svm(void)` - Enable and fully initialize SVM, including
  allocating and setting up host save area VMCB. If SVM is disabled by host or
  not supported, the test will exit with `TCONF`.

- `struct kvm_vmcb *kvm_alloc_vmcb(void)` - Allocate new VMCB structure
  with correct memory alignment and fill it with zeroes.

- `void kvm_vmcb_set_intercept(struct kvm_vmcb *vmcb, unsigned int id, unsigned int state)` -
  Set SVM intercept bit `id` to given `state`.

- `void kvm_init_guest_vmcb(struct kvm_vmcb *vmcb, uint32_t asid, uint16_t ss, void *rsp, int (*guest_main)(void))` -
  Initialize new SVM virtual machine. The `asid` parameter is the nested
  page table ID. The `ss` and `rsp` parameters set the stack segment and stack
  pointer values, respectively. The `guest_main` parameter sets the code entry
  point of the virtual machine. All control registers, segment registers
  (except stack segment register), GDTR and IDTR will be copied
  from the current CPU state.

- `struct kvm_svm_vcpu *kvm_create_svm_vcpu(int (*guest_main)(void), int alloc_stack)` -
  Convenience function for allocating and initializing new SVM virtual CPU.
  The `guest_main` parameter is passed to `kvm_init_guest_vmcb()`,
  the `alloc_stack` parameter controls whether a new 8KB stack will be
  allocated and registered in GDT. Interception will be enabled for `VMSAVE`
  and `HLT` instructions. If you set `alloc_stack` to zero, you must configure
  the stack segment register and stack pointer manually.

- `void kvm_svm_vmrun(struct kvm_svm_vcpu *cpu)` - Start or continue execution
  of a nested virtual machine. Beware that FPU state is not saved.  Do not use
  floating point types or values in nested guest code. Also do not use
  `tst_res()` or `tst_brk()` functions in nested guest code.

See AMD64 Architecture Programmer's Manual Volume 2 for documentation
of the Secure Virtual Machine (SVM) technology.

4. KVM guest environment
------------------------

KVM guest payload execution begins with bootstrap code which will perform
the minimal guest environment setup required for running C code:

- Activate the appropriate CPU execution mode (IA-32 protected mode
  on 32-bit x86 or the 64-bit mode on x86_64).
- Create indentity mapping (virtual address = physical address) of the lower
  2GB memory region, even if parts of the region are not backed by any host
  memory buffers. The memory region above 2GB threshold is left unmapped
  except for one memory page reserved for the `struct tst_kvm_result` buffer.
- Initialize 8KB stack.
- Install default interrupt handlers for standard CPU exception vectors.

When the environment setup is complete, bootstrap will call `void main(void)`
function implemented by the test program. To finish execution of guest payload,
the test can either return from the `main()` function or call `kvm_exit()`
at any point.