libnvmm.3 revision 1.1
1.Dd September 12, 2018
2.Dt LIBNVMM 3
3.Os
4.Sh NAME
5.Nm libnvmm
6.Nd NetBSD Virtualization API
7.Sh LIBRARY
8.Lb libnvmm
9.Sh SYNOPSIS
10.In nvmm.h
11.Ft int
12.Fn nvmm_capability "struct nvmm_capability *cap"
13.Ft int
14.Fn nvmm_machine_create "struct nvmm_machine *mach"
15.Ft int
16.Fn nvmm_machine_destroy "struct nvmm_machine *mach"
17.Ft int
18.Fn nvmm_machine_configure "struct nvmm_machine *mach" "uint64_t op" \
19    "void *conf"
20.Ft int
21.Fn nvmm_vcpu_create "struct nvmm_machine *mach" "nvmm_cpuid_t cpuid"
22.Ft int
23.Fn nvmm_vcpu_destroy "struct nvmm_machine *mach" "nvmm_cpuid_t cpuid"
24.Ft int
25.Fn nvmm_vcpu_getstate "struct nvmm_machine *mach" "nvmm_cpuid_t cpuid" \
26    "void *state" "uint64_t flags"
27.Ft int
28.Fn nvmm_vcpu_setstate "struct nvmm_machine *mach" "nvmm_cpuid_t cpuid" \
29    "void *state" "uint64_t flags"
30.Ft int
31.Fn nvmm_vcpu_inject "struct nvmm_machine *mach" "nvmm_cpuid_t cpuid" \
32    "struct nvmm_event *event"
33.Ft int
34.Fn nvmm_vcpu_run "struct nvmm_machine *mach" "nvmm_cpuid_t cpuid" \
35    "struct nvmm_exit *exit"
36.Ft int
37.Fn nvmm_gpa_map "struct nvmm_machine *mach" "uintptr_t hva" "gpaddr_t gpa" \
38    "size_t size" "int flags"
39.Ft int
40.Fn nvmm_gpa_unmap "struct nvmm_machine *mach" "uintptr_t hva" "gpaddr_t gpa" \
41    "size_t size"
42.Ft int
43.Fn nvmm_gva_to_gpa "struct nvmm_machine *mach" "nvmm_cpuid_t cpuid" \
44    "gvaddr_t gva" "gpaddr_t *gpa" "nvmm_prot_t *prot"
45.Ft int
46.Fn nvmm_gpa_to_hva "struct nvmm_machine *mach" "gpaddr_t gpa" \
47    "uintptr_t *hva"
48.Ft int
49.Fn nvmm_assist_io "struct nvmm_machine *mach" "nvmm_cpuid_t cpuid" \
50    "struct nvmm_exit *exit" "void (*cb)(struct nvmm_io *)"
51.Ft int
52.Fn nvmm_assist_mem "struct nvmm_machine *mach" "nvmm_cpuid_t cpuid" \
53    "struct nvmm_exit *exit" "void (*cb)(struct nvmm_mem *)"
54.Sh DESCRIPTION
55.Nm
56provides a library for VMM software to handle hardware-accelerated virtual
57machines in
58.Nx .
59A virtual machine is described by an opaque structure,
60.Cd nvmm_machine .
61VMM software should not attempt to modify this structure directly, and should
62use the API provided by
63.Nm
64to handle virtual machines.
65.Pp
66.Fn nvmm_capability
67gets the capabilities of NVMM.
68.Pp
69.Fn nvmm_machine_create
70creates a virtual machine in the kernel.
71The
72.Fa mach
73structure is initialized, and describes the machine.
74.Pp
75.Fn nvmm_machine_destroy
76destroys the virtual machine described in
77.Fa mach .
78.Pp
79.Fn nvmm_machine_configure
80configures, on the machine
81.Fa mach ,
82the parameter indicated in
83.Fa op .
84.Fa conf
85describes the value of the parameter.
86.Pp
87.Fn nvmm_vcpu_create
88creates a virtual CPU in the machine
89.Fa mach ,
90giving it the CPU id
91.Fa cpuid .
92.Pp
93.Fn nvmm_vcpu_destroy
94destroys the virtual CPU identified by
95.Fa cpuid
96in the machine
97.Fa mach .
98.Pp
99.Fn nvmm_vcpu_getstate
100gets the state of the virtual CPU identified by
101.Fa cpuid
102in the machine
103.Fa mach .
104The
105.Fa state
106argument is the address of a state area, and
107.Fa flags
108is the bitmap of the components that are to be retrieved.
109See
110.Sx VCPU State Area
111below for details.
112.Pp
113.Fn nvmm_vcpu_setstate
114sets the state of the virtual CPU identified by
115.Fa cpuid
116in the machine
117.Fa mach .
118The
119.Fa state
120argument is the address of a state area, and
121.Fa flags
122is the bitmap of the components that are to be set.
123See
124.Sx VCPU State Area
125below for details.
126.Pp
127.Fn nvmm_vcpu_run
128runs the CPU identified by
129.Fa cpuid
130in the machine
131.Fa mach ,
132until a VM exit is triggered.
133The
134.Fa exit
135structure is filled to indicate the exit reason, and the associated parameters
136if any.
137.Pp
138.Fn nvmm_gpa_map
139makes the guest physical memory area beginning on address
140.Fa gpa
141and of size
142.Fa size
143available in the machine
144.Fa mach .
145The area is mapped in the calling process' virtual address space, at address
146.Fa hva .
147.Pp
148.Fn nvmm_gpa_unmap
149removes the guest physical memory area beginning on address
150.Fa gpa
151and of size
152.Fa size
153from the machine
154.Fa mach .
155It also unmaps the area beginning on
156.Fa hva
157from the calling process' virtual address space.
158.Pp
159.Fn nvmm_gva_to_gpa
160translates, on the CPU
161.Fa cpuid
162from the machine
163.Fa mach ,
164the guest virtual address given in
165.Fa gva
166into a guest physical address returned in
167.Fa gpa .
168The associated page premissions are returned in
169.Fa prot .
170.Fa gva
171must be page-aligned.
172.Pp
173.Fn nvmm_gpa_to_hva
174translates, on the machine
175.Fa mach ,
176the guest physical address indicated in
177.Fa gpa
178into a host virtual address returned in
179.Fa hva .
180.Fa gpa
181must be page-aligned.
182.Pp
183.Fn nvmm_assist_io
184emulates the I/O operation described in
185.Fa exit
186on CPU
187.Fa cpuid
188from machine
189.Fa mach .
190.Fa cb
191will be called to handle the transaction.
192See
193.Sx I/O Assist
194below for details.
195.Pp
196.Fn nvmm_assist_mem
197emulates the Mem operation described in
198.Fa exit
199on CPU
200.Fa cpuid
201from machine
202.Fa mach .
203.Fa cb
204will be called to handle the transaction.
205See
206.Sx Mem Assist
207below for details.
208.Ss NVMM Capability
209The
210.Cd nvmm_capability
211structure helps VMM software identify the capabilities offered by NVMM on the
212host:
213.Bd -literal
214struct nvmm_capability {
215	uint64_t version;
216	uint64_t state_size;
217	uint64_t max_machines;
218	uint64_t max_vcpus;
219	uint64_t max_ram;
220	union {
221		struct {
222			...
223		} x86;
224		uint64_t rsvd[8];
225	} u;
226};
227.Ed
228.Pp
229For example, the
230.Cd max_machines
231field indicates the maximum number of virtual machines supported, while
232.Cd max_vcpus
233indicates the maximum number of VCPUs supported per virtual machine.
234.Ss VCPU State Area
235A VCPU state area is a structure that entirely defines the content of the
236registers of a VCPU.
237Only one such structure exists, for x86:
238.Bd -literal
239struct nvmm_x64_state {
240	...
241};
242.Ed
243.Pp
244Refer to functional examples to see precisely how to use this structure.
245.Ss Exit Reasons
246The
247.Cd nvmm_exit
248structure is used to handle VM exits:
249.Bd -literal
250enum nvmm_exit_reason {
251	NVMM_EXIT_NONE		= 0x0000000000000000,
252
253	/* General. */
254	NVMM_EXIT_MEMORY	= 0x0000000000000001,
255	NVMM_EXIT_IO		= 0x0000000000000002,
256	NVMM_EXIT_MSR		= 0x0000000000000003,
257	NVMM_EXIT_INT_READY	= 0x0000000000000004,
258	NVMM_EXIT_NMI_READY	= 0x0000000000000005,
259	NVMM_EXIT_SHUTDOWN	= 0x0000000000000006,
260
261	/* Instructions (x86). */
262	...
263
264	NVMM_EXIT_INVALID	= 0xFFFFFFFFFFFFFFFF
265};
266
267struct nvmm_exit {
268	enum nvmm_exit_reason reason;
269	union {
270		...
271	} u;
272	uint64_t exitstate[8];
273};
274.Ed
275.Pp
276The
277.Va reason
278field indicates the reason of the VM exit.
279Additional parameters describing the exit can be present in
280.Va u .
281.Va exitstate
282contains a partial, implementation-specific VCPU state, usable as a fast-path
283to retrieve certain state values.
284.Pp
285It is possible that a VM exit was caused by a reason internal to the host
286kernel, and that VMM software should not be concerned with.
287In this case, the exit reason is set to
288.Cd NVMM_EXIT_NONE .
289This gives a chance for VMM software to halt the VM in its tracks.
290.Pp
291Refer to functional examples to see precisely how to handle VM exits.
292.Ss Event Injection
293It is possible to inject an event into a VCPU.
294An event can be a hardware interrupt, a software interrupt, or a software
295exception, defined by:
296.Bd -literal
297enum nvmm_event_type {
298	NVMM_EVENT_INTERRUPT_HW,
299	NVMM_EVENT_INTERRUPT_SW,
300	NVMM_EVENT_EXCEPTION
301};
302
303struct nvmm_event {
304	enum nvmm_event_type type;
305	uint64_t vector;
306	union {
307		uint64_t error;
308		uint64_t prio;
309	} u;
310};
311.Ed
312.Pp
313This describes an event of type
314.Va type ,
315to be sent to vector number
316.Va vector ,
317with a possible additional
318.Va error
319or
320.Va prio
321code that is implementation-specific.
322.Pp
323It is possible that the VCPU is in a state where it cannot receive this
324event, if:
325.Pp
326.Bl -bullet -offset indent -compact
327.It
328the event is a hardware interrupt, and the VCPU runs with interrupts disabled,
329or
330.It
331the event is a non-maskable interrupt (NMI), and the VCPU is already in a
332in-NMI context.
333.El
334.Pp
335In this case,
336.Fn nvmm_vcpu_inject
337will return
338.Er EAGAIN ,
339and NVMM will cause a VM exit with reason
340.Cd NVMM_EXIT_INT_READY
341or
342.Cd NVMM_EXIT_NMI_READY
343to indicate that VMM software can now reinject the desired event.
344.Ss I/O Assist
345When a VM exit occurs with reason
346.Cd NVMM_EXIT_IO ,
347it is necessary for VMM software to emulate the associated I/O operation.
348.Nm
349provides an easy way for VMM software to perform that.
350.Pp
351.Fn nvmm_assist_io
352will call the
353.Fa cb
354callback function and give it a
355.Cd nvmm_io
356structure as argument.
357This structure describes an I/O transaction:
358.Bd -literal
359struct nvmm_io {
360	uint64_t port;
361	bool in;
362	size_t size;
363	uint8_t data[8];
364};
365.Ed
366.Pp
367The callback can emulate the operation using this descriptor, following two
368unique cases:
369.Pp
370.Bl -bullet -offset indent -compact
371.It
372The operation is an input.
373In this case, the callback should fill
374.Va data
375with the desired value.
376.It
377The operation is an output.
378In this case, the callback should read
379.Va data
380to retrieve the desired value.
381.El
382.Pp
383In either case,
384.Va port
385will indicate the I/O port,
386.Va in
387will indicate if the operation is an input, and
388.Va size
389will indicate the size of the access.
390.Ss Mem Assist
391When a VM exit occurs with reason
392.Cd NVMM_EXIT_MEMORY ,
393it is necessary for VMM software to emulate the associated memory operation.
394.Nm
395provides an easy way for VMM software to perform that, similar to the I/O
396Assist.
397.Pp
398.Fn nvmm_assist_mem
399will call the
400.Fa cb
401callback function and give it a
402.Cd nvmm_mem
403structure as argument.
404This structure describes a Mem transaction:
405.Bd -literal
406struct nvmm_mem {
407	gvaddr_t gva;
408	gpaddr_t gpa;
409	bool write;
410	size_t size;
411	uint8_t data[8];
412};
413.Ed
414.Pp
415The callback can emulate the operation using this descriptor, following two
416unique cases:
417.Pp
418.Bl -bullet -offset indent -compact
419.It
420The operation is a read.
421In this case, the callback should fill
422.Va data
423with the desired value.
424.It
425The operation is a write.
426In this case, the callback should read
427.Va data
428to retrieve the desired value.
429.El
430.Pp
431In either case,
432.Va gva
433will indicate the guest virtual address,
434.Va gpa
435will indicate the guest physical address,
436.Va write
437will indicate if the access is a write, and
438.Va size
439will indicate the size of the access.
440.Sh RETURN VALUES
441Upon successful completion, each of these functions returns zero.
442Otherwise, a value of \-1 is returned and the global
443variable
444.Va errno
445is set to indicate the error.
446.Sh FILES
447Functional examples:
448.Pp
449.Bl -tag -width XXXX -compact
450.It Pa src/share/examples/nvmm/toyvirt/
451Example of virtualizer.
452Launches the binary given as argument in a virtual machine.
453.It Pa src/share/examples/nvmm/smallkern/
454Example of a kernel that can be executed by toyvirt.
455.El
456.Sh ERRORS
457These functions will fail if:
458.Bl -tag -width [ENOBUFS]
459.It Bq Er EEXIST
460An attempt was made to create a machine or a VCPU that already exists.
461.It Bq Er EFAULT
462An attempt was made to emulate a memory-based operation in a guest, and the
463guest page tables did not have the permissions necessary for the operation
464to complete successfully.
465.It Bq Er EINVAL
466An inappropriate parameter was used.
467.It Bq Er ENOBUFS
468The maximum number of machines or VCPUs was reached.
469.It Bq Er ENOENT
470A query was made on a machine or a VCPU that does not exist.
471.It Bq Er EPERM
472An attempt was made to access a machine that does not belong to the process.
473.El
474.Pp
475In addition,
476.Fn nvmm_vcpu_inject
477uses the following error codes:
478.Bl -tag -width [ENOBUFS]
479.It Bq Er EAGAIN
480The VCPU cannot receive the event immediately.
481.El
482.Sh AUTHORS
483NVMM was designed and implemented by
484.An Maxime Villard .
485