p
.Nm
also supports injecting errors.
.Nm
also supports reading/writing/clearing error records in a persistent
firmware store (XXX not yet: nothing uses the ERST).
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh DIAGNOSTICS
When the hardware detects an error and reports it to
.Nm ,
it will print information about the error to the console.
p Example of a correctable memory error, automatically corrected by the system, with no further intervention needed: d -literal apei0: error source 1 reported hardware error: severity=corrected nentries=1 status=0x12<CE,GEDE_COUNT=0x1> apei0: error source 1 entry 0: SectionType={0xa5bc1114,0x6f64,0x4ede,0xb8b8,{0x3e,0x83,0xed,0x7c,0x83,0xb1}} (memory error) apei0: error source 1 entry 0: ErrorSeverity=2 (corrected) apei0: error source 1 entry 0: Revision=0x201 apei0: error source 1 entry 0: Flags=0x1<PRIMARY> apei0: error source 1 entry 0: FruText=CorrectedErr apei0: error source 1 entry 0: MemoryErrorType=8 (PARITY_ERROR)
p Example of a fatal uncorrectable memory error: d -literal apei0: error source 0 reported hardware error: severity=fatal nentries=1 status=0x11<UE,GEDE_COUNT=0x1> apei0: error source 0 entry 0: SectionType={0xa5bc1114,0x6f64,0x4ede,0xb8b8,{0x3e,0x83,0xed,0x7c,0x83,0xb1}} (memory error) apei0: error source 0 entry 0: ErrorSeverity=1 (fatal) apei0: error source 0 entry 0: Revision=0x201 apei0: error source 0 entry 0: Flags=0x1<PRIMARY> apei0: error source 0 entry 0: FruText=UncorrectedErr apei0: error source 0 entry 0: ErrorStatus=0x400<ErrorType=0x4=ERR_MEM> apei0: error source 0 entry 0: Node=0x0 apei0: error source 0 entry 0: Module=0x0 apei0: error source 0 entry 0: Device=0x0 panic: fatal hardware error .Ed
p
Details of the hardware error sources can be dumped with
.Xr acpidump 8 .
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh SEE ALSO
.Xr acpi 4 ,
.Xr acpihed 4 ,
.Xr acpidump 8
.Rs
.%B ACPI Specification 6.5
.%O Chapter 18: ACPI Platform Error Interfaces (APEI)
.%U https://uefi.org/specs/ACPI/6.5/18_Platform_Error_Interfaces.html
.Re
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh HISTORY
The
.Nm
driver first appeared in
.Nx 10.1 .
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh AUTHORS
The
.Nm
driver was written by
.An Taylor R Campbell Aq Mt riastradh (at] NetBSD.org .
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
.Sh BUGS
No sysctl interface to read BERT after boot.
p No simple sysctl interface to inject errors with EINJ, or any way to inject errors at physical addresses in pages allocated for testing. Perhaps there should be a separate kernel module for that.
p Nothing reads, writes, or clears ERST. .Nx could use it to store dmesg or other diagnostic information on panic.
p Many hardware error source types in the HEST are missing, such as .Tn PCIe errors.
p .Nm is not wired to any machine-dependent machine check exception notifications.
p No formal log format or sysctl/device interface that programs can reliably act on.
p .Nx makes no attempt to recover from uncorrectable but recoverable errors, such as discarding a clean cached page where an uncorrectable memory error has occurred.