Home | History | Annotate | Download | only in x86
History log of /src/sys/arch/x86/x86/errata.c
RevisionDateAuthorComments
 1.35  27-Oct-2023  mrg x86: handle AMD errata 1474: A CPU core may hang after about 1044 days

from the new comment:

* This requires disabling CC6 power level, which can be a performance
* issue since it stops full turbo in some implementations (eg, half the
* cores must be in CC6 to achieve the highest boost level.) Set a timer
* to fire in 1000 days -- except NetBSD timers end up having a signed
* 32-bit hz-based value, which rolls over in under 25 days with HZ=1000,
* and doing xcall(9) or kthread(9) from a callout is not allowed anyway,
* so just have a kthread wait 1 day for 1000 times.

documented in:

https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/56323-PUB_1_01.pdf
 1.34  27-Oct-2023  mrg x86: add names for errata that don't have actual numbers

zenbleed is reported as "erratum 65535" currently, this adds a name
for it, and enables the name for any others as well.

pull logging into a function with a tag message.
 1.33  28-Jul-2023  mrg x86: make the CPUID list for errata be far less confusing

the 0x80000001 CPUID result needs some parsing to match against
actual family/model/stepping values. 4-bit 'family' values of
15 or 6 change how to parse the 4-bit extended model and 8-bit
extended family value - for family 6 or 15, the extended model
bits (4) are concatenated with the base 4-bits to create an
8-bit value, and for family 15, the family value is addition
of the family value and the 8-bit extended-family value, giving
a range of 0 to 15 + 0xff aka 270.

use a CPUREV(family, model, stepping) macro that builds the
relevant bit-representation of a CPUID, making it far easier
to understand what each entry means, and to add new ones too.

i have confirmed that the emitted cpurevs[] array has the same
values before/after this change, ie, NFCI or observed.
 1.32  26-Jul-2023  mrg fix the cpuids for the zen2 client CPUs.

i'm not exactly how i came up with the values i had, though one
of them was still valid and matched my test systems.

XXX: pullup-*
 1.31  25-Jul-2023  mrg x86: turn off zenbleed chicken bit on Zen2 cpus.

this is based upon Taylor's original work. i just made the list
of CPUs to run on correct as i could determine. (also, add some
Zen3 and Zen4 cpuids not yet used by any errata.)

(might be nice to have a better way to expression revision ranges
rather than specific cpuid matches, eg, 0x30-0x4f models in a cpu
family, etc.)

tested on ryzen 3600, and a ported zenbleed PoC that no longer
shows any obtained text. (a similar module-version of it stopped
the PoC on a ryzen 3950x without having to reboot.)

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7008.html
https://lock.cmpxchg8b.com/zenbleed.html
 1.30  24-Jul-2023  riastradh x86/errata.c: Only say the errata revision search for cpu0.
 1.29  24-Jul-2023  riastradh x86/errata.c: Say what revision we're searching for.
 1.28  24-Jul-2023  riastradh x86/errata.c: Link to original AMD errata guide.

This one is no longer updated; need to link to newer ones for
individual families too. That's where all the cryptic nomenclature
comes from here.
 1.27  07-Oct-2021  msaitoh branches: 1.27.4;
KNF. No functional change.
 1.26  18-May-2019  maxv branches: 1.26.2;
Disable errata #1091. We are the only OS to apply it, and it seems to be
causing trouble to VirtualBox (PR/54143).
 1.25  12-Aug-2018  maxv enable the two errata for AMD Family 16h, tested by mrg@, thanks
 1.24  07-Aug-2018  maxv Add five errata for AMD Family 17h (Ryzen etc), tested by Patrick Welche,
thanks. Also add two errata for Family 16h, not yet tested, so not yet
enabled.
 1.23  05-Jan-2016  hannken branches: 1.23.10; 1.23.16; 1.23.18;
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.

As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.22  27-Jul-2015  msaitoh KNF.
 1.21  21-Mar-2013  christos branches: 1.21.12; 1.21.14; 1.21.16;
PR/47677 Aktado: x86_errata() should be avoided if NetBSD runs as a KVM guest.
XXX: pullup to 6
 1.20  06-Apr-2012  chs branches: 1.20.2;
bring in this change from openbsd:
Implement the AMD suggested workaround for family 10h & 12h errata 721
"Processor May Incorrectly Update Stack Pointer" by setting a bit
marked 'reserved' in an MSR that is only "documented" to exist on 12h.
 1.19  23-Jul-2010  cegger branches: 1.19.8; 1.19.12; 1.19.14;
use __arraycount
 1.18  25-May-2008  chris branches: 1.18.12; 1.18.18; 1.18.20;
Check for erratum 261 on AMD Family 10h Stepping 3 processors.

Also output any detected errata at verbose, rather than debug, level so
they can be seen with dmesg, and at least have a clue if a BIOS update
would fix the errata.
 1.17  25-May-2008  chris Add detection of errata for AMD Family 10h steppings A and 2. Covering
errata:
254: Internal Resource Livelock Involving Cached TLB Reload
261: Processor May Stall Entering Stop-Grant Due to Pending Data
Cache Scrub
298: L2 Eviction May Occur During Processor Operation To Set
Accessed or Dirty Bit
309: Processor Core May Execute Incorrect Instructions on
Concurrent L2 and Northbridge Response
 1.16  21-May-2008  ad Be a bit less pointed with the errata warning.
 1.15  28-Apr-2008  martin branches: 1.15.2;
Remove clause 3 and 4 from TNF licenses
 1.14  16-Apr-2008  cegger branches: 1.14.2; 1.14.4;
- use aprint_*_dev and device_xname
- use POSIX integer types
 1.13  14-Nov-2007  ad branches: 1.13.14;
- Remove I486_CPU, I586_CPU, I686_CPU options. They buy us nothing and
clutter the code significantly.
- Remove pccons.
 1.12  12-Nov-2007  ad - cpu_vendor was both an int and char[] on amd64 - fix it.
- Run the errata check/patch on all CPUs, not just the boot processor.
 1.11  17-Oct-2007  garbled branches: 1.11.2;
Merge the ppcoea-renovation branch to HEAD.

This branch was a major cleanup and rototill of many of the various OEA
cpu based PPC ports that focused on sharing as much code as possible
between the various ports to eliminate near-identical copies of files in
every tree. Additionally there is a new PIC system that unifies the
interface to interrupt code for all different OEA ppc arches. The work
for this branch was done by a variety of people, too long to list here.

TODO:
bebox still needs work to complete the transition to -renovation.
ofppc still needs a bunch of work, which I will be looking at.
ev64260 still needs to be renovated
amigappc was not attempted.

NOTES:
pmppc was removed as an arch, and moved to a evbppc target.
 1.10  03-Oct-2007  veego branches: 1.10.2;
Add a debug printf (aprint_debug) when a erratum was patched.
 1.9  26-Sep-2007  ad x86 changes for pcc and LKMs.

- Replace most inline assembly with proper functions. As a side effect
this reduces the size of amd64 GENERIC by about 120kB, and i386 by a
smaller amount. Nearly all of the inlines did something slow, or something
that does not need to be fast.
- Make curcpu() and curlwp functions proper, unless __GNUC__ && _KERNEL.
In that case make them inlines. Makes curlwp LKM and preemption safe.
- Make bus_space and bus_dma more LKM friendly.
- Share a few more files between the ports.
- Other minor changes.
 1.8  25-Mar-2007  tls branches: 1.8.4; 1.8.12; 1.8.14; 1.8.16;
Revert revision 1.6: with a -current GENERIC.MP kernel we cannot reproduce
the TLB shootdown IPI storms on any of the machines in question.
 1.7  21-Feb-2007  thorpej branches: 1.7.2; 1.7.6; 1.7.8; 1.7.10;
Replace the Mach-derived boolean_t type with the C99 bool type. A
future commit will replace use of TRUE and FALSE with true and false.
 1.6  05-Feb-2007  ad branches: 1.6.2;
The TLB flush filter workaround causes TLB shootdown storms on our build
machines. Disable it for now until that problem is solved.
 1.5  11-Jan-2007  ad branches: 1.5.2;
x86_errata: correct the definition of MSR_HWCR and re-enable. Problem
noted and debugged by Murray Armfield (murray at river-styx.org).
 1.4  02-Jan-2007  ad - Don't print any specifics unless booted with -d.
- Disable for now, at least one model of CPU throws a GPF.
 1.3  01-Jan-2007  ad Cut size of tables slighty.
 1.2  01-Jan-2007  ad Oops, issue a warning only once.
 1.1  01-Jan-2007  ad Report on and where possible, try to work around some of the known errata
for Athlon 64 and Opteron processors. Tested briefly by cube@ and elad@.
 1.5.2.3  09-Feb-2007  ad Sync with HEAD.
 1.5.2.2  12-Jan-2007  ad Sync with head.
 1.5.2.1  11-Jan-2007  ad file errata.c was added on branch newlock2 on 2007-01-12 01:01:01 +0000
 1.6.2.2  15-Apr-2007  yamt sync with head.
 1.6.2.1  27-Feb-2007  yamt - sync with head.
- move sched_changepri back to kern_synch.c as it doesn't know PPQ anymore.
 1.7.10.1  29-Mar-2007  reinoud Pullup to -current
 1.7.8.1  11-Jul-2007  mjf Sync with head.
 1.7.6.3  03-Dec-2007  ad Sync with HEAD.
 1.7.6.2  09-Oct-2007  ad Sync with head.
 1.7.6.1  10-Apr-2007  ad Sync with head.
 1.7.2.5  15-Nov-2007  yamt sync with head.
 1.7.2.4  27-Oct-2007  yamt sync with head.
 1.7.2.3  03-Sep-2007  yamt sync with head.
 1.7.2.2  26-Feb-2007  yamt sync with head.
 1.7.2.1  21-Feb-2007  yamt file errata.c was added on branch yamt-lazymbuf on 2007-02-26 09:08:50 +0000
 1.8.16.1  06-Oct-2007  yamt sync with head.
 1.8.14.2  09-Jan-2008  matt sync with HEAD
 1.8.14.1  06-Nov-2007  matt sync with HEAD
 1.8.12.4  21-Nov-2007  joerg Sync with HEAD.
 1.8.12.3  14-Nov-2007  joerg Sync with HEAD.
 1.8.12.2  04-Oct-2007  joerg Sync with HEAD.
 1.8.12.1  02-Oct-2007  joerg Sync with HEAD.
 1.8.4.1  03-Oct-2007  garbled Sync with HEAD
 1.10.2.3  18-Nov-2007  bouyer Sync with HEAD
 1.10.2.2  13-Nov-2007  bouyer Sync with HEAD
 1.10.2.1  17-Oct-2007  bouyer amd64 (aka x86-64) support for Xen. Based on the OpenBSD port done by
Mathieu Ropert in 2006.
DomU-only for now. An INSTALL_XEN3_DOMU kernel with a ramdisk will boot to
sysinst if you're lucky. Often it panics because a runable LWP has
a NULL stack (really, it's all of l->l_addr which is has been zeroed out
while the process was on the queue !)
TODO:
- bug fixes :)
- Most of the xpq_* functions should be shared with xen/i386
- The xen/i386 assembly bootstrap code should be remplaced with the C
version in xenamd64/amd64/xpmap.c
- see if a config(5) trick could allow to merge xenamd64 back to xen or amd64.
 1.11.2.1  19-Nov-2007  mjf Sync with HEAD.
 1.13.14.1  02-Jun-2008  mjf Sync with HEAD.
 1.14.4.3  11-Aug-2010  yamt sync with head.
 1.14.4.2  04-May-2009  yamt sync with head.
 1.14.4.1  16-May-2008  yamt sync with head.
 1.14.2.2  04-Jun-2008  yamt sync with head
 1.14.2.1  18-May-2008  yamt sync with head.
 1.15.2.1  23-Jun-2008  wrstuden Sync w/ -current. 34 merge conflicts to follow.
 1.18.20.1  05-Mar-2011  rmind sync with head
 1.18.18.1  17-Aug-2010  uebayasi Sync with HEAD.
 1.18.12.1  24-Oct-2010  jym Sync with HEAD
 1.19.14.2  14-Jul-2016  snj Pull up following revision(s) (requested by hannken in ticket #1361):
sys/arch/x86/include/cpufunc.h: revision 1.19
sys/arch/x86/x86/errata.c: revision 1.23
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.19.14.1  09-Apr-2012  riz branches: 1.19.14.1.4; 1.19.14.1.6;
Pull up following revision(s) (requested by chs in ticket #168):
sys/arch/x86/include/specialreg.h: revision 1.57
sys/arch/x86/x86/errata.c: revision 1.20
bring in this change from openbsd:
Implement the AMD suggested workaround for family 10h & 12h errata 721
"Processor May Incorrectly Update Stack Pointer" by setting a bit
marked 'reserved' in an MSR that is only "documented" to exist on 12h.
 1.19.14.1.6.1  14-Jul-2016  snj Pull up following revision(s) (requested by hannken in ticket #1361):
sys/arch/x86/include/cpufunc.h: revision 1.19
sys/arch/x86/x86/errata.c: revision 1.23
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.19.14.1.4.1  14-Jul-2016  snj Pull up following revision(s) (requested by hannken in ticket #1361):
sys/arch/x86/include/cpufunc.h: revision 1.19
sys/arch/x86/x86/errata.c: revision 1.23
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.19.12.1  29-Apr-2012  mrg sync to latest -current.
 1.19.8.2  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.19.8.1  17-Apr-2012  yamt sync with head
 1.20.2.2  03-Dec-2017  jdolecek update from HEAD
 1.20.2.1  23-Jun-2013  tls resync from head
 1.21.16.1  06-Feb-2016  snj Pull up following revision(s) (requested by hannken in ticket #1073):
sys/arch/x86/x86/errata.c: revision 1.23
sys/arch/x86/include/cpufunc.h: revision 1.19
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.21.14.2  19-Mar-2016  skrll Sync with HEAD
 1.21.14.1  22-Sep-2015  skrll Sync with HEAD
 1.21.12.1  26-Jan-2016  snj Pull up following revision(s) (requested by hannken in ticket #1073):
sys/arch/x86/x86/errata.c: revision 1.23
sys/arch/x86/include/cpufunc.h: revision 1.19
Adapt prototypes and usage of rdmsr_locked() and wrmsr_locked() to
their implementation. Both functions don't take the passcode as
argument.
As wrmsr_locked() no longer writes the passcode to the msr the
erratum 721 on my Opteron 2356 really gets patched and cc1 no longer
crashes with SIGSEGV.
 1.23.18.1  10-Jun-2019  christos Sync with HEAD
 1.23.16.1  06-Sep-2018  pgoyette Sync with HEAD

Resolve a couple of conflicts (result of the uimin/uimax changes)
 1.23.10.3  27-Jul-2023  martin Pull up following revision(s) (requested by mrg in ticket #1852):

sys/arch/x86/x86/errata.c: revision 1.32

fix the cpuids for the zen2 client CPUs.

i'm not exactly how i came up with the values i had, though one
of them was still valid and matched my test systems.
 1.23.10.2  25-Jul-2023  martin Pull up following revision(s) (requested by mrg in ticket #1851):

sys/arch/x86/include/specialreg.h: revision 1.207
sys/arch/x86/x86/errata.c: revision 1.31

x86: turn off zenbleed chicken bit on Zen2 cpus.

this is based upon Taylor's original work. i just made the list
of CPUs to run on correct as i could determine. (also, add some
Zen3 and Zen4 cpuids not yet used by any errata.)

(might be nice to have a better way to expression revision ranges
rather than specific cpuid matches, eg, 0x30-0x4f models in a cpu
family, etc.)

tested on ryzen 3600, and a ported zenbleed PoC that no longer
shows any obtained text. (a similar module-version of it stopped
the PoC on a ryzen 3950x without having to reboot.)

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7008.html
https://lock.cmpxchg8b.com/zenbleed.html
 1.23.10.1  05-Aug-2020  martin Pull up the following, requested by msaitoh in ticket #1595:

sys/arch/x86/include/specialreg.h 1.129 via patch
sys/arch/x86/x86/errata.c 1.24-1.26

- Add six errata for AMD Family 17h (Ryzen etc), tested by
Patrick Welche and mrg@.
 1.26.2.2  27-Jul-2023  martin Pull up following revision(s) (requested by mrg in ticket #1667):

sys/arch/x86/x86/errata.c: revision 1.32

fix the cpuids for the zen2 client CPUs.

i'm not exactly how i came up with the values i had, though one
of them was still valid and matched my test systems.
 1.26.2.1  25-Jul-2023  martin Pull up following revision(s) (requested by mrg in ticket #1664):

sys/arch/x86/include/specialreg.h: revision 1.207
sys/arch/x86/x86/errata.c: revision 1.31

x86: turn off zenbleed chicken bit on Zen2 cpus.

this is based upon Taylor's original work. i just made the list
of CPUs to run on correct as i could determine. (also, add some
Zen3 and Zen4 cpuids not yet used by any errata.)

(might be nice to have a better way to expression revision ranges
rather than specific cpuid matches, eg, 0x30-0x4f models in a cpu
family, etc.)

tested on ryzen 3600, and a ported zenbleed PoC that no longer
shows any obtained text. (a similar module-version of it stopped
the PoC on a ryzen 3950x without having to reboot.)

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7008.html
https://lock.cmpxchg8b.com/zenbleed.html
 1.27.4.3  03-Oct-2024  martin Pull up following revision(s) (requested by rin in ticket #919):

sys/arch/x86/x86/errata.c: revision 1.28
sys/arch/x86/x86/errata.c: revision 1.29
sys/arch/x86/include/specialreg.h: revision 1.209
usr.sbin/cpuctl/arch/i386.c: revision 1.144
sys/arch/x86/x86/errata.c: revision 1.30
sys/arch/x86/x86/errata.c: revision 1.33
sys/arch/x86/x86/errata.c: revision 1.34
sys/arch/x86/x86/errata.c: revision 1.35
sys/arch/x86/include/specialreg.h: revision 1.210
sys/arch/x86/include/specialreg.h: revision 1.211

x86/errata.c: Link to original AMD errata guide.

This one is no longer updated; need to link to newer ones for
individual families too. That's where all the cryptic nomenclature
comes from here.

x86/errata.c: Say what revision we're searching for.

x86/errata.c: Only say the errata revision search for cpu0.

x86: make the CPUID list for errata be far less confusing
the 0x80000001 CPUID result needs some parsing to match against
actual family/model/stepping values. 4-bit 'family' values of
15 or 6 change how to parse the 4-bit extended model and 8-bit
extended family value - for family 6 or 15, the extended model
bits (4) are concatenated with the base 4-bits to create an
8-bit value, and for family 15, the family value is addition
of the family value and the 8-bit extended-family value, giving
a range of 0 to 15 + 0xff aka 270.

use a CPUREV(family, model, stepping) macro that builds the
relevant bit-representation of a CPUID, making it far easier
to understand what each entry means, and to add new ones too.
i have confirmed that the emitted cpurevs[] array has the same
values before/after this change, ie, NFCI or observed.

x86: add names for errata that don't have actual numbers
zenbleed is reported as "erratum 65535" currently, this adds a name
for it, and enables the name for any others as well.
pull logging into a function with a tag message.

x86: handle AMD errata 1474: A CPU core may hang after about 1044 days
from the new comment:
* This requires disabling CC6 power level, which can be a performance
* issue since it stops full turbo in some implementations (eg, half the
* cores must be in CC6 to achieve the highest boost level.) Set a timer
* to fire in 1000 days -- except NetBSD timers end up having a signed
* 32-bit hz-based value, which rolls over in under 25 days with HZ=1000,
* and doing xcall(9) or kthread(9) from a callout is not allowed anyway,
* so just have a kthread wait 1 day for 1000 times.
documented in:
https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/56323-PUB_1_01.pdf

add MSR stuff for AMD errata 1474.

cpuctl: fix i386 bit descriptions for CPUID_SEF_FLAGS1
warning: non-printing character '\31' in description
'BUS_LOCK_DETECT""b\31' [363]
s/RPMQUERY/RMPQUERY/
 1.27.4.2  27-Jul-2023  martin Pull up following revision(s) (requested by mrg in ticket #247):

sys/arch/x86/x86/errata.c: revision 1.32

fix the cpuids for the zen2 client CPUs.

i'm not exactly how i came up with the values i had, though one
of them was still valid and matched my test systems.
 1.27.4.1  25-Jul-2023  martin Pull up following revision(s) (requested by mrg in ticket #243):

sys/arch/x86/include/specialreg.h: revision 1.207
sys/arch/x86/x86/errata.c: revision 1.31

x86: turn off zenbleed chicken bit on Zen2 cpus.

this is based upon Taylor's original work. i just made the list
of CPUs to run on correct as i could determine. (also, add some
Zen3 and Zen4 cpuids not yet used by any errata.)

(might be nice to have a better way to expression revision ranges
rather than specific cpuid matches, eg, 0x30-0x4f models in a cpu
family, etc.)

tested on ryzen 3600, and a ported zenbleed PoC that no longer
shows any obtained text. (a similar module-version of it stopped
the PoC on a ryzen 3950x without having to reboot.)

https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7008.html
https://lock.cmpxchg8b.com/zenbleed.html

RSS XML Feed