Home | History | Annotate | only in /src/sys/dev/tprof
History log of /src/sys/dev/tprof
RevisionDateAuthorComments
 1.1 01-Jan-2008  yamt branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8;
a simple performance monitor based profiler, inspired from linux oprofile.
 1.1.8.2 18-Feb-2008  mjf Sync with HEAD.
 1.1.8.1 01-Jan-2008  mjf file files.tprof was added on branch mjf-devfs on 2008-02-18 21:06:25 +0000
 1.1.6.2 21-Jan-2008  yamt sync with head
 1.1.6.1 01-Jan-2008  yamt file files.tprof was added on branch yamt-lazymbuf on 2008-01-21 09:44:39 +0000
 1.1.4.2 09-Jan-2008  matt sync with HEAD
 1.1.4.1 01-Jan-2008  matt file files.tprof was added on branch matt-armv6 on 2008-01-09 01:54:35 +0000
 1.1.2.2 02-Jan-2008  bouyer Sync with HEAD
 1.1.2.1 01-Jan-2008  bouyer file files.tprof was added on branch bouyer-xeni386 on 2008-01-02 21:55:16 +0000
 1.23 11-Apr-2023  msaitoh KNF. No functional change.
 1.22 16-Dec-2022  ryo tprof_lock is not a spin mutex. use mutex_{enter,exit}(). oops
 1.21 16-Dec-2022  ryo branches: 1.21.2;
- Add support select(2)/poll(2) on /dev/tprof.
- Changed sampling buffer switching frequency (which is the frequency of tprof_worker()
calls and also the maximum block time of read(2) of /dev/tprof) from 1sec to 125ms.
This improve tprof top responsiveness.
- The maximum number of sampling buffers is now adjusted according to the number of CPUs.
Previously it was fixed at 100 and was insufficient if ncpu was greater than this.

The maximum number of samples per second per CPU is calculated by
"TPROF_MAX_SAMPLES_PER_BUF * (HZ of tprof_worker)".
Therefore, currently, 10000 * (1000/125) = 80000 maximum samplings per CPU.
The actual value will vary slightly from this due to tprof_worker and read(2) timing.
This value may need to be adjusted more in the future.
 1.20 11-Dec-2022  chs make sure error is initialized before we return it.
 1.19 01-Dec-2022  ryo Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.18 01-Dec-2022  ryo don't call kpreempt_{disable,enable}() from an interrupt handler.

Fixed a problem in which the system would freeze if a high load (e.g., build.sh -j20)
was applied while running `tprof monitor -e LsNotHaltedCyc ...' on x86.

This almost eliminates the problem, but still is not enough. tprof_x86 uses NMI
interrupts, which are interrupted even in splhigh(), leaving the possibility of
being interrupted in the splhigh section of percpu_cpu_swap().
 1.17 28-Mar-2022  riastradh driver(9): devsw_detach never fails. Make it return void.

Prune a whole lotta dead branches as a result of this. (Some logic
calling this is also wrong for other reasons; devsw_detach is final
-- you should never have any reason to decide to roll it back. To be
cleaned up in subsequent commits...)

XXX kernel ABI change to devsw_detach signature requires bump
 1.16 01-Nov-2021  skrll Trailing whitespace
 1.15 27-Nov-2020  riastradh tprof: Use percpu rather than a MAXCPUS-element array.
 1.14 13-Jul-2018  maxv branches: 1.14.6; 1.14.14;
Revamp tprof.

Rewrite the Intel backend to use the generic PMC interface, which is
available on all Intel CPUs. Synchronize the AMD backend with the new
interface.

The kernel identifies the PMC interface, and gives its id to userland.
Userland then queries the events itself (via cpuid etc). These events
depend on the PMC interface.

The tprof utility is rewritten to allow the user to choose which event
to count (which was not possible until now, the event was hardcoded in
the backend). The command line format is based on usr.bin/pmc, eg:

tprof -e llc-misses:k -o output sleep 20

The man page is updated too, but the arguments will likely change soon
anyway so it doesn't matter a lot.

The tprof utility has three tables:

Intel Architectural Version 1
Intel Skylake/Kabylake
AMD Family 10h

A CPU can support a combination of tables. For example Kabylake has
Intel-Architectural-Version-1 and its own Intel-Kabylake table.

For now the Intel Skylake/Kabylake table contains only one event, just
to demonstrate that the combination of tables works. Tested on an
Intel Core i5 Kabylake.

The code for AMD Family 10h is taken from the code I had written for
usr.bin/pmc. I haven't tested it yet, but it's the same as pmc(1), so
I guess it works as-is.

The whole thing is written in such a way that (I think) it is not
complicated to add more CPU models, and more architectures (other than
x86).
 1.13 20-Aug-2015  christos branches: 1.13.8; 1.13.16; 1.13.18;
include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.
 1.12 25-Jul-2014  dholland branches: 1.12.4;
Add d_discard to all struct cdevsw instances I could find.

All have been set to "nodiscard"; some should get a real implementation.
 1.11 16-Mar-2014  dholland branches: 1.11.2;
Change (mostly mechanically) every cdevsw/bdevsw I can find to use
designated initializers.

I have not built every extant kernel so I have probably broken at
least one build; however I've also found and fixed some wrong
cdevsw/bdevsw entries so even if so I think we come out ahead.
 1.10 14-Apr-2011  yamt branches: 1.10.4; 1.10.14; 1.10.18;
for each samples, record and report cpuid and lwpid.
 1.9 25-Feb-2011  yamt tprof_start: don't forget to restore refcount when failed to start backend.
 1.8 05-Feb-2011  yamt tprof: record pid and userland events.
 1.7 11-Aug-2010  pgoyette branches: 1.7.2; 1.7.4;
Keep condvar wmesg within 8-char limit
 1.6 13-Mar-2009  yamt branches: 1.6.2; 1.6.4;
tprof_stop1: add an assertion.
 1.5 11-Mar-2009  yamt fix breakage where db_regs_t != trapframe.
the problem pointed out by Martin Husemann on tech-kern@.
 1.4 10-Mar-2009  yamt - adapt to MODULAR.
- some preparations to have more backends.
- add some comments.
 1.3 20-Jan-2009  yamt branches: 1.3.2;
comment
 1.2 07-May-2008  yamt branches: 1.2.8;
tprof_start: fix workqueue's IPL.
 1.1 01-Jan-2008  yamt branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8; 1.1.14; 1.1.16; 1.1.18;
a simple performance monitor based profiler, inspired from linux oprofile.
 1.1.18.3 09-Oct-2010  yamt sync with head
 1.1.18.2 04-May-2009  yamt sync with head.
 1.1.18.1 16-May-2008  yamt sync with head.
 1.1.16.1 18-May-2008  yamt sync with head.
 1.1.14.1 02-Jun-2008  mjf Sync with HEAD.
 1.1.8.2 18-Feb-2008  mjf Sync with HEAD.
 1.1.8.1 01-Jan-2008  mjf file tprof.c was added on branch mjf-devfs on 2008-02-18 21:06:25 +0000
 1.1.6.2 21-Jan-2008  yamt sync with head
 1.1.6.1 01-Jan-2008  yamt file tprof.c was added on branch yamt-lazymbuf on 2008-01-21 09:44:40 +0000
 1.1.4.2 09-Jan-2008  matt sync with HEAD
 1.1.4.1 01-Jan-2008  matt file tprof.c was added on branch matt-armv6 on 2008-01-09 01:54:36 +0000
 1.1.2.2 02-Jan-2008  bouyer Sync with HEAD
 1.1.2.1 01-Jan-2008  bouyer file tprof.c was added on branch bouyer-xeni386 on 2008-01-02 21:55:17 +0000
 1.2.8.2 28-Apr-2009  skrll Sync with HEAD.
 1.2.8.1 03-Mar-2009  skrll Sync with HEAD.
 1.3.2.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.6.4.2 21-Apr-2011  rmind sync with head
 1.6.4.1 05-Mar-2011  rmind sync with head
 1.6.2.1 17-Aug-2010  uebayasi Sync with HEAD.
 1.7.4.2 05-Mar-2011  bouyer Sync with HEAD
 1.7.4.1 08-Feb-2011  bouyer Sync with HEAD
 1.7.2.1 06-Jun-2011  jruoho Sync with HEAD.
 1.10.18.1 18-May-2014  rmind sync with head
 1.10.14.2 03-Dec-2017  jdolecek update from HEAD
 1.10.14.1 20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.10.4.1 22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.11.2.1 10-Aug-2014  tls Rebase.
 1.12.4.1 22-Sep-2015  skrll Sync with HEAD
 1.13.18.1 10-Jun-2019  christos Sync with HEAD
 1.13.16.1 28-Jul-2018  pgoyette Sync with HEAD
 1.13.8.2 29-Apr-2017  pgoyette Revise previous. Rather than explicitly including <sys/localcount.h>
in all the places where {b,c}devsw is initialized, just include it
from <sys/conf.h>. This avoids an include-sequence dependancy.
 1.13.8.1 29-Apr-2017  pgoyette Add DEVSW_MODULE_INIT to existing device-driver modules, so that they
willl have a localcount defined and thus be permitted to load. Without
a localcount, loading the module will return EINVAL.

XXX the dtrace and drm stuff might need to be fed back upstream?
 1.14.14.1 14-Dec-2020  thorpej Sync w/ HEAD.
 1.14.6.1 01-Aug-2023  martin Pull up the following revisions, requested by msaitoh in ticket #1697:

usr.sbin/tprof/tprof.8 1.16,1.22,1.25,1.29 via patch
usr.sbin/tprof/tprof_analyze.c 1.4
usr.sbin/tprof/arch/tprof_x86.c 1.13-1.19
sys/dev/tprof/tprof.c 1.23 via patch
sys/dev/tprof/tprof_x86_amd.c 1.7-1.8 via patch
sys/dev/tprof/tprof_x86_intel.c 1.8 via patch

- Add AMD family 19h (zen3 and zen4) support.
- Add Intel Comet Lake support.
- Add support for Intel Skylake-X and Cascade Lake.
- Print the path that we failed to open on error.
- Use lowercase consistently for hexadecimal numbers.
- KNF
 1.21.2.2 21-Jun-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #210):

usr.sbin/tprof/tprof.8: revision 1.30
sys/dev/tprof/tprof_x86_amd.c: revision 1.8
sys/dev/tprof/tprof_armv8.c: revision 1.20
sys/dev/tprof/tprof_types.h: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.6
sys/dev/tprof/tprof_x86_intel.c: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.8
sys/dev/tprof/tprof.c: revision 1.23
usr.sbin/tprof/tprof.8: revision 1.25
usr.sbin/tprof/tprof.8: revision 1.26
usr.sbin/tprof/arch/tprof_x86.c: revision 1.16
usr.sbin/tprof/tprof.8: revision 1.27
usr.sbin/tprof/arch/tprof_x86.c: revision 1.17
usr.sbin/tprof/tprof.8: revision 1.28
usr.sbin/tprof/tprof.h: revision 1.5
usr.sbin/tprof/tprof.8: revision 1.29
sys/dev/tprof/tprof_armv7.c: revision 1.13
usr.sbin/tprof/tprof_top.c: revision 1.9
usr.sbin/tprof/tprof.c: revision 1.21

Add Cometlake support.

Obtain the number of general counters from CPUID 0xa.

Test cpuid_level in tprof_intel_ncounters().
This function is called before tprof_intel_ident().

KNF. No functional change.

Add two note to the tprof(8)'s manual page.
- "list" command prints the maximum number of counters that can be used
simultaneously.
- multiple -e arguments can be specified.

Use the default counter if -e argument is not specified.
monitor command:
The default counter is selected if -e argument is not specified.
list command:
Print the name of the default counter for monitor and top command.

tprof.8: new sentence, new line

tprof(8): fix markup nits

tprof.8: fix typo, s/speficied/specified/
 1.21.2.1 23-Dec-2022  martin Pull up following revision(s) (requested by ryo in ticket #20):

sys/arch/arm/arm/cpufunc.c: revision 1.185
sys/dev/tprof/tprof.c: revision 1.22
sys/arch/arm/arm32/arm32_boot.c: revision 1.45
sys/dev/tprof/tprof_armv8.c: revision 1.19
sys/dev/tprof/tprof_armv7.c: revision 1.12
sys/arch/aarch64/aarch64/cpu.c: revision 1.71
sys/arch/aarch64/aarch64/cpu.c: revision 1.72

tprof_lock is not a spin mutex. use mutex_{enter,exit}(). oops

Explicitly disable overflow interrupts before enabling the cycle counter.

PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0

Even if an overflow interrupt is occured for a counter outside tprof management,
the bit of onverflow status register must be cleared to prevent an interrupt storm.
 1.7 01-Dec-2022  ryo Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.6 13-Jul-2018  maxv Revamp tprof.

Rewrite the Intel backend to use the generic PMC interface, which is
available on all Intel CPUs. Synchronize the AMD backend with the new
interface.

The kernel identifies the PMC interface, and gives its id to userland.
Userland then queries the events itself (via cpuid etc). These events
depend on the PMC interface.

The tprof utility is rewritten to allow the user to choose which event
to count (which was not possible until now, the event was hardcoded in
the backend). The command line format is based on usr.bin/pmc, eg:

tprof -e llc-misses:k -o output sleep 20

The man page is updated too, but the arguments will likely change soon
anyway so it doesn't matter a lot.

The tprof utility has three tables:

Intel Architectural Version 1
Intel Skylake/Kabylake
AMD Family 10h

A CPU can support a combination of tables. For example Kabylake has
Intel-Architectural-Version-1 and its own Intel-Kabylake table.

For now the Intel Skylake/Kabylake table contains only one event, just
to demonstrate that the combination of tables works. Tested on an
Intel Core i5 Kabylake.

The code for AMD Family 10h is taken from the code I had written for
usr.bin/pmc. I haven't tested it yet, but it's the same as pmc(1), so
I guess it works as-is.

The whole thing is written in such a way that (I think) it is not
complicated to add more CPU models, and more architectures (other than
x86).
 1.5 05-Feb-2011  yamt branches: 1.5.54; 1.5.56;
tprof: record pid and userland events.
 1.4 18-Nov-2009  yamt branches: 1.4.4; 1.4.6; 1.4.8;
comment
 1.3 11-Mar-2009  yamt fix breakage where db_regs_t != trapframe.
the problem pointed out by Martin Husemann on tech-kern@.
 1.2 10-Mar-2009  yamt - adapt to MODULAR.
- some preparations to have more backends.
- add some comments.
 1.1 01-Jan-2008  yamt branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8; 1.1.18; 1.1.26; 1.1.32;
a simple performance monitor based profiler, inspired from linux oprofile.
 1.1.32.1 13-May-2009  jym Sync with HEAD.

Commit is split, to avoid a "too many arguments" protocol error.
 1.1.26.1 28-Apr-2009  skrll Sync with HEAD.
 1.1.18.2 11-Mar-2010  yamt sync with head
 1.1.18.1 04-May-2009  yamt sync with head.
 1.1.8.2 18-Feb-2008  mjf Sync with HEAD.
 1.1.8.1 01-Jan-2008  mjf file tprof.h was added on branch mjf-devfs on 2008-02-18 21:06:25 +0000
 1.1.6.2 21-Jan-2008  yamt sync with head
 1.1.6.1 01-Jan-2008  yamt file tprof.h was added on branch yamt-lazymbuf on 2008-01-21 09:44:40 +0000
 1.1.4.2 09-Jan-2008  matt sync with HEAD
 1.1.4.1 01-Jan-2008  matt file tprof.h was added on branch matt-armv6 on 2008-01-09 01:54:36 +0000
 1.1.2.2 02-Jan-2008  bouyer Sync with HEAD
 1.1.2.1 01-Jan-2008  bouyer file tprof.h was added on branch bouyer-xeni386 on 2008-01-02 21:55:17 +0000
 1.4.8.1 08-Feb-2011  bouyer Sync with HEAD
 1.4.6.1 06-Jun-2011  jruoho Sync with HEAD.
 1.4.4.1 05-Mar-2011  rmind sync with head
 1.5.56.1 10-Jun-2019  christos Sync with HEAD
 1.5.54.1 28-Jul-2018  pgoyette Sync with HEAD
 1.13 11-Apr-2023  msaitoh KNF. No functional change.
 1.12 22-Dec-2022  ryo Even if an overflow interrupt is occured for a counter outside tprof management,
the bit of onverflow status register must be cleared to prevent an interrupt storm.
 1.11 03-Dec-2022  ryo branches: 1.11.2;
move ARMv7 PMC register definitions to armreg.h from tprof_armv7.c
 1.10 01-Dec-2022  ryo Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.9 01-Dec-2022  ryo tprof_armv7 initializes on each CPUs, like tprof_armv8.
 1.8 01-Dec-2022  ryo PMCR.E should not be disabled from tprof.

PMCR.E controls not only performance event counters but also the cycle
counter operation, and the cycle counter may be used for cpu_counter.
Similarly, the 31st bit in PMINTENCLR and PMCNTENCLR controls the cycle
counter, not performance event counters, and should not be modified.
 1.7 01-Nov-2022  jmcneill Add support for Cortex-A9.
 1.6 26-Nov-2021  christos declare xc
 1.5 25-Nov-2021  skrll Improve error handling.

Hypervisors can return a PMCR.N of 0.
 1.4 30-Oct-2020  skrll Retire arm_[di]sb in favour of the isb() and dsb(sy) macro invocations.
 1.3 24-Feb-2020  rin 0x%#x --> %#x for non-external codes.
Also, stop mixing up 0x%x and %#x in single files as far as possible.
 1.2 16-Jul-2018  jmcneill branches: 1.2.2; 1.2.8; 1.2.12;
RW fields in performance monitor registers are reset to architecturally
UNKNOWN values. Initialize the PMU to a known state - all interrupts and
counters disabled, performance monitor disabled, and user access disabled.
 1.1 15-Jul-2018  jmcneill Add tprof backend for ARMv7 performance monitors.
 1.2.12.1 29-Feb-2020  ad Sync with head.
 1.2.8.3 08-Apr-2020  martin Merge changes from current as of 20200406
 1.2.8.2 10-Jun-2019  christos Sync with HEAD
 1.2.8.1 16-Jul-2018  christos file tprof_armv7.c was added on branch phil-wifi on 2019-06-10 22:07:33 +0000
 1.2.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.2.2.1 16-Jul-2018  pgoyette file tprof_armv7.c was added on branch pgoyette-compat on 2018-07-28 04:37:57 +0000
 1.11.2.2 21-Jun-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #210):

usr.sbin/tprof/tprof.8: revision 1.30
sys/dev/tprof/tprof_x86_amd.c: revision 1.8
sys/dev/tprof/tprof_armv8.c: revision 1.20
sys/dev/tprof/tprof_types.h: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.6
sys/dev/tprof/tprof_x86_intel.c: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.8
sys/dev/tprof/tprof.c: revision 1.23
usr.sbin/tprof/tprof.8: revision 1.25
usr.sbin/tprof/tprof.8: revision 1.26
usr.sbin/tprof/arch/tprof_x86.c: revision 1.16
usr.sbin/tprof/tprof.8: revision 1.27
usr.sbin/tprof/arch/tprof_x86.c: revision 1.17
usr.sbin/tprof/tprof.8: revision 1.28
usr.sbin/tprof/tprof.h: revision 1.5
usr.sbin/tprof/tprof.8: revision 1.29
sys/dev/tprof/tprof_armv7.c: revision 1.13
usr.sbin/tprof/tprof_top.c: revision 1.9
usr.sbin/tprof/tprof.c: revision 1.21

Add Cometlake support.

Obtain the number of general counters from CPUID 0xa.

Test cpuid_level in tprof_intel_ncounters().
This function is called before tprof_intel_ident().

KNF. No functional change.

Add two note to the tprof(8)'s manual page.
- "list" command prints the maximum number of counters that can be used
simultaneously.
- multiple -e arguments can be specified.

Use the default counter if -e argument is not specified.
monitor command:
The default counter is selected if -e argument is not specified.
list command:
Print the name of the default counter for monitor and top command.

tprof.8: new sentence, new line

tprof(8): fix markup nits

tprof.8: fix typo, s/speficied/specified/
 1.11.2.1 23-Dec-2022  martin Pull up following revision(s) (requested by ryo in ticket #20):

sys/arch/arm/arm/cpufunc.c: revision 1.185
sys/dev/tprof/tprof.c: revision 1.22
sys/arch/arm/arm32/arm32_boot.c: revision 1.45
sys/dev/tprof/tprof_armv8.c: revision 1.19
sys/dev/tprof/tprof_armv7.c: revision 1.12
sys/arch/aarch64/aarch64/cpu.c: revision 1.71
sys/arch/aarch64/aarch64/cpu.c: revision 1.72

tprof_lock is not a spin mutex. use mutex_{enter,exit}(). oops

Explicitly disable overflow interrupts before enabling the cycle counter.

PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0

Even if an overflow interrupt is occured for a counter outside tprof management,
the bit of onverflow status register must be cleared to prevent an interrupt storm.
 1.1 15-Jul-2018  jmcneill branches: 1.1.2; 1.1.8;
Add tprof backend for ARMv7 performance monitors.
 1.1.8.2 10-Jun-2019  christos Sync with HEAD
 1.1.8.1 15-Jul-2018  christos file tprof_armv7.h was added on branch phil-wifi on 2019-06-10 22:07:33 +0000
 1.1.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.1 15-Jul-2018  pgoyette file tprof_armv7.h was added on branch pgoyette-compat on 2018-07-28 04:37:57 +0000
 1.20 11-Apr-2023  msaitoh KNF. No functional change.
 1.19 22-Dec-2022  ryo Even if an overflow interrupt is occured for a counter outside tprof management,
the bit of onverflow status register must be cleared to prevent an interrupt storm.
 1.18 01-Dec-2022  ryo branches: 1.18.2;
Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.17 01-Dec-2022  ryo PMCR.E should not be disabled from tprof.

PMCR.E controls not only performance event counters but also the cycle
counter operation, and the cycle counter may be used for cpu_counter.
Similarly, the 31st bit in PMINTENCLR and PMCNTENCLR controls the cycle
counter, not performance event counters, and should not be modified.
 1.16 10-Nov-2022  ryo revert my previous commit.

since armv8_pmu_init is only called when it is reliably worked by ACPI or fdt,
there is no need for dynamic checks.

pointed out by jmcneill@, thanks
 1.15 09-Nov-2022  ryo If the hardware does not support PMU, return an error instead of KASSERT.
 1.14 16-May-2022  jmcneill tprof: armv8: Only attach to known PMU types.
 1.13 03-Dec-2021  skrll fix the typo that martin spotted.
 1.12 03-Dec-2021  skrll Add a comment and simplify the code ever so slightly.
 1.11 03-Dec-2021  skrll Use the first (not second) event counter as there might only be one
available.
 1.10 26-Nov-2021  christos declare xc
 1.9 25-Nov-2021  skrll Improve error handling.

Hypervisors can return a PMCR.N of 0.
 1.8 01-Nov-2021  skrll Trailing whitespace
 1.7 26-Sep-2021  jmcneill Make sure setup happens on all CPUs.
 1.6 30-Oct-2020  skrll Retire arm_[di]sb in favour of the isb() and dsb(sy) macro invocations.
 1.5 30-Mar-2020  jmcneill Enable the cycle counter when a CPU hatches and store an estimate of the
frequency in ci_data.cpu_cc_freq.
 1.4 17-Jul-2018  christos branches: 1.4.2; 1.4.8;
use PRI?64 instead of ll?
 1.3 16-Jul-2018  jmcneill Spaces -> tabs
 1.2 16-Jul-2018  jmcneill RW fields in performance monitor registers are reset to architecturally
UNKNOWN values. Initialize the PMU to a known state - all interrupts and
counters disabled, performance monitor disabled, and user access disabled.
 1.1 15-Jul-2018  jmcneill Add tprof backend for ARMv8 performance monitors.
 1.4.8.3 08-Apr-2020  martin Merge changes from current as of 20200406
 1.4.8.2 10-Jun-2019  christos Sync with HEAD
 1.4.8.1 17-Jul-2018  christos file tprof_armv8.c was added on branch phil-wifi on 2019-06-10 22:07:33 +0000
 1.4.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.4.2.1 17-Jul-2018  pgoyette file tprof_armv8.c was added on branch pgoyette-compat on 2018-07-28 04:37:57 +0000
 1.18.2.2 21-Jun-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #210):

usr.sbin/tprof/tprof.8: revision 1.30
sys/dev/tprof/tprof_x86_amd.c: revision 1.8
sys/dev/tprof/tprof_armv8.c: revision 1.20
sys/dev/tprof/tprof_types.h: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.6
sys/dev/tprof/tprof_x86_intel.c: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.8
sys/dev/tprof/tprof.c: revision 1.23
usr.sbin/tprof/tprof.8: revision 1.25
usr.sbin/tprof/tprof.8: revision 1.26
usr.sbin/tprof/arch/tprof_x86.c: revision 1.16
usr.sbin/tprof/tprof.8: revision 1.27
usr.sbin/tprof/arch/tprof_x86.c: revision 1.17
usr.sbin/tprof/tprof.8: revision 1.28
usr.sbin/tprof/tprof.h: revision 1.5
usr.sbin/tprof/tprof.8: revision 1.29
sys/dev/tprof/tprof_armv7.c: revision 1.13
usr.sbin/tprof/tprof_top.c: revision 1.9
usr.sbin/tprof/tprof.c: revision 1.21

Add Cometlake support.

Obtain the number of general counters from CPUID 0xa.

Test cpuid_level in tprof_intel_ncounters().
This function is called before tprof_intel_ident().

KNF. No functional change.

Add two note to the tprof(8)'s manual page.
- "list" command prints the maximum number of counters that can be used
simultaneously.
- multiple -e arguments can be specified.

Use the default counter if -e argument is not specified.
monitor command:
The default counter is selected if -e argument is not specified.
list command:
Print the name of the default counter for monitor and top command.

tprof.8: new sentence, new line

tprof(8): fix markup nits

tprof.8: fix typo, s/speficied/specified/
 1.18.2.1 23-Dec-2022  martin Pull up following revision(s) (requested by ryo in ticket #20):

sys/arch/arm/arm/cpufunc.c: revision 1.185
sys/dev/tprof/tprof.c: revision 1.22
sys/arch/arm/arm32/arm32_boot.c: revision 1.45
sys/dev/tprof/tprof_armv8.c: revision 1.19
sys/dev/tprof/tprof_armv7.c: revision 1.12
sys/arch/aarch64/aarch64/cpu.c: revision 1.71
sys/arch/aarch64/aarch64/cpu.c: revision 1.72

tprof_lock is not a spin mutex. use mutex_{enter,exit}(). oops

Explicitly disable overflow interrupts before enabling the cycle counter.

PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0

Even if an overflow interrupt is occured for a counter outside tprof management,
the bit of onverflow status register must be cleared to prevent an interrupt storm.
 1.2 16-May-2022  jmcneill tprof: armv8: Only attach to known PMU types.
 1.1 15-Jul-2018  jmcneill branches: 1.1.2; 1.1.8;
Add tprof backend for ARMv8 performance monitors.
 1.1.8.2 10-Jun-2019  christos Sync with HEAD
 1.1.8.1 15-Jul-2018  christos file tprof_armv8.h was added on branch phil-wifi on 2019-06-10 22:07:33 +0000
 1.1.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.1 15-Jul-2018  pgoyette file tprof_armv8.h was added on branch pgoyette-compat on 2018-07-28 04:37:57 +0000
 1.5 01-Dec-2022  ryo Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.4 13-Jul-2018  maxv Revamp tprof.

Rewrite the Intel backend to use the generic PMC interface, which is
available on all Intel CPUs. Synchronize the AMD backend with the new
interface.

The kernel identifies the PMC interface, and gives its id to userland.
Userland then queries the events itself (via cpuid etc). These events
depend on the PMC interface.

The tprof utility is rewritten to allow the user to choose which event
to count (which was not possible until now, the event was hardcoded in
the backend). The command line format is based on usr.bin/pmc, eg:

tprof -e llc-misses:k -o output sleep 20

The man page is updated too, but the arguments will likely change soon
anyway so it doesn't matter a lot.

The tprof utility has three tables:

Intel Architectural Version 1
Intel Skylake/Kabylake
AMD Family 10h

A CPU can support a combination of tables. For example Kabylake has
Intel-Architectural-Version-1 and its own Intel-Kabylake table.

For now the Intel Skylake/Kabylake table contains only one event, just
to demonstrate that the combination of tables works. Tested on an
Intel Core i5 Kabylake.

The code for AMD Family 10h is taken from the code I had written for
usr.bin/pmc. I haven't tested it yet, but it's the same as pmc(1), so
I guess it works as-is.

The whole thing is written in such a way that (I think) it is not
complicated to add more CPU models, and more architectures (other than
x86).
 1.3 14-Apr-2011  yamt branches: 1.3.54; 1.3.56;
for each samples, record and report cpuid and lwpid.
 1.2 05-Feb-2011  yamt tprof: record pid and userland events.
 1.1 01-Jan-2008  yamt branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8; 1.1.40; 1.1.46; 1.1.48;
a simple performance monitor based profiler, inspired from linux oprofile.
 1.1.48.1 08-Feb-2011  bouyer Sync with HEAD
 1.1.46.1 06-Jun-2011  jruoho Sync with HEAD.
 1.1.40.2 21-Apr-2011  rmind sync with head
 1.1.40.1 05-Mar-2011  rmind sync with head
 1.1.8.2 18-Feb-2008  mjf Sync with HEAD.
 1.1.8.1 01-Jan-2008  mjf file tprof_ioctl.h was added on branch mjf-devfs on 2008-02-18 21:06:25 +0000
 1.1.6.2 21-Jan-2008  yamt sync with head
 1.1.6.1 01-Jan-2008  yamt file tprof_ioctl.h was added on branch yamt-lazymbuf on 2008-01-21 09:44:40 +0000
 1.1.4.2 09-Jan-2008  matt sync with HEAD
 1.1.4.1 01-Jan-2008  matt file tprof_ioctl.h was added on branch matt-armv6 on 2008-01-09 01:54:37 +0000
 1.1.2.2 02-Jan-2008  bouyer Sync with HEAD
 1.1.2.1 01-Jan-2008  bouyer file tprof_ioctl.h was added on branch bouyer-xeni386 on 2008-01-02 21:55:18 +0000
 1.3.56.1 10-Jun-2019  christos Sync with HEAD
 1.3.54.1 28-Jul-2018  pgoyette Sync with HEAD
 1.7 11-Apr-2023  msaitoh KNF. No functional change.
 1.6 01-Dec-2022  ryo branches: 1.6.2;
Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.5 15-Jul-2018  jmcneill Add TPROF_IDENT_ARMV7_GENERIC
 1.4 15-Jul-2018  jmcneill Define TPROF_IDENT_ARMV8_GENERIC
 1.3 13-Jul-2018  maxv Revamp tprof.

Rewrite the Intel backend to use the generic PMC interface, which is
available on all Intel CPUs. Synchronize the AMD backend with the new
interface.

The kernel identifies the PMC interface, and gives its id to userland.
Userland then queries the events itself (via cpuid etc). These events
depend on the PMC interface.

The tprof utility is rewritten to allow the user to choose which event
to count (which was not possible until now, the event was hardcoded in
the backend). The command line format is based on usr.bin/pmc, eg:

tprof -e llc-misses:k -o output sleep 20

The man page is updated too, but the arguments will likely change soon
anyway so it doesn't matter a lot.

The tprof utility has three tables:

Intel Architectural Version 1
Intel Skylake/Kabylake
AMD Family 10h

A CPU can support a combination of tables. For example Kabylake has
Intel-Architectural-Version-1 and its own Intel-Kabylake table.

For now the Intel Skylake/Kabylake table contains only one event, just
to demonstrate that the combination of tables works. Tested on an
Intel Core i5 Kabylake.

The code for AMD Family 10h is taken from the code I had written for
usr.bin/pmc. I haven't tested it yet, but it's the same as pmc(1), so
I guess it works as-is.

The whole thing is written in such a way that (I think) it is not
complicated to add more CPU models, and more architectures (other than
x86).
 1.2 14-Apr-2011  yamt branches: 1.2.4; 1.2.56; 1.2.58;
for each samples, record and report cpuid and lwpid.
 1.1 05-Feb-2011  yamt branches: 1.1.2; 1.1.4;
tprof: record pid and userland events.
 1.1.4.3 21-Apr-2011  rmind sync with head
 1.1.4.2 05-Mar-2011  rmind sync with head
 1.1.4.1 05-Feb-2011  rmind file tprof_types.h was added on branch rmind-uvmplock on 2011-03-05 20:54:10 +0000
 1.1.2.2 08-Feb-2011  bouyer Sync with HEAD
 1.1.2.1 05-Feb-2011  bouyer file tprof_types.h was added on branch bouyer-quota2 on 2011-02-08 16:19:55 +0000
 1.2.58.1 10-Jun-2019  christos Sync with HEAD
 1.2.56.1 28-Jul-2018  pgoyette Sync with HEAD
 1.2.4.2 06-Jun-2011  jruoho Sync with HEAD.
 1.2.4.1 14-Apr-2011  jruoho file tprof_types.h was added on branch jruoho-x86intr on 2011-06-06 09:08:40 +0000
 1.6.2.1 21-Jun-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #210):

usr.sbin/tprof/tprof.8: revision 1.30
sys/dev/tprof/tprof_x86_amd.c: revision 1.8
sys/dev/tprof/tprof_armv8.c: revision 1.20
sys/dev/tprof/tprof_types.h: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.6
sys/dev/tprof/tprof_x86_intel.c: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.8
sys/dev/tprof/tprof.c: revision 1.23
usr.sbin/tprof/tprof.8: revision 1.25
usr.sbin/tprof/tprof.8: revision 1.26
usr.sbin/tprof/arch/tprof_x86.c: revision 1.16
usr.sbin/tprof/tprof.8: revision 1.27
usr.sbin/tprof/arch/tprof_x86.c: revision 1.17
usr.sbin/tprof/tprof.8: revision 1.28
usr.sbin/tprof/tprof.h: revision 1.5
usr.sbin/tprof/tprof.8: revision 1.29
sys/dev/tprof/tprof_armv7.c: revision 1.13
usr.sbin/tprof/tprof_top.c: revision 1.9
usr.sbin/tprof/tprof.c: revision 1.21

Add Cometlake support.

Obtain the number of general counters from CPUID 0xa.

Test cpuid_level in tprof_intel_ncounters().
This function is called before tprof_intel_ident().

KNF. No functional change.

Add two note to the tprof(8)'s manual page.
- "list" command prints the maximum number of counters that can be used
simultaneously.
- multiple -e arguments can be specified.

Use the default counter if -e argument is not specified.
monitor command:
The default counter is selected if -e argument is not specified.
list command:
Print the name of the default counter for monitor and top command.

tprof.8: new sentence, new line

tprof(8): fix markup nits

tprof.8: fix typo, s/speficied/specified/
 1.2 01-Dec-2022  ryo Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.1 24-Jul-2018  maxv branches: 1.1.2; 1.1.8;
Merge the tprof_pmi and tprof_amdpmi modules into a single tprof_x86
module.
 1.1.8.2 10-Jun-2019  christos Sync with HEAD
 1.1.8.1 24-Jul-2018  christos file tprof_x86.c was added on branch phil-wifi on 2019-06-10 22:07:33 +0000
 1.1.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.1.2.1 24-Jul-2018  pgoyette file tprof_x86.c was added on branch pgoyette-compat on 2018-07-28 04:37:57 +0000
 1.8 11-Apr-2023  msaitoh KNF. No functional change.
 1.7 08-Dec-2022  msaitoh branches: 1.7.2;
Add AMD family 19h (zen3 and zen4) support to tprof.
 1.6 01-Dec-2022  ryo Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.5 11-Oct-2019  jmcneill Match Family 15h
 1.4 14-Jun-2019  msaitoh branches: 1.4.2;
Fix compile error (s/LAPIC_PCINT/LAPIC_LVT_PCINT/)
 1.3 29-May-2019  maxv branches: 1.3.2;
Add support for AMD Family 17h.
 1.2 24-Jul-2018  maxv branches: 1.2.2;
Merge the tprof_pmi and tprof_amdpmi modules into a single tprof_x86
module.
 1.1 16-Jul-2018  maxv Move
arch/x86/x86/tprof_pmi.c
arch/x86/x86/tprof_amdpmi.c
into
dev/tprof/tprof_x86_intel.c
dev/tprof/tprof_x86_amd.c
 1.2.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.2.2.1 24-Jul-2018  pgoyette file tprof_x86_amd.c was added on branch pgoyette-compat on 2018-07-28 04:37:57 +0000
 1.3.2.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.3.2.2 10-Jun-2019  christos Sync with HEAD
 1.3.2.1 29-May-2019  christos file tprof_x86_amd.c was added on branch phil-wifi on 2019-06-10 22:07:33 +0000
 1.4.2.2 01-Aug-2023  martin Pull up the following revisions, requested by msaitoh in ticket #1697:

usr.sbin/tprof/tprof.8 1.16,1.22,1.25,1.29 via patch
usr.sbin/tprof/tprof_analyze.c 1.4
usr.sbin/tprof/arch/tprof_x86.c 1.13-1.19
sys/dev/tprof/tprof.c 1.23 via patch
sys/dev/tprof/tprof_x86_amd.c 1.7-1.8 via patch
sys/dev/tprof/tprof_x86_intel.c 1.8 via patch

- Add AMD family 19h (zen3 and zen4) support.
- Add Intel Comet Lake support.
- Add support for Intel Skylake-X and Cascade Lake.
- Print the path that we failed to open on error.
- Use lowercase consistently for hexadecimal numbers.
- KNF
 1.4.2.1 12-Oct-2019  martin Pull up following revision(s) (requested by jmcneill in ticket #301):

usr.sbin/tprof/tprof.8: revision 1.15
sys/dev/tprof/tprof_x86_amd.c: revision 1.5
usr.sbin/tprof/arch/tprof_x86.c: revision 1.9

Match Family 15h

-

Add support for AMD Family 15h

-

Add AMD Family 15h to supported model list
 1.7.2.1 21-Jun-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #210):

usr.sbin/tprof/tprof.8: revision 1.30
sys/dev/tprof/tprof_x86_amd.c: revision 1.8
sys/dev/tprof/tprof_armv8.c: revision 1.20
sys/dev/tprof/tprof_types.h: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.6
sys/dev/tprof/tprof_x86_intel.c: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.8
sys/dev/tprof/tprof.c: revision 1.23
usr.sbin/tprof/tprof.8: revision 1.25
usr.sbin/tprof/tprof.8: revision 1.26
usr.sbin/tprof/arch/tprof_x86.c: revision 1.16
usr.sbin/tprof/tprof.8: revision 1.27
usr.sbin/tprof/arch/tprof_x86.c: revision 1.17
usr.sbin/tprof/tprof.8: revision 1.28
usr.sbin/tprof/tprof.h: revision 1.5
usr.sbin/tprof/tprof.8: revision 1.29
sys/dev/tprof/tprof_armv7.c: revision 1.13
usr.sbin/tprof/tprof_top.c: revision 1.9
usr.sbin/tprof/tprof.c: revision 1.21

Add Cometlake support.

Obtain the number of general counters from CPUID 0xa.

Test cpuid_level in tprof_intel_ncounters().
This function is called before tprof_intel_ident().

KNF. No functional change.

Add two note to the tprof(8)'s manual page.
- "list" command prints the maximum number of counters that can be used
simultaneously.
- multiple -e arguments can be specified.

Use the default counter if -e argument is not specified.
monitor command:
The default counter is selected if -e argument is not specified.
list command:
Print the name of the default counter for monitor and top command.

tprof.8: new sentence, new line

tprof(8): fix markup nits

tprof.8: fix typo, s/speficied/specified/
 1.8 11-Apr-2023  msaitoh KNF. No functional change.
 1.7 11-Apr-2023  msaitoh Test cpuid_level in tprof_intel_ncounters().

This function is called before tprof_intel_ident().
 1.6 11-Apr-2023  msaitoh Obtain the number of general counters from CPUID 0xa.
 1.5 01-Dec-2022  ryo branches: 1.5.2;
Improve tprof(4)

- Multiple events can now be handled simultaneously.
- Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance,
instead of being configured at TPROF_IOC_START.
- The configured counters can be started and stopped repeatedly by
PROF_IOC_START/TPROF_IOC_STOP.
- The value of the performance counter can be obtained at any timing as a 64bit
value with TPROF_IOC_GETCOUNTS.
- Backend common parts are handled in tprof.c as much as possible, and functions
on the tprof_backend side have been reimplemented to be more primitive.
- The reset value of counter overflows for profiling can now be adjusted.
It is calculated by default from the CPU clock (speed of cycle counter) and
TPROF_HZ, but for some events the value may be too large to be sufficient for
profiling. The event counter can be specified as a ratio to the default or as
an absolute value when configuring the event counter.
- Due to overall changes, API and ABI have been changed. TPROF_VERSION and
TPROF_BACKEND_VERSION were updated.
 1.4 26-May-2022  msaitoh Use CPUID_PERF_* macros defined in specialreg.h. No functional change.
 1.3 14-Jun-2019  msaitoh branches: 1.3.2;
Fix compile error (s/LAPIC_PCINT/LAPIC_LVT_PCINT/)
 1.2 24-Jul-2018  maxv branches: 1.2.2; 1.2.8;
Merge the tprof_pmi and tprof_amdpmi modules into a single tprof_x86
module.
 1.1 16-Jul-2018  maxv Move
arch/x86/x86/tprof_pmi.c
arch/x86/x86/tprof_amdpmi.c
into
dev/tprof/tprof_x86_intel.c
dev/tprof/tprof_x86_amd.c
 1.2.8.3 13-Apr-2020  martin Mostly merge changes from HEAD upto 20200411
 1.2.8.2 10-Jun-2019  christos Sync with HEAD
 1.2.8.1 24-Jul-2018  christos file tprof_x86_intel.c was added on branch phil-wifi on 2019-06-10 22:07:33 +0000
 1.2.2.2 28-Jul-2018  pgoyette Sync with HEAD
 1.2.2.1 24-Jul-2018  pgoyette file tprof_x86_intel.c was added on branch pgoyette-compat on 2018-07-28 04:37:57 +0000
 1.3.2.2 01-Aug-2023  martin Pull up the following revisions, requested by msaitoh in ticket #1697:

usr.sbin/tprof/tprof.8 1.16,1.22,1.25,1.29 via patch
usr.sbin/tprof/tprof_analyze.c 1.4
usr.sbin/tprof/arch/tprof_x86.c 1.13-1.19
sys/dev/tprof/tprof.c 1.23 via patch
sys/dev/tprof/tprof_x86_amd.c 1.7-1.8 via patch
sys/dev/tprof/tprof_x86_intel.c 1.8 via patch

- Add AMD family 19h (zen3 and zen4) support.
- Add Intel Comet Lake support.
- Add support for Intel Skylake-X and Cascade Lake.
- Print the path that we failed to open on error.
- Use lowercase consistently for hexadecimal numbers.
- KNF
 1.3.2.1 15-Oct-2022  martin Pull up following revision(s) (requested by msaitoh in ticket #1543):

sys/dev/tprof/tprof_x86_intel.c: revision 1.4
usr.sbin/tprof/arch/tprof_x86.c: revision 1.10
usr.sbin/tprof/arch/tprof_x86.c: revision 1.11
usr.sbin/tprof/arch/tprof_x86.c: revision 1.12

Fix typo in a comment.

Use CPUID_PERF_* macros defined in specialreg.h. No functional change.

Add topdown-slots to Intel architectural performance monitoring version 1.

Disable the unsupported events from the bit vector length in EAX.
 1.5.2.1 21-Jun-2023  martin Pull up following revision(s) (requested by msaitoh in ticket #210):

usr.sbin/tprof/tprof.8: revision 1.30
sys/dev/tprof/tprof_x86_amd.c: revision 1.8
sys/dev/tprof/tprof_armv8.c: revision 1.20
sys/dev/tprof/tprof_types.h: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.6
sys/dev/tprof/tprof_x86_intel.c: revision 1.7
sys/dev/tprof/tprof_x86_intel.c: revision 1.8
sys/dev/tprof/tprof.c: revision 1.23
usr.sbin/tprof/tprof.8: revision 1.25
usr.sbin/tprof/tprof.8: revision 1.26
usr.sbin/tprof/arch/tprof_x86.c: revision 1.16
usr.sbin/tprof/tprof.8: revision 1.27
usr.sbin/tprof/arch/tprof_x86.c: revision 1.17
usr.sbin/tprof/tprof.8: revision 1.28
usr.sbin/tprof/tprof.h: revision 1.5
usr.sbin/tprof/tprof.8: revision 1.29
sys/dev/tprof/tprof_armv7.c: revision 1.13
usr.sbin/tprof/tprof_top.c: revision 1.9
usr.sbin/tprof/tprof.c: revision 1.21

Add Cometlake support.

Obtain the number of general counters from CPUID 0xa.

Test cpuid_level in tprof_intel_ncounters().
This function is called before tprof_intel_ident().

KNF. No functional change.

Add two note to the tprof(8)'s manual page.
- "list" command prints the maximum number of counters that can be used
simultaneously.
- multiple -e arguments can be specified.

Use the default counter if -e argument is not specified.
monitor command:
The default counter is selected if -e argument is not specified.
list command:
Print the name of the default counter for monitor and top command.

tprof.8: new sentence, new line

tprof(8): fix markup nits

tprof.8: fix typo, s/speficied/specified/

RSS XML Feed