History log of /src/sys/dev/tprof/tprof.c |
Revision | | Date | Author | Comments |
1.23 |
| 11-Apr-2023 |
msaitoh | KNF. No functional change.
|
1.22 |
| 16-Dec-2022 |
ryo | tprof_lock is not a spin mutex. use mutex_{enter,exit}(). oops
|
1.21 |
| 16-Dec-2022 |
ryo | branches: 1.21.2; - Add support select(2)/poll(2) on /dev/tprof. - Changed sampling buffer switching frequency (which is the frequency of tprof_worker() calls and also the maximum block time of read(2) of /dev/tprof) from 1sec to 125ms. This improve tprof top responsiveness. - The maximum number of sampling buffers is now adjusted according to the number of CPUs. Previously it was fixed at 100 and was insufficient if ncpu was greater than this.
The maximum number of samples per second per CPU is calculated by "TPROF_MAX_SAMPLES_PER_BUF * (HZ of tprof_worker)". Therefore, currently, 10000 * (1000/125) = 80000 maximum samplings per CPU. The actual value will vary slightly from this due to tprof_worker and read(2) timing. This value may need to be adjusted more in the future.
|
1.20 |
| 11-Dec-2022 |
chs | make sure error is initialized before we return it.
|
1.19 |
| 01-Dec-2022 |
ryo | Improve tprof(4)
- Multiple events can now be handled simultaneously. - Counters should be configured with TPROF_IOC_CONFIGURE_EVENT in advance, instead of being configured at TPROF_IOC_START. - The configured counters can be started and stopped repeatedly by PROF_IOC_START/TPROF_IOC_STOP. - The value of the performance counter can be obtained at any timing as a 64bit value with TPROF_IOC_GETCOUNTS. - Backend common parts are handled in tprof.c as much as possible, and functions on the tprof_backend side have been reimplemented to be more primitive. - The reset value of counter overflows for profiling can now be adjusted. It is calculated by default from the CPU clock (speed of cycle counter) and TPROF_HZ, but for some events the value may be too large to be sufficient for profiling. The event counter can be specified as a ratio to the default or as an absolute value when configuring the event counter. - Due to overall changes, API and ABI have been changed. TPROF_VERSION and TPROF_BACKEND_VERSION were updated.
|
1.18 |
| 01-Dec-2022 |
ryo | don't call kpreempt_{disable,enable}() from an interrupt handler.
Fixed a problem in which the system would freeze if a high load (e.g., build.sh -j20) was applied while running `tprof monitor -e LsNotHaltedCyc ...' on x86.
This almost eliminates the problem, but still is not enough. tprof_x86 uses NMI interrupts, which are interrupted even in splhigh(), leaving the possibility of being interrupted in the splhigh section of percpu_cpu_swap().
|
1.17 |
| 28-Mar-2022 |
riastradh | driver(9): devsw_detach never fails. Make it return void.
Prune a whole lotta dead branches as a result of this. (Some logic calling this is also wrong for other reasons; devsw_detach is final -- you should never have any reason to decide to roll it back. To be cleaned up in subsequent commits...)
XXX kernel ABI change to devsw_detach signature requires bump
|
1.16 |
| 01-Nov-2021 |
skrll | Trailing whitespace
|
1.15 |
| 27-Nov-2020 |
riastradh | tprof: Use percpu rather than a MAXCPUS-element array.
|
1.14 |
| 13-Jul-2018 |
maxv | branches: 1.14.6; 1.14.14; Revamp tprof.
Rewrite the Intel backend to use the generic PMC interface, which is available on all Intel CPUs. Synchronize the AMD backend with the new interface.
The kernel identifies the PMC interface, and gives its id to userland. Userland then queries the events itself (via cpuid etc). These events depend on the PMC interface.
The tprof utility is rewritten to allow the user to choose which event to count (which was not possible until now, the event was hardcoded in the backend). The command line format is based on usr.bin/pmc, eg:
tprof -e llc-misses:k -o output sleep 20
The man page is updated too, but the arguments will likely change soon anyway so it doesn't matter a lot.
The tprof utility has three tables:
Intel Architectural Version 1 Intel Skylake/Kabylake AMD Family 10h
A CPU can support a combination of tables. For example Kabylake has Intel-Architectural-Version-1 and its own Intel-Kabylake table.
For now the Intel Skylake/Kabylake table contains only one event, just to demonstrate that the combination of tables works. Tested on an Intel Core i5 Kabylake.
The code for AMD Family 10h is taken from the code I had written for usr.bin/pmc. I haven't tested it yet, but it's the same as pmc(1), so I guess it works as-is.
The whole thing is written in such a way that (I think) it is not complicated to add more CPU models, and more architectures (other than x86).
|
1.13 |
| 20-Aug-2015 |
christos | branches: 1.13.8; 1.13.16; 1.13.18; include "ioconf.h" to get the 'void <driver>attach(int count);' prototype.
|
1.12 |
| 25-Jul-2014 |
dholland | branches: 1.12.4; Add d_discard to all struct cdevsw instances I could find.
All have been set to "nodiscard"; some should get a real implementation.
|
1.11 |
| 16-Mar-2014 |
dholland | branches: 1.11.2; Change (mostly mechanically) every cdevsw/bdevsw I can find to use designated initializers.
I have not built every extant kernel so I have probably broken at least one build; however I've also found and fixed some wrong cdevsw/bdevsw entries so even if so I think we come out ahead.
|
1.10 |
| 14-Apr-2011 |
yamt | branches: 1.10.4; 1.10.14; 1.10.18; for each samples, record and report cpuid and lwpid.
|
1.9 |
| 25-Feb-2011 |
yamt | tprof_start: don't forget to restore refcount when failed to start backend.
|
1.8 |
| 05-Feb-2011 |
yamt | tprof: record pid and userland events.
|
1.7 |
| 11-Aug-2010 |
pgoyette | branches: 1.7.2; 1.7.4; Keep condvar wmesg within 8-char limit
|
1.6 |
| 13-Mar-2009 |
yamt | branches: 1.6.2; 1.6.4; tprof_stop1: add an assertion.
|
1.5 |
| 11-Mar-2009 |
yamt | fix breakage where db_regs_t != trapframe. the problem pointed out by Martin Husemann on tech-kern@.
|
1.4 |
| 10-Mar-2009 |
yamt | - adapt to MODULAR. - some preparations to have more backends. - add some comments.
|
1.3 |
| 20-Jan-2009 |
yamt | branches: 1.3.2; comment
|
1.2 |
| 07-May-2008 |
yamt | branches: 1.2.8; tprof_start: fix workqueue's IPL.
|
1.1 |
| 01-Jan-2008 |
yamt | branches: 1.1.2; 1.1.4; 1.1.6; 1.1.8; 1.1.14; 1.1.16; 1.1.18; a simple performance monitor based profiler, inspired from linux oprofile.
|
1.1.18.3 |
| 09-Oct-2010 |
yamt | sync with head
|
1.1.18.2 |
| 04-May-2009 |
yamt | sync with head.
|
1.1.18.1 |
| 16-May-2008 |
yamt | sync with head.
|
1.1.16.1 |
| 18-May-2008 |
yamt | sync with head.
|
1.1.14.1 |
| 02-Jun-2008 |
mjf | Sync with HEAD.
|
1.1.8.2 |
| 18-Feb-2008 |
mjf | Sync with HEAD.
|
1.1.8.1 |
| 01-Jan-2008 |
mjf | file tprof.c was added on branch mjf-devfs on 2008-02-18 21:06:25 +0000
|
1.1.6.2 |
| 21-Jan-2008 |
yamt | sync with head
|
1.1.6.1 |
| 01-Jan-2008 |
yamt | file tprof.c was added on branch yamt-lazymbuf on 2008-01-21 09:44:40 +0000
|
1.1.4.2 |
| 09-Jan-2008 |
matt | sync with HEAD
|
1.1.4.1 |
| 01-Jan-2008 |
matt | file tprof.c was added on branch matt-armv6 on 2008-01-09 01:54:36 +0000
|
1.1.2.2 |
| 02-Jan-2008 |
bouyer | Sync with HEAD
|
1.1.2.1 |
| 01-Jan-2008 |
bouyer | file tprof.c was added on branch bouyer-xeni386 on 2008-01-02 21:55:17 +0000
|
1.2.8.2 |
| 28-Apr-2009 |
skrll | Sync with HEAD.
|
1.2.8.1 |
| 03-Mar-2009 |
skrll | Sync with HEAD.
|
1.3.2.1 |
| 13-May-2009 |
jym | Sync with HEAD.
Commit is split, to avoid a "too many arguments" protocol error.
|
1.6.4.2 |
| 21-Apr-2011 |
rmind | sync with head
|
1.6.4.1 |
| 05-Mar-2011 |
rmind | sync with head
|
1.6.2.1 |
| 17-Aug-2010 |
uebayasi | Sync with HEAD.
|
1.7.4.2 |
| 05-Mar-2011 |
bouyer | Sync with HEAD
|
1.7.4.1 |
| 08-Feb-2011 |
bouyer | Sync with HEAD
|
1.7.2.1 |
| 06-Jun-2011 |
jruoho | Sync with HEAD.
|
1.10.18.1 |
| 18-May-2014 |
rmind | sync with head
|
1.10.14.2 |
| 03-Dec-2017 |
jdolecek | update from HEAD
|
1.10.14.1 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.10.4.1 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.11.2.1 |
| 10-Aug-2014 |
tls | Rebase.
|
1.12.4.1 |
| 22-Sep-2015 |
skrll | Sync with HEAD
|
1.13.18.1 |
| 10-Jun-2019 |
christos | Sync with HEAD
|
1.13.16.1 |
| 28-Jul-2018 |
pgoyette | Sync with HEAD
|
1.13.8.2 |
| 29-Apr-2017 |
pgoyette | Revise previous. Rather than explicitly including <sys/localcount.h> in all the places where {b,c}devsw is initialized, just include it from <sys/conf.h>. This avoids an include-sequence dependancy.
|
1.13.8.1 |
| 29-Apr-2017 |
pgoyette | Add DEVSW_MODULE_INIT to existing device-driver modules, so that they willl have a localcount defined and thus be permitted to load. Without a localcount, loading the module will return EINVAL.
XXX the dtrace and drm stuff might need to be fed back upstream?
|
1.14.14.1 |
| 14-Dec-2020 |
thorpej | Sync w/ HEAD.
|
1.14.6.1 |
| 01-Aug-2023 |
martin | Pull up the following revisions, requested by msaitoh in ticket #1697:
usr.sbin/tprof/tprof.8 1.16,1.22,1.25,1.29 via patch usr.sbin/tprof/tprof_analyze.c 1.4 usr.sbin/tprof/arch/tprof_x86.c 1.13-1.19 sys/dev/tprof/tprof.c 1.23 via patch sys/dev/tprof/tprof_x86_amd.c 1.7-1.8 via patch sys/dev/tprof/tprof_x86_intel.c 1.8 via patch
- Add AMD family 19h (zen3 and zen4) support. - Add Intel Comet Lake support. - Add support for Intel Skylake-X and Cascade Lake. - Print the path that we failed to open on error. - Use lowercase consistently for hexadecimal numbers. - KNF
|
1.21.2.2 |
| 21-Jun-2023 |
martin | Pull up following revision(s) (requested by msaitoh in ticket #210):
usr.sbin/tprof/tprof.8: revision 1.30 sys/dev/tprof/tprof_x86_amd.c: revision 1.8 sys/dev/tprof/tprof_armv8.c: revision 1.20 sys/dev/tprof/tprof_types.h: revision 1.7 sys/dev/tprof/tprof_x86_intel.c: revision 1.6 sys/dev/tprof/tprof_x86_intel.c: revision 1.7 sys/dev/tprof/tprof_x86_intel.c: revision 1.8 sys/dev/tprof/tprof.c: revision 1.23 usr.sbin/tprof/tprof.8: revision 1.25 usr.sbin/tprof/tprof.8: revision 1.26 usr.sbin/tprof/arch/tprof_x86.c: revision 1.16 usr.sbin/tprof/tprof.8: revision 1.27 usr.sbin/tprof/arch/tprof_x86.c: revision 1.17 usr.sbin/tprof/tprof.8: revision 1.28 usr.sbin/tprof/tprof.h: revision 1.5 usr.sbin/tprof/tprof.8: revision 1.29 sys/dev/tprof/tprof_armv7.c: revision 1.13 usr.sbin/tprof/tprof_top.c: revision 1.9 usr.sbin/tprof/tprof.c: revision 1.21
Add Cometlake support.
Obtain the number of general counters from CPUID 0xa.
Test cpuid_level in tprof_intel_ncounters(). This function is called before tprof_intel_ident().
KNF. No functional change.
Add two note to the tprof(8)'s manual page. - "list" command prints the maximum number of counters that can be used simultaneously. - multiple -e arguments can be specified.
Use the default counter if -e argument is not specified. monitor command: The default counter is selected if -e argument is not specified. list command: Print the name of the default counter for monitor and top command.
tprof.8: new sentence, new line
tprof(8): fix markup nits
tprof.8: fix typo, s/speficied/specified/
|
1.21.2.1 |
| 23-Dec-2022 |
martin | Pull up following revision(s) (requested by ryo in ticket #20):
sys/arch/arm/arm/cpufunc.c: revision 1.185 sys/dev/tprof/tprof.c: revision 1.22 sys/arch/arm/arm32/arm32_boot.c: revision 1.45 sys/dev/tprof/tprof_armv8.c: revision 1.19 sys/dev/tprof/tprof_armv7.c: revision 1.12 sys/arch/aarch64/aarch64/cpu.c: revision 1.71 sys/arch/aarch64/aarch64/cpu.c: revision 1.72
tprof_lock is not a spin mutex. use mutex_{enter,exit}(). oops
Explicitly disable overflow interrupts before enabling the cycle counter.
PMCR_EL0.LC should be set. ARM deprecates use of PMCR_EL0.LC=0
Even if an overflow interrupt is occured for a counter outside tprof management, the bit of onverflow status register must be cleared to prevent an interrupt storm.
|