History log of /src/sys/netinet/tcp_vtw.c |
Revision | | Date | Author | Comments |
1.25 |
| 07-Oct-2024 |
jakllsch | Allow CACHE_LINE_SIZE 256 with uint64_t fatp_word_t
|
1.24 |
| 04-Nov-2022 |
ozaki-r | branches: 1.24.8; inpcb: rename functions to inpcb_*
Inspired by rmind-smpnet patches.
|
1.23 |
| 28-Oct-2022 |
ozaki-r | inpcb: separate inpcb again to reduce the size of PCB for IPv4
The data size of PCB for IPv4 increased because of the merge of struct in6pcb. The change decreases the size to the original size by separating struct inpcb (again). struct in4pcb and in6pcb that embed struct inpcb are introduced.
Even after the separation, users don't need to realize the separation and only have to use some macros to access dedicated data. For example, inp->inp_laddr is now accessed through in4p_laddr(inp).
|
1.22 |
| 28-Oct-2022 |
ozaki-r | inpcb: integrate data structures of PCB into one
Data structures of network protocol control blocks (PCBs), i.e., struct inpcb, in6pcb and inpcb_hdr, are not organized well. Users of the data structures have to handle them separately and thus the code is cluttered and duplicated.
The commit integrates the data structures into one, struct inpcb. As a result, users of PCBs only have to handle just one data structure, so the code becomes simple.
One drawback is that the data size of PCB for IPv4 increases by 40 bytes (from 248 bytes to 288 bytes).
|
1.21 |
| 13-Aug-2021 |
andvar | fix typos in words "pointer" and s/fram /frame/
|
1.20 |
| 01-Oct-2019 |
chs | in many device attach paths, allocate memory with KM_SLEEP instead of KM_NOSLEEP and remove code to handle failures that can no longer happen.
|
1.19 |
| 03-May-2018 |
maxv | branches: 1.19.2; Remove now unused tcpip.h includes. Some were already unused before.
|
1.18 |
| 01-Jun-2017 |
chs | branches: 1.18.8; remove checks for failure after memory allocation calls that cannot fail:
kmem_alloc() with KM_SLEEP kmem_zalloc() with KM_SLEEP percpu_alloc() pserialize_create() psref_class_create()
all of these paths include an assertion that the allocation has not failed, so callers should not assert that again.
|
1.17 |
| 13-Dec-2016 |
ozaki-r | Remove unnecessary inclusions of nd6.h
|
1.16 |
| 28-Jul-2016 |
martin | PR kern/51371: avoid shifting negative values
|
1.15 |
| 26-Apr-2016 |
ozaki-r | branches: 1.15.2; Sweep unnecessary route.h inclusions
|
1.14 |
| 24-Aug-2015 |
pooka | sprinkle _KERNEL_OPT
|
1.13 |
| 31-Mar-2015 |
ozaki-r | Remove unnecessary opt_ipsec.h inclusions
|
1.12 |
| 10-Nov-2014 |
maxv | branches: 1.12.2; Do not uselessly include <sys/malloc.h>.
|
1.11 |
| 05-Sep-2014 |
matt | Don't use C++ keywords (class, template) as variables
|
1.10 |
| 15-Sep-2013 |
martin | branches: 1.10.4; ifdef a variable like its use
|
1.9 |
| 13-Apr-2012 |
yamt | branches: 1.9.2; 1.9.4; add a big comment (copy and paste from cvs log rev.1.1)
|
1.8 |
| 17-Jul-2011 |
joerg | branches: 1.8.2; 1.8.6; Retire varargs.h support. Move machine/stdarg.h logic into MI sys/stdarg.h and expect compiler to provide proper builtins, defaulting to the GCC interface. lint still has a special fallback. Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and derive va_list as required by standards.
|
1.7 |
| 06-Jun-2011 |
dyoung | Don't allocate resources for vtw until/unless it is enabled. This will further help those machines where memory is in short supply.
TBD: release resources after vtw is disabled and all entries have expired.
|
1.6 |
| 03-Jun-2011 |
dyoung | branches: 1.6.2; Don't sleep until memory becomes available.
Use kmem_zalloc() instead of kmem_alloc() + bzero().
During initialization, try to get all of the memory we need for the vestigial time-wait structures before we set any of the structures up, and if any single allocation fails, release all of the memory.
This should help low-memory hosts. A much better fix postpones allocating any memory until vtw is enabled through the sysctl.
|
1.5 |
| 03-Jun-2011 |
dyoung | Defer scheduling vtw_tick() and setting the vtw hooks until vtw_control() is called. In this way, vtw_tick() will be re-scheduled repeatedly while vtw is in use.
Pay tcp_vtw_was_enabled no attention in vtw_earlyinit(), since it's always going to be 0 during initialization.
|
1.4 |
| 17-May-2011 |
dholland | branches: 1.4.2; 1.4.4; typo in comment
|
1.3 |
| 11-May-2011 |
drochner | use getmicrouptime(9) rather than microtime(9) for TIME_WAIT duration calculation, because this doesn't get confused by system time changes, and uses less CPU cycles reviewed by dyoung
|
1.2 |
| 06-May-2011 |
drochner | remove an empty function
|
1.1 |
| 03-May-2011 |
dyoung | Reduces the resources demanded by TCP sessions in TIME_WAIT-state using methods called Vestigial Time-Wait (VTW) and Maximum Segment Lifetime Truncation (MSLT).
MSLT and VTW were contributed by Coyote Point Systems, Inc.
Even after a TCP session enters the TIME_WAIT state, its corresponding socket and protocol control blocks (PCBs) stick around until the TCP Maximum Segment Lifetime (MSL) expires. On a host whose workload necessarily creates and closes down many TCP sockets, the sockets & PCBs for TCP sessions in TIME_WAIT state amount to many megabytes of dead weight in RAM.
Maximum Segment Lifetimes Truncation (MSLT) assigns each TCP session to a class based on the nearness of the peer. Corresponding to each class is an MSL, and a session uses the MSL of its class. The classes are loopback (local host equals remote host), local (local host and remote host are on the same link/subnet), and remote (local host and remote host communicate via one or more gateways). Classes corresponding to nearer peers have lower MSLs by default: 2 seconds for loopback, 10 seconds for local, 60 seconds for remote. Loopback and local sessions expire more quickly when MSLT is used.
Vestigial Time-Wait (VTW) replaces a TIME_WAIT session's PCB/socket dead weight with a compact representation of the session, called a "vestigial PCB". VTW data structures are designed to be very fast and memory-efficient: for fast insertion and lookup of vestigial PCBs, the PCBs are stored in a hash table that is designed to minimize the number of cacheline visits per lookup/insertion. The memory both for vestigial PCBs and for elements of the PCB hashtable come from fixed-size pools, and linked data structures exploit this to conserve memory by representing references with a narrow index/offset from the start of a pool instead of a pointer. When space for new vestigial PCBs runs out, VTW makes room by discarding old vestigial PCBs, oldest first. VTW cooperates with MSLT.
It may help to think of VTW as a "FIN cache" by analogy to the SYN cache.
A 2.8-GHz Pentium 4 running a test workload that creates TIME_WAIT sessions as fast as it can is approximately 17% idle when VTW is active versus 0% idle when VTW is inactive. It has 103 megabytes more free RAM when VTW is active (approximately 64k vestigial PCBs are created) than when it is inactive.
|
1.4.4.1 |
| 23-Jun-2011 |
cherry | Catchup with rmind-uvmplock merge.
|
1.4.2.3 |
| 12-Jun-2011 |
rmind | sync with head
|
1.4.2.2 |
| 31-May-2011 |
rmind | sync with head
|
1.4.2.1 |
| 17-May-2011 |
rmind | file tcp_vtw.c was added on branch rmind-uvmplock on 2011-05-31 03:05:08 +0000
|
1.6.2.2 |
| 06-Jun-2011 |
jruoho | Sync with HEAD.
|
1.6.2.1 |
| 03-Jun-2011 |
jruoho | file tcp_vtw.c was added on branch jruoho-x86intr on 2011-06-06 09:09:57 +0000
|
1.8.6.1 |
| 29-Apr-2012 |
mrg | sync to latest -current.
|
1.8.2.2 |
| 22-May-2014 |
yamt | sync with head.
for a reference, the tree before this commit was tagged as yamt-pagecache-tag8.
this commit was splitted into small chunks to avoid a limitation of cvs. ("Protocol error: too many arguments")
|
1.8.2.1 |
| 17-Apr-2012 |
yamt | sync with head
|
1.9.4.3 |
| 18-May-2014 |
rmind | sync with head
|
1.9.4.2 |
| 23-Sep-2013 |
rmind | - Add some initial locking to the IPv4 PCB. - Rename inpcb_lookup_*() routines to be more accurate and add comments. - Add some comments about connection life-cycle WRT socket layer.
|
1.9.4.1 |
| 17-Jul-2013 |
rmind | Checkpoint work in progress: - Move PCB structures under __INPCB_PRIVATE, adjust most of the callers and thus make IPv4 PCB structures mostly opaque. Any volunteers for merging in6pcb with inpcb (see rpaulo-netinet-merge-pcb branch)? - Move various global vars to the modules where they belong, make them static. - Some preliminary work for IPv4 PCB locking scheme. - Make raw IP code mostly MP-safe. Simplify some of it. - Rework "fast" IP forwarding (ipflow) code to be mostly MP-safe. It should run from a software interrupt, rather than hard. - Rework tun(4) pseudo interface to be MP-safe. - Work towards making some other interfaces more strict.
|
1.9.2.2 |
| 03-Dec-2017 |
jdolecek | update from HEAD
|
1.9.2.1 |
| 20-Aug-2014 |
tls | Rebase to HEAD as of a few days ago.
|
1.10.4.1 |
| 17-Jan-2015 |
martin | Pull up following revision(s) (requested by maxv in ticket #427): sys/compat/svr4/svr4_schedctl.c: revision 1.8 sys/netinet/tcp_timer.c: revision 1.88 sys/miscfs/genfs/layer_vfsops.c: revision 1.45 sys/compat/svr4/svr4_ioctl.c: revision 1.37 sys/ufs/chfs/chfs_vfsops.c: revision 1.14 sys/miscfs/fdesc/fdesc_vfsops.c: revision 1.91 sys/compat/linux/arch/i386/linux_ptrace.c: revision 1.30 sys/compat/common/kern_time_50.c: revision 1.28 sys/netinet6/ip6_forward.c: revision 1.74 sys/miscfs/umapfs/umap_vnops.c: revision 1.57 sys/compat/svr4/svr4_fcntl.c: revision 1.74 distrib/sets/lists/comp/mi: revision 1.1931 sys/netinet6/udp6_output.c: revision 1.46 sys/fs/puffs/puffs_compat.c: revision 1.3 sys/fs/udf/udf_rename.c: revision 1.11 sys/compat/svr4/svr4_filio.c: revision 1.24 sys/fs/udf/udf_rename.c: revision 1.12 sys/netinet/tcp_usrreq.c: revision 1.202 sys/miscfs/umapfs/umap_subr.c: revision 1.29 sys/compat/linux/common/linux_fadvise64.c: revision 1.3 sys/netinet/if_atm.c: revision 1.34 sys/miscfs/procfs/procfs_subr.c: revision 1.106 sys/miscfs/genfs/layer_subr.c: revision 1.37 sys/netinet/tcp_sack.c: revision 1.30 sys/compat/freebsd/freebsd_misc.c: revision 1.33 sys/compat/freebsd/freebsd_file.c: revision 1.33 sys/ufs/chfs/chfs_vnode.c: revision 1.12 sys/compat/svr4/svr4_ttold.c: revision 1.34 sys/compat/linux/common/linux_file.c: revision 1.114 sys/compat/linux/arch/mips/linux_machdep.c: revision 1.43 sys/compat/linux/common/linux_signal.c: revision 1.76 sys/compat/common/compat_util.c: revision 1.46 sys/compat/linux/arch/arm/linux_ptrace.c: revision 1.18 sys/compat/svr4/svr4_sockio.c: revision 1.36 sys/compat/linux/arch/arm/linux_machdep.c: revision 1.32 sys/compat/svr4/svr4_signal.c: revision 1.66 sys/kern/kern_exec.c: revision 1.410 sys/fs/puffs/puffs_vfsops.c: revision 1.115 sys/compat/svr4/svr4_exec_elf64.c: revision 1.15 sys/compat/linux/arch/i386/linux_machdep.c: revision 1.159 sys/compat/linux/arch/alpha/linux_machdep.c: revision 1.50 sys/compat/linux32/common/linux32_misc.c: revision 1.24 sys/netinet/in_pcb.c: revision 1.153 sys/sys/malloc.h: revision 1.116 sys/compat/common/if_43.c: revision 1.9 share/man/man9/Makefile: revision 1.380 sys/netinet/tcp_vtw.c: revision 1.12 sys/miscfs/umapfs/umap_vfsops.c: revision 1.95 sys/ufs/ext2fs/ext2fs_vfsops.c: revision 1.186 sys/compat/common/uipc_syscalls_43.c: revision 1.46 sys/ufs/ext2fs/ext2fs_vnops.c: revision 1.115 sys/fs/puffs/puffs_msgif.c: revision 1.97 sys/compat/svr4/svr4_ipc.c: revision 1.27 sys/compat/linux/common/linux_exec.c: revision 1.117 sys/ufs/ext2fs/ext2fs_readwrite.c: revision 1.66 sys/netinet/tcp_output.c: revision 1.179 sys/compat/svr4/svr4_termios.c: revision 1.28 sys/fs/udf/udf_strat_bootstrap.c: revision 1.4 sys/fs/puffs/puffs_subr.c: revision 1.67 sys/fs/puffs/puffs_node.c: revision 1.36 sys/miscfs/overlay/overlay_vnops.c: revision 1.21 sys/fs/cd9660/cd9660_node.c: revision 1.34 sys/netinet/raw_ip.c: revision 1.146 sys/sys/mallocvar.h: revision 1.13 sys/miscfs/overlay/overlay_vfsops.c: revision 1.63 share/man/man9/malloc.9: revision 1.50 sys/netinet6/dest6.c: revision 1.18 sys/compat/linux/common/linux_uselib.c: revision 1.33 sys/compat/linux/common/linux_socket.c: revision 1.120 share/man/man9/malloc.9: revision 1.51 sys/netinet/tcp_subr.c: revision 1.257 sys/compat/linux/common/linux_socketcall.c: revision 1.45 sys/compat/linux/common/linux_fadvise64_64.c: revision 1.3 sys/compat/freebsd/freebsd_ipc.c: revision 1.17 sys/compat/linux/common/linux_misc_notalpha.c: revision 1.109 sys/compat/linux/arch/alpha/linux_pipe.c: revision 1.17 sys/netinet6/in6_pcb.c: revision 1.132 sys/netinet6/in6_ifattach.c: revision 1.94 sys/compat/svr4/svr4_exec_elf32.c: revision 1.15 sys/miscfs/nullfs/null_vfsops.c: revision 1.90 sys/fs/cd9660/cd9660_util.c: revision 1.12 sys/compat/linux/arch/powerpc/linux_machdep.c: revision 1.48 sys/compat/freebsd/freebsd_exec_elf32.c: revision 1.20 sys/miscfs/procfs/procfs_vfsops.c: revision 1.94 sys/compat/linux/arch/powerpc/linux_ptrace.c: revision 1.28 sys/compat/linux/common/linux_sched.c: revision 1.67 sys/compat/linux/common/linux_exec_aout.c: revision 1.67 sys/compat/linux/common/linux_pipe.c: revision 1.67 sys/compat/linux/common/linux_llseek.c: revision 1.34 sys/compat/linux/arch/mips/linux_ptrace.c: revision 1.10 Do not uselessly include <sys/malloc.h>. Cleanup: - remove struct kmembuckets (dead) - correctly deadify MALLOC_XX - remove MALLOC_DEFINE_LIMIT and MALLOC_JUSTDEFINE_LIMIT (dead) - remove malloc_roundup(), malloc_type_setlimit(), MALLOC_DEFINE_LIMIT() and MALLOC_JUSTDEFINE_LIMIT() from man 9 malloc New sentence, new line. Bump date for previous. Obsolete malloc_roundup(9), malloc_type_setlimit(9) and MALLOC_DEFINE_LIMIT(9) man pages.
|
1.12.2.6 |
| 28-Aug-2017 |
skrll | Sync with HEAD
|
1.12.2.5 |
| 05-Feb-2017 |
skrll | Sync with HEAD
|
1.12.2.4 |
| 05-Oct-2016 |
skrll | Sync with HEAD
|
1.12.2.3 |
| 29-May-2016 |
skrll | Sync with HEAD
|
1.12.2.2 |
| 22-Sep-2015 |
skrll | Sync with HEAD
|
1.12.2.1 |
| 06-Apr-2015 |
skrll | Sync with HEAD
|
1.15.2.2 |
| 07-Jan-2017 |
pgoyette | Sync with HEAD. (Note that most of these changes are simply $NetBSD$ tag issues.)
|
1.15.2.1 |
| 06-Aug-2016 |
pgoyette | Sync with HEAD
|
1.18.8.1 |
| 21-May-2018 |
pgoyette | Sync with HEAD
|
1.19.2.1 |
| 13-Apr-2020 |
martin | Mostly merge changes from HEAD upto 20200411
|
1.24.8.1 |
| 02-Aug-2025 |
perseant | Sync with HEAD
|