Home | History | Annotate | Download | only in raidframe
History log of /src/sys/dev/raidframe/rf_aselect.c
RevisionDateAuthorComments
 1.31  20-Mar-2022  andvar s/initialiase/initialise/ in comments.
 1.30  23-Jul-2021  oster Extensive mechanical changes to the pools used in RAIDframe.

Alloclist remains not per-RAID, so initialize that pool
separately/differently than the rest.

The remainder of pools in RF_Pools_s are now per-RAID pools. Mostly
mechanical changes to functions to allocate/destroy per-RAID pools.
Needed to make raidPtr available in certain cases to be able to find
the per-RAID pools.

Extend rf_pool_init() to now populate a per-RAID wchan value that is
unique to each pool for a given RAID device.

TODO: Complete the analysis of the minimum number of items that are
required for each pool to allow IO to progress (i.e. so that a request
for pool resources can always be satisfied), and dynamically scale
minimum pool sizes based on RAID configuration.
 1.29  04-Jan-2017  christos branches: 1.29.34;
PR/51775: David Binderman: Remove unused variable.
 1.28  15-Sep-2013  martin branches: 1.28.6; 1.28.10;
Remove unused variables
 1.27  31-Aug-2011  plunky branches: 1.27.2; 1.27.12; 1.27.16;
NULL does not need a cast
 1.26  07-Feb-2009  oster Nuke #define MAXNSTRIPES which is no longer useful.
 1.25  04-Mar-2007  christos branches: 1.25.40; 1.25.50;
Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.
 1.24  23-Mar-2006  oster branches: 1.24.14;
This one is good for making one's brain hurt. Turns out that the
original RAIDframe code had the same bug with dag_h being used when
possibly NULL. Use dagList as the starting point for any potential
dag_h's. Move the initialization of dag_h in this part to a
little later. Loop now runs through in equivalent lock-step with the
construction of the dagList earlier in the function.

Addresses Coverity CID 1129 (id=6841 Run 5).
 1.23  18-Mar-2006  oster dag_h will always be NULL in this case, and thus we can eliminate a
bit of dead code. Addresses Coverity CID 728 NetBSD Scan 5.
 1.22  11-Dec-2005  christos branches: 1.22.4; 1.22.6; 1.22.8; 1.22.10; 1.22.12;
merge ktrace-lwp.
 1.21  27-Feb-2005  perry branches: 1.21.4;
nuke trailing whitespace
 1.20  09-Apr-2004  oster branches: 1.20.4; 1.20.6;
These changes complete the effective removal of malloc() from all
write paths within RAIDframe. They also resolve the "panics with
RAID 5 sets with more than 3 components" issue which was present
(briefly) in the commits which were previously supposed to address
the malloc() issue.

With this new code the 5-component RAID 5 set panics are now gone.

It is also now also possible to swap to RAID 5.

The changes made are:

1) Introduce rf_AllocStripeBuffer() and rf_FreeStripeBuffer() to
allocate/free one stripe's worth of space. rf_AllocStripeBuffer() is
used in rf_MapUnaccessedPortionOfStripe() where it is not sufficient to
allocate memory using just rf_AllocBuffer(). rf_FreeStripeBuffer() is
called from rf_FreeRaidAccDesc(), well after the DAG is finished.

2) Add a set of emergency "stripe buffers" to struct RF_Raid_s.
Arrange for their initialization in rf_Configure(). In low-memory
situations these buffers will be returned by rf_AllocStripeBuffer()
and re-populated by rf_FreeStripeBuffer().

3) Move RF_VoidPointerListElem_t *iobufs from the dagHeader into
into struct RF_RaidAccessDesc_s. This is more consistent with the
original code, and will not result in items being freed "too early".

4) Add a RF_RaidAccessDesc_t *desc to RF_DagHeader_s so that we have a
way to find desc->iobufs.

5) Arrange for desc in the DagHeader to be initialized in InitHdrNode().

6) Don't cleanup iobufs in rf_FreeDAG() -- the freeing is now delayed
until rf_FreeRaidAccDesc() (which is how the original code handled the
allocList, and for which there seem to be some subtle, undocumented
assumptions).

7) Rename rf_AllocBuffer2() to be rf_AllocBuffer() and remove the
former rf_AllocBuffer(). Fix all callers of rf_AllocBuffer().
(This was how it was *supposed* to be after the last time these
changes were made, before they were backed out).

8) Remove RF_IOBufHeader and all references to it.

9) Remove desc->cleanupList and all references to it.

Fixes PR#20191
 1.19  20-Mar-2004  oster branches: 1.19.2;
Can't conditionalize cleanup on numStripeUnitsBailed -- have to
cleanup regardless.

More importantly, we can't free any of the AccessStripeMaps here!
 1.18  19-Mar-2004  oster Introduce 3 more pools and 6 functions to handle allocating/freeing
elements from the pools.

Re-work rf_SelectAlgorithm() to get rid of all the 8 malloc's, and to
use the new functions to get/put these 'support structures'. I'm not
overly happy with some of the variable names, but them's the breaks.

In the process of changing things, fix a bug:
- in the case where we can't create a dag, free asmh_b and blockFuncs
too!!

[if you were able to look at the source code related to these changes,
and comprehend what was going on without having your eyes bleed or
getting dizzy, please contact me... I'm sure I'll have more code
which would benefit by you having a look at it before I commit it :) ]
 1.17  18-Mar-2004  oster - Introduce a 'dagnode' pool. Initialize it and allow for cleanup.
Provide rf_AllocDAGNode() and rf_FreeDAGNode() to handle
allocation/freeing.

- Introduce a "nodes" linked list of RF_DagNode_t's into the DAG header.
Initialize nodes in InitHdrNode(). Arrange for nodes cleanup in rf_FreeDAG().

- Add a "list_next" to RF_DagNode_t to keep track of nodes on the
above "nodes" list. (This is distinct from the "next" field of
RF_DagNode_t, which keeps track of the firing order of nodes.)
"list_next" gets used in the cleanup routines, and in traversing
through a set of nodes that belong to a particular set of nodes
(e.g. those belonging to xorNodes for a given DAG).

- use rf_AllocDAGNode() instead of mallocs of variable-sized arrays of
RF_DagNode_t's. Mostly mechanical changes to convert the DAG construction
from "access nodes via an array index" to "access nodes via a 'nextnode'
pointer".

- rework a couple of tricky spots where assumptions about the node order
was being abused.

- performance remains consistent with performance before these changes.

[Thanks to Simon Burge (simonb at you.know.where) for looking over
the mechanical changes to make sure I didn't biff anything.]
 1.16  29-Feb-2004  oster InitHdrNode() might as well return 'void'. Nothing ever pays attention
to what it returns anyway!
 1.15  29-Feb-2004  oster rf_MakeAllocList() will always get memory. No point in checking for
a case that won't ever occur.
 1.14  29-Feb-2004  oster TransferDagMemory() doesn't exist, so these lines are just wasting space.
 1.13  29-Feb-2004  oster Stripe functions are now handled by a linked-list instead of a
runtime-variable array.

Fix a bug where stripeFuncs was being freed, and then being used after
(in the case of numStripesBailed > 0).
 1.12  27-Feb-2004  oster Use a dynamically allocated linked list of dagLists instead of using a
dynamically allocated variable-sized array (dagArray). Convert code
to use the new linked list stuff instead of the array stuff (the ratio
of one dagList per stripe still applies). The big advantage is in
being able to more efficiently allocate the dagLists on-the-fly, and
not have to know the size(s) of the array beforehand.
 1.11  02-Jan-2004  oster Fix the "We panic if we can't create a DAG" problem that's existed
~forever. This requires a number of things:

1) If we can't create a DAG, set desc->numStripes to 0 in
rf_SelectAlgorithm. This will ensure that we don't attempt to free
any dagArray[] elements in rf_StateCleanup.

2) Modify rf_State_CreateDAG() to not panic in the event of a DAG
failure. Instead, set the bp->b_flags and bp->b_error, and set things
up to skip to rf_State_Cleanup().

3) Need to mark desc->status as "bad" so that we actually stop looking
for a different DAG. (which we won't find... no matter how many times
we try).

4) rf_State_LastState() will then do the biodone(), and return EIO for
the IO in question.

5) Remove some " || 1 "'s from ProcessNode(). These were for
debugging, and we don't need the failure notices spewing
over and over again as the failing DAGs are processed.

6) Needed to change

if (asmap->numDataFailed + asmap->numParityFailed > 1)

to

if ((asmap->numDataFailed + asmap->numParityFailed > 1) ||
(raidPtr->numFailures > 1)){

in rf_raid5.c so that it doesn't try to return
rf_CreateNonRedundantWriteDAG as the creation function.

7) Note that we can't apply the above change to the RAID 1 code as
with the silly "fake 2-D" RAID 1 sets, it is possible to have 2 failed
components in the RAID 1 set, and that would stop them from working.
(I really don't know why/how those "fake 2-D" RAID 1 sets even work
with all the "single-fault" assumptions present in the rest of the
code.)

8) Needed to protect rf_RAID0DagSelect() in a similar way -- it should
return NULL as the createFunc.

9) No point printing out "Multiple disks failed..." a zillion times.
 1.10  30-Dec-2003  oster Some days you wonder if some of the function declaration consistency
was just an accident in the first place. Cleanup function decls and
a few comments. [ok.. so I wasn't going to fix this many.. but once
you're on a roll....]
 1.9  29-Dec-2003  oster - first kick at a major reworking of RAIDframe's memory allocation code:
- all freelists converted to pools
- initialization of structure members in certain cases where
code was relying on specific allocation and usage properties
to keep structures in a "known state" (that doesn't work with
pools!).
- make most pool_get() be "PR_WAITOK" until they can be analyzed
further, and/or have proper error handling added.
- all RF_Mallocs zero the space returned, so there is no difference
between RF_Calloc and RF_Malloc. In fact, all the RF_Calloc()'s
do is tend to do is get things horribly confused.
Make RF_Malloc() the "general memory allocator", with
RF_MallocAndAdd() the "general memory allocator with
allocation list".
- some of these RF_Malloc's et al. are destined to disappear.
- remove rf_rdp_freelist entirely (it's not used anywhere!)
- remove: #include "rf_freelist.h"
- to the files that were relying on the above, add: #include "rf_general.h"
- add: #include "rf_debugMem.h" to rf_shutdown.h to make it happy
about the loss of: #include "rf_freelist.h".

This shrinks an i386 GENERIC kernel by approx 5K. RAIDframe now
weighs in at about 162K on i386.
 1.8  01-Jul-2003  oster branches: 1.8.2;
UpdateNodeHdrPtr() isn't used anywhere. Turf.
 1.7  02-Aug-2002  oster - remove memChunkEnable as an arg to InitHdrNode
 1.6  02-Aug-2002  oster Unused code go bye-bye.
 1.5  13-Nov-2001  lukem branches: 1.5.8;
add RCSIDs
 1.4  04-Oct-2001  oster Step 2 of the disentanglement. We now look to <dev/raidframe/*> for
the stuff that used to live in rf_types.h, rf_raidframe.h, rf_layout.h,
rf_netbsd.h, rf_raid.h, rf_decluster,h, and a few other places.
Believe it or not, when this is all done, things will be cleaner.

No functional changes to RAIDframe.
 1.3  05-Feb-1999  oster branches: 1.3.20; 1.3.22; 1.3.24;
Phase 2 of the RAIDframe cleanup. The source is now closer to KNF
and is much easier to read. No functionality changes.
 1.2  26-Jan-1999  oster RAIDframe cleanup, phase 1. Nuke simulator support, user-land driver,
out-dated comments, and other unneeded stuff. This helps prepare
for cleaning up the rest of the code, and adding new functionality.

No functional changes to the kernel code in this commit.
 1.1  13-Nov-1998  oster RAIDframe, version 1.1, from the Parallel Data Laboratory at
Carnegie Mellon University. Full RAID implementation, including
levels 0, 1, 4, 5, 6, parity logging, and a few other goodies.
Ported to NetBSD by Greg Oster.
 1.3.24.1  11-Oct-2001  fvdl Catch up with -current. Fix some bogons in the sparc64 kbd/ms
attach code. cd18xx conversion provided by mrg.
 1.3.22.2  06-Sep-2002  jdolecek sync kqueue branch with HEAD
 1.3.22.1  10-Jan-2002  thorpej Sync kqueue branch with -current.
 1.3.20.3  13-Aug-2002  nathanw Catch up to -current.
 1.3.20.2  14-Nov-2001  nathanw Catch up to -current.
 1.3.20.1  22-Oct-2001  nathanw Catch up to -current.
 1.5.8.1  29-Aug-2002  gehenna catch up with -current.
 1.8.2.4  04-Mar-2005  skrll Sync with HEAD.

Hi Perry!
 1.8.2.3  21-Sep-2004  skrll Fix the sync with head I botched.
 1.8.2.2  18-Sep-2004  skrll Sync with HEAD.
 1.8.2.1  03-Aug-2004  skrll Sync with HEAD
 1.19.2.1  11-Apr-2004  tron Pull up revision 1.20 (requested by oster in ticket #123):
These changes complete the effective removal of malloc() from all
write paths within RAIDframe. They also resolve the "panics with
RAID 5 sets with more than 3 components" issue which was present
(briefly) in the commits which were previously supposed to address
the malloc() issue.
With this new code the 5-component RAID 5 set panics are now gone.
It is also now also possible to swap to RAID 5.
The changes made are:
1) Introduce rf_AllocStripeBuffer() and rf_FreeStripeBuffer() to
allocate/free one stripe's worth of space. rf_AllocStripeBuffer() is
used in rf_MapUnaccessedPortionOfStripe() where it is not sufficient to
allocate memory using just rf_AllocBuffer(). rf_FreeStripeBuffer() is
called from rf_FreeRaidAccDesc(), well after the DAG is finished.
2) Add a set of emergency "stripe buffers" to struct RF_Raid_s.
Arrange for their initialization in rf_Configure(). In low-memory
situations these buffers will be returned by rf_AllocStripeBuffer()
and re-populated by rf_FreeStripeBuffer().
3) Move RF_VoidPointerListElem_t *iobufs from the dagHeader into
into struct RF_RaidAccessDesc_s. This is more consistent with the
original code, and will not result in items being freed "too early".
4) Add a RF_RaidAccessDesc_t *desc to RF_DagHeader_s so that we have a
way to find desc->iobufs.
5) Arrange for desc in the DagHeader to be initialized in InitHdrNode().
6) Don't cleanup iobufs in rf_FreeDAG() -- the freeing is now delayed
until rf_FreeRaidAccDesc() (which is how the original code handled the
allocList, and for which there seem to be some subtle, undocumented
assumptions).
7) Rename rf_AllocBuffer2() to be rf_AllocBuffer() and remove the
former rf_AllocBuffer(). Fix all callers of rf_AllocBuffer().
(This was how it was *supposed* to be after the last time these
changes were made, before they were backed out).
8) Remove RF_IOBufHeader and all references to it.
9) Remove desc->cleanupList and all references to it.
Fixes PR#20191
 1.20.6.1  19-Mar-2005  yamt sync with head. xen and whitespace. xen part is not finished.
 1.20.4.1  29-Apr-2005  kent sync with -current
 1.21.4.2  03-Sep-2007  yamt sync with head.
 1.21.4.1  21-Jun-2006  yamt sync with head.
 1.22.12.1  28-Mar-2006  tron Merge 2006-03-28 NetBSD-current into the "peter-altq" branch.
 1.22.10.1  19-Apr-2006  elad sync with head.
 1.22.8.1  01-Apr-2006  yamt sync with head.
 1.22.6.1  22-Apr-2006  simonb Sync with head.
 1.22.4.1  09-Sep-2006  rpaulo sync with head
 1.24.14.1  12-Mar-2007  rmind Sync with HEAD.
 1.25.50.1  03-Mar-2009  skrll Sync with HEAD.
 1.25.40.1  04-May-2009  yamt sync with head.
 1.27.16.1  18-May-2014  rmind sync with head
 1.27.12.2  03-Dec-2017  jdolecek update from HEAD
 1.27.12.1  20-Aug-2014  tls Rebase to HEAD as of a few days ago.
 1.27.2.1  22-May-2014  yamt sync with head.

for a reference, the tree before this commit was tagged
as yamt-pagecache-tag8.

this commit was splitted into small chunks to avoid
a limitation of cvs. ("Protocol error: too many arguments")
 1.28.10.1  07-Jan-2017  pgoyette Sync with HEAD. (Note that most of these changes are simply $NetBSD$
tag issues.)
 1.28.6.1  05-Feb-2017  skrll Sync with HEAD
 1.29.34.1  01-Aug-2021  thorpej Sync with HEAD.

RSS XML Feed