1 # README for the matrix-vector multiplication demo code 2 3 ## Synopsis 4 5 This program implements the multiplication of a matrix and a vector. It is 6 written in C and has been parallelized using the Pthreads parallel programming 7 model. Each thread gets assigned a contiguous set of rows of the matrix to 8 work on and the results are stored in the output vector. 9 10 The code initializes the data, executes the matrix-vector multiplication, and 11 checks the correctness of the results. In case of an error, a message to this 12 extent is printed and the program aborts. Otherwise it prints a one line 13 message on the screen. 14 15 ## About this code 16 17 This is a standalone code, not a library. It is meant as a simple example to 18 experiment with gprofng. 19 20 ## Directory structure 21 22 There are four directories: 23 24 1. `bindir` - after the build, it contains the executable. 25 26 2. `experiments` - after the installation, it contains the executable and 27 also has an example profiling script called `profile.sh`. 28 29 3. `objects` - after the build, it contains the object files. 30 31 4. `src` - contains the source code and the make file to build, install, 32 and check correct functioning of the executable. 33 34 ## Code internals 35 36 This is the main execution flow: 37 38 * Parse the user options. 39 * Compute the internal settings for the algorithm. 40 * Initialize the data and compute the reference results needed for the correctness 41 check. 42 * Create and execute the threads. Each thread performs the matrix-vector 43 multiplication on a pre-determined set of rows. 44 * Verify the results are correct. 45 * Print statistics and release the allocated memory. 46 47 ## Installation 48 49 The Makefile in the `src` subdirectory can be used to build, install and check the 50 code. 51 52 Use `make` at the command line to (re)build the executable called `mxv-pthreads`. It will be 53 stored in the directory `bindir`: 54 55 ``` 56 $ make 57 gcc -o ../objects/main.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes main.c 58 gcc -o ../objects/manage_data.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes manage_data.c 59 gcc -o ../objects/workload.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes workload.c 60 gcc -o ../objects/mxv.o -c -g -O -Wall -Werror=undef -Wstrict-prototypes mxv.c 61 gcc -o ../bindir/mxv-pthreads ../objects/main.o ../objects/manage_data.o ../objects/workload.o ../objects/mxv.o -lm -lpthread 62 ldd ../bindir/mxv-pthreads 63 linux-vdso.so.1 (0x0000ffff9ea8b000) 64 libm.so.6 => /lib64/libm.so.6 (0x0000ffff9e9ad000) 65 libc.so.6 => /lib64/libc.so.6 (0x0000ffff9e7ff000) 66 /lib/ld-linux-aarch64.so.1 (0x0000ffff9ea4e000) 67 $ 68 ``` 69 The `make install` command installs the executable in directory `experiments`. 70 71 ``` 72 $ make install 73 Installed mxv-pthreads in ../experiments 74 $ 75 ``` 76 The `make check` command may be used to verify the program works as expected: 77 78 ``` 79 $ make check 80 Running mxv-pthreads in ../experiments 81 mxv: error check passed - rows = 1000 columns = 1500 threads = 2 82 $ 83 ``` 84 The `make clean` comand removes the object files from the `objects` directory 85 and the executable from the `bindir` directory. 86 87 The `make veryclean` command implies `make clean`, but also removes the 88 executable from directory `experiments`. 89 90 ## Usage 91 92 The code takes several options, but all have a default value. If the code is 93 executed without any options, these defaults will be used. To get an overview of 94 all the options supported, and the defaults, use the `-h` option: 95 96 ``` 97 $ ./mxv-pthreads -h 98 Usage: ./mxv-pthreads [-m <number of rows>] [-n <number of columns] [-r <repeat count>] [-t <number of threads] [-v] [-h] 99 -m - number of rows, default = 2000 100 -n - number of columns, default = 3000 101 -r - the number of times the algorithm is repeatedly executed, default = 200 102 -t - the number of threads used, default = 1 103 -v - enable verbose mode, off by default 104 -h - print this usage overview and exit 105 $ 106 ``` 107 108 For more extensive run time diagnostic messages use the `-v` option. 109 110 As an example, these are the options to compute the product of a 2000x1000 matrix 111 with a vector of length 1000 and use 4 threads. Verbose mode has been enabled: 112 113 ``` 114 $ ./mxv-pthreads -m 2000 -n 1000 -t 4 -v 115 Verbose mode enabled 116 Allocated data structures 117 Initialized matrix and vectors 118 Defined workload distribution 119 Assigned work to threads 120 Thread 0 has been created 121 Thread 1 has been created 122 Thread 2 has been created 123 Thread 3 has been created 124 Matrix vector multiplication has completed 125 Verify correctness of result 126 Error check passed 127 mxv: error check passed - rows = 2000 columns = 1000 threads = 4 128 $ 129 ``` 130 131 ## Executing the examples 132 133 Directory `experiments` contains the `profile.sh` script. This script 134 checks if gprofng can be found and for the executable to be installed. 135 136 The script will then run a data collection experiment, followed by a series 137 of invocations of `gprofng display text` to show various views. The results 138 are printed on stdout. 139 140 To include the commands executed in the output of the script, and store the 141 results in a file called `LOG`, execute the script as follows: 142 143 ``` 144 $ bash -x ./profile.sh >& LOG 145 ``` 146 147 ## Additional comments 148 149 * The reason that compiler based inlining is disabled is to make the call tree 150 look more interesting. For the same reason, the core multiplication function 151 `mxv_core` has inlining disabled through the `void __attribute__ ((noinline))` 152 attribute. Of course you're free to change this. It certainly does not affect 153 the workings of the code. 154 155 * This distribution includes a script called `profile.sh`. It is in the 156 `experiments` directory and meant as an example for (new) users of gprofng. 157 It can be used to produce profiles at the command line. It is also suitable 158 as a starting point to develop your own profiling script(s). 159