1 TEST PLAN for NSD. 2 3 By W.C.A. Wijngaards, July 2006, NLnetLabs. 4 5 6 1. Introduction 7 --------------- 8 NSD 3 contains far more features than a typical point release. These 9 features need to be tested and checked to make sure they work well. 10 This document describes a plan to test all the features that have 11 been added to NSD. 12 13 Regression testing is also very important. The old features must 14 remain working. We have a set of tpkg packages to help with it. 15 And also root-trace speed tests to regression test NSD. 16 17 The feature tests are to be automated, using tpkg packages where 18 possible. 19 20 2. Minor Features 21 ----------------- 22 Some minor features for the test: 23 24 2.1. DNAME 25 ---------- 26 DNAME support - there are already extensive DNAME tests. 27 (closed). 28 29 2.2. NSEC3 30 ---------- 31 NSEC3 support 32 - use the perl automated nsec3 test 33 - port to tpkg perhaps. 34 35 Note NSEC3 hash length byte to be implemented, test against others. 36 Test interoperability of that. A simple zone transfer with Bind. 37 (experimental, no need to test any more). 38 39 2.3. NSID 40 --------- 41 Would make a nice nsid.tpkg package. 42 43 NSID support - run NSD with different NSIDs and queries. 44 a- test NSID with zero length, query with NSID. 45 b- very long length, query with NSID 46 c- 012345678 and query for different things. 47 1- query OK things. 48 - query error 49 2- nxdomain, 50 3- loop, 51 4- nodata. 52 5- query error (bad queries, wrong zone) 53 - is NSID present. 54 d- NSID and TSIG. 55 1- query has OK TSIG 56 2- query has BAD TSIG 57 3- query for nxdomain 58 4- bad query, wrong zone 59 e- configure NSID from config file? 60 - is this possible 61 62 f- test if NSID in NOTIFY responses. (should there?) 63 ldns-notify and parse result packet for nsid. 64 g- test if NSID in AXFR responses. (should there?) 65 drill axfr <zone> and see if nsid in result packets. 66 (experimental, so low priority). 67 (need a way to send NSID enabled queries - no test). 68 69 3. Transfers 70 ------------ 71 For the transfers the test are to be done using 72 - NSD as a master(AXFR) in 3.1, or a ldns-ixfr miniserver(IXFR) 73 as a master in 3.2 that serves pre-made ixfr answers. 74 75 The zone transfer tests can be put in one tpkg by using servers 76 at different ports. The allow- lines are then for localhost, all 77 ports (since the sending process uses ephemeral ports, all must 78 be allowed). The request- lines contain the correct port numbers 79 to send to. 80 81 3.1. AXFR 82 --------- 83 3.1.1. AXFR features 84 -------------------- 85 Setup is a secondary zone which requests to a master. 86 the master zone is updated. Then, the secondary should be 87 informed with the notify: statements. 88 And test if secondary got the same zone as master. 89 By doing axfr from both servers and check if the same, and serial nr. 90 91 Tests 3.1.1 can be one tpkg. 92 93 (with serial numbers for SOA, to perform serial rollover). 94 - secondary starts with a zone without content (soa=1) 95 so the zone is only mentioned in the config, the zonefile is empty/nonexist 96 on the slave. Master has a soa and three text records. 97 - axfr an empty zone - only the SOA (soa=2) 98 - axfr a zone with only little data. (soa=3) 99 some NS, MX, A, AAAA records. 100 [NOTE: apparently, due to the linked list mgt in domains (of rrset*) 101 the ordering of rrtypes for a domain is reversed after a zone transfer 102 for NSD, i.e. for query type=any. Ordering within an rrset is preserved. 103 Created fix to ordering, but is slow for many rr types... ] 104 - old zone unsigned, new zone signed. (soa=4) 105 sign with two KSKs and one ZSK. And a prepublished ZSK in the zone. 106 ZSK1: Kexample.com.+005+44537 107 ZSK2: Kexample.com.+005+03824 (prepublish) 108 KSK1: Kexample.com.+005+53988 109 KSK2: Kexample.com.+005+25320 (presign) 110 - old zone signed, new zone unsigned. (soa=5) 111 different zone contents, some names are still there, 112 unchanged, some names are there RRs changed, some names 113 there different RRtypes, and some names removed, some names added. 114 www: unchanged (including nsec,rrsig). 115 webmail: mail prio changed. 116 printer: name removed. 117 terms: different RR types, now A type. 118 mail: type TXT added. 119 apex: type DNSKEY, nsec, rrsig removed. 120 newservice, ooo: new names 121 - new zone with nsec3. (soa=6) 122 iter=33 salt=AA44FF11 123 slave detects NSEC3 settings. 124 - new parameters for nsec3. (soa=7) 125 iter=1078 salt=00998877665544332211AADDCCFF 126 slave detects NSEC3 settings. 127 - new zone no longer uses nsec3. (soa=8) 128 uses nsec. 129 - axfr an empty zone (only the SOA) (soa=9 + a lot for serial wraparound) 130 2**31 = 2147483648 131 9 + 2**31 : 2147483657 132 also tested 133 9 + 2**31-3 -- works. notify and transfer, zone updated. 134 9 + 2**31-2 -- works. notify and transfer, zone updated. 135 9 + 2**31-1 -- notify works, but at transfer time 'serial old'. 136 fixed: works, zone updated. 137 9 + 2**31 -- notify is ignored, 'serial old'. 138 9 + 2**31+1 -- notify is ignored, 'serial old'. 139 - axfr wraparound zone, couple txts. (soa=2) 140 serial=2 works after serial=9 + 2**31-2 before. 141 These can be done in order. 142 143 Test done for AXFR. RRset-type ordering preserve fixed. Serial rollover fixed. 144 Serial printed as unsigned. 145 146 3.1.2. Huge xfr - see test tpkg for this. 147 It works already. 148 149 3.1.3. AXFR and TSIG 150 Like 3.1.1. but enable tsig. 151 Tested, it works. 152 153 3.2. IXFR 154 --------- 155 3.2.1. IXFR Features 156 -------------------- 157 Setup is a secondary with ldns-mini-ixfr server as a master. 158 (ldns/examples/nsd-test/ldns-testns.c). 159 The mini ixfr server responds with canned replies to a IXFR query. 160 161 - secondary loaded with only the soa = 1. notified with 10. 162 ixfr server responds with soa=1. (i.e. no update available). 163 - same, ixfr server responds with soa=2, and TC (udp). 164 on tcp connection, it responds with a simple difference package. 165 SOA2 SOA1 SOA2 newTXTA newTXTB SOA2 166 adds a couple records. 167 - to soa is 3, and this one removes the TXT records, 168 makes a new TXTB record and a TXTC record. test that TXTA domain does not exist NXDOMAIN. 169 test that TXTB is updated. 170 test that TXTC exists. 171 - from 3 to 5, with 4 in between. 172 make a ixfr packet that is 3 .. 4 and 4 .. 5 concatenated without 173 compression, so it means more work for processing. 174 So, in the ixfr packet TXTB is removed from 3, added in 4, removed from 4, 175 added in 5. 176 Also in vs4 a txt5 record is added, which stays around. 177 - from 5 to 7, but this time the redundant work is removed from ixfr packet. 178 - 7 to 8, have one domain name where two RR types exist, A and DNAME. 179 remove one RR type in IXFR then make sure the other type still exists. 180 - 8 to 9, have a name with many different A records. Remove one A record 181 from it. Add another A record to it. Test if the rest is there. 182 [Note: when you delete on RR from an RRset, the ordering of the RRset 183 changes, the contents of the rrset get shuffled (last put in empty slot).] 184 185 3.2.2. a huge ixfr. 186 ------------------- 187 create test for it. Should be several MBs worth or data removed, MBs of 188 data that stays the same, and MBs that are added. 189 To make sure that the code can handle multiple packet IXFRs, and the 190 state memory between IXFR packets. 191 - created testns version for multiple packet reply. Small, multiple packets. 192 Test from 3.1. but using multiple packets: one RR per packet. 193 This test also falls over from udp to tcp for ixfr. 194 - This works. The Mbs in size is tested in huge axfr test already. 195 196 3.2.3. Test remove domain 197 ------------------------- 198 - is_existing = 0 used to remove a domain. Check and test carefully. 199 - test delete middle name 200 i.e. you have a zone with: 201 c.example.com TXT "x" 202 b.c.example.com TXT "x" 203 a.b.c.example.com TXT "x" 204 and you delete the b.c. record. The b.c becomes empty 205 nonterminal. If you then delete a.b.c. TXT, the b.c becomes 206 NXdomain. 207 [- fixed delete with IXFR for empty nonterminals.] 208 - test delete/add a domain and NXdomain/exist replies. 209 - tested in 3.2.1 already, works. 210 - test delet domain and wildcard replies. 211 - Tested, it works, fix for IXFR that makes empty nonterminals. 212 213 3.3. Timeouts 214 ------------- 215 Get zone to expire. Check it does not answer. 216 Start only a secondary server, no master. Set expire timeout short. 217 Timers set as refresh=1 retry=1 expire=10 minimum=10 218 Provide an update. Check it does answer again. 219 Startup the master server after a while. Transfer should happen 220 within the retry interval. 221 Wait for zone to expire again. Check that. 222 Provide old zone on the master, after expire the slave must transfer it. 223 The above works, old zone is transferred and served. 224 225 Test that the master says that serial number is OK, in 3.2.1 tests. 226 This test also includes IXFR reply from the master that contains AXFR contents. 227 228 3.4. TSIG zone transfers 229 ------------------------ 230 Already TSIG tpkg tests, with transfers TSIG protected, so that is ok. 231 TSIG notifies - test it, create test for it. 232 - notify accepted, nsd->nsd notify 233 by starting master and slave server with tsig keys 234 for a zone, update zone at master. 235 Done in 3.1.3. 236 - notify refused. nsd->nsd notify 237 same but use different secret at one server. 238 Test done. 239 240 4. IPC 241 -------- 242 243 4.1. deadlocks 244 -------------- 245 Have 100.000 zones, all with short SOA timeouts, expire=1 sec. refresh=10. 246 Expire very quickly. This gives many messages from xfrd to server. 247 Send notifies to the server in a loop from a shell script. Lots of 248 messages the other way around. 249 Provide a master server that will serve all the zones (and say they are ok). 250 251 then proceed to send queries for the zones to the server and see if you 252 get answers. Wait for an hour and try again. 253 254 Result, the IPC works okay, but xfrd uses much memory, 16Kb for TSIG regions, 255 per zone. With the 2.5 kb in xfrd almost 20 Kb per zone. For 2G for 100.000. 256 A bit much memory, for the largely unused tsig regions. 257 Fixed, tsig for xfrd uses no preallocated worst case memory use, but only 258 a small footprint. During use this may grow; about 1 K per zone perhaps. 259 260 About 2.5Kb per secondary zone in xfrd, below 1 Kb for a master zone, 261 that works out for 100.000 secondary zones as 250 Mb for xfrd. 262 263 Perhaps do also with 100 child servers for the NSD. see if it can keep 264 up and the result if it cannot keep up sending to child servers. 265 Since it has to send for each zone to each child a message, this will 266 take more resources. 267 Tested, it cannot keep up. Child servers operate using old zone status 268 of expired/ok, also the machine load is 100%. 269 Also fixed tsig.other_size to be checked when reading TSIG from network. 270 271 Due to the length and size, more an incidental test, but can be tpkg-ed. 272 273 4.2. IPC FORKS 274 -------------- 275 Infinite loop of reloads on a server. Has 10 child servers. wait. 276 See if it runs out of sockets, file descriptors, etc. 277 incidental test. 278 Tested, with adjusted source that repeats reloads. This puts strain on the 279 reload ipc handshake code. And ipc socket code. It works fine. 280 281 5. Random mess test 282 ------------------- 283 Setup 7 servers. In master->intermed->slave, 284 with multiple master(2), intermed(3) and slave(2) servers. 285 TSIGs (different) for everyone. 286 Perhaps also include never respond entries (fake address) in acls. 287 288 - Load random SOA + random data in servers. 289 Backup the setup so it is repeatable. 290 Let them work out what version to run. 291 - Provide updated zone for a master. 292 See what happens. 293 - Send notifies to the slave servers. 294 - Send notifies to the intermed servers. 295 - Send notifies to the master servers. 296 - Kill some server. Start it again. 297 - Kill some server & delete some file (ixfr.db or xfrd.status). 298 - delete ixfr.db 299 - delete xfrd.status 300 - delete ixfr and xfrd files. 301 - run nsdc patch on a server. 302 - pretend an intermediary was offline for a long time 303 with old zone files and old ixfr.db and xfrd.state(!!) files. 304 and see what happens :-) 305 It should refresh/expire and so based on timers in xfrd.state. 306 307 Tested: 308 - nsd returns formerr on IXFR queries because of data in NS section. 309 But this is correct, fixed NSD, so it is no longer formerr, but 310 refused / not authorised instead. (or whatever we put in axfr.c). 311 - depending on which server they are asking, servers will use one of 312 the master zones (after expiry time exceeded). If master updated, 313 intermediaries, then slaves update themselves too. 314 - NSD would not start with a corrupt diff file. Now logs error and 315 ignores, fixes, the diff file. 316 317 6. Portability test 318 ------------------- 319 Port NSD to as many platforms as possible 320 - local: sparc5(ok), alpha(ok), amd64/OpenBsd(jelte thuis), 321 open=FreeBSD(ok), linuxes(ok), MacOsX(ok), Sunos4(ok). 322 - sf compilefarm for more. 323 - x86-linux2 has ip6 disabled. tests dont work with that. 324 - minix3 if we can get it working (the minix3 setup fails somehow). 325 326 - would be good to have a test set of tpkg (and tools required) to 327 run after a port-test. A very portable set of tpkgs. 328 OSTYPE: (g)make. autoreconf. (g)indent. 329 -> defaults for * systems. 330 dig 8.3 too old (format of output). Need 9+. 331 however dig/bind is not portable enough. 332 ldns: pcat, pcat-diff, pcat-print. xfr1,2:nsd-ldnsd. pcat-grep.pl 333 manual: md5sum/md5. hping(sudo). 334 long: ldns-testns. 335 Made tests more portable, ran tests on linux, freebsd, Solaris. 336 Full testset run on SPARC/SunOS2.5, and fixed two unaligned memory accesses, 337 all tests succeed now. Full testset runs on Powerpc/MacOSX. 338 339 7. CODE REVIEW 340 -------------- 341 Code has already had 1x review by Wouter, some review by Miek. 342 More review (again), Jelte, Wouter. 343 - Do some spots of interest. 344 - perhaps a full review as well. 345 346 8. todo-tests ideas 347 ------------------- 348 These would be nice as tpkgs, but perhaps manual tests are needed. 349 350 8.1. test combinations of configure options and shells 351 ------------------------------------------------------ 352 " 353 run tests with different shells, aka ==-bug 354 bash 1.1. is too old for [[ in tpkg and tests. 355 Some hosts have awk that puts a space before .pre files in tpkg. 356 Some hosts have bash in /usr/local/bin so tpkg fails on that. 357 run tests with different configure options and combinations 358 of them. 359 Many tests fail with disable-ipv6. 360 implement this in a xen-like environment so that different OSs can 361 be checked. 362 run this daily or only when subversion changes 363 for each test, run our "test-suite" 364 " 365 366 8.2. patch file remove 367 ---------------------- 368 rm patch file, check xfrd's behavior. Refetches zones 369 Checked in section transfer_axfr. 370 371 8.3. 64 bit 372 ----------- 373 GB 64 bit file size transfers. On alpha so nastiest 374 alignment on 64bit machine. Do transfer of > 4 Gb zone. 375 Needs lots of memory(swap space) and disk space. 376 Not done; no host for test. 377 378 8.4. Valgrind 379 ------------- 380 run with valgrind - on two nsds. 381 then do the nsd-nsd, and notify the master to get axfr 382 to happen test, with tsig as well enabled. 383 Done, found one uninit variable. 384 385 8.5. Chroot 386 ----------- 387 test chroot and the new files/directories. 388 (And the file/dir not in chroot problem, and if all is OK that it works). 389 Done, default locations for ixfr.db and xfrd.state have full pathnames. 390 391 8.6. nsdc 392 --------- 393 In temporary test setup above, test nsdc tool. 394 works. 395 396 Make sure that if nsdc patch breaks a zone transfer in progress it is 397 reattempted later on. 398 hard to test. 399 400 8.7. nsd-patch 401 -------------- 402 nsd-patch - run nsd patch and compare zone files, like AXFR/IXFR tests. 403 Done test axfr run, or test-mess. 404 405 8.8. gcov 406 --------- 407 gcov to look at code coverage of the tests. Tests added to improve coverage. 408