-
Notifications
You must be signed in to change notification settings - Fork 0
/
aagi_cu_trial_design.qmd
529 lines (379 loc) · 16.5 KB
/
aagi_cu_trial_design.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
---
title: "Introduction to Trial Design"
author: "Adam H. Sparks"
date: "2024-09-05"
institute: Curtin Biometry and Agricultural Data Analytics
from: markdown+emoji
format:
aagi-revealjs:
incremental: true
html-q-tags: true
bibliography: references.bib
---
```{r}
#| include: false
library(agricolae)
library(gt)
library(tidyverse)
library(AAGIThemes)
library(AAGIPalettes)
# set ggplot2 theme
theme_set(theme_aagi())
```
## Outline
Interactive session with exercises throughout
- Role of the Experimental Design
- Local Controls of Variability
- Replication
- Randomisation
- Blocking
## Outline
Designs
- Complete Randomised Design;
- Randomised Complete Block Design (RCBD);
- Split plot and
- OFE Strip Trial
# Role of Experimental Design{background-image="_extensions/AAGI-AUS/aagi/assets/title-slide-main.png" background-size="1050px auto" background-position="50% 85%"}
## Good Experimental Design
![](./assets/dartboard-and-dart-svgrepo-com.svg){fig-align="center" width="80%"}
:::notes
A good design should address the objectives of the experiment/hypothesis to be tested
- Introducing experimental design in the trial aims at:
- Reducing or at least controlling the error (variation);
- Attaining maximum information, precision and accuracy of the result, by producing the best unbiased estimates of the treatment means; and
- The most efficient use of existing resources.
:::
## Poor Experimental Design{.nostretch}
![<span style="font-size:0.25em;">From: *The Hunt for Red October* (1990)</span>](./assets/wrong_conclusions.gif){fig-align="center" width="80%"}
:::notes
Keep in mind that nothing can compensate for a poor design.
Start by consulting with AAGI first to help ensure that you don't waste your time and resources.
Poor design leads to:
- Wrong conclusions and wasted resources and
- No statistical analysis.
:::
# Sources and Controls of Variability{background-image="_extensions/AAGI-AUS/aagi/assets/title-slide-main.png" background-size="1050px auto" background-position="50% 85%"}
## Data Collection
Technique used for collecting the data --- affects variation and may introduce bias, *e.g.*
- Bad calibration of the measuring equipment;
- Human error when measuring/recording;
- Operators/scorers differences in measuring/assessing (inter and intra-rater repeatability).
## Trial Placement
![](assets/3-soils-paddock.png){fig-align="center"}
:::notes
Consider an example paddock that has three soil types.
Selection of uniform plots (experimental units), homogeneous plots produce small experimental error variance for small plot trials.
Usually, in field trials soil fertility and moisture trends affect the uniformity of the plots.
This variability is also what affects large, paddock-scale OFEs.
:::
## Small Plot Trial Placement
![](assets/3-soils-paddock-4-small-blocks-1-soil){fig-align="center"}
:::notes
Placing the small plot trials in an area of one soil type reduces the variability and increases uniformity of the plots.
:::
## OFE Paddock Scale or Strip Design Trial Placement
![](assets/3-soils-paddock-full-length-OFE.png){fig-align="center"}
:::notes
However, with an OFE trial, we want to cover these differences.
Here we have plots that run the entire length of the paddock and cover the differences in soil types.
This means that these plots can be used to determine how the different treatments might act in other parts of the paddock.
:::
## Blocking
Blocking
- the plots are grouped in blocks such that the variability of the plots within the blocks is less than that among all plots prior to grouping.
:::notes
Among the most frequently used criteria for blocking are: proximity and homogeneity (field plots); time; physical characteristics (age, height, weight); management /agronomic practices.
Failure to use blocks will result in distortion of relative means and inflated standard errors.
:::
## Blocking
- In field trials blocking is often done on the basis of soil fertility and moisture trends, *i.e.*, soil homogeneity;
- On a sloping trial site the moisture differs at different levels of the slope;
- Blocks are usually chosen at different levels up the slope so the that the difference in moisture between blocks is maximised and the difference in moisture within the blocks is minimised;
- A trial site known to have different soil fertility trends should be blocked accordingly, separating the blocks based on the soil quality.
## Blocking Example
![<span style="font-size:0.25em;">Source: Prof Sarita Bennett, Curtin University</span>](assets/Plate_1_S_Bennett.jpg){fig-align="center" height=500}
::: notes
Here's a chickpea experiment where the lower block was affected with higher levels of Ascochyta blight and browning off following waterlogging during the winter.
Without blocking the trial would have been of no use.
:::
## Blocking Exercise
![<span style="font-size:0.25em;">Source: Prof Sarita Bennett, Curtin University</span>](assets/Plate_1_S_Bennett.jpg){fig-align="center" height=300}
::: {.callout-warning icon=false}
## {{< fa clock >}} Exercise (2 min)
What do you think blocking did that made this experiment useable?
:::
::: notes
Here's a chickpea experiment where the lower block was affected with higher levels of Ascochyta blight and browning off following waterlogging during the winter.
Without blocking the trial would have been of no use.
:::
## Blocking
It is important to avoid *confounding* when defining blocks.
Examples:
- Different time of sowing (TOS), TOS is a treatment, is assigned to different blocks in the field;
- Different seeding rates (SR) are applied to different blocks in the field where SR is a treatment;
- Different nitrogen rates (NR) are applied to different blocks in the field where NR is a treatment;
- A complete block should contain each treatment replicated once.
:::notes
In poorly designed trials, blocks can be confounded with some treatments.
In such cases the statistical model cannot distinguish between the variation due to the blocking and the variation due to the treatment.
:::
## Blocking Barley Varieties
```{r}
#| label: blocking-table
#| message: false
#| echo: false
#| tbl-cap: Yield of barley varieties A, B, C and D in kg/ha.
replicates <- tibble(
replicate = as_factor(c(1L, 2L, 3L, 4L, 5L)),
A = c(1120L, 880L, 1120L, 1240L, 1310L),
B = c(1240L, 940L, 1250L, 1360L, 1440L),
C = c(1360L, 1080L, 1440L, 1340L, 1460L),
D = c(1480L, 1170L, 1570L, 1420L, 1560L)
)
replicates |>
gt() |>
tab_options(table.font.size = 28) |>
theme_gt_aagi()
```
:::notes
The blocking structure here is introduced by replicates.
:::
## Blocking Exercise
```{r}
#| label: blocking-table-2
#| message: false
#| echo: false
#| tbl-cap: Yield of barley varieties A, B, C and D in kg/ha.
replicates |>
gt() |>
tab_options(table.font.size = 28) |>
theme_gt_aagi()
```
::: {.callout-warning icon=false}
## {{< fa clock >}} Exercise (2 min)
What pattern or patterns can you see in these data?
:::
:::notes
Spend a few minutes and look at this table.
Can you see patterns?
What are they?
:::
## Blocking Barley Varieties
```{r}
#| label: blocking-table-facet-figure
#| message: false
#| echo: false
#| fig-cap: Barley variety on the x-axis by yield (kg/ha) on the y-axis with variety represented as colour and replicate as shape.
replicates |>
pivot_longer(!replicate, names_to = "variety", values_to = "yield") |>
ggplot(aes(
x = variety,
y = yield,
shape = replicate
)) +
geom_point(size = 4.5, colour = "#414042") +
ylab("yield (kg/ha)")
```
:::notes
Does the pattern stand out more now that we've plotted the data?
:::
## Blocking Exercise
![<span style="font-size:0.25em;">Source: Dr Karyn Reeves, SAGI West</span>](assets/Plate_4_K_Reeves.png){fig-align="center"}
::: {.callout-warning icon=false}
## {{< fa clock >}} Exercise (2 min)
Which design is valid?
What makes one invalid and the other valid?
:::
## Replication
**Replication** implies independent repetition of the basic experiment.
Replication is considered very important for valid experimental results due to the fact that it:
- provides the means to estimate the experimental error variance;
- provides the capacity to increase the precision of the estimates of the treatment means;
- demonstrates the reproducibility of the results under current experimental settings; and
- provides additional data in case of non-consistent results (presence of outliers due to environmental conditions like *e.g.*, waterlogged plots or affected by birds or elephants).
## Replication
### How Many?
- The number of replications affects the precision of treatment means estimates
- Should be chosen to provide acceptable power of statistical tests to identify differences between the means of treatment groups
- Consider whether the planned level of replication may be expected to give standard errors which are acceptably small
- Providing there is no huge variability, meaning:
- the area allocated to the trial is relatively homogeneous and
- the traits of interest are reasonably variable, 2 to 3 replicates should be sufficient
:::notes
1. The replication is usually mainly determined by considerations of the use of resources.
2. What is acceptably small standard errors are depends on the nature of the experiment and treatments.
:::
## Pseudo-replication
![<span style="font-size:0.25em;">Source: Dr Karyn Reeves, SAGI West</span>](assets/Plate_5_K_Reeves.png){fig-align="center"}
## Pseudo-replication
![<span style="font-size:0.25em;">Source: Dr Karyn Reeves, SAGI West</span>](assets/Plate_5_K_Reeves.png){fig-align="center"}
::: {.callout-warning icon=false}
## {{< fa clock >}} Exercise (5 min)
What makes this design invalid?
What would you suggest to do that would make it a valid design?
:::
## Randomisation
![](assets/math-lady.gif){fig-align="center" height="350"}
:::notes
**Randomisation** is the random assignment of treatments to plots (experimental units)
- Randomisation is used to avoid:
- systematic bias;
- selection bias;
- accidental bias and
- cheating
Consider a paddock that has a gradient in the soil.
If you put three treatments in the same order across the paddock, one treatment will always be in the best part of the soil in the block and one in the worst soil in the block, affecting the determinations you can make about your treatments.
:::
# Trial Designs{background-image="_extensions/AAGI-AUS/aagi/assets/title-slide-main.png" background-size="1050px auto" background-position="50% 85%"}
## Trial Designs
We will discuss the following four designs:
- Complete Randomised Design (CRD);
- Randomised Complete Block Design (RCBD);
- Split-plot; and
- OFE paddock-scale
## Complete Randomised Design (CRD)
- CRD is the simplest design without blocking
- Treatments are allocated to the plots at random
- *CRD is most useful in experimental settings where there are no other sources of variation than treatments*
## Complete Randomised Design (CRD)
:::{.nonincremental}
- CRD is the simplest design without blocking
- Treatments are allocated to the plots at random
- *CRD is most useful in experimental settings where there are no other sources of variation than treatments*
:::
:::{.callout-note}
Uncommonly used in agricultural paddocks for this reason
:::
:::notes
I won't spend much time on this one, I wanted to mention it because I saw some of the examples sent through by GGA that referred to this layout, but really, they were randomised complete block design, which is the next design I'll talk about.
:::
## Randomised Complete Block Design (RCBD)
RCBD is an experimental design with one blocking criterion, usually replicates.
**All treatments occur an equal number of times in each block randomly.**
## Randomised Complete Block Design (RCBC) Exercise
::: {.callout-warning icon=false}
## {{< fa clock >}} Exercise (5 min)
Thinking back to the earlier barley yield example and following the description of blocking and randomisation, draw a trial map that has four varieties, "A", "B", "C" and "D" and five replicates (blocks) to test varietal differences in yield.
Each variety should be represented in each replicate only once.
***Recognise that this is just an exercise, it is not recommended to do this by hand.***
*Using random number tables, a sequence or random numbers generated by a computer program are preferred.*
*AAGI can help with this.*
:::
## Randomised Complete Block Design (RCBC)
```{r}
#| label: RCBD with {agricolae}
#| echo: false
trt <- c("A", "B", "C", "D")
design <- design.rcbd(trt = trt, r = 5, seed = -513, serie = 2)
design <- as.data.frame(design$sketch)
colnames(design) <- c("Col 1", "Col 2", "Col 3", "Col 4")
rownames(design) <- c("Rep 1", "Rep 2", "Rep 3", "Rep 4", "Rep 5")
gt(design, rownames_to_stub = TRUE) |>
theme_gt_aagi()
```
:::notes
Here the replicates are the rows of the layout.
This was generated in R using a library that specialises in designing agriculture trials like this and so has been randomised programmatically.
:::
## Split Plot Design
### Example: Lupin Seeding Rate Trial
:::{.nonincremental}
- 6 commercial lupin varieties:
- Jenabillup (Je),
- Jindalee (Ji),
- Quilinock (Qu),
- Belara (Be),
- Mandelup (Ma) and
- Tanjil (Ta)
- 3 seeding rates
- 3 replicates
:::
:::notes
What would be the best design to use?
We may reason in the following way:
There are 18 main plots = 6 varieties x 3 replicates. We may allocate the 3 seeding rates as subplots within the main plot. The seeding rates will be randomized within the main plot.
We may allocate the main plots in an array of 6 columns by 3 rows, so columns 1 and 2 will constitute the 1st replicate, columns 3 and 4 - the 2nd replicate and columns 5 and 6 - the 3rd replicate.
:::
## Split Plot Design Exercise
:::{.callout-warning icon=false .nonincremental}
## {{< fa clock >}} Exercise (10 min)
Draw a map that has 6 columns and 3 rows, columns 1 and 2 are the main plots in the first 1st replicate (there will be six in each replicate) and so on.
- 6 commercial lupin varieties:
- Jenabillup (Je),
- Jindalee (Ji),
- Quilinock (Qu),
- Belara (Be),
- Mandelup (Ma) and
- Tanjil (Ta)
- 3 seeding rates
- 3 replicates
:::
## Split Plot Design
### Answer
![<span style="font-size:0.25em;">Source: Dr Karyn Reeves, SAGI West</span>](assets/Plate_6_K_Reeves.png){fig-align="center"}
## OFE Paddock Scale or Strip Design
Depending on your goals:
- A single strip is useful for demonstration and discussion,
- At least one of these should be a nil strip if you wish to measure the response of the treatments,
- Replicated treatments will provide more robust results.
## OFE Strip Trial
:::{.callout-warning icon=false .nonincremental}
## {{< fa clock >}} Exercise (10 min)
Design a randomised complete block strip trial design with three replicates.
Strips will be arranged to overlay 2-3 farm management units (soil types, soil restraints), consider the 3-soil paddock I showed earlier.
Treatments:
- Depth of seeding: 5, 10cm
- Fertiliser: Standard Rate(S), Nil (N)
:::
## OFE Strip Trial Design
### One Possible Answer
```{r}
#| label: strip-plot design
#| echo: false
#| fig-cap: Complete randomised strip plot trial with four treatments and three replicates
plots <- 1:12
reps <- c(rep(1, 4), rep(2, 4), rep(3, 4))
trts <- c(
sample(1:4, size = 4, replace = FALSE),
sample(1:4, size = 4, replace = FALSE),
sample(1:4, size = 4, replace = FALSE)
)
trts <- gsub(1, "N5", trts)
trts <- gsub(2, "N10", trts)
trts <- gsub(3, "S5", trts)
trts <- gsub(4, "S10", trts)
y <- rep(1, 12) # this is just for ggplot to have a y-axis
strip_plots <- tibble(
"plot" = as.factor(plots),
"rep" = as.factor(reps),
"treatment" = as.factor(trts),
y
)
ggplot(strip_plots, aes(x = plot, y = y)) +
geom_col(aes(fill = treatment)) +
ylab("") +
scale_fill_manual(values = c(
"N5" = "#B6D438",
"N10" = "#54921E",
"S5" = "#FFBC42",
"S10" = "#ec8525"
)) +
facet_wrap(. ~ rep, scales = "free_x") +
theme(axis.text.y = element_blank(), axis.ticks.y = element_blank())
```
## Wrapping Up
Remember
- Keep the treatments simple,
- complexity adds cost and time;
- and weakens the ability of the trial to measure differences
- Replicate;
- Randomise;
- Talk with AAGI first, it may save you time, money and headaches!
##
::: {.callout-important icon=false}
## {{< fa quote-right >}}
*To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.*
- R.A. Fisher, [-@fisher1938]
:::
# Thank You{background-image="_extensions/AAGI-AUS/aagi/assets/title-slide-main.png" background-size="1050px auto" background-position="50% 85%"}
# References