Top: An on-farm research plot in Minnesota that compared crop yields under different tillage systems.
Bottom: Corn grain yield for each plot by replication (rep) and average yield by tillage system at one on-farm, long-term tillage trial site in Minnesota in 2011. Although average yields were numerically different, statistical analysis determined we could not say with any confidence that the tillage systems resulted in different yields.
Field Studies: What do you mean 5 bushels per acre is not significant?
By Lizabeth Stahl, University of Minnesota; Sara Berg, South Dakota State University; Josh Coltrain, Kansas State University; John Thomas, University of Nebraska Lincoln
Part 3 of a four-part series on agricultural research and interpretation by University Extension Educators in the North Central Region. Part 1 discussed the importance of arranging research plots in replicated patterns instead of simple side-by-side comparisons. Part 2 discussed setting up an in-field trial so the data are statistically valid.
Utilizing sound research results to help make decisions on the farm is a wise business practice. It can be confusing, however, when you see two numbers that are clearly not the same labeled as “not significantly different.”
One can quickly calculate the value of a few bushels per acre over hundreds of corn or soybean acres. It is key to look at just what this terminology means and its practical importance when using this information to make decisions.
First, consider why research is conducted in the first place -- so that we can use the results to help make the best decisions possible in the future. We want to use practices that have a high likelihood or probability of paying off. Statistically sound research trials help determine the likelihood that a practice really did influence yield versus any differences being due to some other factor(s) or random variability.
A term commonly used in research is the Least Significant Difference, or LSD. In a hybrid variety trial, for example, this is the minimum bushels per acre that two hybrids must differ by before we could consider them to be “significantly different.”
Note there is no way to calculate the LSD if a person simply splits a field in half and puts one treatment on one side of the field and a different treatment on the other. In this scenario, you have no way to sort out if a difference in observed yields was due to underlying factors such as soil type, planting population, drainage, compaction, disease, insect pressure, harvest issues, topography, etc., or the treatment.
When you see the LSD calculated at the .05 significance level, this means we can be 95 percent certain that the treatments (or hybrids, etc.) really did differ in yield if the difference between them was equal to or greater than the LSD. A significance level of .05 or .10 are most commonly used in agricultural research.
How do we end up with “no significant difference”? This can occur when there is so much variability in the results due to other factors that we can’t make a conclusion with confidence, or when the treatments or hybrids in the study simply don’t differ in yield.
Results from a University of Minnesota tillage trial demonstrate the importance of statistical analysis in helping determine if a yield difference is likely “real.” Three long-term tillage systems were evaluated at multiple locations over three years across southern Minnesota. Tillage treatments were randomized and replicated four times at each location.
The accompanying table lists the results for each tillage system and each replication
at one site in 2011. Average corn yield for strip tillage (ST) was 10 bushels per acre greater than in moldboard plow (MP). Yield was not statistically significant, however, so we couldn’t say one tillage system resulted in a higher yield than another.
Closer examination of data shows that while ST out-yielded MP in the first replication (rep) by 13 bushels per acre, MP out-yielded ST by 14 bushels in the fourth rep. Also, chisel plow (CP) was the lowest yielding treatment in rep 2, while it was the top yielding treatment in rep 3.
Due to this variability, statistical analysis revealed we couldn’t say with confidence that any of the tillage systems resulted in a higher yield than another. Other factors we couldn’t account for appeared to have impacted results at this site, highlighting the value of conducting research over a number of locations and years.
Lastly, if yields are not statistically different, don’t treat them differently. Resist the temptation to put economics to average yields if they are not significantly different. Doing so could lead to poor and costly decisions in the future.