Title: | Kornbrot's Rank Difference Test |
---|---|
Description: | Implements Kornbrot's rank difference test as described in <doi:10.1111/j.2044-8317.1990.tb00939.x>. This method is a modified Wilcoxon signed-rank test which produces consistent and meaningful results for ordinal or monotonically-transformed data. |
Authors: | Brett Klamer [aut, cre] |
Maintainer: | Brett Klamer <[email protected]> |
License: | MIT + file LICENSE |
Version: | 2021.11.25.9000 |
Built: | 2025-01-24 03:39:04 UTC |
Source: | https://bitbucket.org/bklamer/rankdifferencetest |
Coerce's a rankdifferencetest
object to a data frame.
## S3 method for class 'rankdifferencetest' as.data.frame(x, ...) ## S3 method for class 'rankdifferencetest' tidy(x, ...)
## S3 method for class 'rankdifferencetest' as.data.frame(x, ...) ## S3 method for class 'rankdifferencetest' tidy(x, ...)
x |
A |
... |
Unused arguments. |
data.frame
#---------------------------------------------------------------------------- # as.data.frame() and tidy() examples #---------------------------------------------------------------------------- library(rankdifferencetest) # Use example data from Kornbrot (1990) data <- kornbrot_table1 rdt( data = data, formula = placebo ~ drug, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) |> as.data.frame() rdt( data = data, formula = placebo ~ drug, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) |> tidy()
#---------------------------------------------------------------------------- # as.data.frame() and tidy() examples #---------------------------------------------------------------------------- library(rankdifferencetest) # Use example data from Kornbrot (1990) data <- kornbrot_table1 rdt( data = data, formula = placebo ~ drug, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) |> as.data.frame() rdt( data = data, formula = placebo ~ drug, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) |> tidy()
An example dataset as seen in table 1 from Kornbrot (1990). The time per problem was recorded for each subject under placebo and drug conditions for the purpose of measuring 'alertness'.
kornbrot_table1
kornbrot_table1
A data frame with 13 rows and 3 variables:
Subject identifier
The time required to complete a task under the placebo condition
The time required to complete a task under the drug condition
The table 1 values appear to be rounded, thus results do not match exactly with further calculations in Kornbrot (1990).
Kornbrot DE (1990). “The rank difference test: A new and meaningful alternative to the Wilcoxon signed ranks test for ordinal data.” British Journal of Mathematical and Statistical Psychology, 43(2), 241–264. ISSN 00071102, doi:10.1111/j.2044-8317.1990.tb00939.x.
Performs Kornbrot's rank difference test, which is a modified Wilcoxon signed-rank test that produces consistent and meaningful results for ordinal or monotonically transformed data.
rdt( data, formula, ci = FALSE, ci.level = 0.95, alternative = "two.sided", mu = 0, distribution = NULL, correct = TRUE, zero.method = "wilcoxon", tol.root = 1e-04 )
rdt( data, formula, ci = FALSE, ci.level = 0.95, alternative = "two.sided", mu = 0, distribution = NULL, correct = TRUE, zero.method = "wilcoxon", tol.root = 1e-04 )
data |
A data frame. |
formula |
A formula of form:
|
ci |
A scalar logical. Whether or not to calculate the pseudomedian and its confidence interval. |
ci.level |
A scalar numeric from (0, 1). The confidence level. |
alternative |
A string for the alternative hypothesis: |
mu |
A scalar numeric from (-Inf, Inf). Under the null hypothesis, |
distribution |
A string for the method used to calculate the p-value.
If |
correct |
A scalar logical. Whether or not to apply a continuity correction for the normal approximation of the p-value. |
zero.method |
A string for the method used to handle values equal to
zero: |
tol.root |
A numeric scalar from (0, Inf). For
stats::uniroot |
For paired data, the Wilcoxon signed-rank test results in subtraction of the paired values. However, this subtraction is not meaningful for ordinal scale variables. In addition, any monotone transformation of the data will result in different signed ranks, thus different p-values. However, ranking the original data allows for meaningful addition and subtraction of ranks and preserves ranks over monotonic transformation. Kornbrot developed the rank difference test for these reasons.
Kornbrot recommends that the rank difference test be used in preference to the Wilcoxon signed-rank test in all paired comparison designs where the data are not both of interval scale and of known distribution. The rank difference test preserves good power compared to Wilcoxon's signed-rank test, is more powerful than the sign test, and has the benefit of being a true distribution-free test.
The procedure for Kornbrot's rank difference test is as follows:
Combine all paired observations.
Order the values from smallest to largest.
Assign ranks with average rank for ties.
Perform the Wilcoxon signed-rank test using the paired ranks.
The test statistic for the rank difference test is not exactly
equal to the test statistic of the naive rank-transformed Wilcoxon
signed-rank test
, the latter being implemented in
rdt()
. Using should result in a
conservative estimate for
, and they approach in distribution as the
sample size increases. Kornbrot (1990)
discusses methods for calculating
when
and
.
See srt()
for additional details about implementation
of Wilcoxon's signed-rank test.
A list with the following elements:
Slot | Subslot | Name | Description |
1 | statistic |
Test statistic. for the exact Wilcoxon signed-rank distribution or for the asymptotic normal approximation. |
|
2 | p |
p-value. | |
3 | alternative |
The alternative hypothesis. | |
4 | method |
Method used for test results. | |
5 | formula |
Model formula. | |
6 | pseudomedian |
Measure of centrality for y - x . Not calculated when argument ci = FALSE . |
|
6 | 1 | estimate |
Estimated pseudomedian. |
6 | 2 | lower |
Lower bound of confidence interval for the pseudomedian. |
6 | 3 | upper |
Upper bound of confidence interval for the pseudomedian. |
6 | 4 | ci.level.requested |
The chosen ci.level . |
6 | 5 | ci.level.achieved |
For pathological cases, the achievable confidence level. |
6 | 6 | estimate.method |
Method used for calculating the pseudomedian. |
6 | 7 | ci.method |
Method used for calculating the confidence interval. |
7 | n |
Number of observations | |
7 | 1 | original |
The number of observations contained in data . |
7 | 2 | nonmissing |
The number of non-missing observations available for analysis. |
7 | 3 | zero.adjusted |
The number of non-missing and non-zero values available for analysis. i.e. n$nonmissing - n$zeros . |
7 | 4 | zeros |
The number of values that were zero. |
7 | 5 | ties |
The number of values that were tied. |
Kornbrot DE (1990). “The rank difference test: A new and meaningful alternative to the Wilcoxon signed ranks test for ordinal data.” British Journal of Mathematical and Statistical Psychology, 43(2), 241–264. ISSN 00071102, doi:10.1111/j.2044-8317.1990.tb00939.x.
#---------------------------------------------------------------------------- # rdt() example #---------------------------------------------------------------------------- library(rankdifferencetest) # Use example data from Kornbrot (1990) data <- kornbrot_table1 # Create long-format data for demonstration purposes data_long <- reshape( data = kornbrot_table1, direction = "long", varying = c("placebo", "drug"), v.names = c("time"), idvar = "subject", times = c("placebo", "drug"), timevar = "treatment", new.row.names = seq_len(prod(length(c("placebo", "drug")), nrow(kornbrot_table1))) ) # Subject and treatment should be factors. The ordering of the treatment factor # will determine the difference (placebo - drug). data_long$subject <- factor(data_long$subject) data_long$treatment <- factor(data_long$treatment, levels = c("drug", "placebo")) # Recreate analysis and results from section 7.1 in Kornbrot (1990) ## The p-value shown in Kornbrot (1990) was continuity corrected. rdt() does ## not apply a continuity correction, so the p-value here will be slightly ## lower. It does match the uncorrected p-value shown in footnote on page 246. rdt( data = data, formula = placebo ~ drug, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) rdt( data = data_long, formula = time ~ treatment | subject, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) # The same outcome is seen after transforming time to rate. ## The rate transformation inverts the rank ordering. data$placebo_rate <- 60 / data$placebo data$drug_rate <- 60 / data$drug data_long$rate <- 60 / data_long$time rdt( data = data, formula = placebo_rate ~ drug_rate, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) rdt( data = data_long, formula = rate ~ treatment | subject, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) # In contrast to the rank difference test, the Wilcoxon signed-rank test # produces differing results. See table 1 and table 2 in Kornbrot (1990). wilcox.test( x = data$placebo, y = data$drug, alternative = "two.sided", paired = TRUE, exact = FALSE, correct = FALSE )$p.value/2 wilcox.test( x = data$placebo_rate, y = data$drug_rate, alternative = "two.sided", paired = TRUE, exact = FALSE, correct = FALSE )$p.value/2
#---------------------------------------------------------------------------- # rdt() example #---------------------------------------------------------------------------- library(rankdifferencetest) # Use example data from Kornbrot (1990) data <- kornbrot_table1 # Create long-format data for demonstration purposes data_long <- reshape( data = kornbrot_table1, direction = "long", varying = c("placebo", "drug"), v.names = c("time"), idvar = "subject", times = c("placebo", "drug"), timevar = "treatment", new.row.names = seq_len(prod(length(c("placebo", "drug")), nrow(kornbrot_table1))) ) # Subject and treatment should be factors. The ordering of the treatment factor # will determine the difference (placebo - drug). data_long$subject <- factor(data_long$subject) data_long$treatment <- factor(data_long$treatment, levels = c("drug", "placebo")) # Recreate analysis and results from section 7.1 in Kornbrot (1990) ## The p-value shown in Kornbrot (1990) was continuity corrected. rdt() does ## not apply a continuity correction, so the p-value here will be slightly ## lower. It does match the uncorrected p-value shown in footnote on page 246. rdt( data = data, formula = placebo ~ drug, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) rdt( data = data_long, formula = time ~ treatment | subject, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) # The same outcome is seen after transforming time to rate. ## The rate transformation inverts the rank ordering. data$placebo_rate <- 60 / data$placebo data$drug_rate <- 60 / data$drug data_long$rate <- 60 / data_long$time rdt( data = data, formula = placebo_rate ~ drug_rate, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) rdt( data = data_long, formula = rate ~ treatment | subject, alternative = "two.sided", distribution = "asymptotic", zero.method = "wilcoxon", correct = FALSE, ci = TRUE ) # In contrast to the rank difference test, the Wilcoxon signed-rank test # produces differing results. See table 1 and table 2 in Kornbrot (1990). wilcox.test( x = data$placebo, y = data$drug, alternative = "two.sided", paired = TRUE, exact = FALSE, correct = FALSE )$p.value/2 wilcox.test( x = data$placebo_rate, y = data$drug_rate, alternative = "two.sided", paired = TRUE, exact = FALSE, correct = FALSE )$p.value/2
Performs Wilcoxon's signed-rank test.
srt( data, formula, ci = FALSE, ci.level = 0.95, alternative = "two.sided", mu = 0, distribution = NULL, correct = TRUE, zero.method = "wilcoxon", tol.root = 1e-04, digits.rank = Inf )
srt( data, formula, ci = FALSE, ci.level = 0.95, alternative = "two.sided", mu = 0, distribution = NULL, correct = TRUE, zero.method = "wilcoxon", tol.root = 1e-04, digits.rank = Inf )
data |
A data frame. |
formula |
A formula of form:
|
ci |
A scalar logical. Whether or not to calculate the pseudomedian and its confidence interval. |
ci.level |
A scalar numeric from (0, 1). The confidence level. |
alternative |
A string for the alternative hypothesis: |
mu |
A scalar numeric from (-Inf, Inf). Under the null hypothesis, |
distribution |
A string for the method used to calculate the p-value.
If |
correct |
A scalar logical. Whether or not to apply a continuity correction for the normal approximation of the p-value. |
zero.method |
A string for the method used to handle values equal to
zero: |
tol.root |
A numeric scalar from (0, Inf). For
stats::uniroot |
digits.rank |
A numeric scalar from (0, Inf]. If finite,
base::rank |
The procedure for Wilcoxon's signed-rank test is as follows:
For one-sample data x
or paired samples x
and y
, where mu
is the
measure of center under the null hypothesis, define the 'values' used for
analysis as (x - mu)
or (x - y - mu)
.
Define 'zero' values as (x - mu == 0)
or (x - y - mu == 0)
.
zero.method = "wilcoxon"
: Remove values equal to zero.
zero.method = "pratt"
: Keep values equal to zero.
Order the absolute values from smallest to largest.
Assign ranks to the ordered absolute values, using
mean rank for ties.
zero.method = "pratt"
: remove values equal to zero and their corresponding
ranks.
Calculate as the sum of the ranks for positive values. The sum of
and
is
, so either can be calculated from
the other. If the null hypothesis is true,
and
are
expected to be similar in value.
Calculate the test statistic
distribution = "exact"
: Use as the test statistic.
takes
values between 0 and
. Under the null hypothesis, its
expected mean and variance are
and
.
distribution = "asymptotic"
: Let
be the standardized version of
, then
asymptotically. If there are ties, use the adjusted variance
,
where
is the number of ties for each unique ranked absolute 'value'.
zero.method = "pratt"
: The expected mean and variance are modified
to be
and
.
correct = TRUE
: For two-sided, greater than, and less than
alternatives, define the continuity correction as ,
, and
, respectively.
is redefined as
.
Calculate the p-value
distribution = "exact"
: Use the Wilcoxon signed-rank distribution to calculate
the probability of being as or more extreme than .
When zeros or ties are present, use the Shift-Algorithm from
Streitberg and Röhmel (1984) as implemented
in exactRankTests::pperm()
.
When zeros and ties are absent, use stats::psignrank()
.
distribution = "asymptotic"
: Use the standard normal distribution to calculate the
probability of being as or more extreme than . See
stats::pnorm()
.
zero.method = "pratt"
uses the method by
Pratt (1959), which first
rank-transforms the absolute values, including zeros, and then removes the
ranks corresponding to the zeros. zero.method = "wilcoxon"
uses the method
by Wilcoxon (1950), which first removes
the zeros and then rank-transforms the remaining absolute values.
Conover (1973)
found that when comparing a discrete uniform distribution to a distribution
where probabilities linearly increase from left to right, Pratt's method
outperforms Wilcoxon's. When testing a binomial distribution centered at zero
to see whether the parameter of each Bernoulli trial is ,
Wilcoxon's method outperforms Pratt's.
When ci = TRUE
, a pseudomedian and it's confidence interval are returned.
For exact tests, The Hodges-Lehman estimate is used and the confidence bounds
are estimated by calculating an appropriate quantile of the pairwise averages
(Walsh averages). For asymptotic tests, the pseudomedian is estimated using
stats::uniroot()
to search for a root of the asymptotic normal approximation
of the Wilcoxon signed-rank distribution, with similar strategy for the
confidence bounds.
The signed rank test traditionally assumes the values are independent and identically distributed, with a continuous and symmetric distribution. The hypotheses are stated as:
Null: (x)
or (x - y)
is centered at mu
.
Two-sided alternative: (x)
or (x - y)
is not centered at mu
.
Greater than alternative: (x)
or (x - y)
is centered at a value greater
than mu
.
Less than alternative: (x)
or (x - y)
is centered at a value less than
mu
.
However, not all of these assumptions are required (Pratt and Gibbons 1981). The 'identically distributed' assumption is not required, keeping the level of test as expected for the hypotheses as stated above. The symmetry assumption is not required when using one-sided alternative hypotheses as:
Null: (x)
or (x - y)
is symmetric and centered at mu
.
Greater than alternative: (x)
or (x - y)
is stochastically larger than
mu
.
Less than alternative: (x)
or (x - y)
is stochastically smaller than
mu
.
stats::wilcox.test()
is the canonical function for the Wilcoxon signed-rank
test. Improvements and updated methods were introduced in
exactRankTests::wilcox.exact()
and later coin::wilcoxsign_test()
.
srt()
attempts to refactor these functions so the best
features of each is available in a fast and easy to use format.
A list with the following elements:
Slot | Subslot | Name | Description |
1 | statistic |
Test statistic. for the exact Wilcoxon signed-rank distribution or for the asymptotic normal approximation. |
|
2 | p |
p-value. | |
3 | alternative |
The alternative hypothesis. | |
4 | method |
Method used for test results. | |
5 | formula |
Model formula. | |
6 | pseudomedian |
Measure of centrality for x or y - x . Not calculated when argument ci = FALSE . |
|
6 | 1 | estimate |
Estimated pseudomedian. |
6 | 2 | lower |
Lower bound of confidence interval for the pseudomedian. |
6 | 3 | upper |
Upper bound of confidence interval for the pseudomedian. |
6 | 4 | ci.level.requested |
The chosen ci.level . |
6 | 5 | ci.level.achieved |
For pathological cases, the achievable confidence level. |
6 | 6 | estimate.method |
Method used for calculating the pseudomedian. |
6 | 7 | ci.method |
Method used for calculating the confidence interval. |
7 | n |
Number of observations | |
7 | 1 | original |
The number of observations contained in data . |
7 | 2 | nonmissing |
The number of non-missing observations available for analysis. |
7 | 3 | zero.adjusted |
The number of non-missing and non-zero values available for analysis. i.e. n$nonmissing - n$zeros . |
7 | 4 | zeros |
The number of values that were zero. |
7 | 5 | ties |
The number of values that were tied. |
Wilcoxon F (1950). “SOME RAPID APPROXIMATE STATISTICAL PROCEDURES.” Annals of the New York Academy of Sciences, 52(6), 808–814. ISSN 00778923, 17496632, doi:10.1111/j.1749-6632.1950.tb53974.x.
Pratt JW, Gibbons JD (1981). Concepts of Nonparametric Theory, Springer Series in Statistics. Springer New York, New York, NY. ISBN 9781461259336 9781461259312, doi:10.1007/978-1-4612-5931-2.
Pratt JW (1959). “Remarks on Zeros and Ties in the Wilcoxon Signed Rank Procedures.” Journal of the American Statistical Association, 54(287), 655–667. ISSN 0162-1459, 1537-274X, doi:10.1080/01621459.1959.10501526.
Cureton EE (1967). “The Normal Approximation to the Signed-Rank Sampling Distribution When Zero Differences are Present.” Journal of the American Statistical Association, 62(319), 1068–1069. ISSN 0162-1459, 1537-274X, doi:10.1080/01621459.1967.10500917.
Conover WJ (1973). “On Methods of Handling Ties in the Wilcoxon Signed-Rank Test.” Journal of the American Statistical Association, 68(344), 985–988. ISSN 0162-1459, 1537-274X, doi:10.1080/01621459.1973.10481460.
Hollander M, Wolfe DA, Chicken E (2014). Nonparametric statistical methods, Third edition edition. John Wiley & Sons, Inc, Hoboken, New Jersey. ISBN 9780470387375.
Bauer DF (1972). “Constructing Confidence Sets Using Rank Statistics.” Journal of the American Statistical Association, 67(339), 687–690. ISSN 0162-1459, 1537-274X, doi:10.1080/01621459.1972.10481279.
Streitberg B, Röhmel J (1984). “Exact nonparametrics in APL.” SIGAPL APL Quote Quad, 14(4), 313–325. ISSN 0163-6006, doi:10.1145/384283.801115.
Hothorn T (2001). “On exact rank tests in R.” R News, 1(1), 11-12. ISSN 1609-3631, https://journal.r-project.org/articles/RN-2001-002/.
Hothorn T, Hornik K (2002). “Exact Nonparametric Inference in R.” In Härdle W, Rönz B (eds.), Compstat, 355–360. Physica-Verlag HD, Heidelberg. ISBN 9783790815177 9783642574894, doi:10.1007/978-3-642-57489-4_52.
Hothorn T, Lausen B (2003). “On the exact distribution of maximally selected rank statistics.” Computational Statistics & Data Analysis, 43(2), 121–137. ISSN 01679473, doi:10.1016/S0167-9473(02)00225-6.
Hothorn T, Hornik K, Wiel MAVD, Zeileis A (2008). “Implementing a Class of Permutation Tests: The coin Package.” Journal of Statistical Software, 28(8). ISSN 1548-7660, doi:10.18637/jss.v028.i08.
stats::wilcox.test()
,
coin::wilcoxsign_test()
,
rdt()
#---------------------------------------------------------------------------- # srt() example #---------------------------------------------------------------------------- library(rankdifferencetest) # Use example data from Kornbrot (1990) data <- kornbrot_table1 # The rate transformation inverts the rank ordering. data$placebo_rate <- 60 / data$placebo data$drug_rate <- 60 / data$drug # In contrast to the rank difference test, the Wilcoxon signed-rank test # produces differing results. See table 1 and table 2 in Kornbrot (1990). srt( formula = placebo ~ drug, data = data, alternative = "two.sided", distribution = "asymptotic", correct = FALSE, zero.method = "wilcoxon", ci = TRUE ) srt( formula = placebo_rate ~ drug_rate, data = data, alternative = "two.sided", distribution = "asymptotic", correct = FALSE, zero.method = "wilcoxon", ci = TRUE )
#---------------------------------------------------------------------------- # srt() example #---------------------------------------------------------------------------- library(rankdifferencetest) # Use example data from Kornbrot (1990) data <- kornbrot_table1 # The rate transformation inverts the rank ordering. data$placebo_rate <- 60 / data$placebo data$drug_rate <- 60 / data$drug # In contrast to the rank difference test, the Wilcoxon signed-rank test # produces differing results. See table 1 and table 2 in Kornbrot (1990). srt( formula = placebo ~ drug, data = data, alternative = "two.sided", distribution = "asymptotic", correct = FALSE, zero.method = "wilcoxon", ci = TRUE ) srt( formula = placebo_rate ~ drug_rate, data = data, alternative = "two.sided", distribution = "asymptotic", correct = FALSE, zero.method = "wilcoxon", ci = TRUE )