O-A194 971 RN AkP WdORKSPACE FOR CONDUCTING NONPRAHETRIC1/STATISTICAL INFERENCE(U) NAVAL POSTGRADUATE SCHOOL
MONTEREY CA W F YAGTS JUN 87
UMCLRSSIFIED F/G 12/3 HEhhE IlEEEE
1111 1 1080 128
lul ~l ''
11111'' .6IIII1II -8
HI1111- liiil,
MICROCOPY RESOLUTION TEST CHART
NATONAL BUREAU OF STANDAROS 963-A
AV .. &V ' mv ,
Z~
'---
, . ON J':FILE ""
NAVAL POSTGRADUATE SCHOOLMonterey, California
DTIC
ELCT
OCT 0 7 198
THESISAN APL WORKSPACE FOR CONDUCTING
NONPARAMETRIC STATISTICAL INFERENCE
by
Wayne Franz Vagts
June 1987
S.
Thesis Advisor: T. Jayachandran
Approved for public release; distribution is unlimited
0.
. .,'.,,) 183
sEcuairy CILASSIFICATION OF TIS OAGE
REPORT DOCUMENTATION PAGEI# REPORT SECURITY CLASSIFICATION lb RESTRICTIVE MARKINGS
UNCLASS IFIED)___________________I& SECURITY CLASSIFICA TION AUTHRI401TY 3 RISTRIBUTI0Nd4AAILAILhTY OF .REPORT
_____________________________ pproved for public release;l b OECLASSIF#CATION /DOWN4GRADiJNG SCHEDULE distribution is unlimited.
4 PERFORMING ORGANIZATION REPORT NUMBER(S S MONITORING ORGANIZATION REPORT NUMBER(S)
64 NAME OF PERFORMING ORGANIZATION J60 OFFICE SYMBOL 74 NAME OF MONITORING ORGANIZATION
Naval Postgraduate School 53 Naval Postgraduate School
&c AODRE 55 City. State. and ZIP Cod.; 7b ADDRESS (City, State, and ZIP Coll
Monterey, California 93943-5000 Monterey, California 93943-5000
, NJAME O2F ;UNOINGiSPONSORING Bab OFFICE SYMBOL 9 PROCUREMENT INSTRUMENT IDEN TIFICATION 'juM8RODRGANIZATION (if dO0hCrabAi
8c AODRESS(City. Sfate. and ZiPCod.) 10 SOURCE OF FUNDING NUMBERS d
PROGRAM PROJECT TASK IWORK .JNlrELEMENT NO INO NO ACCESSION NO
1VTLE Incil e 5cui Clawholcarion)
*AN APL .JORKSPACE FOR CONDUCTING NONPAIRAMETRIC STATISTICAL INFERENCE
12 PERSONAL AUTIIOR(S)VAT,1yeFrn
1TYPE OF FIEP iiRT t 3b TIME COVERED 114 POR (Yea, Month. Day) jsPAGE M lUNMases hesis FROM _ TO ue116
'61 SUPPLEMVENTARY NOTATION
17 COSATI CODES ff"%~ f Tj jj t14& ~~ELD GROP SB-GOUP Mann-Whitney Test, Kendall Test, Spearman Test,
Nonparametric simple regression, (continued)'9 ABSTRACT (eontonue on leverseo of nocenary and odentify by block number)
This thesis contains programs written in APL and documentation forperforming certain nonparametric tests and computing nonpara-metricconfidence intervals. These methods of inference are particularly usefulin dealing with Department of Defense related problems as illustrated intaie several nilitari examples worked in Appendix C. The following non-oarametric tests ire considered: Sjign Test, Wqilcoxon Sqned-rank -71stMann-W'hitney Test,. Kruskal--'Jallis Test, KendaL1's B, Soearmnan's a, andNonparametric Linear Regression. The tests are based on the exact -
*distributions of the respective test statistics unless a large sampleapproximation is determined to provide at least a three decimal placeaccuracy. The software consists of two APL workspaces; one, which is
*designed for microcomputers (IBM PC's or compatibles) and is menu driven,
0 DS~RiSUTION /AVAILAILITY OF ABSTRACT 21 STfTS~_IYeSFCTO
(3..NCLASSIFIE0flJNLIMITIEO 0 SAME AS RIPT 03 DTIC USERS MCassi22a NAME OF RESPONSIBLE INDIVIDUAL 22b TFJEPNQNI (Ii.do Area Code) 22c OFFi ILL O
. Jayachandran 4 0-6 620000 FORM 1473.64 MAR 83 APR eolil e~ay be used unil~ Olluted SECURITY CL.ASSIFICATION O r,415 PAGE -All other editions are obsolete
Jill" 11 111 Jil
ISUNUY CLASUIFCAnOW OF T"15 PAQ9 jWtm D O&,
Block 19. Abstract (Continued)
and the other, without menus, is designed for the mainframecomputer (IBM 3033) at the Naval Postgraduate School.
Block 18. Subject Terms (continued)
Exact Distribution, Asymptotic Approximations.
.
p.a,
SIN 0102- LF- 014-6601
2 SCURITY CLAIMPICAIOW OP IrIS PAG,(Mh, Veta Dnt.e.d)
a ~ ~ ~ : ~ *.- ~ *p~ **,..'-* -**-~'j*~** *~*** !*.':*
Approved for public release; distribution is unlimited.
An APL Workspace for Conducting NonparametricStatistical Inference
by
Wayne Franz VagtsLieutenant Commander, United States Navy
B.S., University of Notre Dame, 1975
submitted in partial fulfillment of therequirements for the degree of
MASTER OF SCIENCE IN OPERATIONS RESEARCH
from the
NAVAL POSTGRADUATE SCHOOLJune 1987
Author: _ _ _ __ _ _ __ _ _
Wayfie Franz tagt
Approved by:
Toke Jeyadhandran, Thesis Advisor
Robe Recond Reader
Peter urdue, Chairman, Department ofOperations Research
Kneale T. Marshall, Dean rormationand Policy Sciences
3
ABSTRACT
This thesis contains programs written in APL and
documentation for performing certain nonparametric
tests and computing nonparametric confidence intervals.
These methods of inference are particularly useful in
dealing with Department of Defense related problems as
illustrated in the several military examples worked in
Appendix C. The following nonparametric tests are
considered: Sign Test, Wilcoxon Signed-rank Test, Mann-
Whitney Test, Kruskal-Wallis Test, Kendall's B,
Spearman's R, and Nonparametric Linear Regression. The
tests are based on the exact distributions of the
respective test statistics unless a large sample
approximation is determined to provide at least a three
decimal place accuracy. The software consists of two
APL workspaces; one, which is designed for
microcomputers (IBM PC's or compatibles) and is menu-
driven, and the other, without menus, is designed for
the mainframe computer (IBM 3033) at the Naval
Postgraduate School.
4
TABLE OF CONTENTS
LIST OF TABLES............................................. 7
ACKNOWLEDGEMENTS........................................... 8
I. INTRODUCTION.......................................... 9
II. WORKSPACE DESIGN ISSUES............................1
III. GENERAL SAMPLE SIZE CONSIDERATIONS ANDASYMPTOTIC APPROXIMATIONS.......................... 13
IV. TESTS FOR LOCATION BASED ON
SINGLE AND PAIRED-SAMPLE DATA..................... 15
A. ORDINARY SIGN TEST............................ 15
B. WILCOXON SIGNED-RANK TEST.................... 18
V. TESTS BASED ON TWO OR MORE SAMPLES................ 24
A. MANN-WHITNEY TEST............................. 24
B. KRUSKAL-WALLIS TEST........................... 3
VI. TESTS FOR ASSOCIATION IN PAIRED-SAMPLES.......... 35
A. KENDALL'S B................................... 35
B. SPEARMAN'SR................................... 38
VII. NONPARAMETRIC SIMPLE~ LINEAR REGRESSION............ 43
A. COMPUTATION OF THEESTIMATED REGRESSION EQUATION................ 43
B. HYPOTHESIS TESTING............................ 44
C. CONFIDENCE INTERVAL ESTIMATION............... 4
VIII.AREAS FOR FURTHER WORK............................. 46
LIST OF REFERENCES....................................... 47
APPENDIX A: DOCUMENTATION FOR THE
MICROCOMPUTER WORKSPACE................... 49
5
APPENDIX B: DOCUMENTATION FOR THEMAINFRAME COMPUTER WORKSPACE ............ 52
APPENDIX C: WORKSPACE FAMILIARIZATIONTHROUGH PRACTICAL EXAMPLES .............. 54
APPENDIX D: MAIN PROGRAM LISTINGSFOR MICROCOMPUTER WORKSPACE .............. 78
APPENDIX E: MAIN PROGRAM LISTINGSFOR MAINFRAME COMPUTER WORKSPACE ........ 88
APPENDIX F: LISTINGS OF SUBPROGRAMSBASIC TO BOTH WORKSPACES ................ 98
APPENDIX G: LISTINGS OF PROGRAMS USED TOGENERATE C.D.F. COMPARISON TABLES ...... 107
INITIAL DISTRIBUTION LIST ........................... 115
6
V V ~ V V~~ ~'
LIST OF TABLES
1. C.D.F. COMPARISONS FOR THE SIGN TEST ............. 16
2. C.D.F. COMPARISONS FORTHE WILCOXON SIGNED-RANK TEST ................... 20
3. C.D.F. COMPARISONSFOR THE MANN-WHITNEY TEST ....................... 26
4. C.D.F. COMPARISONS FORTHE KRUSKAL-WALLIS TEST ......................... 32
5. C.D.F. COMPARISONS FOR THE KRUSKAL-WALL TS TEST USING COMPUTER SIMULATION............. 33
6. C.D.F. COMPARISONS FOR KENDALL'S B .............. 37
7. C.D.F. COMPARISONS FOR SPEARMAN'S R .............. 41
Ac.-esion ,For
NTIS CRADTC TABULlmjnirvzwc C]
.i. ' j-VOID
7
ACKNOWLEDGMENTS
Many thanks to Professor Larson for his APL
workspace STATDIST from which several of the normal
theory based asymptotic approximations are computed.
APL*PLUS/PC System software and APL*PLUS/PC TOOLS are
used in the construction of the microcomputer work-
space.1 IBM's VSAPL is the APL version used for the
mainframe workspace.
/
IAPL*PLUS is a copyrighted software from STSC,Inc., a CONTEL Company, 2115 East Jefferson Street,Rockville, Maryland 20852.
8
I.INTRODUCTION
Although nonparametric procedures are powerful
tools to the analyst, they are currently underused and
often avoided by potential users. Perhaps one reason
for this is the difficulty in generating the exact
distributions of the test statistics, even for moderate
sample sizes. Consequently, tables of theseko
distributions are only available for very small samole
sizes and normal theory based approximations must then
be used.
The purpose of this thesis is to make a variety of -'
nonparametric procedures quick, easy and accurate to
apply using menu driven computer programs in APL.1
These programs use enumeration, recursion, or
combinatorial formulas to generate the exact null
distribution of the various nonparametric test
statistics. This allows hypothesis testing and
confidence interval estimation to be based on exact
distributions without the use of tables. For larger
sample sizes, the normal, F, and T distributions are
1 APL was chosen because it is an interactivelanguage that is especially powerful at performingcalculations dealing with rank order statistics andvector arithmetic. Menus are not included in theworkspace designed for the mainframe.
9
used to approximate the distributions of the test
statistics with three decimal place accuracy.
Section II addresses workspace design issues, 'to
include, workspace requirements and assumptions
regarding its use. Section III discusses the methods
used to assess the accuracy of different asymptotic
approximations, and the sample sizes required for an
approximation to yield three decimal place accuracy.
Section IV gives background information and discusses
programming methodology for nonparametric tests based
on single and paired sample data. In Section V,
nonparametric tests for two or more independent samples
are considered. Section VI discusses nonparametric
tests for association; and, Section VII deals with
nonparametric simple linear regression. Section VIII
recommends other nonparametric tests that may be added
to the workspace and areas for further work.
To show application of nonparametric statistical
methods to Department of Defense problems, several
military examples are worked in Appendix C.
10
rd
II.WORKSPACE DESIGN ISSUES
This section presents a brief overview of the
design considerations used in developing the APL
workspace for both the mainframe and microcomputer.
A. EQUIPMENT AND SOFTWARE REQUIREMENTS
The microcomputer must be an IBM PC or AT
compatible, equipped with 512 kilobytes of RAM and the
APL*PLUS/PC system software, release 3.0 or later, and
IBM's DOS, version 2.00 or later.1 The 8087 math
coprocessor chip is no required to run this software,
but will increase the computational speed.
B. KNOWLEDGE LEVEL OF THE USER
The user is expected to have had some exposure to
APL and a working knowledge of nonparametric
statistics. Familiarity with microcomputers or the
Naval Postgraduate School mainframe computer is
assumed.
1 The APL system software requires 144 kilobytes ofRAM while the NONPAR workspace requires an additional190 kilobytes.
ii0V
: Z:ell11
C. SELECTION OF TESTS
The nonparametric tests chosen for this workspace
are some of the more widely known, and are considered
basic material for any nonparametric statistics course.
More information about the tests can be found in any of
the textbooks that are referred to in this document.
D. MENU DISPLAYS
The microcomputer's workspace is designed around
the use of menus. This was accomplished using the
software package PC TOOLS from STSC. These menus are
designed to guide the user through the selection of the
tests without an excessive amount of prompting. The
main menu displays the choices available in the
workspace, while the test menus give the background
information and options available for each test. Help
menus to provide additional information about the
tests are also available.
E. ORGANIZATION OF WORKSPACE DOCUMENTATION
Separate documentation is included for the
microcomputer's and mainframe computer's workspaces
(see Appendices A and B. respectivelyY. These
appendices explain the organization and opera:>irn of
the workspaces. Appendix C, which provides example
problems for each nonparametric test, is applicable to
both workspaces.
12
. . .-. . . . .. -. , . ,... ....... ., .. ,. . . ,.. .... -.. ,.- .-. -. - ,,' -N .
III.GENERAL SAMPLE SIZE CONSIDERATIONS AND ASYMPTOTICAPPROXIMATIONS
In this thesis, the term alpha value is used in
a general sense, and refers to the probability of
rejecting a true null hypothesis. The term P-value
refers to the probability -hat a test statistic 4i''
exceed (or not exceed in the lower-tailed test) the
computed value, when the hypothesis being tested is
true.
For selected values, the exact cumulativedistribution functions (C.D.F.' of the test s.a::s-/cs
are compared with those obtained from normal based
asymptotic approximations. The results of the compari-
sons are used as a basis for assessing the accuracy of
the approximations. In those cases where more than one
asymptotic approximation has been suggested in the
literature, the accuracy of each approximation is
compared over a range of desired C.D.F. values and
sample sizes. From the results, the most consistently
accurate approximation, and the sample size for which
,hat approx.mation provides a! >'as . -_ e iec:=a
place accuracy is determined.
Once the accuracy comparisons were completed for a
specific nonparametric test, microcomputer capabilities
13
were considered. In some cases, generation of the
exact distribution up to the desired sample size took
too long or was not possible on the PC. When this
occurred, the mainframe computer was used to generate
the required distributions with the results stored in
numerical matrices for quick recall by the
nonparametric test programs.
14
.1S:
a-... - S
IV.TESTS FOR LOCATION BASED ON SINGLE AND PAIRED-SAMPLE DATA
The tests assume that the data consists of a
single set of independent observations Xi or paired
observations (Xi,Yi), i=1,2,...,N, from a continuous
distribution. For the single and paired-sample cases,
the null hypotheses are concerned with the median of
the Xi and the median of the differences X;- Y-;.
respectively. The tests considered are the Ordinary
Sign Test and the Wilcoxon Signed-Rank Test.
A. ORDINARY SIGN TEST
The Sign test can be used to test various
hypothesis about the population median (or the median
of the population of differences). Confidence
intervals for these parameters can also be constructed.
As a final option, nonparametric confidence intervals
for the quantiles of a continuous distribution are
offered.
1. Computation of the Test StatisticFor single-sample data. the test statisti, K
is computed as the number of observations Xi greater
than the hypothesized median M0 . For the paired-sample
case, K is the number of differences Xi- Yi that exceed
M0 . All observations Xi (or Xi- Yi) that are equal to
15
MO are ignored and the sample size decreased
accordingly. As long as the number of such ties is
small relative to the size of the sample, the test
results are not greatly affected. Gibbons CRef. 2:pp.
108].
2. The Null and Asymntotic Distribution of K
The null distribution of K is binomial
with p = .5. in Table 1, the exact values of the
C.D.F are compared with the corresponding approximate
values using a normal approximation with and without
continuity correction.
TABLE 1. C.D.F. COMPARISONS FOR THE SIGN TEST
PROBC I kJ; FOR SAMPLI SIZE EQUAL TO 24.
TEST STAT. VALUE I 51 6 1 7 8 I 9 1 01 I 1iI I I I I'
EXACT C.D. . I .00331 I .01±33 I .0396 I .07579 I .15373 I .27003 I .41941I I I I I I [
ERROR; NORMAL I .00117 I .00407 I .01134 I .02456 I .04339 I .06352 I .07786I I I I I I
ERROR; NORM. /CC 1-.00068 1V.00104 1V.00114 1V.00073 1 .00001 I .00043 I .00028I I I ,
PROSEX .k]; FOR SAMPLE SIZE EQUAL TO 25.
TEST STAT. VALUE I 6 1 7 1 8 1 9 I 10 I 11 I 12I I I Ii
EXACT C.D.F. I .00732 I .02164 I .05338 I .11476 I 2121 I .34502 I .50000I I ,
0' 'I21A 0774 2±17?5 .)' 0 5,3 :.o7.)7
!02M. 'L .00,33 011 1 1 00092 .0031 .00032 1 .00044 .3 00 .16i I
16 ,
As can be seen, for sample sizes greater than
25, a normal approximation with continuity correction
is accurate to at least three decimal places.
3. Hypothesis Testing
P-values are computed for three basic
hypotheses comparing the median of the population or
the median of the population of differences M with a
hypothesized median M0 . P-values are taken from the
cumulative distribution of the binomial for the
following tests of hypothesis.
a. One-sided Tests
(1) H0: M = Mo Versus HI: M < Mo. The
P-value equals Pr[K < k], where k is the computed
value of the test statistic.
(2) H0: M = M0 Versus HI: M > M0 . The
P-value equals Pr[K > k].
b. Two-sided Test.
(1) H0: M = M0 Versus Hi: M 0 M0 . The
P-value equals twice the smaller value of a(1 ) or
a(2), but does not exceed the value one.
For sample sizes greater than 25, a normal
approximation with continuity =orrectin is used.
4. Confidence interva. Zs,:imation
Confidence intervals for the population
median are based on the ordered observations in the
sample. For paired-sample data, confidence bounds are
17
4. P IF IP e
obtained from the ordered differences of the pairs of
data. A 100(1- a )% confidence interval is determined
in the following manner. Let k be the number such that
Pr[K < k] < ( a/2). Then, the (k+i)th and (N-k)th
order statistics constitute the end points of the
confidence interval. Gibbons [Ref. 1: pp.104].
For computing confidence intervals when
sample size N is greater than 25, a normal
approximation with continuity correction is used.
Also included under this test is an option to
generate nonparametric confidence intervals for any
specified quantile given a sample size N from a
continuous distribution. The end points of the
intervals are sample order statistics.
B. WILCOXON SIGNED-RANK TEST
The signed-rank test requires the added assumption
that the underlying distribution is symmetric. This
test uses the ranks of the differences Xi- M0 (or Xi-
Yi- M0 ) together with the signs of these differences to
determine the test statistic. Confidence intervals for
the median can also be constructed.
1. omoutat.Zn cDt' the res- StatestL;
For single-sample data, the test statistic W
is computed as follows.
18
.'*
• - " ... ' . . - -. :. - -.. 2. .- . .. . .. . . . . '-
L 1 if Xi - MO > 0Let Zj
I 0 if Xi- M0 <0
Nand let ri - rank(IXi- Mo). Then, W = Ziri.
j=1
For paired sample data, W is calculated in
the same manner, except the differences to be ranked
are the paired-differences minus the hypothesized
median. Zero differences are ignored and the sample
size is decreased accordingly. When ties occur between
ranks, the average value of the ranks involved are
assigned to the tied positions. It has been shown that
a moderate number of ties and zero differences has
little effect on the test results.1
2. The Null and Asymptotic Distribution of W
The exact null distribution of W is given by:
Pr[W = w] = UN(w)/2N, w - 0,1,2,...,N(N+1)/2, where
UN(w) is the number of ways to assign plus and minus
signs to the first N integers such that the sur e
positive integers equals w. It can be shown "ee
Gibbons [Ref. 1:pp. 112)) that uN(w), for successive
values of N, can be computed using the recursive
relationship:
UN(w) = UNI(W-N) - UN_1(w)
IFor more information on the effects that zeros andtied ranks have on the Wilcoxon Signed-Rank Test, seePratt and Gibbons [Ref. 3).
19
Exact C.D.F. values were compared with those
obtained using the following asymtotic approximations:
student's T with (N-i) degrees of freedom (T),
student's T with continuity correction (TC), normal
(Z), normal with a contin ity correction (ZC), the
average of T and Z as suggested by Iman [Ref. 4], and
the average of TC and ZC.
As can be seen in Table 2 below, the average
of TC and ZC gives the most consistently accurate
results with three decimal place accuracy when the
sample size exceeds 9.
TABLE 2. C.D.F. COMPARISONS FOR THE WILCOXON
SIGNED-RANK TEST
PROBEW i w]; FOR SAMPLE SIZE EQUAL TO 9.
TEST STAT. VALUEI 3 5 1 6 8 9 12 1 14
EXACT C.D.F. 1 .00977 I .01953 .02734 1 .04883 1 .06445 1 .12500 1 .17969
ERROR; NORMAL 1".00067 I .00046 1 .00204 1 .00591 1 .00958 1 .01824 1 .02272
ERROR; NORM. W/CC 1".00243 I.00247 1".00167 1 .00023 1 .00269 1 .00693 1 .00806
ER02; T DIST 1 .00518 I .00608 1 00673 1 .00701 1 .00817 1 .00826 1 .00818I i i
ERROR; T WCC 1 .00355 I .00276 1 .00233 I .00012 1 -.00010 1.00437 1 .00725
E2OR; AVE T/2 1 .00225 I .00327 I .00438 1 .00646 1 .00888 1 .0325 .01545
1.0: 56E 02 . .. . 4 .. j033 , 0 17 I -3
20
10j------------
TABLE 2. (Continued)
PROBEW 5 w]; FOR SAMPLE SI2. EQUAL TO W0.
TEST STAT. VALUEI 5 I 7 I 8 I ±O I 12 I 15 I £7.0097 I .044 .04±9
EXACT C..F. 1 .01355 11 .06!43 .1621 1 .161!3
ERROR: NORMAL I .0Oii5 1 .00023 I .00099 1 .00476 1 .00837 1 .01490 1 .01888
ERROR; NORM. W/CC I-.00270 I .002i9' I .00198 1 .00043 I .00229 I .00558 1 .00710
ERROR: T DI3T I .00399 I .0052 I .00525 .00665 I .00661 I .00661 .00681
ERROR; T U/CC I .00249 I .00243 1 .00182 1.00151 1".00052 1 .00376 I .00571I I I I I
ERROR: AVE T/Z 1.00142 1 .00267 1 .00312 1 .00571 .00749 I .01075 1 .01234
ERP.OR- AVE 1,'C I .000iO .'00i2 .00008 i .00097 .00089 I .)009i .0070I I I
3. Hypothesis Testing
P-values are computed for three basic
hypotheses comparing the median of the population or
the median of the population of differences M with a
hypothesized median M0 as shown below.
a. One-sided Tests
(1) HO: M = M0 Versus HI: M < M0 . The
P-value equals Pr[W < w], where w is the computed
value of the test statistic W.
(2) H0: M = M0 Versus Hi: M > M0 . The
P-value equals PrLW > w].
b. Two-s ided Test
(1) H0: M = M0 Versus Hi: M # M0 . The
P-value equals twice the smaller value of a(l) or
a(2), but not exceeding the value one.
21
S.
UWWWWTWWWWWJWWN
For sample sizes greater than 9, an average
of the normal and student's T approximations, each with
continuity correction, is used. Computations of the P-
value for each of the alternative hypotheses are:
a. HI: < M
Let PZC = Pr[ Z < (w +.5 - Mw)/ w] and
let PTC = Pr T( - ) <w -___uwl___.5
gaw2 [Lw - jwj - .5] 2 -5N - I
where Z is standard normal, T(N_1) has a student's T
distribution with (N-I) degrees of freedom, Uw =
N(N+1)/4 and aw2 = (N(N+I1)(2N+1)/24.)). Then, the
P-value for the test is (PZC + (1 - PTC))/ 2 if w is
less than mw and (Pzc + PTC)/2, otherwise. The above
formulas are obtained from those given by Iman [Ref. 4]
after inclusion of a continuity correction.
b. Hi: M > M0
The P-value equals ((1 - PZC) + PTC)/2 if
w is less than Pw and ((1 - PZC) + (1 - PTC))/2,
otherwise. The computation of PZC and PTC is similar
to the above excent the sign of the ccn-inu-7
correction is changed.
c. Hi: M * M0
The P-value equals twice the smaller
value of a or b above, but not exceeding the value one.
22
%9
4. Confidence Interval Estimation
For single-sample data, the confidence
interval for the population median is based on the
ordered averages of all pairs of observations (Xji+
Xj)/2 such that i < J. A 100(1- a)% confidence
interval is determined in the following manner. Let w
be the number such that PrCW < w] < ( a/2). Th'en, the
(w+1)th and (m-w)th order statistics, where m =
N(N+1)/2 or the total number of paired-averages,
constitute the end points of the confidence interval.
A confidence interval for paired-sample data is
computed in the same manner, except the end points are
taken from the paired-averages of the differences X i -
Yi. Gibbons [Ref. 1:pp. 114-118].
For computing confidence intervals when
sample sizes are greater than 9, a normal approximation
with continuity correction is used.
23
V.TESTS BASED ON TWO OR MORE SAMPLES
The tests assume that the data consists of
independent random samples from two or more continuous
distributions. The general null hypothesis is that the
samples are drawn from identical populations. The
Mann-Whitney and Kruskal-Wallis tests are considered.
A. MANN-WHITNEY TEST
The Mann-Whitney test is based on the distribution
of the test statistic U, which can be used to compare
the equality of the population medians or variances
for two samples.1 The Mann-whitney test with a
modified ranking scheme can be used to test for
equality of variances if the population means or
medians are assumed to be equal (Conover [Ref. 5:pp.
229-230]). If the medians differ by a known amount, the
data can be adjusted before applying the test. A
confidence interval for the difference in the medians
of the two populations can also be estimated.
1. Computation of the Test Statistic
For -he comparison Df Dopulat.on medilans. -he
test statistic U is computed from the combined ordered
1The test statistic U and the method used tocompute it are taken from Gibbons [Ref. 1:pp. 140-141].
24
! . . . , - -. " " ".... < -:< ...'.'. . ',. ' . . '..''....'..:, '[,:, .-. ' .. ,:.' .' .'.'': .... . .
arrangement of observations Xi and Yj, i =
J = 1,2,...,M. Let r i = rank(X i ) in the combinedN
ordered sample and RX = ri. Then,i=I
U = RX - M(M+1)/2.
For testing the equality of variances, the
computation of U is similar except for the method of
assigning ranks to the ordered sample. This method
ranks the smallest value 1, largest value 2, second
largest value 3, second smallest value 4, and so on
by two's, until the middle of the combined ordered
sample is reached. For either test, tied ranks for the
combined sample are assigned the average value of the
ranks involved. A moderate number of ties has little
effect on the test results
2. The Null and Asymptotic Distribution of U
The exact null distribution of U is
determined using a recursion algorithm due to Harding
[Ref. 6].
Exact C.D.F. values were compared with
approximate values obtained from the following
as7mtotic distributions: student's T 'ith (n-2) degrees
of freedom where n = N + M, the total number of
observations in both samples (T), student's T with
continuity correction (TC), normal (Z), normal with
continuity correction (ZC), the average of T and Z
25
. . . . . . -* . .
(Iman [Ref. 7]), and the average of TC and ZC. The
results for various sample sizes are given in Table 3.
TABLE 3. C.D.F. COMPARISONS FOR THE MANN-WHITNEYTEST
PROC U i u: FOR SAMPLE SIZES 4 EQUAL n. 9 AN4D M EQUAL 70 9.
TE3T STAT. VALUEI 14 I 7 1 i8 I 2 I 23 I 27 I 29
EXACT C.. F. I .00933 i .0999 I .0255 I .04636 1 .06796 1 .12904 I .17005
ERROR; NORMAL -.00026 1 .0000 1 .00163 .00441 1 .00632 1 .0±243 i .0±51±
ERROR; NORM. W/CC 1".00146 1-.00114 I-.00032 .00026 1 .00±30 .00354 I .00436
12..OR; T )137 1 .0237 .00337 i .00372 .00464 .005-5 .00676 1 .00779
ERROR; T (WCC I .00i19 1 .00±08 1 .00095 1 .00007 1".00073 1".00259 I .00323
122O2; AVE -/: I .00105 1 .0029 .00270 1 .00453 1 .00603 I .00959 1 .01145
E2ROR; AYE 7C,'."C I.00014 1 .00003 1 .00004 1 .0001- 1 .00026 I .00048 I .00054* I I 4
PROBEU i u3; FOR SAMPLE SIZES N EQUAL TO 7 AND M EQUAL TO 12.
TEST STAT. VALUE 14 1 071 19 21 24 1 27 30
XXACT C.D.F. 1 .00853 I .0792 I .02732 1 .04156 1 .07111 .11342 .17012I I ,
E202; NORMAL 1".00045 I .00062 I .00±87 1 .00359 1 .00701 i .0±097 1 .01487I .
ERROR; NORM. W/CC 1".00152 I.00128 I.00079 1".00003 1 .0054 1 .00322 1 .00453
ERROR; T DIST 1 .0099 I .00292 I .00356 1 .00422 1 .00527 1 .00643 1 .00796 %i i. FiIiiI
ERROR; T W/CC 1 .00094 I .00092 I .00068 1 .00026 1".00066 " .00i78 [".002629 '
E220P.; AVE T: 1 .00077 1 .00177 .00 272 .0039 t1 .00614 1 .00870 1 .01±42
RR2: ZYE 7:':: ",- -. 30, . ." ..0044.,307C2
26 .
TABLE 3. (Continued)
?!03tU ~.ul; FOR SAMPLI 31:13 3 EQUAL TO 1. AND I EQUAL TO L7.
T S TAT. VALUE 1 13 1 16 1 13 I 20 i 23 I 27 I 30
E xACT C.. .00963 1 .0933 1 .02916 .04245 .07013 1 ,44O t . 734
E2ROR; OR4IAL 00077 .00039 1 .00101 .00345 .30639 1 .01210 1 .01!64
122OR: .O42. 1'r, .0090 ". i0±0 ".00037 .00007 .30133 .004A4 .00573
1102: 7 3 3 302t0 .3022 .002,99 .00390 1 .)0544 .00774 .00'3
122OR; T W/CC 1 .000? 1 .00023 1 .00023 I .00021 1 .00007 1".00025 -.00051
E2OR: AV! T,: 30027 .301'0 .30225 .30363 .30616 I .3099 .3±IZ0
?20.30AVE ,3 06 i ".30063 )0032 .)0014 .30037 .0209 .30264
?ROEU i u1 FOR SA1PLI SIZES N 1QUAL TO 3 AND M ZGUAL 73 27.
13 STAT. VALUE I 7 1 10 I 12 I5 I 10 I 23 I 26
EXACT C.D.Y. I .00764 1 .01650 1 02512 1 .04236 1 .07734 I .12660 I .17433
ERROR; NORMAL 1V.00265 ".00099 1 .00072 1 .00389 1 .00874 1 .0342 1 01680
ERROR; NORM. W/CC I.00363 1".00254 -.0033 .00089 .00405 1 .00665 1 .00831II I I
ERROR; T DIST I.00121 1 .00031 1 .00173 1 .00414 1 .00741 I .01026 1 .0±253i -i i i
ERROR; T W/CC 1-.00219 1".00129 1".00042 1 .00097 1 .00250 1 .00323 1 .00390
ERROR; AVE T/Z I.00193 1-.00034 1.00122 1 .00402 1 .00808 I .01±84 I .0±466I I
ERROR- AYE TC,/Z I .00291 1.00092 t .00087 .00093 1 .00327 1 .00496 1 .0060
As c-n be seen from the tables. the 3ver2. e
o f ZC and TC g-ves :he most cons.stanz'1 accurate
results. For sample sizes NxM > 80, nearly three
decimal place accuracy is obtained in all cases.
27
. . . . .. . . . . ./
3. Hypothesis Testing
P-values are computed for three basic
hypotheses comparing the medians or variances of the
two populations as shown below.
a. One-sided Tests
(i) HO: MX = My Versus HI: MX < My or
HO: VX = Vy Versus HI: VX > Vy. The P-value equals
Pr[U < u], where u is the observed value of the test
statistic.
(2) HO: MX = My Versus Hi: MX > My or
HO: VX = Vy Versus HI: VX < Vy. The P-value equals
Pr[u > u].
b. Two-sided Test
(1) HO: MX = My Versus Hi: MX # My or
HO: VX = Vy Versus HI: VX # Vy. The P-value equals
twice the smaller value of a(l) or a(2), but not
exceeding the value one.
For sample sizes NxM greater than 80, the
average of the normal and student's T approximations,
each with continuity correction, is used. Computations
of the P-value for each alternative hypothesis are:
28
a. HI: MX < My or Vx > Vy
Let PZC Pr[ Z < (u +.5 - pu)/ au] and
le T r_~-)j - ,Uu) - .5
l (N+M-li) a 2 [ iu-g,. ! -5 ] -5N+M-2 N+M-2
where Z is standard normal, T(n_2) has a student's T
distribution with (n-2) degrees of freedom, Pu = NxM/2
and au 2 = (N(M)(N+M+1))/12. Then the P-value for the
test is (Pzc + (1 - PTC))/ 2 for u less than 9u and
(PZC + PTC)/ 2 , otherwise. The above formulas are
obtained from those given by Iman [Ref. 7] after
inclusion of the continuity correction.
b. Hi: MX > My or VX < Vy
The P-value equals ((1 - PZC) + PTC)/2 if
u is less than pu and ((1 - PZC) + (1 - PTC))/2,
otherwise. The computation of PZC and PTC is similar
to the above except the sign of the continuity
correction is changed.
c. Hi: MX 0 My or VX * Vy
The P-value equals twice the smaller
value of a or b above, but not exceeding the value one.
4. Confidence r.n er. a Estma tiorn
Confidence intervals for the difference in
medians, (My - MX), are based on the ordered
arrangement of the differences (Yj- Xi), J = 1,2,...,M;
29
1 = 1,2,...,N for all I and j. A 100(1- a)%
confidence interval is determined in the following
manner. Let u be the number such that Pr[U < u] <.
( a /2). Then, the (u+1)th and (m-u)th order
statistics, where m = NxM or the total number of
possible differences, constitute the end points of the
confidence interval.
For computing confidence intervals when
sample sizes NxM are greater than 80, a normal
approximation with continuity correction is used.
B. KRUSKAL-WALLIS TEST
The Kruskal-Wallis test is a nonparametric analog
of the one-way classification analysis of variance test
for equality of several population medians. Gibbons
[Ref. l:pp. 99].
1. Computation of the Test Statistic
Calculations of the test statistic H center
around the ordered arrangement of the combined samples
from which the sum of ranks for each sample is derived.
Let Xij, J=1,2,...,ni and i=1,2,...,k, be independent
random samples from k populations. Let rij = rank(Xij),
A:
R - rij, and N = ni. Then,J=1i=
H = (12/(N(N+1)) (R1 2 /n i ) - 3(N+1)
30
If ties occur in the combined sample, they
are resolved by assigning the average value of the
ranks involved. A correction based on the number of
observations tied at a given rank and the number of
ranks involved, is included in the calculations. A
complete description of the correction factor is given
in Gibbons [Ref. 2:pp. 178-179].
2. The Null and Asymptotic distribution of H
The null distribution of H is generated by
enumeration. Each possible permutation of ranks is
listed for the combined sample, and the corresponding H
value computed. The frequency distribution of H is the
total number of occurrences of each distinct H value.
The H values are arranged in increasing order while
maintaining the frequency pairings. The null
distribution is obtained by dividing the cumulative
frequencies by nl!n2!...nk!/N!.
Due to computer limitations, generation of
the exact distribution of H was only possible for k = 3
populations with n = 4 observations in each, and 4
populations with 3 observations in each. Most of the
distributions were generated on the mainframe computer
and saved in matrices for quick recall by the Kruskal-
Wallis test program.
Exact C.D.F. values were compared with the
corresponding approximate values using the following
31
- ' * . z
Ip
distributions: chi-square with (k-1) degrees of freedom
(C), F distribution with (k-1) and (N-k) degrees of
freedom (F), and F with (k-1) and (N-k-i) degrees of
freedom (Fl). The chi-square distribution uses the
Kruskal-Wallis H statistic, while the F and F1
distributions use a modified H statistic, Hi
((N-k)H)/(k-1)((N-I)-H); see Iman and Davenport CRef.
8]. As can be seen in Table 4, F1 gives the most
consistently accurate estimates.
TABLE 4. C.D.F. COMPARISONS FOR THE KRUSKAL-WALLIS TEST
PROS E I h: ?OR 4 GROUP OF I SAMPLES CONZ5STING OF , 4, AND 3 9S.
TZS!' STAT. 7ALLIE 17.1439 5 .7121 I5.1318 15.5,985 3. -30 '.2121 :9585EXACT C.D.Z. .00970 L .01905 i .0266 1 .3 866 .079±0 I .129i8 .1778U
--------------------------------- --- ---.------
SRROR; CHISQURE V-1.8 0 i-.01582 [.01585 -. 01220 1-.30814 1 .00746 I .01241----------------- t------------.4-----------------
ERROR; F DIST I .00304 I .00736 I .00836 i .01113 I .01820 I .01696 I .00990------- . +------------.--------------------- -------------------
ERROR; F / DF I .00084 I .00U26 I .00403 I .005U4 I .01138 I .00895 I .00172
PROBCR Z hJ; FOR A GROUP OF 3 SAMPLES CONSISTING OF 4, 34, AND % OBS.TEST STAT. VALUE I 7.6538 I 6.9615 I 6.5000 I 5.6923 I 4.g615 I 4.2692 I 3.5769 "
-------4.-----------4.-----------4.-----------4----------.------.------4-----EXACT C.D.P. I .00762 I .01939 I .02996 I .04866 I .08000 I .12190 I .17299
---------------------------------------------4 %.4 4
ERROR; CISQUARE I-.01416 1-.01139 1-.00882 1-.0094 1-.00368 1 .00361 I .005774.------.-------4------------4------------4------------4-----------------------
ERROR; F DIST I .00290 I .00839 I .01204 I .01100 1 .01272 I .01225 1 .00263i.--------4---F;=---------+------------4------------4------------4----------------------ERROR: F I/1 DF I .00149 I .00600 I .00890 I .00647 I .00708 I .00592 1.00384
PROBES k h3; FOR A GROUP OF 4 SAMPLES CONSISTING OF 3. 3, 2, AND 2 0S.TEST STAT. VALUE I 7.6364 7.1818 I 7.0000 I 6.5273 I 6.0182 I 5.3818 I 4.8727
-----4--------4.--------4.--------.--------4---------4------- ------ 4-- --- -----
EXACT C.D.P. 1 .01000 I .01921 1 .02921 I .04921 I .07184 I .12984 I .17952-------4.. ... .. . .------------ .-------------4......------ --------------.----------
ERROR; CRISQUARE 1.04u16 1".04712 1".04269 1-.03939 1-.03089 1".01604 1".00183
ii; ------- 4------------4------------- 4--- 4---.4._+--------------------...EOR: F Drs., I .002SLL -'0250 1 .0073C !.00880 .01094 I12 .00902
ORRCR: . W/- F .J01: -. C -. 02C-.007 - - -5 -JC50 -. 08u3
PROBES I hl. FOR A GROUP OF 4 SAMPLES CONSISTING OF 3, 3, 3, AND 2 OBS.TEST STAT. VALUE I 8.0152 I 7.6364 I 7.1515 I 6.7273 1 6.1970 5.4697 I 4.9697
--~-4------4------4------.------------4------- ------ 4----------------------
EXACT G.D.P. I .00961 I .01831 I .02974 j .0 9 8 I .07805 I .12740 I .17571---- 4-------.-------.------4----- ----- 4._+ ------- -------- 4=------ --- _+:--------------
ERROR; CRISQUARE V.03609 1.03584 1 .037148 1-.0316 1".02436 1.01306 I .00168-----4-------4.---------- .4------------4------------4-------------4--------- -------
ERROR; F DIST I .00215 I .00481 I .00U41 I .00920 1 .01185 I .01036 I .01214------------------- 4------------4.--------- ---- 4-------------4--------- ---- 4---- ------
ERROR; F Y/I OF .00133 1.0001 I.00269 I .00030 1 .00098 1.00230 1.00091
32
16*
A ifinal accuracy comparison between the C and F1
approximations was conducted by computer simulation for
5 populations with 8 observations each. Initially,
30,000 permutations of the 40 ranks were randomly
generated (no tie ranks allowed), and the H statistic
calculated for each permutation. Then the empirically
determined percentiles Hp for selected values of p
between .01 and .18 were compared with the
approximations given by the C and F1 distributions.
The results are shown in Table 5. It can be seen that
the F1 approximation compares well with the simulated
results, giving three decimal place accuracy, while
the C approximation is less accurate.
TABLE 5. C.D.F. COMPARISONS FOR THE KRUSKAL-WALLIS TEST USING COMPUTER SIMULATION
PROBES ~h : BASED ON 10000 GENERATEfD H'S FOR 5 SAMPLES OF 8 OHS. EACH.
TES STAT 1AU -0 12.2 2I 11.06 8- -0 I -9.2121 --- 8.129 --- 7.030- -- 6.232----------------------------- ++------------4------------4----------------------
C.D.F. VALUE 1 .01000 1 .02000 1 .03000 1 .05000 1 .08000 1 .13000 1 .18000-------------- ---------------4.+4 ------------ 4-------------4----------------------
ERROR; CUISQUARE 1Z-.00573 1-.00584 1-.00646 1-.00601 1-.00696 1-.00432 1-.002464.------------------------ -- -. 4------------4----------------------
IRRR; iii D I.00081 1 .00231 1 .00259 I .00350 1 .00161 1 .00109 1-.00103
PROBES k hl-. BASED ON 20000 GErNERATEPD HIS FOR 5 SAMPLES OF 8 OHS. EACH.
TETSAT-AU--1.1 I15 I184- --- 9.163 --1-- I 7034 6.220--------------------4-------4--------4------------4------------4-------------4----------------------C.D.F.VALUE 1 .01000 1 .02000 1 .03000 1 .05000 1 .08000 1 .13000 1 .18000---------------------------------- 4------------4------------4-------------4-----------------------
ERROR:-- CHS-rsQU-ARE-1-.00516 1-.00596 1-.007u5 1-.00716 1-.00696 1-.00413 1-.00334---------------------.----------- 4---------------------------;----------4. ----------- ;--+=---------
0ROR W1 -IF .00.26 F .0021 .001E6 .0023- 1 .00161 1 .3013O1 .00199
?R0 CH 3ASED ),V ',J,) 0 ;MVERATE: 31 5 ?OR 3 7,;,MPLZS )F 3 .,'8S. ZAC7.
----------------------------4-------.--.--- 4
------------------.------.--------4------------4------------4-------------4-------------4----------
* RROR: CHIrSQARE -1-.00522 1-.00684 1--00802 1-.00677 1I.00563 1-.00213 1-.00020--------------------4-------4--------4------------4------------4------------4-----------------------ERRO: FW/1 ? I.00121 1 .00143 1 .00111 1 .00273 1 .00301 1 .00344 1 .00142
33
3. Hypothesis Testing
P-values for the test HO: the population
medians are all equal versus Hi: at least two
population medians are not equal, are computed as:
Pr[H > hi, where h is the value of the observed test
statistic.
For three or more populations with at least 4
observations in each, the F1 approximation is used.
34
VI.TESTS FOR ASSOCIATION IN PAIRED-SAMPLES
K
The tests described herein assume that the data
consists of independent pairs of observations (Xi,Yi)
from a bivariate distribution. The general null
hypothesis is that of no association between X and Y.
Kendall's B and Spearman's R are considered.
A. KENDALL'S B .v
Computation of the Test Statistic
The test statistic is computed by comparing
each observation (Xi.Yi) with all other observations
(XjYj) in the sample. If the changes in X and Y are -?
of the same sign, sgn(Xj - Xi) = sgn(Yj - Yi), the pair
(Xi,Yi) and (Xj,Yj) is "concordant" and a +1 is scored.
If the signs are different, the pair is "discordant"
and a -1 is scored. Any ties between either the X's or
the Y's scores a zero for that pair. The sum of all
scores divided by the total number of distinguishable
pairs, (N(N-1))/2, gives B. If zeros are scored, the
denominator is reduced by a correction factor which is
basea on 'he number of 'observations tied at a given
rank and the number of ranks involved in each of the X
and Y samples. A complete description of the
correction for ties is given in Gibbons [Ref. 2:pp.
35
N V-V
289). The value of B ranges between 1, indicating
perfect concordance, and -1, for perfect discordance.
Gibbons [Ref. 1:pp. 209-225].
2. The Null and AsymDtotic Distribution of B
The null distribution of B is derived from
the following recursive formula given in Gibbons [Ref.
1 :pp.216].
u(N+1,P) = u(N,P) + u(N,P-1) + u(N,P-2) + ... + u(N,P-N)
where u(N,P) denotes the number of P concordant
pairings of N ranks. This formula is used to generate
the frequency with which the possible values of P
occur. Division by N! results in the probability
distribution of P. Since, B = (4P/(N(N-1)))-1, the
null distribution of B is easily determined.
Exact C.D.F. values were compared with those
obtained using a normal approximation, with and without
a continuity correction factor (CC = 6/N(N 2 -1), pro-
posed by Pittman [Ref. 11] for the Spearman's R test).
The results for various sample sizes are provided in
Table 6. As can be seen, for sample sizes greater than
13, a normal approximation with continuity orrectin
proviaes three decimal place accuracy.
1
36
N. W
TABLE 6. C.D.F. COMPARISONS FOR KENDALL'S B
PR~3 b]; FOR SAMPLE SIZE EQUAL TO ±.3.
IEST STAT. VALUE 0.±28i 0.4615 0.4±03 I 0.3590 1 0.3333 1 0.2564 1 0.2303
EXACT C.D.F. .00748 .0524 1 .02363 1 .04999 .06443 I .2593.i .1!309
ERROR; NORMAL 1.00014 1.00121 1 .00313 I .00620 1 .00808 I .01473 1 .01703
ERROR; 1PRM. W/CC ".00013 1 .00073 .00239 .00497 .00658 .01223 .014:5
PROBCB 2 b]; FOR SAMPLE SIZE EQUAL TO 14.
TEST STAT. VALUE 0.472! 1 0.4236 1 0.4066 1 0.3626 0.2967 0.2527 0.2033
EXACT C.I.F. .00964 i.1773 1 .323 9 .03973 .07353 ,I S! .6541
ERROR: NORMAL I .00035 .00140 .00213 .0043i .00839 .0±2! i .31627
EROR; 4ORM. WCC 1 .00008 ! .10035 .00009 .00345 , 0074 f .310,57 .)1371
3. Hypothesis Testing
P-values for tests of no association between
X and Y are computed for three types of alternative
hypotheses. Because the distribution of B is symmetric,
all probabilities can be taken from the upper tail
using the absolute value of b, the observed value of
the test statistic. Linear interpolation is used when
b lies between tabulated values. The P-values are
'Cmruted as f-L'ows.
a. One-Sided Alternatives
The one-sided alternative tested depends
on the sign of b. A positive b will automatically test
37
g _ .. ,-..r, . . - .- , . ' - "" "-"- " '. -" "" -' " "-" "'" " " "-". -" . " - - -" 1- ""**" *, ***.": '
I k -V Irv
for direct association or concordance, while a negative
b will test for indirect association or discordance.
The P-value equals Pr[B > IbIl].
b. Two-Sided Alternative
The P-value equals twice the probability
computed for the one-sided hypothesis.
For sample sizes greater than 12, a normal
approximation with continuity correction is used. The
approximate P-values are then:
1 - Pr[ Z < ((fbj - CC) - b)/ ab], where Z
is standard normal, CC is the continuity correction,.
9b = 0, and b2 = (4N + 10)/9N(N-I ), for the one-
sided test and twice this P-value for the two-sided
test.
B. SPEARMAN'S R
The Spearman's R Test requires the added
assumption that the underlying bivariate distribution
is continuous. The test measures the degree of
correspondence between rankings, instead of the actual
variate values, and can be used as a measure of
association between X and Y. Gibbons [Ref. 1:pp. 226].
:o~mDita::on of The Tes Sta:;st/2
The test statistic R is computed in the
following manner. Let ri = rank(Xi) and si = rank(Yi)
and Di = ri - si . Then,
38
.4~~~~~~~~~~~~~~ N44 . .~~ S* .~.- .-.. S..~-
6 DR = 1 - i:=1R
N(N 2 - - 1)
where N is the size of the sample. If ties occur in X
or Y, they are resolved by assigning the average value
of the ranks involved. A correction factor, based on
the number of observations tied at a given rang and -he
number of ranks involved, is included in the
calculacions. A complete iescriotion f 1he
-:correction factor is given in Gibbons [Ref. 2:pp. 279].
The value of R ranges between 1, indicating perf.ct
direct associat-on, and - , for perfect indirect
association. Gibbons [Ref. 1:pp. 226-235].
2. The Null and Asymptotic Distribution of R
The null distribution of R for a given sample
size N is generated by enumeration. The method, as
presented in Kendall [Ref. 9], involves generation of
an N by N array of all possible squared differences
between any two paired ranks of X and Y. All N!
permutations of N ranks are used to index values from
the array. The sum of these indexed values for each
permutation - 'nves - .se - ,D sum .f squarea1 iiffeences
which are then converted to the R statistic. The
frequency distribution of R is the total number of
occurrences of each distinct value of R divided by N!.
39
Due to mainframe computer memory limitations
in the APL environment, generation of the distribution
of R was limited to sample sizes of 7 or less.1 Using
tables, provided by Gibbons [Ref. 2:pp. 417-418] to
supplement computer computations, a numerical matrix,
called PMATSP, was created to store the cumulative
distributions of R for sample sizes less than 11. This
matrix allows for quick recall of cumulative
probabilities by the Spearman's R Test program.
Exact C.D.F. values were compared with those
obtained using a student's T approximation with (N-2)
degrees of freedom (see Glasser and Winter [Ref. 10]),
and a normal approximation. Both normal and T
approximations were computed with and without a
continutity correction factor, CC = 6/N(N 2 -1) (Pittman
[Ref. 11]). From the results presented in Table 7, the
most consistently accurate approximation is given by
the T distribution with a correction.
3. Hypothesis Testing
P-values for tests of no association between
X and Y can be computed for three types of alternative
hypotheses. Because the listribution of R is symmetric.
all probabilities are taken from the upper tail using
1 The memory capacity of the mainframe computer inthe APL environment is limited to 2.5 megabytes.
40
TABLE 7. C.D.F. COMPAR:SONS FOR SPEARMAN'S R
PROBER 1 r]; FOR SAMPLE Sl= EaUAL TO 9.
TEST STAT. VALUE I 0.7333 I 0.7167 I 0.6667 1 0.6000 1 0.5333 1 0.4333 0.3500* ,I060 053 .33I 030
IX AC 2 .Y. .30361 I 01343 .3944 I .)4040 ..07376 .12496 i .179 9
E2202; NORMAL (-.0047 ".00290 1 00022 I .0'35 1 .00305 .0±479 I .0319
ERROR: NORM. Y/CC 1'.00553 1-.0043 I .00095 I .00123 I .00493 1 .0±029 I .0±236
E2RO; ) 31 1 .00235 1 .00352 I .00451 I .00459 .)0415 1 .002938 I .00±38
ERO; T /CC 1 .00153 1 .00203 I .00251 I .00175 I .00042 1-.00212 I.00479* I II
?20 03 2 1 0! RO 3A1,1,LZ 3::- -_,UAL 70 _'O.
7'37 STAT. VALUE I 0.7455 1 0.i727 1 0.6364 0. 5636 ,). 4*3 1).406' 1 0.3333
IXACT 0.3.1. .00370 .0943 .02722 .04314 .7.41 .12374 .17437
12202; A4ORMAL 1 3096 " .0230 I .009± .30 27. i .00700 .)121 .31673
ERROR; NORM. W/CC I .00457 1.00327 1 .00210 1 .00095 1 .00451 1 .00867 1 .0i±23... . I
ERROR; T DIST i .00204 1 .00296 I .00326 I .00323 I .00253 1 .00±59 1 .00107i 1~I
ERROR; T U/CC I .00144 1 .00±35 I .00±34 I .0014 1-.00035 1.00230 1V.00361I I I I I - I
the absolute value of r, the observed value of the test
statistic. The P-values are computed as follows.
a. One-Sided Alternatives
The one-sided alternative tested depends
on the sign of r. A positive r 'ir1! test f:r -
associ~a:z n, w negative r tests for Lnalrecz
association. The P-value equals Pr[R > Irl].
41
b. Two-Sided Alternative
The P-value equals twice the probability
computed for the one-sided hypothesis.
For sample sizes greater than 10, an
approximation based on the student's T distribution
with (N-2) degrees of freedom and continuity
correction, is used. The P-valuels are:
1 - PrE T(N_2) . ((Ir - CC) - 9r)/ ar],
where T(N_2) denotes the T distribution with (N-2)
degrees of freedom, CC is the continuity correction,
Mr = 0. and ar 2 = (1 -(Irf-CC) 2 )/(N-2), for the one-
sided test,. and twice this P-value for the two-sided
test. Gibbons CRef. I:pp. 218].
42
42 ,
VII.NONPARAMETRIC SIMPLE LINEAR REGRESSION I
Nonparametric Linear Regression assumes that the
data consists of independent pairs of observations from
a bivariate distribution and that the regression of Y
on X is linear. The program estimates linear
regression parameters based on the data samples. It
then allows the user to input X values to predict the Y
values. Hypothesis testing and confidence interval
estimation for the slope of the regression equation is
offered. If the estimated slope lies outside the
confidence interval, an alternate regression equation
is offered with an opportunity to input X values to
predict the corresponding Y values.
A. COMPUTATION OF THE ESTIMATED REGRESSION EQUATION
The least squares method is used to estimate A and
B in the regression equation Yi = A + BXi + ei
(i=1,2,...N), where ei (unobservable errors) are
assumed to be independent and identically distributed.
A and B are computed from the following equations:
IExcept for program design considerations, theinformation and concepts provided in the section areparaphrased from Conover [Ref. 12 :pp. 263-271].
I
!* k 43
o-
N N N
B = i=1 i=1 i=IN Xij2 - YX
i=I B
N N 2
Yi - BXi
B. HYPOTHESIS TESTING
P-values for testing hypotheses about the slope of
the regression equation are based on the Spearman's
rank correlation coefficient R between the Xi and U i =
Yi - BoXi, where B0 is the hypothesized slope. The
appropriate one-sided test of hypothesis, HO: B = B0
versus HI: B < B0 or Hi: B > B0 , is automatically
chosen based on the sign of the computed test statistic
r (positive r tests, HI: B > B0 ; negative r tests, HI:
B < B0 ). The P-value is computed as: Pr[R > Irl].
P-values for two-sided tests, HO: B = B0 versus HI: B #
B0 , are also presented.
For sample sizes N greater than 10, P-values are
approximated using a T distribution with (N-2) degrees
of freedom and contlnui-,y correction.
C. CONFIDENCE INTERVAL ESTIMATION
100(- Q )% confidence bounds for the slope
parameter B are determined as follows. The n possible
44
slopes, Sij = (Yi-Yj)/(Xi-Xj), are computed for all
pairs of data (X i ,Yi) and (Xj,Yj) such that i < J and
X i # Xj and rearranged in increasing order to give S(1)
< S( 2 ) < .• < s(n). Let w be the (1- a/2) percentile
of the distribution of Kendall's statistic with sample
size n• 1 Let d be the largest integer less than or
equal to (n-w)/2 and u the smallest integer greater
than or equal to (n+w)/2 + 1. Then S(d) and S(u) are
the desired lower and upper confidence bounds,
respectively.
For sample sizes larger than 13, a normal
approximation with continuity correction is used to
estimate the confidence intervals.
If the slope of the estimated regression equation
does not lie within the computed confidence interval,
the program automatically calculates a new regression
equation where the slope is the median of the two-point
slopes Sij and the intercept is the difference of the
medians of the X and Y samples, My- MX.2
IKendall's statistic is defined here as Nc - Nd,where Nc is the number of concordant pairs ofobservations and Nd is the number of discordant pairs.Conover rREF. 2:p. 256.
2 This procedure is recommended by Conover [REF.12:pp. 256].
45
VUII.AREAS FOR FURTHER WORK
To create a more versatile and powerful software
package, the NONPAR workspace could be expanded to
include some or all of the following nonparametric
tests: tests for randomness based on runs. Chisquare
and Kolmogorov-Smirnov(K-S) Goodness-of-fit tests,
Chisquare and K-S general two sample distribution
tests, Chisquare 'est for independence, and -the
Friedman test for association.
46
LIST OF REFERENCES
1. Gibbons, J. D., NonDarametric StatisticalInference, McGraw-Hill, Inc.. 1971.
2. Gibbons, J. D., Nonparametric Methods forQuantitative Analysis, Holt, Rinehart, andWinston, 1976.
3. Pratt, J. W. and Gibbons, J. D., Conceots ofNonparametric Theory, pp. 160-176, Springer-Verlag, Inc., 1981.
4.. Iman, R. L., "Use of a T-statistic as an Approxi-mation to the Exact Distribution of the WilcoxonSigned-ranks Test Statistic,' Communcazions inStatistics, vol. 3, no. 8, pp. 795-806, 1974.
5. Conover, W. J., Practical Nonparametric statis-
tics. John Wiley and Sons, Inc., 197! .
6. Harding, E. F., "An Efficient, Minimal-storageProcedure for Calculating the Mann-Whitney U,Generalized U and Similar Distributions," AppliedStatistics, vol. 33, no. 1, pp. 1-6, 1984.
7. Iman, R. L., "An Approximation to the ExactDistribution of the Wilcoxon-Mann-Whitney Rank SumTest Statistic, " Communications in Statistics --Theoretical Methods, A5, no. 7, pp. 587-598, 1976.
8. Iman, R. L. and Davenport, J. M., "NewApproximations to the Exact Distribution of theKruskal-Wallis Test Statistic," Communications inStatistics -- Theoretical Methods, A5, no. 14, pp.1335-1348, 1976.
9. Kendall, M. G., Kendall, S. F., and Smith, B. B.,"The Distribution of the Spearman's Coefficient ofRank Correlation in a nniverse n hlch .i
Rankings Occur an Equal Number of Times,'Biometrika, vol. 30, pp. 251-273, 1939.
10. Glasser, G. J. and Winter, R. F., "Critical Valuesof the Coefficient of Rank Correlation for Testingthe Hypothesis of Independence," Biometrika, vol.48, pp. 444-448, 1961.
47
. . . . " S - " - - - .- ' - - ' '.' ' ' ' " ".' ' '.• '.. " .' . '. %* - '.'% %' ' .'. -.- -
11. Pitman, E. J. G., "Significance Tests Which May beApplied to Samples From any Populations. II. TheCorrelation Coefficient Test," Journal of theRoyal Statistical Society, Supplement 4. pp. 225-232, 1937.
12. Conover, W. J., Practical Nonparametric Statis-tics, 2d ed., John Wiley and Sons, Inc., 1980.
48
APPENDIX A
DOCUMENTATION FOR THE MICROCOMPUTER WORKSPACE
1. General Information
This appendix describes the organization and
operation of the IBM-PC (or compatible) version of the
workspace. Appendix C continues from where this
appendix leaves off, to walk the user through each test
by working practical examples.
Before proceeding any further, the user should
refer to section II (Workspace Design Issues) for
general information about workspace requirements and
assumptions regarding its use.
To get started, enter the APL environment in the
usual manner and load the NONPAR workspace.
2. Workspace Menus
This workspace is designed around the use of
menus. They guide the user through the selection
process of choosing a nonparametric test and a test
option. Three types of menus are used; the main menu,
test menus, and help menus.
a. The Main Menu
Within moments of loading the NONPAR
workspace, the main menu will appear. It is titled
Nonparametric Statistical Tests. This menu presents
49
general information about the workspace. Its primary
purpose is to list the choices of nonparametric tests
available and provide an option which allows the user
to exit the main menu into APL to copy data into the
workspace or return to DOS. Each test choice is listed
with some information about the test's area of
application. To make a selection from the menu, move
the cursor (using the cursor keys) to highlight the
desired choice, and press enter. As a reminder to the
user. a footnote at the bottom of the screen describes
the procedure for entering a choice. Once a test has
been selected from the main menu, a sub-menu
appropriate to the test appears. To exit from any menu
back to the main menu, press the Escape key.
b. Test Menus
The title of the test menu is the name of the
nonparametric test chosen. The text portion of the
menu gives a general overview of the test, to include,
the method used to compute the test statistic, and a
description of the various options that may be
exercised. The third section consists of the list of
test options available. These options include
returning to the main menu or choosing the help menu.
Test menus may have options listed in single or
multiple-paged formats. The comment in the final block
of the menu lets the user know if a certain menu is
50
21L
wwzrwwxw-- --qvtW W
multiple-paged or not. To make a selection from a
multiple-paged menu, use the page-up or page-down key
to locate the desired option. Proceed with the scroll
keys to highlight the choice, and press enter. Once a
test option is entered, the user is prompted to input
the data required to run the test. When the option for
more information is selected, the help menu is
displayed.
c. Help Menus
The title of the help menu usually begins
with the words "More Information About..." followed by
the title of the nonparametric test. The text portion
of the menu explains the test and its options in
greater detail. No choices are offered in the menu.
To return to the test menu, press any key.
51
APPENDIX B
DOCUMENTATION FOR THE MAINFRAME COMPUTER WORKSPACE
1. General Information
This appendix describes the organization and
operation of the mainframe computer workspace. To load
a copy of the NONPAR workspace from the APL library,
enter the APL environment and type: )LOAD 9 NONPAR.
Within a few moments the variables LIST and DESCRIBE
are displayed on the screen. These variables provide a
description of the workspace.
2. The NONPAR WorksDace
The NONPAR workspace consists of seven
programs which call several subprograms during their
execution. The exact syntax for each test and its
corresponding nonparametric test name is given in the
following format:
SYNTAX: Nonparametric Test and Application.
a. SIGN: Ordinary Sign Test for Location in Singleand Paired-sample Data.
b. WILCOX: Wilcoxon Signed-rank Test for Locationin Single and Paired-samole Data.
c. MANNWHIT: Mann-Whitney Test for Equal Medians orVariances in Two Independent Samples.
d. KRUSKAL: Kruskal-Wallis Test for Equal Medians in
K Independent Samples.
52
N.
e. KENDALL: Kendall's B: Measure of Association forPaired-sample Data.
f. SPEARMAN: Spearman's R; Measure of AssociationBetween Rankings of Paired Data.
g. NPSLR: Nonparametric Simple Linear Regression;
Least Squares.
The list presented above can be displayed at any
time by typing: LIST.
-or each test program, t here exists a HOW variable
that gives a full description of the test and the
various options that may be exercised. To display any
o the HOW variables, -usm enter the test program's
name with the suffix HOW appended (i.e. SIGNHOW).
A test is run by entering the program's name. The
user is immediately prompted to input data. Enter
numerical data separated by spaces or as a variable to
which the numbers have been previously assigned.
Several of the tests require a considerable amount of
prompting before all the necessary data has been
entered.
53
* .. .',. . -. -. ~ x-. V.-
APPENDIX C
WORKSPACE FAMILIARIZATION THROUGH PRACTICAL EXAMPLES
1. General Information
This appendix applies to both the mainframe and
microcomputer workspaces. Its purpose is to acquaint
the user with the organization of the programs and the
type of prompts to be expected.
Extensive error checking has been included in the
programs to ensure that the data is of the proper form.
Should a program become suspended, clear the state
indicator by entering: )RESET, check over the data for
errors, and restart the program. 330 kilobytes of
computer memory are needed to load APL and the NONPAR
workspace; to avoid filling up the remaining workspace
area, the user should minimize data storage in the
NONPAR workspace. To exit a program at any time, press
the Control and Escape keys, simultaneously.
2. Practical Examples
a. Sign Test
(1) Descriction of Problem I. A Sinclair mine
is manufactured to have a median explosive weight of
not less than 16 ounces. The explosive weights of 15
mines, randomly selected from the production line, were
54
recorded as follows: 16.2 15.7 15.9 15.8 15.9 16 16.1
15.8 15.9 16 16.1 15.7 15.8 15.9 15.8.
(a) Is the manufacturing process packing
enough explosives in the mines?
(b) What range of values can be expected
for the median of the explosive weights 90% of the
time.
(2) Solution. ro see if the manufacturing
process is meeting the specifications, we test the
hypothesis Ho: M = 16 versus Hi: M < 16.
(3) Workspace Decision Process.
(a) Microcomputer: Choose the Sign Test
from the main menu, and the option, Single Sample;
Test HO: M = Mo versus Hi: M < Mo, from the test menu.
Skip to the Program Interaction section below.
(b) Mainframe: Enter SIGN at the keyboard
and receive the prompt:
DID YOU ENTER THIS PROGRAM FOR THE SOLE
PURPOSE OF GENERATING CONFIDENCE INTERVALS FOR A
SPECIFIED SAMPLE SIZE AND QUANTILE? (Y/N).
Enter N (If Y is entered, the user will
go directly to this last option of the test). The next
promptIS
THE NULL HYPOTHESIS STATES - THE
POPULATION MEDIAN (M) IS EQUAL TO THE HYPOTHESIZED
MEDIAN (Mo); HO: M = Mo. WHICH ALTERNATIVE DO YOU WISH
55
TO TEST? ENTER: 1 FOR Hl: M < Mo; 2 FOR Hl: M > Mo:
3 FOR Hi: M # Mo.
Enter 1. The next prompt is:
ENTER: 1 FOR SINGLE-SAMPLE PROBLEM; 2
FOR PAIRED-SAMPLE PROBLEM.
Enter 1.
(4) Program Interaction. The prompt is:
ENTER THE DATA (MORE THAN TWO OBSERVATIONS
ARE REQUIRED).
Enter the data separated by spaces or as a
variable to which the data has been previously
assigned. The next prompt is:
ENTER THE HYPOTHESIZED MEDIAN.
Enter 16. The following is dislayed.
COMPUTATIONS ARE BASED ON A SAMPLE SIZE OF:
13.
THE TOTAL NUMBER OF POSTIVE SIGNS IS: 3.
THE P-VALUE FOR HO: M = 16 Versus Hi:
M < 16 IS: .0461.
Consider a significance level of .05.
Since the P-value of .0461 is less than .05, we reject
HO: M = 16 in favor of Hi: M e 16 and conclude that the
manufacturing process is not packing enough explosives
in the Sinclair mine. The next prompt is:
WOULD YOU LIKE A CONFIDENCE INTERVAL FOR
THE MEDIAN? (Y/N).
56
.,*.* a . .* *z****** e. .~ .
Enter Y (If N is entered, the progam asks
if confidence intervals for a quantile are desired).
The next prompt is:
ENTER THE DESIRED CONFIDENCE COEFFICIENT;
FOR EXAMPLE: ENTER 95, FOR A 95% CONFIDENCE INTERVAL.
Enter 90. The following is displayed.
A 90% CONFIDENCE INTERVAL FOR THE MEDIAN OF
THE POPULATION IS: ( 15.8 < MEDIAN < 16 ).
The next prompt is:
WOULD YOU LIKE CONFIDENCE INTERVALS FOR A
SPECIFIED QUANTILE? (Y/N).
To see the form of the results, we generate
confidence intervals for the 30th quantile. Sample
size is automaticly set at the number of data points
entered eariler. Enter Y (If N is entered, the
mainframe program ends; or, the Sign test menu
reappears). The next prompt is:
ENTER DESIRED QUANTILE; FOR EXAMPLE: ENTER
20, FOR THE 20TH QUANTILE.
Enter 30. The following is displayed.
ORDER STATISTICS 1 COEFFICIENTS3 8 i .8231602 9 .9494901 .91600
***** THIS TABLE GIVES CONFIDENCE COEF-
FICIENTS FOR VARIOUS INTERVALS WITH ORDER STATISTICS AS
END POINTS FOR THE 30TH QUANTILE.
57
The mainframe program ends. The menu-
driven microcomputer program pauses for input from the
keyboard by prompting:
PRESS ENTER WHEN READY.
Press Enter and the Sign test menu
reappears.
b. Wilcoxon Signed-rank Test
(1) Description of Problem 2. A special train-
ing program is being considered to replace the regular
training that Radio Telephone Operators receive. In
order to evaluate the effectiveness of the new training
program, proficiency tests were given during the third
week of regular training. Twenty-four trainees were
chosen at random and grouped into twelve pairs based on
proficiency test scores. One member of each pair
received specialized training while the other member
received regular training. Upon graduation, the
proficiency tests were given again with the following
results.
Specially Trained Group (X): 60 50 55 71
43 59 64 49 61 54 47 70
Regularly Trained Group (Y): 40 46 60 53
49 57 51 53 45 59 40 35
(a) Does the special training program
ensure higher scores?
58
IrU
(b) By what range of values can the
scores of the two groups be expected to differ 95% of
the time?
(2) Solution. To test the hypothesis that the
special training program raises profficiency scores, we
test HO: M(X-Y) = 0 versus Hi: M(X-Y) > 0.
(3) Workspace Decision Process
(a) Microcomputer: Choose the Wilcoxon
Signed-rank Test from the main menu, and the option,
Paired-sample; Test HO: M = Mo versus Hi: M > Mo, from
the test menu. Skip to the Program Interaction section
below.
(b) Mainframe: Enter WILCOX at the key-
board and receive the prompts:
THE NULL HYPOTHESIS STATES - THE
POPULATION MEDIAN (M) IS EQUAL TO THE HYPOTHESIZED
MEDIAN (Mo); HO: M = Mo. WHICH ALTERNATIVE DO YOU WISH
TO TEST? ENTER: 1 FOR Hi: M < Mo; 2 FOR Hi: M > Mo;
3 FOR Hi: M A Mo.
Enter 2. The next prompt is:
ENTER: 1 FOR SINGLE-SAMPLE PROBLEM; 2
FOR PAIRED-SAMPLE PROBLEM.
Enter 2.
(4) Program Interaction. The prompt is:
ENTER X DATA (MORE THAN TWO OBSERVATIONS
ARE REQUIRED).
* 59
Enter the X data separated by spaces. The
next prompt is:
ENTER Y DATA (NUMBER OF Y ENTRIES MUST
EQUAL NUMBER OF X ENTRIES).
Enter the Y data. The next prompt is:
ENTER THE HYPOTHESIZED MEDIAN FOR THE
DIFFERENCES OF THE PAIRED DATA.
Enter 0. The following is dislayed.
COMPUTATIONS ARE BASED ON A SAMPLE SIZE OF:
12.
THE TOTAL SUM OF POSITIVE RANKS IS: 60.5.
THE P-VALUE FOR COMPARING THE MEDIAN OF THE
POPULATION OF DIFFERENCES TO THE HYPOTHESIZED MEDIAN,
HO: M(X-Y) = 0 Versus Hi: M(X-Y) > 0, IS: .0505.
Consider a significance level of .05.
Since the P-value of .0505 is greater than .05, we do
not reject the null hypothesis that the two training
cources are equally effective. However, due to the
closeness in values, the choice of rejecting or not
rejecting the null hypothesis is strictly a judgement
call. The next prompt is:
WOULD YOU LIKE A CONFIDENCE INTERVAL FOR
THE MEDIAN? (Y/N).
Enter Y (If N is entered, the mainframe
progam ends; or, the Wilcoxon test menu reappears).
The next prompt is:
60
i"
ENTER THE DESIRED CONFIDENCE COEFFICIENT;
FOR EXAMPLE: ENTER 95, FOR A 95% CONFIDENCE INTERVAL.
Enter 95. The following is displayed.
A 95% CONFIDENCE INTERVAL FOR THE MEDIAN OF
THE POPULATION OF DIFFERENCES IS:
( -1 _ MEDIAN(X-Y) < 16.5 ).
The mainframe program ends. The menu-
driven microcomputer program pauses for input from the
keyboard by prompting:
PRESS ENTER WHEN READY.
Press Enter and the Wilcoxon test menu
reappears.
c. Mann-Whitney Test for Equality of Medians
(1) Description of Problem 3. A group of Army
and Navy officers were given the Defense Language
Aptitude test. From the results, 14 Army and 17 Navy
officers' scores were randomly selected. These scores
are listed below.
Army (X): 35 30 55 51 28 25 16 63 60 44 20
42 47 38.
Navy (Y): 54 26 41 43 37 34 39 50 46 49 45
33 29 36 38 42 34.
(a) Is there sufficient evidence to claim
that Navy officers score higher on this test than Army
officers?
61
%1
(b) By what range of values can the
scores between the two groups be expected to differ 90%
of the time.
(2) Solution. To see if Navy officers score
higher on the exam, we test HO: Mx = My versus
HI: Mx < My.
(3) Workspace Decision Process.
(a) Microcomputer: Choose the Mann-
Whitney Test from the main menu, and the option, Test
HO: Mx = My versus HI: Mx < My, from the test menu.
Skip to the Program Interaction section below.
(b) Mainframe: Enter MANNWHIT at the
keyboard and receive the prompts:
DO YOU WISH TO COMPARE THE MEDIANS OR
VARIANCES OF THE POPULATIONS? ENTER: 1 TO COMPARE
MEDIANS; 2 TO COMPARE VARIANCES.
Enter 1. The next prompt is:
THE NULL HYPOTHESIS STATES - THE
MEDIANS OF X AND Y ARE EQUAL; Mx = My. WHICH
ALTERNATIVE DO YOU WISH TO TEST? ENTER:
1 FOR HI: Mx < My; 2 FOR Mx > My; 3 FOR Mx # My.
Enter 1.
(4) Program Interaction. The prompt is:
ENTER X DATA (MORE THAN ONE OBSERVATION IS
REQUIRED).
62
'Ile
Enter the X data separated by spaces. The
next prompt is:
ENTER Y DATA.
Enter the Y data. The following is
displayed.
THE SUM OF THE X RANKS IS: 224. THE U
STATISTIC EQUALS: 119.
THE P-VALUE FOR HO: Mx =My versus HI:
Mx < My IS: .5078.
We do not reject the hypothesis of equal
population medians and conclude the median of all Army
scores is equal to the Navy's. The next prompt is:
WOULD YOU LIKE A CONFIDENCE INTERVAL FOR
THE SHIFT IN LOCATION (My - Mx)? (Y/N).
Enter Y (If N is entered, the mainframe
program ends; or, the Mann-Whitney test menu
reappears). The next prompt is:
ENTER THE DESIRED CONFIDENCE COEFFICIENT;
FOR EXAMPLE: ENTER 95, FOR A 95% CONFIDENCE INTERVAL.
Enter 95. The following is displayed.
A 95% CONFIDENCE INTERVAL FOR THE SHIFT IN
LOCATION BETWEEN POPULATIONS X AND Y IS:
-10 < My-Mx < 10 ).
The mainframe program ends. The microcom-
puter program pauses for input from the keyboard by
prompting:
63
PRESS ENTER WHEN READY.
Press Enter and the Mann-Whitney test menu
reappears.
d. Mann-Whitney Test for Equality of Variances
(1) Description of Problem. Referring to
problem 3 in section c(l). Is there sufficient
evidence to claim that Army scores vary more than Navy
scores?
(2) Solution. To see if Army scores vary
more, we test HO: Vx = Vy versus HI: Vx > Vy.
(3) Workspace Decision Process.
(a) Microcomputer: Choose the Mann-
Whitney Test from the main menu, and the option, Test
HO: Vx = Vy versus HI: Vx > Vy, from the test menu and
receive the prompt:
ENTER THE DIFFERENCE OF THE MEANS OR
MEDIANS (Mx - My).
Because we believe the population
medians to be approximately equal, We enter 0. Skip to
the Program Interaction section below.
(b) Mainframe: Enter MANNWHIT at the
keyboard and receive the prompts:
DO YOU WISH TO COMPARE THE MEDIANS OR
VARIANCES OF THE POPULATIONS? ENTER: 1 TO COMPARE
MEDIANS; 2 TO COMPARE VARIANCES.
Enter 2. The next prompt is:
64
'~~~.%~ %,9 %. %I
THE TEST TO COMPARE VARIANCES, REQUIRES
THE TWO POPULATION MEANS OR MEDIANS TO BE EQUAL. IF
THEY DIFFER BY A KNOWN AMOUNT, THE DATA CAN BE ADJUSTED
BEFORE APPLYING THE TEST. ENTER THE DIFFERENCE OF
MEDIANS (Mx - My) OR 900 TO QUIT.
We enter 0. The next prompt is:
THE NULL HYPOTHESIS STATES - THE
VARIANCES OF X AND Y ARE EQUAL; Vx = Vy. WHICH
ALTERNATIVE DO YOU WISH TO TEST? ENTER:
1 FOR HI: Vx < Vy; 2 FOR Vx > Vy; 3 FOR Vx # Vy.
Enter 2.
(4) Program Interaction. The prompt is:
ENTER X DATA (MORE THAN ONE OBSERVATION IS
REQUIRED).
Enter the X data separated by spaces. The
next prompt is:
ENTER Y DATA.
Enter the Y data. The following is
displayed.
THE SUM OF THE X RANKS IS: 166. THE U
STATISTIC EQUALS: 61.
THE P-VALUE FOR HO: Vx = Vy versus Hi:
Vx > Vy IS: .0112.
Consider a significance level of .05.
Since a P-value of .0112 is less than .05, we reject
the null hypothesis of equal variances in favor of
65
LFuvI:A
Vx > Vy and conclude that Army scores do vary more than
Navy scores.
The mainframe program ends. The
microcomputer program pauses for input from the key-
board by prompting:
PRESS ENTER WHEN READY.
Press Enter and the Mann-Whitney test menu
reappears.
e. Kruskal-Wallis Test
(1) Description of Problem 4. During a recent
Monster Mash involving four Navy SEAL Teams, one of the
events consisted of the number of pushups a man could
do in 2 minutes. Eight men were chosen randomly from
each Team. The following scores were recorded.
SEAL 1: 90 96 102 85 65 77 88 70.
SEAL 2: 64 79 99 95 87 74 69 97.
SEAL 3: 101 66 93 89 71 60 76 98.
SEAL 4: 72 78 73 81 83 92 94 86.
Are the different Seal Teams considered to
be equally fit?
(2) Solution. To see if the Seal Teams are
equally fit. we test the hypothesis that all the
population medians are equal.
66
. .o
(3) Workspace Decision Process.
(a) Microcomputer: Choose the Kruskal-
Wallis Test from the main menu; and, once the test menu
is displayed, press Enter.
(b) Mainframe: Enter KRUSKAL at the
keyboard.
(4) Program Interaction. The prompt is:
ENTER THE NUMBER OF POPULATIONS TO BE
COMPARED (MUST BE GREATER THAN TWO).
Enter 4. The next prompt is:
ENTER YOUR FIRST SAMPLE.
Enter the SEAL 1 data separated by spaces.
The next prompt is:
ENTER YOUR NEXT SAMPLE.
Enter the SEAL 2 data. The next prompt is:
ENTER YOUR NEXT SAMPLE.
Enter the SEAL 3 data. The next prompt is:
ENTER YOUR LAST SAMPLE.
Enter the SEAL 4 data. The following is
displayed.
THE H STATISTIC EQUALS: .1335.
THE P-VALUE FOR HO: THE POPULATION MEDIANS
ARE EQUAL versus HI : AT LEAST TWO POPULATION MEDIANS
ARE NOT EQUAL IS: .98893.
67
N.N.v.'V,
We do not reject the null hypothesis that
the population medians are equal and conclude that the
SEAL Teams are equally fit.
The mainframe program ends. The microcom-
puter program pauses for input from the keyboard by
prompting:
PRESS ENTER WHEN READY.
Press Enter and the Kruskal-Wallis test
menu reappears.
f. Kendall's B
(1) Description of Problem 5. In order to
determine if cold weather affects target marksmanship,
Naval Special Warfare recorded small arms marksmanship
scores and corresponding air temperatures for a period
of one year. 20 men were chosen at random, and their
scores averaged for different air temperatures. The
average score for each air temperature is shown below.
Air temperature (X): 50 55 20 50 65 55 30
52 40 60.
Average scores (Y): 210 200 165 165 260
215 175 191 180 235.
Can it be said that colder temperatures
have an effect on marksmanship scores? Is that effect
positive or negative?
68
- -- . .
"
(2) Solution. We test the null hypothesis
that no association exists between cold temperatures
and marksmanship.
(3) Workspace Decision Process.
(a) Microcomputer: Choose Kendall's B
Test from the main menu; and, once the test menu is
displayed, press Enter.
(b) Mainframe: Enter KENDALL at the
keyboard.
(4) Program Interaction. The prompt is:
ENTER X DATA (MORE THAN TWO OBSERVATIONSPP
ARE REQUIRED).
Enter the X data separated by spaces. The
next prompt is:
ENTER Y DATA (NUMBER OF Y ENTRIES MUST
EQUAL NUMBER OF X ENTRIES).
Enter the Y data. The following is
displayed.
KENDALL'S B EQUALS: .7817.
THE P-VALUE FOR HO: NO ASSOCIATION EXISTS
versus: Hi: DIRECT ASSOCIATION EXISTS IS: .00045.
THE P-VALUE FOR THE TWO-SIDED TEST OF
HYPOTHESIS 1S: .0009.
Since the P-value for the one-sided test
equals .00045, we reject the null hypothesis that no
association exists between temperatures and
69
A, A'
marksmanship in favor of direct association. We
conclude that colder temperatures tend to cause lower
marksmanship scores.
The mainframe program ends. The microcom-
puter program pauses for input from the keyboard by
prompting:
PRESS ENTER WHEN READY.
Press Enter and Kendall's B test menu
reappears.
g. Spearman's R
(1) Description of Problem 6. When fitness
reports are written, officers of the same grade are
ranked against each other based upon their demonstrated
level of performance. Last marking period, the
Commanding and Executive Officers separately ranked 9
Ensigns as shown below.
EnsignsA B C D E F GH I
CO (x): 6 4 1 5 2 8 3 7 9XO (Y): 5 6 3 4 1 9 7 2 8
Does any association exist between the two
sets of rankings?
(2) Solution. We test the null hypothesis
that no association exists.
70
(3) Workspace Decision Process.
(a) Microcomputer: Choose Spearman's R
Test from the main menu; and, once the test menu is
displayed, press Enter.
(b) Mainframe: Enter SPEARMAN at the
keyboard.
(4) Program Interaction. The prompt is:
ENTER X DATA (MORE THAN TWO OBSERVATIONS
ARE REQUIRED).
Enter the X data separated by spaces. The
next prompt is:
ENTER Y DATA (NUMBER OF Y ENTRIES MUST
EQUAL NUMBER OF X ENTRIES).
Enter the Y data. The following is
displayed.
SPEARMAN'S R EQUALS: .5500.
THE P-VALUE FOR HO: NO ASSOCIATION EXISTS
versus: HI: DIRECT ASSOCIATION EXISTS IS: .0664.
THE P-VALUE FOR THE TWO-SIDED TEST OF
HYPOTHESIS IS: .1328.
Consider a significance level of .05.
Since a P-value of .0664 exceeds .05, we do not re.ject
the null hypothesis that no correspondence exists
between the two sets of rankings.
71
The mainframe program ends. The microcom-
puter program pauses for input from the keyboard by
prompting:
PRESS ENTER WHEN READY.
Press Enter and the Spearman's R test menu
reappears.
h. Nonparametric Simple Linear Regression; LeastSquares
(1) Description of Problem 7. Battery-powered
Swimmer Proplusion Units are sometimes used to aide
swimmers during long underwater swims. Recent tests
have shown that a nearly linear relationship exists
between water temperature and battery life for these
units. The following 17 data points were randomly
selected from the test results.
Water temperature (X) Battery life (Y)70 365 2.7550 1.840 1.260 2.455 1.952 1.7550 1.743 1.640 1.172 2.7555 248 1.535 .970 3.368 357 2.3
(a) Find the fitted regression equation.
72
(b) For the following water temperatures,
predict the battery life of the units: 61 52 46 36.
(c) Can we determine with any certainty
if the slope of the regression line equals .05.
(d) What range of values could be used
as the slope of the estimated equation line 90% of the
time?
(2) Solution. To determine the estimated
regression equation, we use nonparametric lirear
regression.
(3) Workspace decision process.
(a) Microcomputer: Choose Nonparametric
Simple Linear Regression from the main menu; and, once
the test menu is displayed, press Enter.
(b) Mainframe: Enter NPSLR at the
keyboard.
(4) Program Interaction. The prompt is:
ENTER X DATA (MORE THAN TWO OBSERVATIONS
ARE REQUIRED).
Enter the X data separated by spaces. The
next prompt is:
ENTER Y DATA (NUMBER OF Y ENTRIES MUST
EQUAL NUMBER OF X ENTRIES).
Enter the Y data. The following is
displayed.
73
THE LEAST SQUARES ESTIMATED REGRESSION
EQUATION IS:
Y -- 1.263 + .060668X.
The next prompt is:
DO YOU WISH TO ENTER SOME X VALUES TO GET
THE PREDICTED Y'S? (Y/N).
Enter Y (If N is entered, the program skips
to hypothesis testing for the slope).
ENTER X VALUES.
Enter 61 52 46 36. The next prompt is:
THE PREDICTED Y VALUES ARE: 2.44 1.89
1.53 .92.
WOULD YOU LIKE TO RUN SOME MORE X VALUES?
Enter N. The next prompt is:
WOULD YOU LIKE TO TEST HYPOTHESIS ON B, THE
SLOPE OF THE EQUATION? (Y/N).
Enter Y (If N is entered, the program skips
to confidence interval estimation). The next prompt
is:
ENTER THE HYPOTHESIZED SLOPE.
Enter .05. The following is displayed.
SPEARMAN'S R EQUALS: .5756.
THE P-VALUE FOR HO: B = .05 versus Hl: B >
.05 IS: .0079.
THE P-VALUE FOR THE TWO-SIDED TEST OF
HYPOTHESIS IS: .0158.
74
N.
Consider a significance level of .05. Since
a P-value of.0079 is less than .05, we reject the null
hypothesis that B - .05 in favor of B > .05, and
conclude the slope of the regression line is greater
than .05. The next prompt is:
WOULD YOU LIKE A CONFIDENCE INTERVAL FOR
THE SLOPE? (Y/N).
Enter Y (If N is entered, the mainframe
program ends; or, the Nonparametric Regression test
menu reappears). The next prompt is:
ENTER THE DESIRED CONFIDENCE COEFFICIENT;
FOR EXAMPLE: ENTER 95, FOR A 95% CONFIDENCE INTERVAL.
Enter 90. The following is displayed.
A 90% CONFIDENCE INTERVAL FOR B, THE SLOPE
OF THE ESTIMATED REGRESSION LINE, IS:
( .05333 < B < .07 ).
If the estimated slope, does not lie within
the confidence interval, the following would be
displayed.
THE LEAST SQUARES ESTIMATOR OF B LIES
OUTSIDE THE CONFIDENCE INTERVAL. DISCARD THE LEAST
SQUARES EQUATION AND USE:
Y = -1.4458 + .060833X.
THIS EQUATION IS BASED ON THE MEDIANS OF
THE X AND Y DATA, AND THE MEDIAN OF THE TWO-POINT
SLOPES CALCULATED FOR THE CONFIDENCE INTERVAL ON B.
75
The next prompt is:
DO YOU WISH TO ENTER SOME X VALUES TO GET
THE PREDICTED Y'S FROM THE NEW EQUATION? (Y/N).
To compare results, let us input the
temperatures in the new equation. Enter Y (If N is
entered, the mainframe program ends; or, the
Nonparametric Regression test menu reappears). The
next prompt is:
ENTER X VALUES.
Enter 61 52 46 36. The following is
displayed.
THE PREDICTED Y VALUES ARE: 2.265 1.72
1.35 .74.
The next prompt is:
WOULD YOU LIKE TO RUN SOME MORE X VALUES?
Enter N. The next prompt is:
WOULD YOU LIKE TO TEST HYPOTHESIS ON B, THE
SLOPE OF THE EQUATION? (Y/N).
To compare results once again, we enter Y
(If N is entered, the mainframe program ends; or, the
Nonparametric Regression test menu reappears). The
next prompt is:
ENTER THE HYPOTHESIZED SLOPE.
Enter .05. The following is displayed.
SPEARMAN'S R EQUALS: .8287.
76
.. .. . . . . .~p* . .. ....... ......... - .. . - : .-.-p. ". ..-s .. .. . : -
I ,<' . < < : , , "- ; ;le"i
THE P-VALUE FOR HO: B = .05 versus Hi: B >
.05 IS: .0000.
THE P-VALUE FOR THE TWO-SIDED TEST OF
HYPOTHESIS IS: .0000.
The mainframe program ends. The microcom-
puter program pauses for input from.the keyboard by
prompting:
PRESS ENTER WHEN READY.
Press Enter and the Nonparametric
regression test menu reappears.
77
APPENDIX D
MAIN PROGRAM LISTINGS FOR MICROCOMPUTER WORKSPACE
SKENA.AA-B;BX*BY;CCXCY.D;DD;DX;Dr.DXY;S;POS;NEG;XX;Y;N;DEN;NN;NUM;P;PVAL-bU;SV:T'U;V!AT-Z:X;Z:WW;CHA;E:P ' 'I a THIS FUNCTION COMPUTkS THE kENDAfi STATISTIC WHICH IS A MEASURE
2 OF ASSOCIATION BETWEEN SAMPLES. P-VALUES ARE GIVEN FOR TESTING ONE3 AND TWO-SIDED HYPOTHESIS FOR NO ASSOCIATION VERSUS ASSOCIATION.
4 a SUBPROGRAMS CALLED BY THIS FUNCTION INCLUDE: TIES, TIESK, KENDALP.5 ZNTERP, INPUT AND NORMCDF.
7 . E DISPLAY TEST MENU AND INPUT DATAI :MENU IZN l
1 0
12 BI:RrNPUT 2
Xl + (Q+)+RqORDER I IN INCREASING ORDER OF X.ORDER X IN INCREASING ORDER
'-19 a-x4AXJio20 A COMPUTE CURRENT RANKING OF Y
22 I NOW ORDER Y RANKS IN INCREASING ORDER23 D4-A AJ2 A TIES EX.ST IN EITHER X OR Y RANKED VECTOR USE MID-RANK METHOD25 DD-1 TIES D26 XX-l TIES 327 'FIND ORIGINAL RANKING OF Y WITH TIES RESOLVED28 YY-DDEC329 N-pX30 A COMPUTE NUMBER OF DISTINGUISHABLE PAIRS31 NN (Nx N-i))+232 S-pO33 AAu34 a POSITIVE ONES COME FROM A RUNS UP CONDITION- NEGATIVE I FROM RUNS DOWN35 A ZERO IS SCORED FOR TIES. MULTIPLY THE RESUL±S FOR EVERY ELEMENT AN D SUM36 Li: AA+37 EX(XAA(A+X38 CX4 XX AA <(AA+XXx)39 DX'-DX+CX40 BY.(77A (AA+X))'41 74iYY AA JAA.YY ).(-1)42 DY Y+ CY~43 DX!-DXxDY'4u POS (DXY'0)45 NGS ZO )X46 SSLS.47 -(AAcN-o )x L zl48 A SUM FINAL VECTOR TO DETERMINE S49 S.+/S50 A OBTAIN THE NUMBER OF TIES IN EACH VECTOR USING THE TIES! FUNCTION51 U-TIES B52 VTIE K D53 SU-+/ (2U)54 SV.+/ (2V55 A CALCU^ATE THE B STATISTIC INCLUDING THE CORRECTION FOR TIES56 TS( (NN-SU)x(NN-SV))*0.557 AT58 * (N>13)/NORM59 A CALL XENDALP TO CALCULATE THE RIGHT TAIL OF THE CDF OF B60 P-KENDALP N
6 :ALL :NTERP TO CALCULATE ?- 7ALUE 3Y :NTERPOLAT:ON* * PVAL-AT :NTERP P.0, -qPVALz1,)IL364 PVAL-0.5,65 *L366 CALCULA E P VALUE USINC NORMAL APPROX.67 NORM:NUM (3xAT )x((2XNN)*O68: DEN42x k2 "' N5 )*0.5'70 PVAL-1- NORMCDF Z)71 a IF a 1 POSITIVE PRINT OUT DIRECT ASSOCIATION.72 L3:.(T>0) L5.731 CHA 4'INDIRECT'
78
ef~f4 e*
7u -L775 LS:CHA.'DIRECT'76 L7:PV-o.2xPVAL77 ;(PVsIL878 PV179 L8:'KZNDALL''S B EQUALS: '§ ('&T TCNLs0 '$E P-VALUE FOR H0: NO AliO~lZAION EXISTS rfHUS'al I ill: ' (u!CHA) I ASSOCIATIONEXSTS IS: I (4vPVAL) OTCNL82, 'TE P-VALUE FOR THE T-SIDEb TEST OF HYPOTHESIS IS: ',t4fPV),T6'NL83 'PRESS ENTER WHEN READY.'8'. WW!Ea85 -NI
7
V KRWL*NUMDENOMA;CR-D-K;AA:BBbDDE-F:N-OF;P-PVAL-RSOF:SR;TSOR-CHA:B1 TH±S FeICTIOIJ COAP'Tg T X LSKAL!WALLIS'TEST S±AtZSTC& H wHrct :s2 A MEASURE OF TEE EQUALITY OF K INDEPENDENT SAMPLES.
~31 SUBPROGRAMS CALLED BY THIS FUNCTION INCLUDE: TIES TIESK INDEXPLSAFDISTN. INTERP AND THE VARIABLES PMATKW20, PMATE.V31. PMA±J.33. PMA±"34
Sia?MAXKWUl,* PMATKW42, ANVD PMTKW'.43.~6I71A MENU CHOICES AND ROUTE TO PROPER STATEMET FOR ACTION.8S NI: B-MENU KRWLQgj
I0 MENU MAINQ1AI~11 -01i2 a PP"u1 131 31: 'ENTER THE NUMBER OF POPULATIONS TO BE COMPARED (MUST BE GREATER THAN T
15 .(- '~ (1I)0)E
S17 n INITIALIZE VECTORS E AND F AND VARIABLE C18 E*P'SOFR'p019 C'-020 ATHIS LOOP FACILITATES ENTERING THE SAMPLE VECTORS AND STORING THEM21 CHA" FIRST'
*22 L1;C-C~l23 'ETER TOUR ',(*CHA),' SAMPLE.'
25 ( (0oD -xo)/NEXT26 D-1 D27 a CONCATENATE SAMPLES AS THEY ARE ENTERED AND STORE THEM IN VECTOR E28 NEXT:E-E., D29 A RECORD THE LENGTHS OF THE SAMPLES AS THEY ARE ENTERED30 F*.PD31 CA'ET32 *!;C((XE-1 1)/Ll33 C A-14ASTI34 'C<K)/L135 A RECORD SIZE OF ALL SAMPLES WHEN COMBINED36 N.4/F
38 OF-F[739 A ORDER COMBINED SAMPLE VECTOR TO BE USED BY TIES FUNCTION40 D-E9 E'41 A CALR IND EXPLS TO INCREMENT INDEXES WHEN TIES OCCUR WITHIN ONE SAMPLE42 AA.F* INDEXPLS E43 a CALL TIES TO BREAK TIES BY MIDRANK METHOD414 BB-1 TIES D'45 C-046 a THIS LOOP CALCULATES THE H STATISTIC
* 47 L2:C*C+1*48 A SUM OF RANK S FOR EACH SAMPLE IS CALCULATED
49 SR4-+ BBilFiC)+(AAEChJD50 A CiLCUAT SUM OF 1AK SQUARED DIVIDED BY THE INDIVIDUAL SAMPLE SIZE51 SR. SR*2),FEC52 at STORE EACH CALCULATION53 SOFR.*SOFRSR54 5(< /L2551 A SUM ACROSS ALL SAMPLES*5S TSOR-+ISOFR
57 c3 CALCULATE ?INAL 3 S4TTISTIC
*59 a RECALCULATE H WITH CORRECTION FOR TIES.60 A.TIFSJ E*61 NUM- + (A *3))-(+IA).62 DEPNOM.Nx ((N*2 )1)*63 H*H* (1-(NUM.DENom))*64 A SYSTEM F LOGIC AL STATEMENTS ENSURE PROPER PROD. IS ACCESSED*65 * OF ~ '2 LOUTPUT*66 -jK>u ) IFAPPROX*67 *:3 1I
.6 A/~O 3 333 )v (Ft >)/APO.69 *(A/(F 2 l)o O PUTll)1AP.70 0 ((= )(FL:2 /Pf42
79
.4%
o
C713 -((A/(Of= 3-1 1 lflV(A/(OPX 3 2 1 1))V(A/(0F= 3 3 1 1))V(A/(OF= 3 3 2 1))) /P'ui
72 .(K=4 /P4373 r*i(01/(O >4 2 1 ,1(A POF 3 1 1)))OUTPUT75 * ((OL±]L4A(OV3j)76 *( / 0O= 3 2 11)(/(f 3 1))V('./(OF= 3 3 2))V(A/(OF= 3 3 3)))/P3377 * A/ 0OF3 2 2 v~OFL o'j1J4)/P3478 J_ AL P RPRI.T VARIABLE ACCESS CDP79 P23:P.-PMLTK20c CN4);;j80 -PM81 P31:?PXATKW31[(N-5);;382 *?M83 P33:P-PLITXU33r(N-5);;J814 -Pm85 P34:P-PI4ATKU34E(N-6);;]86 *PM87 P41 :P-PM.ATKI41[(N-5): ;]88 *PM89 Pu2:P.PMATKNu2E(N-5):;90 -PM91 P43:P-PATKNu3 E(-792 CALL ITRTO)RILCULAZZ P-VALUE BY INTERPOLATION93 PM:PAL-9. :NTERP ?S941 *(PVALa 1IOU.TPUT95] *L596] a CALCULAT P-VALUE USING TH F DIS rT U/ONE LESS D.F.IN DENOM APPROX.97] FAPPROX: F. (N-KI)*(K x (-98] PVAL1I-(( PK1, N -1)FISTN F)199) 'L5.100~ OUTPUT:2VAL-IGREATER THAN .251101 L5:,THZ 3 STATISTIC EQUALS,: ('aH) "TCNLS1021 'THE P-VALUE FOR HO: THE ?6OP&LATIOIJMqEDrANS ARE .7 UAL 7:,P,2 TCNVL.103] 1 31: AT LEAST TWOC ?0PUATON MEDIANS ARE NOP ZQUZ _ , ":PVALv,!_
COTCVL101 PRESS ENTER WHEN READY.'15 AA-0
E0lZ:'I ERROR: ENTER A SINGLE INTEGER VALUE GREATER THAN 2; TRY AGAIN.' ,TCNL108 -91
7 XANW-N-MPV2ABCG*MM*N;RX*U*NM1 PNU;PVAL-NMNUMZ ;NUMZ1 DEN-DENC;,DENCI A T&-NM GGi;L*A1*PVA PkDb* IPXkC';IAtPHA;BB;CC;UFld;2;P1 ;NN1;-NN
i st 'His 'FuNCfIOI U§S THk sUA OF RANkS AOCEDURE TO CALCULATE THE MANN-2: . WHITNEY u STATSI WHICH is USED IN COMPUTING THE P-VALUE FOR THE TEST3 a OF LOCATION AND SCALE. THE C.I. FOR (MX--!Z) THE SHIFT IN LOCATION IS4. ALSO COMPUTED. SUBPROGRAMS CALLED BY THIS FUNbTIoN INCLUDE: TIES. TIES25 *AINDEXPLS, VARMW, MANWP, * NPUT, CONFMW * NORMCDF * AND NORMPTH.
7. N1:MENL' MANWEL8,a MENU CHOICI5AND ROUTE TO PROPER STATEMENTS FOR ACTION.9.N3:D-CHOICEM PAGEDMENU MA!N1PQ
10 D F.011 (D=2,3,4 52/B112 (D=6 7 83/' 313 * D15lk114 MNU MAINQflJ15 Q16 B3:ENTR THE DIFFERENCE OF TE MEANS OR MEDIANS (Mg - I.17 DZ FF-U18 * ((pDIF)')/E319 a ENTER DATA VECTORS20 B1: 'ENTER X DATA (MORE THAN ONE OBSERVATION IS REQUIRED).'21 11.022 -(N)=0 ) /123 'ENTER Y DATA.24 MO.025 A IF CALCULATIONS INVOLVE VARIANCES ADJUST X B! THE DIFFERENCE IN MEANS26 N-.N- 01FF27 a CONCATENATE X AND Y SAMPLE VECTORS28 A.N,M
29 DETERMINE 3:ZZ OF ", IND 21 7ECTCRS AZVD ASSIGN TO .N AND X!M'3 0 NN-oN-1 mm-om32 AOMPUTE SIZE LIMIT- OF LEFT TAIL OF NULL DISTRIBUTION33 NM- (NNxMME )234 NM14L NM35 a ORDER A AND ASSIGN To B36 BA A37 C.N AM NDEXPLS A38 AB ALkLIE FUNCTION TO BREAK TIES USING MIDRANK METHOD .
39 G.1ri~ TIES40 A F FALSE CALCULATE TEST FOR VARIANCES41 :*fD=2 3 4,51/9542 a' cALVARMW TO GENERATE RANKS REQUIRED FOR VARIANCE TEST43 A C-ALW TIESTORCORD TIES IN TEE DATA AND BREAK TIES IN GO
80
45 0.0G TIES B46 a CLCUATE SUM OF X RANKS
48 A ONVERT TO MWINWEIT U STATISTIC49g t7RX- ((RNx (NN.1 )5+2)50 Ul.U51 A. IF SIZ OFXTMSSZEO GO TO NORMAL APPROX52 .((NMx2 )~0/L253 N4INM54 NN-INM55 AMANWP FUIJCTICN CALCULATES LEFT TAZL CUMULATIVE PROBS. OF U STATISTIC56 P.NN2 MANWP NNI57 .(D=5)/LIO58 ALOGICAL STATEMENT ENSURES ONLY LEFT SIDE OF NULL DIST IS USED59 -b(UigNMI )/L360 A CON VERT U STAT WHEN GREATER THAN LEFT TAIL VALUES61 LT.(NNxMM)-U12 A IJF gjl.ISA FRACTIONAL, rNTERPOLATE P VALUE63 23: 0(i U) NON
65 .(U2>O)/PiS66 PV.1-((PCU2+12 )+2)67 PF368 ± V.-PU> U2i))~69~ P3:PVI*(PEU2+13+PCU2+2j )+2L70 .J .cHzqxL711 NON:-%U>0)/GOL72] PV.173 -P217u GO:P7.1-P[Uj
~76: CHECE:-(U1-JJM1)/L477, PV2'-pv$73 ?V-?VI~79: PVI*P.V2so 80* L4l , I CONFIDENCE INTERVAL DESIRED GO TO LboL82, M2-(D=5)/L10L83 *a COMPUTE TEE NORMAL APPROXZIATION ,V/CORRECTIOY FACTOR
8u NUMZ-(U+0.5)-NM,85 NUMZle(U-.)-y''86 DEN-((MMxNNxCMM+NN+!))+1.2)*0.S87 Z*NUXk+DENso8 Z1.NUMZI+DEN89 RUM. I U-NM)90 DENC-i. (NN4MM-1)x(DEFN*2) ).(NN*MM-2) i-( ((NUM-0.5)*2)+(NN+MM-2) ))0.591 D9NQ1.( (NN+MM 1)c (DEN*2 ). (NN+xM2 )-((NUM+0.5)*2)+(NN 11*)))*0.592 TC. (NU -0.5 ) DENC93 TC1(NUM+0 )+DENCl94 * ( UNM )/SECOND95 PVI14((NORMCDF Z)+( (NN+MM- 2) TDISTN TC )4+296 PV. (1-(NORMCDF Z1)(1-( (NN*MM-2) TDISTN TCI)))+297 uL99 SECO fD:PVI.((NORMCD1 Z )+(1-((NN+MM-2) TDISTN TC)))+2991 Pv.{(1-(ZJORMDj 1 (N2+MM-2) TDISTN TCl))+2100) L4:2FV3,x(~(VPV
102) .103] NS:PVM-(3, 1)p(PVI PV PV3)104) 'TE SUM OF THE I RNKS IS: ',(*RX),'. THE U STATISTIC EQUALS: ',(oU1),
O2TCNLS105) A* POIA TTMENT FOR VARIANCE OUTPUTL106) .D=6 7 I VAR
L107 'THE P-IU'AUE FOR go: mX = MI U52B:M ,MOI[-;3, ZI:1(6 L& VPVMCD-1;13),CTCNL i ~'(LCCC-:), iI:'
ls09 VARPVMe( 3t1 2pjPV.PVI,PV3)110O 'THE P-AO o V1 YIE59i Hi: VZ ',(*LOGIC[D-5;1J),' Vi IS: '
S (6 4 PVM D-kH±N'RADY.111) 'PRESS ENI'R ED112) B!.2113] N1114) L8:'WOULD YOU LIKE A CONFIDENCE INTERVAL FOR THE SHIFT IN LOCATION (Ml1
Mj) (YIN).',O CNL~115 B41
-' ::a~ LPHA- f':- 0 +v11 ROUTE TO NORMAL APPROX. FOR CONF. lNT. OF LARGER SAMPLE SZZES120 .((NMx2)>80 ) L5121 A COMPUTING CONFIDENCE INTERVALS BY EXACT P-VALUE122 CDFe-P123 A INDEX POSITION OF VALUE IN CDF :5 ALPHA124 INDEX-(+/(CDF5ALPHA))
N 125 * (INDEX>0)/L61261 NDEX~l127 .*L6
a,129 a COMPUTING CONFIDENCE INTERVALS USING NORMAL APPROX. W/C.F.129 L5:DEN-( (MMxNNx(MM+NN+1))+ 12)0.5130 UALPHA. (DENx(NORMPTH ALPHA)) +NM-0.51311 A ROUND UALPHA DOWN AND INCREMENT BY ONE
81
4m
132 INDEXLUALPHA+i1331 L6:IPX.NN INDEX134 CI IPX CbNFMW A135. A ',(CC).' CONFIDENCE INTERVAL FOR THE SHIFT IN138 'LOCATION BETWEEN POPULATIONS X AND Y IS:'.OTCNL.37. ' ( ' (sC[),' S M - M2 I ',(*CIE2)),' )',TCNLi38 PRESS ENTER WHEN READY.'139 BB I'140 -N3.... El:'ERROR: SAMPLE CONTAINS LESS THAN TWO ENTRIES; TRY AGAIN.' I.CNL
.143 E3:'ERROR: YOU HAVE ENTERED MORE THAN ONE VALUE. TRY AGAIN.'.CTCNL144 -. 83
7 NPLR-N-SUMX-SUM-XBARTBARSUMX2SUMXY B A 'WW XX'BB;U-D;ALPA;P;CC;CDF;TALPHA;NN:CI:SLOP-S;RFASR ;DENOM.INDtX:FiX:: ;R: CkAE-PV
PROGRAM CONDUCZ' NONPAIAMERIC LINEAR &ERSSON. Hk L2EAST SQUARES22 a ESTIMATED REGRESSION LINE IS COMPUTED WITH HYPOTHESIS TESTING AND,31 a CONFIDENCE INTERVAL AVAILABLE FOR THE SLOPE B. IF B DOES NOT LIE IN THEX-1 m C.I. AN ALTERNATE 3EGRESSION INE :S PROPOSED. SUBPROGRAMS CALLED ARE5 S A SPMANP, KENDAL?, NORMPTH, INPUT. AND CONFLR.6] QPP*57 P DISPLAY MENU AND INPUT DATA.a NI:E MZNU102 MENU MAINQJL.1, -0
~12 JN2:R-INPUT 2F:3 Q.*,1.R
is Y4(Q) R?
16 A ASSIGN THE SIZE OF X (AND Y) TO NSN X COMPUTE THE SUM OF X'S AND Y'S
19 SUMX-/X20 SUMY +/Y
2 COMPUTE THE MEAN OF X AND22 XBAR-SUMX+N3 YBAR.SUMY+N
24 A COMPUTE THE SUM OF THE X'S SQUARED25 SUMX2-+/(X*2)26 A COMPUTE THE Jam OF X TIMES I27 SUMXY-I(XxY)28 A COMPUTE 'B' THE SLOPE OF THE ESTIMATED LEAST SQUARES REGRESSION LINE29 B-((NxSUMXY)-tSUMXxSUMI)),((NxSUMX2)-(SUMX*2))3 ) A COMPUTE 'A', THE I-INTERCEPT31 A-YBAR-(BxXBAR)32 FF 'N'33 'THE LEAST SQUARES ESTIMATED REGRESSION EQUATION rS:',DTCNL34 '7 ' (5A),' + ' (5mB) 'X.' OTCNL35 'DO 70 WISH TO EN±ER SOAE X U'LUES TO GET THE PREDICTED Y''S? (YIN).'36 WW-*M37 *(WW'N')I/l38 L2:'ENTER X VALUES.'39 XX .40 a CALCULATE PREDICTED Y'S41 YY A+BxXX42 'THE PREDICTED Y VALUES ARE, ' (wYY).TCNL43 WOULD Y%;U LIKE TO RUN SOME MORE X ALUES? (YIN).'44 WW45 .'!rI:Y')/L24 LN)IWOULD T LIKE TO TEST HYPOTHESIS ON B, THE SLOPE OF THE EQUATION? (Y/
8--(W=I'N' /L3498"IENZER THE HYPOTHESIZED SLOPE.'
51 ACOMPUTE52, U*Y-(BsxX) CMUEU'
.4 A CALL SPMANP TO COMPUTE R AND ASSOCIATED P-VALUES.-=Z D X SPMANP 11
37 HA-< 1.5 LiI:PVE2xD[239 0 - PV51IL19RoJ PV Z
S612 L19:,SPEARMAN''S R EQUALS: ' (4D[]) JTCNL62 'THE P-VALUE FOR HO: B = I,20BB), 'VERIUS'
'63 HI: B ',(WCHA[I 2)7.3) ' IS1 ( 2D2J,)TCTCNL,64 'THE P-VALUE FOR THE TWO-SIDED TEST OF HYPOTHESIS IS: I, TCN2xD[2)),TCN6 L-5)' IF USING THE NEW REGRESSION EQUATION BASED ON MEDIANS, EXIT HERE.6O L3--wPF'IN' 1L1867 ~SS ENTER WHEN READY.'
82
69 -NI70 ACOMPUJTE CONFIDENCE INTERVALS ON B71 L18:'WOULD YOU LIKE A CONFIDENCE INTERVAL FOR THE-SLOPE? (YIN).'72 W.73 *(MW=lN')/N1714 LO:CC*INPUT 575 a CHANGE ENTERED VALUE TO ALPHA76 ALPHAe-(100-CC).20077 a ROUTE TO NORMAL APPROX. FOR CONF. INT. OF LARGER SAMPLE SIZES78 , (N13))L579 a COMPUTING CONFIDENCE INTERVALS BY EXACT P-VALUE80 ?-KENDAL? N81 CDF.PE2-'82 a 1PDEX POSITION OF VALUE IN CDP :9 ALPHA83 INDEX-(+/(CDFSALPHA))SL& *-(INDEX>0)/L685 INDEX 186 -L687 a COMPUTING CONFIDENCE INTERVALS USING NORMAL APPROX. W/C.F.88 L5:DENOM( (Nx(N-ijx((2xN)+5))419)*0.5ao 9 ALPHA-DENOMX 'II (NORMPT3 ALPHA)
91 L6:TALPHA P[3;INDEXI.G21 7ALHA93 LG:C."X CONFLR Z9g4 NN-leCIS95 SLOPES I+CI96 RR+L((NN-TALPHA)+2)
98 *.RRzQV)L209gQ RR.1i0o L20:SR-'-Fl+((NN+TALPH6A+2))101~ SR!021 -&SR:(nSLOPES))IL21i03 SReaSLQPESS104 L21:'A ' (mCC). CONFIDENCE INTERVAL FOR B THE SLOPE OF105 ' TEE ETIMATED REGRESSION LINE, IS:' .,TCNL107 'PES I EDY (SSOS RI, B :51,(5vSLOPESCSR]). ' ,QCV107 'PRESS ENTER HEN READY.T108 WW.3109 m IF 9 OUTSZDE THE C.I. CALCULATE NEW EQUATION BASED ON MEDIANS1o -((B3SLOPES[ZFRA(SLOPSCSR.) /N1111 I ORDER X AND Z112 X X AX)113 ]4- itY1114 A CHECK TO SEE IF THE SIZE OF SS IS EVEN OR ODD FOR FINDING MEDIANS115 *((2 NN)=O)/Si116 A COMPUTE MEDIAN FOR ODD CASE117 B SLOPES[((NN+1)+2))118 -S2,119 a COMPUTE MEDIAN FOR EVEN CASE120 SI:H (SLOPES[(NN 2)]+SLOPESi((NN+2 +2 2121 A VO THE SAME FOR THE X AND Y VECTORS122 S2:W((2 N)=0)/S3123 YBAR L((N+) +2))12U XBARX4((N+1 +2)]125 -OUT.126 S3:BAR-e(IT(N+2)]+YE((N+2).2)J).2127 XBAR (X[(N.2)'+XA'N+2)+2)]2128 A COjMUTE NEW INTERCEPT 'A'129 OUT:A YBAR-(BxXBAR)130 'THE LEAST SQUARES ESTIMATOR OF B LIES OUTSIDE THE CONFIDENCE INTERVAL.'
131 'DISCARD THE LEAST SQUARES EQUATION AND USE:'. TCNL132 ' Y:',(uA).' ' (682.'X',OTCNL.1331' THIS EQUATION IS BASEb ON TAE MEDIANS OF THE X AND Y DATA.AND TEE'C134 'MEDIAN OF THE TWO-POINT SLOPES CALCULATED FOR THE CONFIDENCE INTERVAL 0
N B.' LTCNL[135) A ALLOW USER TO DO SOME ANALYSIS ON NEW EQUATION[136] ' DO YOU WISH TO ENTER SOME X VALUES TO GET PREDICTED 7''S137) 'FROM THE NEW EQUATION? (Y/N).'
[138] FF4-IL139) .(FF=:I')/L2140 3 FF''
SIlGN;A-C-B;D:PVAL-X-MO-N-CDF-ALPHA*CI;Y*AA;BB;CC;DD;PV;PVI;NNN:KPOS;ORDD; XZ.ORDi.kALPHA.Z.Zl. UA.WW.PVM.PV3i.
1 A lHIS PUNCTIOk OSEt T~fOADINIR ri~ TEST TO CALCULATE THE K2 STATISTIC P-VALUE AND CONFIDENCE INTERVAL AS A TEST FOR MEDIANS.
A THE LAST 6PTION WILL DISPLAY A TABLE OF CONFIDENCE INTERVALS OF ORDERED4 a STATISTICS WITH CONFIDENCE COEFFICIENTS.
I SUBPROGRAMS CALLED B! THIS FUNCTION INCLUDE: BINOM, NORMCDF, NORMPTH,*6, INPUT,AND QUANG.
8 NI:MENU SICNGl p9 A MEN -TrICES AND ROUTE FOR PROPER ACTIONS
10] N4:C*CHOICES PAGEDMENU SZGNQIZ
83
.
I2: C2,3,4, 5 )/L"2 C=7 8 1 10)/L9
13 11i14 =C11)/L20
is MEU MAINQJ116. .017 aINPUT DATA FOR SINGLE SAMPLE CASEIS' LS:A"e119 X.IrNPUT I20 NNN.QX
22: MO.INPUT 323 D-X-AO24' *L1125 a PAIRED SAMPLE CASE26' Lg:AA4+227: R-INPUT 228~29* X11+ )+30 7.Q+)+
12 NNN-aDD
S3"~ MO-ZNP9TS35 9 ( - )m36 jaCOMPRESS D TO REMOVE ZEROS37 Li1:A.-(D-0)/D38~ A RECORD LENGTH OF A AND ASSIGN TO N3O N.a KEEPING TRACK OF POSITIVE SIGNS
41 ?POS-+/(A > 0)L&2 -(N>25;1NORM
13 PV.4L-31NO Y
*45 -
L47 Pi:PVI-i-PVALEXPOS3L48 P2:PV-PVALE(KPOS+1 ~J
-L6Z0 ? N IS GREATER THAN 30 Uf NORMAL APPROX W1 CONTINUITY CORRECTION
till ?ORM:z1-((KPO$-o.s -( .541),*(0.5x(N*0.5)*52 7-(KPOS+0.5 )-(0.5 XND+(0.5x(N*0.3)).53: PV.NVORMCDFZ.54. PVI.1-(NORMCDF ZI)*55* A IF PAIRED SAMPLE TEST GO TO L17 FOR OUTPUT STATEMENT56 LG:PV3,.2x (L/(PV,P VI))'57' *(V N558a PV3-1.59. N5:PVM-(3 1)o (PV PVI PV3)160, 'CO9PUTA±IONS AhE BASED ON A SAMPLE SIZE OF: '~N'.',1D2CNL
*61 'THE TOTAL NUMBER OF POSITIVE SIGNS IS: ',(fKPciS) , UCNL62. -(C=78,9)/L17 HiN (LOI(C)1), '(
61'THE P-U'LUE FOR HO:. M NO Iqj9:Xl,*OI[Ci;3,6MO),' IS: $.(5PVHEC-15,TCNL
642 *PL1S65 Li 7: 'THE P-VALUE FOR COMPARING THE MEDIAN OF THE POPULATION OF'66 'DIFFERENCES TO "HE HSIE MEDIAN DZCNL1671 'HO: M(X-Y) =I I EFRYPO S M(X-± rLOGIC(C6);13).f ',(fMO),,
69 BB-0f70 * B ~'')L1672 a INPUT SIZE OF CONFIDENCE INTERVAL73 L16:CC*ZNPUT 574 AiPHiA+ -iQ0-C C )200751 . NNN>25 )INORMI76 a COMPUTING CONFIDENCE INTERVALS BY EXACT P-VALUE77 CDF.HINOM NN78 Aj INDEX POSITION OF CDF FOR ALPHA + 2719 9.+/( CDF.SALPHA)80 *(B'O J/SKIP81 B4-182 -SKIP8a a COMPUTING CONFIDENCE INTERVALS BY NORMAL APPROX.uiL PORMI:AALPHA-(fQ.5x(NNN*0.5))x(NORMPTY ALPHA))'(C.SxNN0-0.5gr- 4 .0UrNo -:;HA dOWN :0 .VEAREST :YVTZCZR! V ,v CTzMENTr 3Y )NE
d7 4 ~ :7 5INGLZ SAMPLE CASE Go TO L
89~ LU CADDC22 ,4,5 LCULATE AND PRINT OUT CONF. INT. FOR PAIRED SAMPLE CASE90 L:ORDD.DDEADD)9I 1 78.ORDD933' A[3ORD .(sCC CONFIDENCE INTERVAL FOR TE MEDIAN OF THE9g' 'POPULAIO lFDPFfRENCES IS: ' OTCNL96 *QN 1,(C L13).1 s MEDIAN(X-I) :5 ',(*CIC2J).' )'.DTCNL
98 7:OD.E CALCULATE AND PRINT OUT CONF. INT. FOR ONE SAMPLE CASE99 77.-oORDX
84
1001 .;,oD 1ACOPENEITERVAL FOR THE MEDIAN oF THE POPULATION Is:
S102J 1 (TN !9~12, MEDIAN~ S (oCr[212') OTCNLL103 2QUANT:'WOULD YCU LIKE C6NFIrDiNCE INTERVALS FOR ASPECIFIAD QUANTILE? (YIN
104 ). 6 OTC[105] *(UW='7')/B11063 .N41072 L20:1ENTE17 TEE SIZE OF THE SAMPLE.'108g NNN.Q110913 i:1ENTER DESIRED QUANTILE; FOR EXAMPLE: ENTER 20, FOR THE 20TH QUANTILZ.
LIII ~ (QUA5C)v (UA>100))/EIL112 k; QP QUANC UAL1132L1143 a* THIS TABLE GIVES CONFIDENCE COEFFICIENTS FOR VARIOUS INTERVALSLilS)05 'WITH ORDER STATISTICS AS END POINTS FOR THE 1.(vQUA), 'TH QUARTILE. ',QTC
SLS1161 'PRESS ENTER WHEN READY.'1172 BB.M1183 -N4
!1 Z: ERROR: THE QUARTILE 7ALUE MUST LIE BETWEEN 0 AND !00; TRY AGAIN.'
7 SPMAN:X:YA:Q:R:CA3BB3PV] THIS FUNCTIrON CbMPeITES THE SPEARMAN R STATISTC wHICy MEASURES
N , THE DECREE OP CORRESPONDENCE BETWEEN RANKINGS OF TWO SAMPLES. THE P-3 aVALUE :5 G17EN FOR TESTINC ONE AND TWO-SIDED 3YPO THESIS OF A4SSOC.:AT:ON14 A SUBPROGRAMS CALLED By THIS FUNCTION INCLUDE: TIS, TIESK, SPEAR?,
5 A INPUT, SPAPROX, INTERP, AND THE VARIABLE PbL4TSP.67 A DISPLAY MENU AND INPUT DATA.
8 VI:B-MENU SPMANQgJ
.0; 7iNU MANQ5I1
12 BI:R.INPUT 2
15 +1) +R16 A AL SPMANP TO CALCULATE THE STATISTIC AND ASSOCIATED P-VALUES17, A-X SPMANP 718 *(A E1>0 1Li19 CHA.'IND RECT'20 +L221 L1:CHA*'DZRECT'22 L2:PV4.2xA L2J23 * PV:51)/ 324 P .121 L3:'SPEARMAN''S R EUALS: '§ (4vAjC1),0TCNL26 TE P-VALUE FOR HO NO AS.O IATION EXISTS ZSUlI:27 ' Hi: ' 'wC1iA I ASSOCIATION-EXT ErSA2j IS:283 'THE P-VALUE FOR THE T-SID b TEST OF HYPOTHESIS IS: ',.4v(2xAL2 ji,0T CN
C291 'PRESS ENTER WHEN READY.'L30 B.0!
v wIsiGABD-EFPV2Z;ZDEN;NUMZ-NUMZi ;PVALXMO;N-TPLUS-CDF-TALPA;ALPHA.H.C±.±-AABbCCNN!DbPV;POS;±PObS;NM;PVI;TkFS;kN;C;'VM;PV3;R;Q;NUM.T .tC1:TAAP:D~kT-bEN±I
I A THIS AUNION 6SES i'HE WILCOXON SIGNED RANK -TEST TO TALCULATE THE TPLUS2 aSTATISTIC P-VALUE, AND CONFIDENCE INTERVAL AS k~ TEST' FOR MEDIANS.3jaSUBPROGRAAS CALLED'BY THIS FUNCTION INCLUDE: TIES, WILP, NORMCDF,
q NORMPTH, CONFW, INPUT 1tND TEE 7,ARIABLE ?MATRIX.
:~.MA U WIZEN CHOICES AND ROUTE FOR PROPER ACTIONS.U81 N4 :C*CHOICEW PAGEDMENU WILQ~g
92 * 2,3,4,5j)/L810 = 6 7 8,9 )/L9
CC* MEU INQfL9j13 * 0
14A INPUT DATA FOR SINGLE SAMPLE CASE15~ L8:AA.116. X.-INPUT I17~ NNN*gis~ * (C=5ftLI6191 MO.INFUT 3
85
S20 D-X-NO21 *Li122 aPAIRED SAMPLE CASE23 L9:AA-2
25 Q~ieR26 .+Qi)R
28 DD-.x-Y29 NNN-oDD30 *(C=9)IL1631 MO..TNPUT 1L32 D-(X-I)-MOL33 1:.(0 COMPRESS D TO REMOVE ZEROSL35 A RECORD LENGTH OF A AND ASSIGN TO N36 N.DA37 A KEEPING TRACK OF POSITIVE SIGNS38 aO.Ao TAKE THE ABSOLUTE VALUE OF A; ASSIGN TO B AND ORDER B
*,^', a EORDER POSITIVE SIGNS TO COINCIDE WITH PROPER POSITIONS IN B
'14*4: Ao.~s ~ CALL FUNCTION TO BREAK TIES45. E~l TIES B46 A CALCULA.'E TPOS BY ADDING ACROSS ALL POSITIVE VALUES OF E47: Tpos.+/(POSxE)l8. TPOS1.*TPOSug" a G;IVES SIZE OF LEFT TAIL OF PROBABILITZ DISTRIBUTIONR?~ NM-(LC(+/iN)+2)fl 11
a (N9I3 GO TO STATEMENTS BASED ON LENGTH OF VECTOR E
S53 a GENERATE NULL DISTRIBUTION FOR TPLUSL54. F.WILP NE55] A IF TPOS FALL IN LEFT HALF OF PROB DIST CALCULATE PVALUE AS NORMAL~56 L1:'.(fPOSs(NM-H)/TREGS57 A OTHERWISE USE THE NEGATIVE T STATISTIC58 TPOS- ( +iN)-TPOS159 a IF TPOS IS FRACTIONAL USE BOTH THE INTEGER ABOVE AND BELOW AS TPLUS~60 TREG:-( C: ITOs)=o )/NON62 fP:PV- (('FLTPLUSJ+F~TPLUS+jj+2)63 PVI.(FCTPLUS+iJ+ EFT LUS+23 )$264 *CHEC651 NON:.(TPOS>0)/GO66 PV.167 -P268 GO:PVi-F[C(TPOS) 2169 P2:PVI-F rTPOS+i 270 CHEC:.TFSl (NM-i) )/L671 PV2-PV72 PV*PVI73 PVI+-PV274 *L675 a COMPUTE NORMAL APPROX. WICONTINUITZ CORRECTION FACTOR r76 L3:TRAP-(Nx(N+li) *477 NUMZ-(TPOS+O.5 )-TRAP781 NUMZI.(TPOS-O.5)-TRAP79 DEN-( (Nx(N+1)?c((2xN)+1)),2'4)*0.5so Z.-NUMZ .DEN81 Zi-NUMZI+DEN82 ACOM UTE STUDENT T APPROXIMATION WITH CONTINUITY CORRECTION FACTOR83 NUM IT FTOS-AP84 DENT. ((N(DEN*2) *(N1) -( NUM-C.5)*2)4(N-1) *0.585 DENTI.+ (Nx (DEN*2 )+(N-1 )~(NUM+0.5)*2),(N-1M3 *0.5
86 T.(NM-0.5 )+DENT87 T1.(UM+0.5)+DENTI88 aCOMUTEAVERAGE OF TC AND ZC
a9 *(PS9( (+/1)+2j))<SECOND90 y.NOMD 11Tp (N-i) TDiITN TI)+
Pv1 m (NORMD I T TISTN TC +J2 T1).92 *opL693 SECOND:PV*((1-NRMCDF Zi) )+(.(N-i) TDI.~TN TC ) )+21L&l PVZ..((NORMCDF Z *(I-((N-1. TcDISTN TC) )i)+2
ga8 N5:PVM.(3 i)p(PvI PV PV3)L9j 9] COMPUTA±ION ARA BASED ON A SAMPLE SIZE OF I(wN) ' '02CNL
102 'THE TOTAL SUM OF POSITIVE RANKS IS* I (OTPOR') '*['QCNLLi01.A IF PAIRED SAMPLE TEST GO TO £17 FOR O/JTPU' STATEMENTL102 J AA=22 L17L1031 'THE, P-VALUE FOR HO: M s AMO),l yj5q Hi: M ',(*LOGXCEC-1;1]).' '(
, MO ' S:,(44PVMCC-1;i2),UCNL
105~ L17:1 THE P-VALUE FOR COMPARING THE MEDIAN OF THE POPULATION OF'L106] 'DIFFEjENCFS TO THE 6POTHESIZEDMED fANt~ OTCNL1107 'Ha: M(X7 I ' l~bC (uQ'~i] : M(I-) *(LOGICEC-513),' '.(fMO),'.
IS:,. UP M[4,
86 *
108 L18: 'WOULD YOU LIR2EA'CONFIDENCS INTERVAL FOR THE MEDIAN? (Y/N).'109 BB* zf ')LI110 *(B'7)/6111 090112 L16:CC-INPUT 5113 ALPHA.-(i00-CC)*200114 A ROUTS TO NORMAL APPROX. FOR CONY INT OF LARGE SAMPLE SIZE115 * NNN>9 /L4&16 CF*-WILP NNN117 it INDEX POSITION OF CDP FOR ALPHA + 2118 CDF-(CDFzo)/CDF119 NAPHA.(-I (CDFsALPHA))120 *(TALPHAxO )/JUMP121 TALPHA-1122 .0JUMP123 P COMPUTING CONFIDENCE INTERVALS BY NORMAL APPROX. Ar/C.?,1214 L4:TRAP.(NNNx(NNVN+1 14
15 DEN.l (NN(NN1) (2NNV)+1 ))+24)*0.51126 TS ALPHA)) +TEAP-0.5'27 TALPHAS128 A ROUND TALPHA DOWN ZC IN7TEGER 7ALUE AND .NCREMENT BY ONE129] TALPHA.LTALPHA+l
.130] a IF ONE SAMPLE CASE GO TO L7
L132: a CALCULATE AND PRINT OUT CON?. INT. FOR PAIRED SAMPLE CASEL133, L5:CXeTALPHA CONFW DDL134 ' A I (CC) I CONFIDENCE INTERVAL FOR THE MEDIAN OF THEL135' 'POPULATI6N OF bIFFERENCES IS: ',OTCNL~136 ' ( I (mClIJ),l 59 MEDIAR(X-Y) :9 ',(vCI[23).' )',OTCNL137, 'PRESS ENTER WHEN AEADY.'13 3BW '
139 N4v,luol ACALCULATE AND PRINT OUT CON?. ZNT. FOR ONE SAMPLE CASE
14ij L7:CZ.TALPHA CONFW X142] 'A (*CC),'I CONFIDENCE INTERVAL FOR THE MEDIAN OF THE POPULATION IS:
S143~ C ' (0 CZ C~. I MEDIAN S ',(sC 2]), 0'TCNL14lu] 'PRESS ENTER WHEN AEADZ.
7
87
&N % N
APPENDIX E
MAIN PROGRAM LISTINGS FOR MAINFRAME COMPUTER WORKSPACE
V KENDALL-A*AA:BBX;BY-C-CXCY;D;DDDDXYDXY;SPS;NEG;XX;ZY;N;DEN;NN;NUM.P.PV PVAL. UTSV'. V. AT Z'X'Y' HA:Q'r
1 a FH±S .UNC!fON co*kP&TkS'THk kEkDALL b HTATISTIC WHICH IS A MEASUREa OF ASSOCIATION BETWEEN SAMPLES. P-VALUES ARE GIVEN FOR TESTING ONEa AND W0-SIDED BYPOTHESIS --OR NO ASSOCIATION VERSUS ASSOC:AT.:N.
4 A SUBPROGRAMS CALLED BY THIS FUNCTION INCLUDE: TIES, TIESK, Z4ENDALP.a INTERP INPUT AND NORMCDF.R-INPUT 2
l~:Q+i)R10 a Y4. ORDER Y IN INCREASING ORDER OF X
a ORDER X IN INCREASING ORDERS C. "a COMPUTE CURRENT RANKING OF Z
16 A NOW ORDER 7 RANKS ZN INCREASING ORDER18 . Ii TIES EXIST IN EITHER X OR Z RANKED VECTOR USE MID-RANK METHOD
19 DD-1 TIES D20 XX.1 TIESB20 a FIND ORIGINAL RANKIN: OF Y WITH TIES RESOLVED22 '7-DD C23 NoX24 A COMPUTE NUMBER OF DISTINGUISHABLE PAIRS25 NN-(Nx(N-1)) 226 S-pO27 AA4-028 A POSITIVE ONES COME FROM A RUNS UP CONDITION- NEGATIVE I FROM RUNS DOWN29 A ZERO IS SCORED FOR TIES. MULTIPLY THE RESULTS FOR EVERY ELEMENT AND SUM
0 LI:AA-AA+131 BX (XX AA >(AA+XX))32 CX XX AA AA+X))x(-)33 DX4.BX+ X34 BY.,(77 AA1>AA+Y))x(,35 CI. IAA <~ AA+U)x(136 D7.7+CY37 DXY DXxD"38 POS (DX1>039 NEG4,DXI<O x ()
42 A SUM FINAL VECTOR TO DETERMINE S143 Si.+/544 a+OBTAIN THE NUMBER OF TIES IN EACE VECTOR USING THE TIESK FUNCTION45 U*TIESK B46 V TIESK D47 SU +/ (2 U)4 A CALCULATE HE B STATISTIC INCLUDING THE CORRECTION FOR TIES50 T*S+; (NN-SU )x(NN-SV) )*0.51 AT Ij
53 CALL EENDALP TO CALCULATE THE RIGHT TAIL OF THE CDF OF B54 Pi.KENDALP N55 CALL INTERP TO CALCULATE P-VALUE BY INTERPOLATION56 PVAL-AT INTERP P57 .(PVALz1)/L3
.58 PVAL*O.5
1 2:,LCULATT ? 7ALUE 7C:NC NORMAL APPROX.o , NORM:NUM*',xAT ,xt,2I NN)O.z)
62 DEN (2x((2xN)+5))*0.563 Z NUM+DEN64 PVAL1- (NORMCDF Z)65 IF B IS POSITIVE PRINT OUT DIRECT ASSOCIATION.66 L3:*(T>O) /L567 CHA-' INDIRECT'68 *L769 L5: CHA'DIRECT'70 L7:PV.2xPVAL71 (PVSI )IL S ,72 PV. I73 L8:'KZENDALL''S B EQUALS #,(4T)
88
-N .,
75) :THE P-VALUE FOR Ho: NO ASSOCIATION EXISTS VERSUS'76 Hi: ',(vcHA).'ASSOCIATION EXISTS IS: '.(L4.PVAL)77] 1 178 'THE P-VALUE FOR THE TWO-SIDED TEST OF HYPOTHESIS IS:'.(4IPV)[79] ' '
9
7 KRUSKAL NUM DENORA C HDK*AA8BbDD-E;F:NOF-P PVAL;R;SOFR ;SR-TSOP;CA~1 A THIS PUNelION C.WHU§'E tgkA K&SkAL-i ALL±S TE.§T'STAT±S±I!C H WH±CH tt2 A A MEASURE OF THE EQUALITY OF K INDEPENDENT SAMPLES.3 a SUBPROGRAMS CALLED BY THIS FUNCTION INCLUDE: TIES TIESE INDEXPLS.
I4 a PDISTN INTERP AND THE VARIABLES PMATKW20 PMATKW31. PMA±KW33,5 a PMATFJ34, PMATfKA41, PMATKWJ42, AND PMATK;4i.
7 BI: 'ENTER THE NUMBER OF POPULA.IONS TO BE COMPARED (MUST BE GREATER THAN T
.91 -((K<3)>l)IE1 )1n
11 pK'INITIALIZE VECTORS E AND F AND VARIABLE C:I E-F-SOFPRooC-014 ATHIS LOOP FACILITATES ENTERING THE SAMPLE VECTORS AND STORING THEM15 CHA,,'FZRST'16 LI:C*C+l
' ENTER YOUR 1.(*cHA),' SAMPLE.'
, -. "9ooD)z01/NEXT
2t Q CONCATENATE SAMPLES AS THEZ ARE ENTERED AND STORE THEM IN VECTOR E2 NEXT:E.E.D23 A RECORD HE LENGTHS OF THE SAMPLES AS THEY ARE ENTEREDL24) FDL A25 H 'NEXTi26 (C(K-))ILL27 CRA-'LAST'28 (C<K)1L129 RECORD SIZE OF ALL SAMPLES WHEN COMBINED30 NV-/F31 A ORDER SAMPLE SIZES LARGEST TO SMALLEST32 OF FC'F-33 ORDER COMBINED SAMPLE VECTOR TO BE USED BY TIES FUNCTION34 DEW*E35 CALL INDEXPLS TO INCREMENT INDEXES WHEN TIES OCCUR WITHIN ONE SAMPLE36 AA-F INDEXPLS E37 a CALL TIES TO BREAK TIES BY MIDRANK METHOD38 BB.1 TIES D39 C-040 a THIS LOOP CALCULATES THE H STATISTIC41 L2:C C+1
42 A SUM OF RANKS FOR EACH SAMPLE IS CALCULATED43 SR-+/BB (F C)+(AA C)
44 LCU A OF AANKi SQUARED DIVIDED B THE INDIVIDUAL SAMPLE SIZE46 i SR*2 S[CiTORE EACH CALCULATION47 SOPR.$OFRSR48 (C' )/L250 S SUM ACROSS ALL SAMPLES50 TSOR .+/SOFR5152 H*KTSORx(12+ NX(N+1)))-(3x(N+l53 a ECALCULATE H WIT CORRECTION FOR TIES
q 54 A-?IESKEZ55 NUM., +/(*S3 ) )- (+/A)56 DE x( (N*2)-1 ,)57 H* 1 -(NUMDENOM))58 A SYSTEM OF LOGICAL STATEMENTS ENSURE PROPER PROB. IS ACCESSED59 OF[ 3 <2 )LOUTPUT60 (0. ) IFAPPROX
1 * f3) IF62 * A A/(OF= 3 3 3 3))v(OFE1)>3))/FAPPROX63 r(OF: 2 1. 1 /OUTPUT
--- ,., U',,05.j (OF: -k :v t,(F: 3 2 .-)v(A/(OF: 3 3 :;)vA,(0F 3 3 2: "k/ F -(A/(OF=
67 "F: (OF[1>4)PAPPROX68 1 (AJ/( OF= 3 1 1)))IOUTPUT69 0 ".%4 (OF3) I IP3170 * / (OF: 3 2~ 1) v (A : ) )v(A/(OF= 3 3 2))v(A/(OF= 3 3 3)))/P33-71 ^ OF= 3 2 2 iL JP372 A AL APPROPRIATE VARIABLE ACCESS CDF73 P23:PPMATKW20C - );;J74 .OPM75 P31:P..PMATXW31C(N-5)1;;76 *1PM77 P3 :P.PMATKW33 (N-5):;)7011
89
ftC CC .. . . . . . C. -
79 P34:P-.PMAfXW34[(N-6); 3]80 PM81 P1i:P-PMATKW i(N-5);;382 -PM83 P42:P-PMATX242[(N-5)•;;84 _PM85 P43:P'PMATXW43 C (N-7) ,86 a CALL INTEAP TO CULATE P-VALUE BY INTERPOLATION87 PM:PVAL-9 ZNTERP P88 *(PVAL- I)/OUTPUT89
iPAL
go a CALCULA-7 ?-VALUE USING THE F DIST WIONE LESS D.F.IN DENOM APPROX.gi FAPPROX:? -Kx ) ((-1(1-))92 PVALi-(((K-),( (N-K)-i)) DISW F) -
93 -L594 OUTPUT:PVAL.'GREATER THAN .25'95 L5:'THE B STATISTIC EQUALS: ',(4*H)96 '97 'THE P-VALUE FOR Ho: THE POPULATION MEDIANS ARE EQUAL VERSUS98 199 ' HI: AT LEAST TWO POPULATION MEDIANS ARE NOT EQUAL IS: ', CPVAL)
,101 .40102, El:'ERROR: YOU MUST ENTER A SINGLE INTEGER VALUE GREATER THAN 2; TRY AGAI) N.1
) 1"103] '
7 XANNWBIT-N.M-PV2-A;BLCG;MM-NN'RX;U:NM1:P;NU:PVAL:N:NUMZ:NUMZI;DENC :Z-- LPHA : NDF INEX - '.k-i; C ;ALPHA, 3B; CC; U ;U2; PV; NNI; NN2 . 2V ; DZVI; ;AA ;O C
;P N PV3;D- Z2R;DEN;bENCTCTC1 :NUM.] THIS FUJNCTION USE'S TE' SUJM OF RANKS PROCEDURE TO CALCULATE THE2 q MANN-WHITNEY U STATISTIC USED IN COMPUTING THE P-VALUE FOR THE TEST3 A OF LOCATION AND SCALE. THE C.I. FOR 2(Y)-M(X), THE SHIFT IN4 a LOCATION, IS ALSO COMPUTED. SUBPROGRAMS CALLED INCLUDE: TIES, TIES27 AINDEXPLS, VARM , MANWP. INPUT, CONFMW, NORMCDF , AND NORMPTH.6 DIFF-O
'DO YOU WISH TO COMPARE THE MEDIANS OR VARIANCES OF THE POPULATIONS?'
0 B2:' ENTER: I TO COMPARE MEDIANS: 2 TO COMPARE VARIANCES.'
12 ( )A(AA=2))/E213 AA=i/Ni14 ' THE TEST TO COMPARE VARIANCES REQUIRES THE TWO POPULATION MEANS'Is 'OR MEDIANS TO BE EQUAL. IF THEY DIFFER BY A KNOWN AMOUNT,16 'THE DATA CAN BE ADJUSTED BEFORE APPLYING THE TEST.'1710833:' ENTER THE DIFFERENCE OF MEDIANS (M(X) - M(Y)) OR 900 TO QUIT).'19 DFF.Q20 DFF )>1 )/E321 F=900)/0221' HE NULL HYPOTHESIS STATES - THE POPULATION VARIANCES ARE EQUAL; V(X) :V(Y).S23224 ' WHICH ALTERNATIVE DO YOU WISH TO TEST?'25 '261 N2:'EZNTR: 1 FOR Hl: V(X) < V(Y); 2 FOR Hi: V(X) > V(Z); 3 FOR Hl: V(X)*_V(z). '
287 '(D-l)A(D-2)A(D*13))/E4291 ,31301 Ni:' THEs NULL HYPOTHESIS STATES - THE MEDIANS OF X AND Y ARE EQUAL; M(X): M(Y).31232 WHICH ALTERNATIVE DO YOU WISH TO TEST?'3334] B6:'ENTMR: I FOR Hl: M(X) < M(Y); 2 FOR Hi: M(X) > M(Y); 3 FOR Hi: M(X)
[3 M(Yb'l
36 '(?Dsi)A(Dz2)A(D:3,)/Er.37 a LNTR DATA 7CTORS
38 31: ENTER ; DATA (MORE THAN LNE OBSERVATION IS REQUIRED).'L4 N>) =0)/El
41 'ENTER I DATA.'
413 it ICALCULATIONS INVOLVE VARIANCES ADJUST X BY THE DIFFERENCE IN MEANS44 N.-N-DIFFL15 A CONCATENATE X AND I SAMPLE VECTORS46 A'N.M47 A DETERMINE SIZE OF X AND Y VECTORS AND ASSIGN TO NN AND MM148 NN'pNSMM M COMPUTE SIZE LIMIT OF LEFT TAIL OF NULL DISTRIBUTION
51 NM-(NNXMM).2521 NMI*LNM
90
%.
53 a ORDER A AND ASSIGN TO B514 8.AAA55 C. (NNMM)INDEXPLS A56 A CALL TIES FUNCTION TO BREAK TIES USING MIDRAJK METHOD57 C-1 TIES B58 a IF FALSE CALCULATE TEST FOR VARIANCES59 * (AA1 )/8560 a C Z; VARMW TO GENERATE RANKS REQUIRED FOR VARIANCE TEST61 GG.VAP.MW NN+MM)62 A CALL TIES TO RECORD TIES IN THE DATA AND BREAK TIES IN CG63 G.GG TIES B614 A C'LCULATE SUM OF X RANKS65 BS:RX*+/(G[(C[1;(INN)1166 A CONfVERTTO MANNUHIT U STATISTIC67 U.RX-((NNx(NN+1) 2.2)68 U1.U69 A IFqSIZE OF X TIMES SIZE OF Y > 80; GO TO NORMAL APPROX70 (NMX2 )~0)/L2
7.1 ! IN.LNN,MM721 XN2. INN MM73 amANwp FuIJCTIoN CALCULATES LEFT TAIL CUMULATIVE PROBS. OF U STATISTIC714 P-NN2 MANWP NNI75 A LOtqICAL STATEMENT ENSURES ONLY LEFT SIDE OF NULL DIST IS USED
*76 . UgNMIA/L377 A CONVERT U STAT W'HEN GREATER THAN LEFT TAIL VALUES:79 U*(NXlgjl A FRACTIONAL, INTERPOLATE P VALUE
82 4- 20 P'83 PV.-.-( (PEU2+1,J)42)
:851 PM:V.-( P1U23 +PrU2 +l ; +2)86] ?3:PVI.( U2+1 -'PU2+2 +873 -CHECK881 NON:*(U>0)/GO
91 PV2i-
95J PV:PVIPU+.
962 PVI.PV297 -L14982 A COMPUT THE NORMAL APPROXIMATION U/CORRECTION FACTOR9 L2:NUMZ. (0+0.5)-NM100 NUMZ + U-0.5 -NM101 DEN- (MMxNNX (MM+NN+1))+12)*0.5102 Z+NUMZ+DEN103 Zb.N(/ 1N
*1014 NUM. (U-NM )105 DENC.( ( (NN+MM-)x(DEN*2) +N+M-)- NUM-.5)*2)G(NN.MM-2) *0.5106 DENC1.( (NN+MM- )x (DEN*))(NN+MM2)(NUM+0.5)*2),(NN M2)*0.5107 TC*(NUM-0.5)+DENC108 TC1.(NUM+0.5)+DENCI
109 * IN)SECOND110 PVI- (NORMCDF Z)+( (NX+MM-2) TDISTN TC) )42ill PV* ( (1-(NORMCDF Z1 )+(l-( NN+MM-2) TDISTN TC1)))*2112 PL14113 SECOND:PVI.( NORMCDj' Z)+(1-((NN+MM-2) TDISTN TC)))*2114 PV~( (NOM Z J)+ (GvN+MM 2) TDITN TC1))*2115 L4YPV3x(I (P, V)116 .PV3 1)IN117 PV3.1118 N5:PVM*(3 ,lp PVI PV PV3)119 'THE SUM OF THE Ik s KSis: .,(*RX).,. THE U STATISTIC EQUALS: ',(mU1)120 1121 A.(A LOGICAL STATEMENT FOR VARIANCE OUTPUT122 *(BA2 ) IVAR123 'TE, PVALUE FOR NO0: M(X) M(Y) VERSUS Hl: M(X) ',(mLOGCD;1J),' M(Y)
IS: ',(4*PVMED;13)S1241)2125 *L8~126~'~ VA P-VAlUF PR'PVC'7(X (7:7RSUS 71: 7(X) '(LGC:2'77
:129 -0L1301 LS:' UQULD YOU LIKE A CONFIDENCE INTERVAL FOR THE SHIFT IN LOCATION(M()-
M(.1))? (YIN).'
132 *(BBIN' )I0133 L1O:CC*ZNFUT 5134 ALPHA. (100-CC) *200135 1% ROUTE TO NORMAL APPROX. FOR CONF. INT. OF LARGER SAMPLE SIZES136 .((NMx2)>80) 1L5137 A COMPUTING CONFIDENCE INTERVALS BY EXACT P-VALUE138 CDF.P139 A INDEX POSITION OF VALUE IN CDF S ALPHA140 INDEX*(+I(CDFSALPHA ))141 .(INDEX>0) 1L61142 rNDZX4-1
91i
143 -L6144 COMPUTING CONFIDENCE INTERVALS USING NORMAL APPROX. W/C.F.1'45 L5:UALPHA-(DENOMZx(NORMPTH ALPHA ) NM-0.5146 A ROUND UALPHA DOWN AND INCREMENT 81 ONE147 INDEXeLUALPHA I148 L6:IPX NN INDEX149 CI.IPX CON5MW A150 ' A ' (DCC) ' PERCENT CONFIDENCE INTERVAL FOR TEE SHIFT IN151 ' LOCATION hETWEN POPULATIONS X AND Y IS:'152 '
153 I ( ',(*CIE1]),' S M(Y) - M(X) S '.(CI[23).' )'154155 -0156 E:'ERROR: THE SIZE OF YOUR SAMPLE IS LESS THAN TWO; TRY AGAIN.'157 1158 -81159 E3:'ERROR: YOU HAVE ENTERED MORE THAN ONE VALUE. TRI AGAIN.'160
1
161 -83162 E2:'ERROR: YOU DAVE NOT ENTERED A VALUE OF 1 OR 2; TRY AGAIN.'163
f
164 -82165 E4:'ERROR: YOU HAVE NOT ENTERED A VALUE OF 1, 2, OR 3; TRY AGAIN.'166] '167 *(A=2)/N2~168 .86
V
7 NPSLR-N.SUMXSUMXYXBAR:YBAR;SUMX2:SUMX!:B;A;WW-XX;BB:UOD:ALPHA:P:CC;CDF;TALPkA.NN-CI.SLOPES-RA.SR ....DENOM..NDE: X .... 4.w.... A..V- A PROGRAM CbNDIUCIS NOANPAAAMk'tR±C LINEAR REGRESSION. T LE EAST SQUARES3: 2 ESTIMATED RECRESSION LINE :S COMPUTED hITH HYPOTHESIS TESTINC AND3 CONFIDENCE INTERVAL AVAILABLE FOR THE SLOPE B. IF B DOES NOT LIE IN14 A THE C.I. AN ALTERNATE REGRESSION LINE IS PROPOSED. SUBPROGRAMS CALLED
5 ARE: SPMANP, K2NDALP, NORMPTH, INPUT, AND CONFLA.6 CPP.57 A INPUT DATAa R INPUT 2
10 X~~( 1*11 Q+i12 a ASSIGN TE SIZE OF X (AND 7) TO N3 N pX13 n COMPUTE THE SUM OF X'S AND I'S
15 SUMX-+/X16 SUMX +/Y17 A COMPUTE THE MEAN OF X AND Y18 XBAR.*SUMX#N19 YBAR SUN 4N20 a COMPUTE THE SUM OF THE X'S SQUARED21 SUMX2 +/(X*2)22 a COMPUTE TEE Sam OF X TIMES I23 SUMXY.+ (XxY)24 A COMPUTE '', THE SLOPE OF THE ESTIMATED LEAST SQUARES REGRESSION LINE25 B.((NxSUMX7)-(SUMXxSUMI ))((NxSUMX2)-(SUMX*2))261 aCOMPUTE 'A, TE 7-INTERCEPT27 A YBAR'(BxXBAR528 FF-'N'29 ' THE LEAST SQUARES ESTIMATED REGRESSION EQUATION IS:30 '31 ' z = ',(*A),' + ',(wB),'X.'3 2 '33 'DO YOU WISH TO ENTER SOME X VALUES TO GET THE PREDICTED Y''S? (Y/N).'34 WW -35 (WAW='N')/LI36 L2:'ENTER X VALUES.'37 XX.I938 a CALCULATE PREDICTED 7'S39 Y7.A+BxXX40 'THE PREDICTED Y VALUES ARE: ',(O77)41 ' )42 ' WOULD YOU LIKE TO RUN SOME MORE X VALUES? (7IN).'43 -*J-3.4 .WW=1Yl)/L2451 I:',WOULD IOU LIT. TO TEST HYPOTHESIS ON 3. THE SLOPE OF THE EQUATION? /
IV).46, WW-47 (WW 'N')IL348 'ENTER THE HYPOTHESIZED SLOPE.'49 88.050 A COMPUTE U'S51 UI-(BBxX)52 CHA.'> 153 a CALL SPMANP TO COMPUTE RHO AND ASSOCIATED P-VALUES.54 D.X SPMANP U55 *(D ~l) ')/Lul56 CA "57 Ll:PV 2xD[23
92
V -, r-*'2, , , , ,, .....
59 aPv-l6 60 L19:' SPEARMAN''S R EQUALS: 1,(4*DE13)61] 1621 'THE P-VALUE FOR HO; B '(B)'VERSUS HI: B ',(vCHAEI 2]).E0B).' IS:
63 1 1('4D[2J64 'THE P-VALUE FOR THE TWO SIDED TEST OF HYPOTHESIS IS: ', (4*PV)65 '66 a IF USING THE NEW REGRESSION EQUATION BASED ON MEDIANS, EXIT HERE.67 L3:.(FF='Y1)/068 A COMPUTE CONFIDENCE INTERVALS ON B69 'WOULD YOU LIKE A CONFIDENCE INTERVAL FOR THE SLOPE? (TIN).'70 W71 (W=fI)/072 L1O:CC-INPUT 573 A CHANGE ENTERED VALUE TO ALPHA714 ALPHAe(100-CC)+20075 4 ROUTE TO NORMAL APPROX. FOR CONI. IN?. OF LARGER SAMPLE SIZES76 *'(N>12)/LS77 A COMPUTING CONFIDENCE INTERVALS BY EXACT P-VALUE[78 P-RENDALP NF7 7 ,) - [ -S80 A I;VDEX POSITION OF VALUE IN CD! :5 ALPHA81 INDEX-(+/ (CDFSALPHA))82 *(INDEX>0)/L683 INDEX-*-84 *L6185 a COMPUTINU CONFIDENCE INTERVALS USING NORMAL APPROX. WIC.F.
* ".86 L5:D9NOM-(N(- )x((2xN)+5))+18)*0.587 TALPHAeDENOMx (I (NORMPTH ALPHA))88 -L989 L6:TALP9A-Pr3. 'NDZX]~90 Lg:CI+X CON'A I[91 NN-+CI[92 SLOPES.1+C1[93 RR-L ((NN-TALPHA)+2)Lg :o *(RR=0)/L2095 RR-2.96 L20 :SR-.rC1.<( NNI+TALPHA )+2))9g7 -<SR:5(pSLOPESJ)/L2I
~98 LSR.:SLOPES9 j21:'A ' (vCC ) I PERCENT CONFIDENCE INTERVAL FOR B, THE SLOPE OF100 z TH ETIAfkD REGRESSION LINE, IS:'101 1
102 C ,(vSLOPE SERR]),' < B < ',(vSLOPESESR]),' ).1104 IF B OUTSIDE TSE(C.I. CALCULATE9 NEW EQUATION BASED ON MEDIANS105 (.BZSLOPES[RRI)A(B5SLOPESJSRk)J106 A ORDER XN1071 X.-X [109 CH HCK TO SEEF IF THE SIZE OF SS IS EVEN OR ODD FOR FINDING MEDIANS110 *((21INN)=0)/S1111 a COXPI7TE MEDIAN FOR ODD CASE
113 -S2114 a COMPUTE MEDIAN FOR EVYEN CA SE115 S1:Be(SLOPESt (N2 ))+SLOPESf ((NN+2)+2) ]942116' 12: DO TH SAME F OR THE X AND Y VECTORS17 A2.( IN )0)/5
118 IBARI. -Y N+1 +2)1 19 XBAR*XL (N+1).21]
* 120 -OUT*121 S3:YBAR+( [ (N42 )J+Z[((N42)*2 ))2
* *~122 X ++XLN2)] +E (N+2 )#2)] 4.123 A COMPUTE NEW I NTE-RCEPT I A'*124 OUr:A-YBAR-(HBcXBAR),25 'THE LEAST SQUARES ESTIMATOR OF B LIES OUTSIDE THE CONFIDENCE INTERVAL.'
1126 'DISCARD THE LEAST SQUARES EQUATION AND USE:'
[128 + ,w)'4 ',(*B),'X[129[130 ' THIS EQUATION IS BASED ON THE MEDIANS OF THE X AND Y DATA AND~1311. THE 'IEDIANi OF THE IWO-POINT SLOPES CALCULATED FOR THE CONFIDENCE INTER?
* ~ALLOW 9SER :0 O 0SOME ;NAL.-S:S IN NVEW EQ UAZ'N .li34J 'DO YOU WISHj TO ENTER SOME X VALUES TO GET PREDICTED Y''S FROM THE NEW E
131QUATION? (Z/N).'
136] 4.(FF:'I')/L2137] FF+'7
93
*** % ~ > ~*~~..
7 SIGN;A.CBDPVALXMO.NVCDF;ALPHACIY-AABBCC;DD;PV;PVI;NNN;KPOS;ORD
I a HIS PLINCTIOk OSEb THk CADI----iTS'O ACLT H2 a STATISTIC P-VALUE AND CONFIDENCE INTERVAL AS A TEST FOR MEDIANS.
E3 a THE LAST bPTIORV OILL DISPLAY A TABLE OF CONFIDENCE INTERVALS OF'4] A ORDERED STATISTICS wITH CONFIDENCE COEFFICIENTS.5S. A SUBPROGRAMS CALLED BY THIS FUNCTION INCLUDE: BINOM, NORMCDF, NOEMPTE,~6. A INPUT SAND QUANC.
7 'I
le I DID YOU ENTER THIS PROGRAM FOR THE SOLE PURPOSE OF GENERATING CONFIDENCE INTERVALS FOR A SPECIFIED SAMPLE SIZE AND QUARTILE? (YIN).'
H 3 WW-m101 7' W1Y)/B'4i I.(~= THE NULL HYPOTHESIS STATES - THE POPULATION MEDIAN (M) IS EQUAL T
0 THE HYPOTHESIZED MEDIAN (MO); HO: M =MO.2
~13 WHICH ALTERNATIVE DO YOU WISH TO TEST?'14I15B3:' ENTER: I FOR Hi: M < MO; 2 FOR Hi: M > MO; 3 FOR HI: M = MO.'
E B2: 'ENTER: 1 FOR SINGLE-SAMPLE PROBLEM; 2 FOR PAIRED-SAMPLE PROBLEM.'H9 AA-O20 (.q )A (,U=2 ))I E2
S21 * (AA2)ILg22.A INPUT DATA FOR SINGLE SAMPLE CASE23] X.INPUT 124 NNN.OX
25 MO+.INPUT 3.21D-X-MO
2 PAIRED SAMPLE CASE
301 i -e-R
S33 DD-X -73U NNN-oDD
35 MO.INPUT361 D-(X-Y)-MON37 ACOMPRESS D TO REMOVE ZEROS358 LU4:A-(D:O)/D39 A RECORD LENGTH OF A AND ASSIGN TO N'40 N-pAL41 $I KEEPING TRACK OF POSITIVE SIGNS42 K.POS.+I (A0O)43 -(NZ30 )INORM'44 PVAL-BINQM N45 ;(PS>O )IFi
47 -P248 P1:PVI-l-PVAL[KPOS)49 P2:PV-PVALE (K.POS+i150 *L651 a IF N I .GREATER THAN 30 U E NORMAL APPR9X WFI-CONTINUITY CORRECTION
52NR(Zi (P -0 5)-(O 5XN) 4 0.5x(N*0.5)j53 Z (KPO +0.5 -(6.5xN)5.*(0.5x(N*O.5))155 PVIi- (NORMCDF Zi)56 AIF PAIRED SAMPLE TEST GO TO L17 FOR OUTPUT STATEMENT57 LPV2x(/(PVP VI))58 (V3)/559 PV3-160 N5:PVM-(3 1)p PV PVI PV3)61 'COMPUTA±IO S AE BASED ON A SAMPLE SIZE OF: ',(VN)6263, 'THE TOT-AL NUMBER OF POSITIVE SIGNS IS: '.(*KPOS)64 ' '
65 -(AA=2 )IL1766 'THE P-VALUE FOR Ho: M =',(*MO),' VERSUS HI: M ',(*LOGIC[C;12),1 ',(vMO)
L69] L17:' THE P-VALUE FOR COMPARING THE MEDIAN OF THE POPULATION OF1701 'DIFFERENCES TO THE HYPOTHESIZED MEDIAN,'
'90q: M(X-71= ,ZMO).' 7ERSUS 31: M(X-Y) ,, c.L0CZCC*.::: . :SO)
~73 1 ,
74 L18: 'WOULD YOU LIKE A CONFIDENCE INTERVAL FOR THE MEDIAN? (YIN).'75 B.76 (BB'Y)L677 :*UANT78 a INPUT SIZE OF CONFIDENCE INTERVAL79 L16:CC*INPUT 580 A4PHA*(100-CC)*20081 * (NNNZ30)INORM182 aCOMPUTING CONFIDENCE INTERVALS By EXACT P-VALUE83 CDF*BINOM NNNa" aINDEX POSITION OF CDF FOR ALPHA + 285 B.+I( CDF!5ALPHA)86 -(B> ) /SFJP
94
87 3.1ias *SxxP89 a COMPUTING CONFIDENCE INTERVALS BY NORMAL AJPROX.90 NORM1:KALPHA-((0.5x(NNN*0.5))x(NORMPTH ALPHA) )+(0.5xNNN)-O .591 A BOUND KALPHA DOWN IO NEAREST INTEGER AND INCREMENT BY ONE92' B.LEALPBA+l93 A IF SINGLE SAMPLE CASE GO TO L794 SKIP: (AA:I)/L795 A CALCULATE AND PRINT OUT CONF. INT. FOR PAIRED SAMPLE CASE96 L5:ORDD.DD [,DD]'7 7Z+PORDD91 CI.ORDDBJ ,ORDDr(YY-(B-I))'99 'A ' (uC PERCENT CONFIDENCE INTERVAL FOR THE MEDIAN OF THE100 :POPULATION OF DIFFERENCES IS:'102 ',(ocl[lJ),' s MEDIAN(X-Y) i ',(fCIC23),' )f103] ' '
104] QUANT105 a CALCULATE AND PRINT OUT CONF. INT. FOR ONE SAMPLE CASE,1061 L7:ORDX.-X'X"L10T7 II ORDX1I08] CZ-ORDX[RjORD!HUY-'(8"i))]Mg 091 ,mC.! P CENT :ONF:DENCE N.RVT7AL FOR THE MEDIAN OF THE POPULATIO110]1,111 ' ( '.(wCI12),' f. MEDIAN < ',(VCI[2]).' )'112]113 QqANT: 'WOULD YOU LIKE CONFIDENCE INTERVALS FOR A SPECIFIED QUANTILE? (Y/N
nt. : -WW='Y)I!
,117' N'ENTER DESIRED SAMPLE SIZE (SINGLE INTEGER 7ALJE).'
L120 Ij ((1 (NNN)=o0)/Eu121 1:'ENTER DESIRED QUANTILE; FOR EXAMPLE: ENTER 20, FOR THE 20T QUANTIZ.
R23 (QUA 50,v(QUA>1OO))/Z
12D NNN QUANC QU AS1251[116] '***** THIS TABLE GIVES CONFIDENCE COEFFICIENTS FOR VARIOUS INTERVALS WITH 1127 'ORDER STATISTICS AS THE END POINTS FOR THE ',(mQUA),'TE QUANTILE.1128 1129 -0130 E1:'ERROR: THE QUANTILE VALUE MUST LIE BETWEEN 0 AND 100; TRY AGAIN.'131 4132 -BI133 E2:'ERROR: YOU HAVE NOT ENTERED A VALUE OF I OR 2; TRY AGAIN.'134 11.35 -B2136 E3:'ERROR: YOU HAVE NOT ENTERED A VALUE OF 1, 2, OR 3; TRY AGAIN.'137 1138 -3139 E4:'ERROR: YOU HAVE NOT ENTERED A SINGLE, INTEGER VALUE: TRY AGAIN.'140 1141 *34
V SPEARMAN:X;Y'A' Q R;CHA;PV;I A THIS MNT1A CO*MP&JTES THE SPEARMAN R STATISTIC WH::S XEAS:'FF-2 A THE DEGREE OF CORRESPONDENCE BETWEEN RANKINGS OF TWC s;MF:F t
3 VALUE IS GIVEN FOR TESTING ONE AND TWO-SIDED HYPOTH.S: :FA SUBPROGRAMS CALLED BY THIS FUNCTION INCLUDE: T:ES. T:ESK. SF7.EZ.
5 A INPUT SPAPROX, INTERP, AND THE VARIABLE PMATS:.,6, R.INPUt 27 M +Rq ll +
7,H'L SPMANP TO CALCV '_' 'F ST4=s-~ Z.r r-.~
,L * :L215 L1:CHA-'DIRfCT'16 L2:PV-2xA 2J17 *(PVSI)IL3Is8 PV-119 L3:'SPEARMAN''S R EQUALS '.(uA:11")20,21 'THE P-VALUE FOR HO: NO ASSC-.. :..221 1 81: ',(*CBA),' ASSCC:4T$ rl -
23 ! '2u 'THE P-VALUE FOR THE TWO-SlDFR TEST YF FFr25 '
76 -t14 971 FIN APL WORKSPACE FOR CONDUCTING NONPRRRlETRIC 12/2STATISTICAIL INFERENCE(U) NAVAL POSTGRDUATE SCHOOL
UNCASIFI~tMONTEREY CR N F YROTS JUN 677 F/IFIE F/L2/3 NL
*, *%, %
1 1164
.. ~p.
,,,, q ~' W~~ii~W'U~i li~~u~z T~aP~~ V~A~'IM MDIAV (N) IS EQUAL r'1 MY.-OF AZrnuLTIVE DO YOU VISE ?0 TSf?E1Y wU: I FORII x NcRa 2 FOR 11: X b X01 3 FOR1III a NO.,
0A(C-2)A CS)E; SZ*' L5W-SAM~rj PRODLEM: 2 F05 P&XIUD-SJJIFL PROBLEM.'
a -Lp * A 2 z ' X:PUr DATA FOR SINCZ9 SAJIPE CASE
Al!:.@T PAIRED SAMPLE CASE
L11:.(Dn)/DCOMPRESS D TO REMOVE ZEROS2 NaA RECORD LENCTOF A AND ASSIGN TO N
'PS(-)ZPN RC OF POSITIV SIGNS:AEE I'll ASOLUTIT 7A.L0Z OF A. ASSIGN ?0 3 AND ORDER 3
pos.voinul; osrrrys SIGNS TO COINCIDE wITE PROPER POSITIOS IN I
s.- Eel CI UCO OALL POSITIVE VAUE 0REA EX2W; BY ADDINGE I ACROSS ALPSfT AUSO
N~~e(L(( Of LI??r TAIL OF PROD AJILIT' DISTRIUTION
(#2,16) /L3 SrArEMEJTS UASED ON LENCTI OF VECTOR 8ao-N GENERATEf NULL DISTRINUTION FOR TPLUS
"' FA LT EAL? 0F PROD DIST CALCULATE PVALUX AS NORMALai E E NE sGATvz r' STATISTIC
S~( USE DOTE YEE INTEGER ABOVE AND BELOW AS TPLOS
So )/GO
.r2
: j Ihhi?1111 VOA PPROCXMATIN:TE! -COTINUITY FCRCINFTO
)(NuM-0.5 *2 )(N-1 ).O.s
i Dr96
"S SC AN b5 S %% -S%.
gS;,~S /N2)$C~
4~~ ~~ :V( TDSTOY5*
r, 3:1
o0 2ry '1N OTAL SON OF POSITIVE RANKS IS: '.(VTPOSl)
1I012 a, F PAIRED SAMPLE TEST GO TO Li7 FOR OUTPUT STATEMENT
8.041 J-J"2 i4 K M q ,(WNO),' VERSUS HI: M ',(vL0GICcC;i3).' ',(G4O;15 ,:I. 6', ~ j
P107 1i7: '"99 -7ALJ9 FOR COMPAR:NG THE VFEDrAN OF THE POPULAZION OF108 : DiFJEENCES TO THE BIPOTESkIZED MEDIAN,f,.... '3C: N(X-Yj2 '690).' 7ERSUS 3i. N(X-I) ,cLCC;:.''.(*NaO.'
Li12 L1 U:'VOULD YOU LIKE A CONFIDENCE INTERVAL FOR TEE MEDIAN? (Y/N).'
.. , ALPSA-':OC-CC)#200iiS. aROUTE TO NORMAL APPROX. FOR CON? IN? OF LARGE SAMPLE SIZZ
119. -(NNN>16VLu~120, a COMPUTING CONFIDENCE INTERVALS BY EXACT P-VALUE121~ CDF.YILP tN122 a INDEX POSITION OF CjPF FOR ALPHA + 2123: TALPIA (+/(CD?!CALPEA))
(TJUMP WIUM
a COMPUTING CONFIDENCE INTERVALS BY NORMAL APPROX. W/C.F.126 L.:UOZ(2xNNN NNli)x((2wNNN)+1)),3 *! .129 TxPE. ( DNM NORMPTB ALPHA)(Nx 5NN1) )4tL) BYON130 a rN MAWI 1A Di131 TALPE*LTALP PEA DOWN TO IrNTEGER V LUE AN -NCRH~T3 N132 TALB LZLAi133 aUP.Az)L IF ONE SAMPLE CASE 0O TO0 L7134 a CALCULATE AND PRINT OUT CONP. INT. FOR PAIRED SAMPLE CASE135 L3:CI. ALPA1136 'A 'jV (!C RETOmNDENMCE INTERVAL FOR TEE MEDIAN OF THE137 'POP LXI OF DFFERENCES IS:'136 ''139 ' '(*C13),, f, NEDIAN(X-7) & ',(vC1E23).' )l140 '141 .001042 aCALCULATE AND PRINT OUT CONY. IN?. FOR ONE SAMPLE CASE143 L7:CI.TALPIA CONFW X144 'A ',(uCC).' PERCENT CONFIDENCE INTERVAL FOR T BE MEDIAN OF THE POPULATIG
N IS:'*145~ 1 1
:146' l '(vCI[I3).I S9 MEDIAN :9 ',(fCIE23),' ),147 1,140 *W0,149 22:'ERROR: YOU HAVE NOT ENTERED A VALUE OF 1 OR 2: TRY AGAIN.'
.151' -821152 E3:'RROR: YOU HAVE NOT ENTERED A VALUE OF 1, 2, OR 3; TRY AGAIN.',153 ' '
.1 sk: .33
97 a
d' . w . .~~ ~ .**a ***>**~~ *~*~* ~~U ,L
APPENDIX F
LISTINGS OF SUBPROGRAMS BASIC TO BOTH WORKSPACES
T .0# N0ON N iN P:X;'CDF
II .Tf S UN ri N -IS A S5PROGRAM OF THEP SIGN TEST (S IGN. I? CALCULATES2~ * 55 CD? OF THE DINONAL WHEN PROBABILITY = .5. N=SAA4PLE SIZE.
V O+L CBIN R
2 HIS ?r UNYTION IS A SUBPROGRAM OF CONFIDENCE INTERVAL GENERATOR FORI T153 Q f QUATILE. IT RETURNS THE VALUE OF THE BINOMIAL CDF AT R,WITH N.P=L WHERE N=SAMPLE SIZE AND P=PROBABILITY.
V CON*XX CONFLR 71'BBSS;AA'A'XR'YR;BC-S1I l HIS FUNCTIOk It A SUShaOGAAMO kONPARAM ERIC LINEAR REGRESSION22 a (NPFLR. I CALCULATES THE TWO-POINT SLOPE FOR EACH PAIR OF POINTS3- (XIII) AND (X 4 JJ .)ALL<< AND XI-XJ. ALL SLOPES ARE ORDERED AND4 USETO FIND TH INTERL OR 8, THE SLOPE OF ESTIMATED5" A EQ ATION. XX= X DATA SAMPLE AND I= Y DATA SAMPLE.
7 a RECORD THE SIZE OF XX AND INITIALIZE VARIABLES
11 a THIS LOOP CoMPRESSZS ZAND 7 DOWN TO WHERE THE XX < ALL OTER XXS12 L2: A.Ai13 A.(X%iAA.,CXX)14 XA A <X
15 YR.A/YI16 XR17 .(*B=0)/L318, C*O19 a THIS LOOP CALCULATES THE SLOPE OF EACH PAIR OF PAIRED DATA.20 LI: C+C+121 5. (71 CAAJ-YR C) )(XXCAA2-XRCCJ)22 SSTSS S23 (C<B ILI24 L3:(AA<BE )/L225 JON (pSSl, CSS[ 4SS3)
A'7 V CONFM-AA CONFMW RBA;B;C*D-EF'GHNTI a THIS FUNCTION IS A'SUBPROGA OF THE xANs-WRITNsY TEST (MANW).
2 A IT COMPUTES CONFIDENCE INTERVAL ENPOINTS FOR THETA, THE SHIFT IN3 A LOCATION BETWEEN X AND Y. AA=INDEX POSITION OF C.I. ENPOINT.4, BB=COMBINED DATA SAMPLES.
A AqSIGN SIZE OF X VECTOR TO A; INDEX POSITION FOR CONF INT TO B.
Q SH ASIN X 79CTOR TO ;C; V ECTOR TO Di DA + 38
11 a REORDER X AND Y VECTOR VALUES TO ASCENDING ORDER
14 a INITIALIZE VECTOR 5 AND VARIABLE F15 Mis F4-17 mINNER AND OUTER LOOPS CALCULATE ALL POSSIBLE DIFFERENCES; EVERY 718 a ELEMENT MINUS EVERY X ELEMENT. B VECTOR STORES THESE DIIFERENCES.19 12:G-120 Li:E.D (F)-CEG3211 H-0B.E
98
p~jV ~ s ~ *~. Pb ' - ~ - r ~ .>. % -. *~~=~ -.. !
ob~ (F( VODI sAhri Ir vicTOiR VALUES TO ASCINDING ORDER
L20 UE 08ZTECN INT VAL"E FROM H
v GONr CINFW B-C;DEZG3N~ FAI S5 '*(fl,9Ar COIN f ~ "" IT RANK TEST
2 . ®r~k C NR A 55 E~ah AS D ON THE3 im E~ Lf.l fA PRID goFIf A Oj 5 RA j1 Z A s X . A x INDEX
aSTART I ACCUMULATION VICTOR OF? MITE ORCINAL VICTOR VALUES7 U00
id' OUTSIDE LOOP INCREMEfNTS D AND RE SETS f TO DiiL2:D-D+i
12 I-D13 j a NSID LOOP CENERATES NEXT SIT OF AVERAGES AND CONCATEPNATES TO ORIGINAL
VCR
*17 aCONTINUE INNER LOOP MNT= I EQUALS THE SZE OF C
19 CONTINUE OUTER LOOP UNTIL D EQUALS THE SZE OF C LESS ONE20 (Da p)I)L:21 ORDER FINAL ACCUMULATED VECTOR H*23 aNDXQFIT VALUES OUT OF H
V FeD? FDI STN X-A:M-NRM-RNLM-LNSM SNM2N21 YETAI FuNcrI6N It~ tus6ROfAI of 'fDkUtAL-FJALLIS TEST IKRWL) AND
2 EYZSU DENT TDIST(DSN.I jXfCA MCLAT PP O UMULATI VE PROBS.
14
6 A xX+Sm.x-lDF5 .14DF7 * (M'>2 )vNp2is a REATl THE 1ST. FOUR SPECIAL CASES.
"I'L' 1 0sUN1 Lll:Pet-IOA*0.3)+04211 -012 L12:P.A*0.g13 -014 L21:P41-(1-A)*0.g15 -0161 L22:P-A17 .0018 a BEGIN THE GENERAL CASES. INITIALIZE THE QUANTITIES.1.g L:R MIN2) ORNe1 I P220 S =0(R :O )xjLM-LM~e *.5xM-221 S4.(RN=0 iRNILN-LN2*0.5xN-222 a R SN CSE N-EVEN FIRST.23 .NODD-i1LM214 11 I~ O2
25 SIe+,/x\(1-A)e.x1+(1+0.5xM)%N226 .M VN27 NODD:eMEPVENx N28 SN. (1-A )*0.;5=29 :.REAT TH PORTION OF SN THAT DOSN'T CHANCE M/ M ODD OR EVEN.30 ,~+eL ) O=LN
'4.31 S e-SNXI+4&/X\(1-A).x142x~M241+2x1LN2 I REAT :%9 q'--;7EN 3 UBC-SE13:-( (242).!=12M),MEVEJV3.-LC
35 efMZVEN~36 a TRAT THE N-ODD SUBCASE37 S JNX2+ol38 eENDE=M39 SN.Nxx/l++*1+2xi0.5.N240 EV9N :*MODDxxU:21M'41 -ENDx%2:N42 SN4.%++/x\[emailprotected] *IND44 a NEXT TREAT TitE $PECIA4 CASES FOR ODD-M.'45 MODD:SMN (0xM:1)4.( A*0 *5)+0+2) x3=M'46 .ENDxt(1:M)v3=
'46 END: .(SNXAH.)+4xRNx.N-RM)+( (RN"RNx loA*0.S)4o+S)-2xRNxSNx(-A)*0.5
99
~.0*
a SO I G SDI z LATA ATPLI I CURSN wl
RECORD SIZE OF N TO DETERMINE NUMBER OF SAMPLES IN BAA'.oNa LOCATE LARGEST SAMPLE SIZE
m* ORDER B SMALLEST TO LARGESTI DD-BCAD
aD -(AiET UP MATRIX OF SIZE REQUIRED TO STORE SAMPLE RANKING.Xx1(Al ) X0a CONCATENA!E 0 AND N1 FNINO';RD CUMULATIVE SUM$ OF SAMPLE SIZES.
1 2 HIS LOOP INDEXES OuT ORGINAL SAMPLE VALUES FOR FURTHER CALCULATIONS.2 LI :CC*CC I.3. ,OCTZ. C'.j RI rYAL SAMPLe 7ALUES ZN Bi4 Zl*BC(NCC t "C )
• a 4x1 Os(CIoNS OF ELEMENTS OF X VECTOR IN B AND ASSIGN TO C~ DD X 111
221 * TkIS LOOP DtErCT$ TIED INDEXED POSITIONS AND INCREMENTS TEE INDEXING OFEACH SUCCZSS. . .. ED POS:T:ON BY ONE
9 a TES± DTH ELEMENT OF C AGAINST REST OF C FOR :!iES30 L2:E.(C:D]=D+C-
a SET P EQUA. TO ITE APPROPRIATE SIZE 'DTH SIZE) ZERO 7ETOR
33, CQN.ATENATE F AND vECTOR OF 0'S AND 's (1'S APEAR WHEN TIE OCCURED)
3S ADD RESULTANT F VECTOR TO C
38 a CONTINUE FOR ENTIRE C VECTOR
,.
1 1 J DjES OSF TE INPUT PROMPTING AND ERROR CEECKUNG.*(L=TR2 L3P O d PROTDREA TW:BEVAIN'R RQIE)3 X? IS 1 SUBPROGAM OF SIGN, VISIG, MANYI, AEN, SPMAN, AND NPLR.
it -O 2',~ I
2:'ENTER X DATA (MORE TEAN TWO OBSERVATIONS ARE REQUIRED).'
,1 '14zA r bAA (NUMBER OF 7 ENTRI ES MOST EQUAL NUMBER OF X ENTRIES).'."
15 r~a
" 00 L2:'rTER T AE H MPOTHESIZED MEDIAN.' R
2 !IN.U Z49
2 *(pIN))I )/E3 "-
23 .0 2
24 l' ENTER THE HPOTYESIZED MEDIAN FOR T LE DIFFERENCES OF THE PAIRED DATA.'
26 *U(IN)>I)/E3
11-0
2. 3: THENTER THE DESIRED CONFIDENCE COEFFICENT:
30 ' FOR EXAMPLE: ENTER 95, FOR A 95 PERCENT CONFIDENCE INTERVAL.'31 I
32 *(IN0 )v (ZN>100 ))/Ei&3 332 *(( 1/IN )2O)/ES
23324 *Ql35 E1:' RR THE SIZE OF OURSME IS LES TH AN THFF REE; TrY AG AI.' DAA.
39 E :'ERROR: SAMPLE SIZES ARE NOT EQUAL; WANT To TRY AGAILN? (I/N).':1 *{B:'Y' )/L2
2 'ENTER RIGHT ARRO 5, TO QUIT.'310
32 Yj0J-lJ~ll0))1a
33 M I X0 /9
34~ ,0
49ARO: TIE 17P0ESIZZD MEDIAN MUST BE A SINGLE VALUE: TRI AGAIN.'
:3: .)/L34:1EROR: THIS VALUE MUST LIN BETWEEN 0 AND 100; TRY AGAIN.'
E*L5E5: ERROR: TEIS VALUE MUST BE AN INTEGER; TRY AGAIN.'
.c114 *L S
R ~ D i A~fj UNN I'L.U TEE
£PASS D AS :50 RE GIKENT.
aSjPAR4TE TEE CD? TABLE AND STATS. INTO SINGLE VARIABLES.
a j .C 3;a HERE A FIRST EXCEEDS OR EQUALS ONE OF TEE TABLE VALUES.
'l -(+Cj80STf DOES NOT EQUAL ANY TABLE VALUES SET NTR zz-.Ju nD ~ lcfomoCFrs ocjz2. F AC
~o a I? :NDgXED POSITION EQUALS ONE INDEX P-7ALUE OUT OF CC.
~20) a TEWS ONDUCT INTERPOLATION TO GET PROPER P-VALUE.21 L), ; lsDpifj-E2j4 gL~ D2lj.D_ 1
r s rNT* 0CR-1J.+PL
2~~~~~~ ~ L N.lDAPNB D N (KEN) AND N N-
PARAMETRIC IN R ~ "SlfON ( NlR. IT CALCULWES THE CUMULAIEISexTRIU, W 0O Ba MPL SIZE N.
5 a NN2I ITALXZE FREQUENCIES FOR X FOR SAMPLE SIZE MN.7 a D TERMjNE $IZE OF RIGHT PROD. TAIL VECTOR
91 acoi OUERJ gO INCREMENTS THROUGH TEE N SAMPLE SIZES TILL THE DESIRED ONE IS
10 Jl1:i+0011 C012 f:00z13 NN.tN+114 *.((NNX(NN-1)).2)+215 A41.1N a INNER LOOP GENERATES NN+1 FREQUENCIES FROM TEE VECTOR OF mN FREQUENCIES17 L2:A*A-1Is C4:C4XC (B-A))19 DDC20 AA-d
21 a ISIZE Q D EQALS NN AND INDEXES OF X STILL REMAIN GO TO L422 *(( UN1 (-)F1/L23 aTE S8S CON TINosE To INCREMENT TERU OLD FREQS
25 a WREN LFT HALF OF NN+l VECTOR IS COMPLETE GO TO L5
34q "+XiUB-A)J-XZAAJ
31 E.1.31] a OM TE N VC O REQ D BND ONCTENINGDWT39 L3:9*D
'40 CONTINUE UNTIL SIZE OF SAMPLE N IS REACHED
'42] :NGNEATE VECTOR OF CORRESPONDING P STATS OF PROPER SIZE C
101
"3 P.0 .1.44 1 1 I~S 4 RO mghP VICTOR
~ '~r4~j'?%~) l*miA±E.z1 FROM Tar FREQUENCY VCTOR
14 C ANC FRZQS TO)Cf C JDALE AND OUTPUT B S'TATS WFIAPPRO. CD! VALUES
V MAN-N MANWP M-F.Q P-ST;U-VB -NNMUUMM[1r F UNJTfON IS 'A5Bjb)MOP TOE MANN-WHITNEY TEST MXAYW).2 mJ l RAUJ r E STYIir f. FPOR HEU STATSTIC. N LSZ OF
51Qa~ U~NUMB9j)OF TERMS TO BE INCLUDED IN LEFT TAIL DISTRIBUTION LESS 1S M. NM+ +7) a F~,M0 SET F VECTOR EQUAL TO 1 CONCATENATED WITH MM -ZRO'S
aI SET P EQUAL TO THE MINIMUM OF NeM OR MM
11ii a SET Q EQUAL TO THE MINIMUM OF N OR MM122 Q-L/M MM13) a GOTO LINE DEOM IF MM IS LESS THAN N+1(SIZE OF X+1'5
15 IFNmMN+ llENRATE FIRST BLOCK OF RECURSIVE RESULTS USING NUM LOOP
.9 PRIMAR)' ?ORMULA USED IN GENERATION OF FIRST BLOCK OF REcURS:'7F RESULTSt19 prj@?g 7r-"202 As S .V-W' tEC;?MZNTD VALUE TO U! AND TESTS I? Z THIS NEW U
23 aGENERATE FINAL RECURSIVE RESULTS USING DENOM Loop1 2 u4 DENOM:S.1~28a P4:V4-RT OMA USED IN GENERATION OF FINAL RECURSIVE RESULTS
[292 *L4x't(QkS*S+1[30) ot C NR FREQUENCY TABLE TO CDF VALUES FOR FINAL OUTPUT[312 MAN.(+\F )4(NL(N+M)
V Z.NORMCDF X-AB;C-D1pl mEAUA- bO~A . FF 26.2.11 fA AM qN
~.11 1A S.lAo4 SNJS V XDPW ~a~:f 5 &UI~~ POTDT hRM ~;ir t7H CHANE 0
52 1VI FDRiW PROBLEMS WITU x\
/§4)/Z4-X .X)=P.X )13+C
*A.5BJ.*A*2)X2)*05S)X+/X\A*2).12(C0r0r/)+
12 210 B *C 26~ 89790 56295540 52050600 19934640 3690160 341952 15232 256133 D- 2227021(32432400)7567360 6I540480*2162i0 3843840)314440 15360 256
V Z.NORMPTH P ABC-D T.S.RS_ V API DS 12977
11 S (p)T.0.42< L )/Z ~2.L12. A- 2.5066282N884 -18.6150006252 '1.39119773534 -25.4tslO6OL963713 Ba .735109309 23.083 6743743 21.06224101826 3.13082909833
17, Ef. 2.78189113 -2.29796479134 4.85014127135 -2.321212768581is D4- 3. 43?6892476231.637 Q678j1897193 S-(xS X( R.*0.t x ) +.X)1+.(R.(I o*5-)0.5- )..* 1 2)e.XD)
102]
j(0 O.424 IQ 31100-1Sir: 'ONE 05 8 ROSE I EANI Afi QQr Qf sffaE.'
V FePERK NK:X::a THS FN±bI$' CALLED 3Y SPEARP. IT GENERATES ALL POSSIBLE
a PERUTAT IONS OF N RANKS. NSAMPLE SIZE.3 0X.N=P" I1 I pl
5 r
1:1:. Ox N<X*+1+7 (~(m)eX)\Z
9 tiL1 XIN-1),N)p (.P) "Y
V M.N QUANC Q1JL;:I THIS FUNCTION vGkzvs A CHOICE OF NONPARAMETRIC CONFIDENCE INTERVALS2 FOR THE QT QUARTILE OF A CONTINUOUS POPULATION. N=SAMPLE SIZE,3,,~=AxQUATIL8. IT CALLS ER SUBPROGRAM CBIN.4]52 a CENTERS CHOICES AT "THE ORDER STATISTIC NEAREST ESTIMATE.!6 I'L 0 •5+N/xQ 1 00
07 Is i. NX±ODERD DEVIATIONS (APPROX.)
I r-! x 1u1 ±2 K!R N 1; BOT il7 '12 CONF DE COEFFICES13 L-I(2 pK)o(N 0.01xQ) CBIN(CI l)+K-1),(IE2]-K))
I & M- -ORDEA STATIHT1CS I COEFFICIENTS15 a PART o F iOR T -ING OUTp 2t
-7 U( 0 6 0 2,18 U-U, 4( , 1L19 MM,Ll. U, 12 6 c((pE),1)pLr.
v SPN-N SPAPROX X'YI n THISF UNCTION tS A SUBPROGRAM OF SPEARMAN'S R (SPHANP)
aIf APPROXIMATES THE CUMMULATIVE PROB FOR R WHEN PASSED THE SAMPLE3 n I IN THE LEFT ARGUMENT AND TE ABSOLUTE VALUE OF R IN THE RIGHT
,- AGUNENT. SUBPROGRAMS OF THIS FUNCTION INCLUDE: TDISTNe6.' •CALCULATE THE CONTINUITY CORRECTION
7 +6 -GNx 1+N*2a A TRANSFORM THE STATISTIC R INTO ONE THAT CAN BE USED WITH THE STUDENTill p x+ - x (N-2)* -iX-Yl*2i*0.511 L j DIST UNC O TO CALCULATE THE P-VALUE121 JPN-1- (N-2) TDISTN X
V SPEAR+SPEARP NC1;A;B-C-DEMN-DID2LzIM.RCDFI A THIS FUNCTION IS A SJkRGA FERAI' HSPY).T2 CALCULATES THE EXACT CUMULATIVE DIST. FOR R OR TE SAMPLE SIZE3 PASSED AS THE RIGHT ARGUMENT. BECA USE OF THE LARGE COMPUTER MEMORY
4 RE9UIREMENTSo N IS LIMITED TO SIX ON THE PC AND 7 ON THE MAINFRAME.5 a SUBPROGRAMS CALLED BY THIS FUNCTION INCLUDE: PERM6, A.B-07 a INITIALIZE VARIABLES, VECTORS, AND MATRICES.
*4 .- ks.a) 0 0
a .±S .0P ErERATSS AN Nxff ARRAY OF THE POSS:3LE 7ALUES OF DIFFERENCES12 A BETWEEN ANY TWO PAIRED RANKS BETWEEN SAMPLES.13 L1:B..+14 M B; +C-A15 A-A+16 *(B<N)/L117 a NOW CALCULATE THE SQUARES OF ALL POSSIBLE DIFFERENCES.I8 MeMj19 AJIL" PERM TO LIST ALL POSSIBLE PERMUTATIONS OF N NUMBERS20 DPRM N21 CALCULAE SIZE LIMIT OF FINAL VECTOR OF R STATS.22 :LIf-1 + (N_53 N *12)gOZrD231 IITII ZE AL B EFORE INDEXING OUT COMBINATIONS OF ALL POSSIBLE
103
JAIV
24 p SQUARED VALUES.25 AIQ26 8-1 N NO27 A TIS LOOP CALCULATES ALL POSSIBLE COMBINATIONS OF THE SQUARED VALUES.28 L2:A-A+29 1 ; IM rA;Dt;A330 2N) IL231 a DD DOWN ALL ROWS FOR EACH COLUMN TO SUM UP SQUARES COMBINATIONS.32 D2-+4E33 a ADD UP NUMBER OF DUPLICATED SUMS OF D-SQUARED VALUES AND COMPRESS3L4 a "ECTOR DOWN TO UNIQUE 7ALUES.35 L3:C1.C (L/D2)36 D1+D (D + /( (L/D2 )=D2))37 D2((/D2)4'2)/D238 *(0C1 )<LIM)1 L339 TRANSFORM SUM OF SQUARES VALUES TO SPEARMAN'S R STATISTIC.140 R-1-(6xCl).(Nx( N*2 -1))41 A CALCULATE CDF VALUES ASSOCIATED WITH THE R STATISTIC.1421 CDF-(+\D )+'Nt431 s FORM TWO ROW MATRIX FOR OUTPUT OF . STATS AND CDF 7ALUES.44 SPEAR (2,(pC1).)p(RCDF)V
V SPM+X SPMANP Z;C;D;DD;DI;D2;N;DENOMR;XX;YI;NS;NUMR;P;PV;PVAL:SU;SV;RHO;U;V;ARO;l UV1 X YiWW
A THIS FUNCTIOk S A SUBPROGRAM OF NONPAR LINEAR REGRESSION (NPLR)a AND SPEARMAN'S R SPMAN). !T COMPUTES THE SPEARMAN R STATST t
A AND ASSOCIATED P-VALUES. TEE LEFT ARGUMENT THAT IS PASSED ZS THE X4 SAMPLE: THE 21GHT ARGUMENT IS THE -' SAMPLE.
r5i z SUBPROGRAMS OF THIS FUNCTION !NCLJDE: TIES, TESX, SPEARP, SPAPROX,.I A ZNTEBP, AND THE VARIABLE PMATSP.7 ORDER I IN INCREASING ORDER OF X
10 a ORDER X IN INCREASING ORDERI1 U+-X [,$XJ
12 COMPUTE CURRENT RANKING OF I13 C+SV1&4 A NOW ORDER Y RANKS IN INCREASING ORDER15 D1.VCAVI16 IF TIES EXIST IN EITHER X OR I RANKED VECTOR USE MID-RANK METHOD17 DD.1 TIES DI18 XX-1 TIES U19 A FIND ORIGINAL RANKING OF I WITH TIES RESOLVED20 YI1DDiC221 A RECORD SIZE OF INPUT VECTOR22 N X23 a CALCULATE DIFFERENCES BETWEEN RANKS OF X AND I VECTORS24 D-XX-YI25 A DETERMINE THE SUN OF SQUARES OF THE DIFFERENCES26 D2+/(D*2)27 A OBTAIN THE NUMBER OF TIES IN EACH VECTOR USING THE TIESE FUNCTION28 UO1TIESK U29 V1.TIESl Dl30 SU ((+ /( U1*3 )) - (+/U1 )).123:1 SV ((+/ V1*3 )( V1 )232 NS+N ((N*2)-1)33 A CALCULAE T56 R S S TC INCLUDING THE CORRECTION FOR TIES34 NUM(N +( 6)xD2)+ ( 6 x(SU+SV))35 DENOMR ((NS-(12xSU )*.5 )x( (NS-(12xSV))*0.5)36 RBO.NQIMR+DENOMR37 AHHO IRHO38 (N>6)/Ll39 A CALL SPEARP TO CALCULATE THE RIGHT TAIL OF THE CDF OF R40 P SPEAAP N41 eL242 Ll:.(N>10 I)/L343 P+PMATSP (N-5) •J44 : CHANG SIZE bP P TO AN MxN MATRIX
46 CALL INTERP TO CALCULATE P-VALUE BI INTERPOLATIONL1 L2:?YAL.-4RHO rNTEAP P
*48 -. PVAL= !I/L7? L VAL-O. S
0 L7.51 CALCULATE P VALUE USING STUDENT T APPROX.52 L3:PVAL N SPAPROX ARHO
253 L7:SPM (RHO),PVALV
104I%4
,' .);/ : ;,'."j. .w TJ,%,': :',. : . .' .: ' ,.' ,%' %' % .';q__ .: ". .'.' ":l~lt'.''-' *4*-4
V P K fDISIN X;7A THIS PUNCTION IS A SUBPROGRAM OF THE CUMULATIVE PROBABILITY GENERATOR
2 FOR PEA MAN'S R (SPEARP). IT CALCULATES THE CDP AT X USING3 THE STUDENT'S T DIT WITH K DEGREES OF FREEDOM.
m THIS PNCTION CALLS ON THE 'F' DISTRIBUTION FUNCTION (FDISTN).
7 P.0.5k(2,K) FDISTN X*27 *g xtO~vtIs Pv/X *0.5.+V/P.9 -B X!A/VA C-V)/19X .. ..V/
V TIBB TIES B-CDIN;TY-ZK*-MLPP-NR-MMA THIS McfUN ok IS'ASJHRAk 4?b MEMALL'S B (KEN) SPEARMAN'S
. (SP.4A11P, 'USKAL'WA)L:S ±KRWL) MANN-WHITNEY (MANW)3 AND W.L'CXON fWISIG). rT CHEC..S HE RIGHT ARC. 7ECR FOR ':ES 4ND
4 CHANGES THE TIED POSITIONS OF THE LEFT ARG. BZ THE MIDRANK METHOD.51
7 A' F NO VECTOR OF RANKS IS PASSED; GENERATE ONE
1 L.NpO12 T.1.2 :( CHECKING FOR TIES 3Y :NCREMENTING THRU THE VECTOR
L,. .( COUNT NUMBER OF TIES; IF .VO TIES GO TO L2
16~ D--/C17 -7 *(D0) /L2i af RECORD WHERE TIES STARTED AND HOW MANY RANKS INVOLVED19 II.T. (D+1)20] a INCREMENT NEXT T BY THE NUMBER OF TIES ENCOUNTERED PLUS 121 L2 :T T (D-6122 al? T LESS THAN SIZE OF ORIGINAL VECTOR GO TO L3 AND START AGAIN AT NEW T3 -(T<N)IL3
25 a ASSIGN THE RANKS OF THE LEFT ARG. TO TI26 TI.BB27 a IF NO TIES FOUND QUIT28 *(Y=0)/029 Z-030 A LOCAE THE INDEXED POSITIONS OP TIED RANKS31 L5:PP.( (I1 ZJ)r 1)+((IL1+ZJ+I[2+ZJ 1)32 A FIND THE MIDRANK VALUE OF THESE RANKS33 NR. (+/TICPP) ) 1I2+Z J341 a ST UP VEC OR WITH ZEROS AND ONES; ONES WHERE TIE RANKS INVOLVED35 1.0
37 L :K K+C38 m-(TI=TICPPENK))39 J~.M)40 (K<IE2+Z1)/L441 ASET A VECTOR WITH ZEROS WHERE TIED RANKS OCCUR42 L.,MM43 a TRANSFORM ONES OF MM VECTOR TO MIDRANK VALUE44 MM-NRxMM45 a TRANSFORM ONES OF L VECTOR TO REMAINING UNCHANGED RANKING VECTOR46 L.TIXL47 A FILL IN MIDRANK VALUES48 TI-L+MM49 Z-Z+250 A DO THE SAME FOR ANY OTHER TIES INVOLVED BUT WITH NEWLY COMPUTED TI51 -(Y>Z)IL5
9
V TIE*TIESK AAAAkBLC D;g'Ta R ,SMA?) AND tSPVA1V1) .-;D iRUSXAL-WALL:S ,5 U AM ., • i -PE'ARMAN'
": A .IE AND ZEE :OTAL NUMBER JF 11-3 i; :HE 7ECTOR.
20 Tr p10
105
1 : V cx1NG FOR TIES BY INCREMEN':NG TBRU THE VECTOR
513, OUNT NUMBER OF TIES. IF NO TIES GO TO L2,11 D.I,/C
1 N SN NEXT T B7 TEE NUMBER OF TIES ENCOUNTERED PLUS I28 2: T-T (D+%)10 T.T LSED~HAN SIZE OF ORIGINAL VECTOR GO TO L3 AND START AGAIN AT NEW T
7#
V
VARV RM B'CD'E:D1;E1a T5S f*ckC±iN IS A SUBPROGRAM OF THE MANN-WHITNEY TEST (MANy).
if) IT N GEEAESSTER SCHEME USED IN CALCULATING THER DIFFERECS IN SCALE (I ASSIGNED SMALLESTL2 ASSIGNED LARGEST 3
L& Zs:, D3~ f~NEXT RESr 4 SECOND SMZLZZST,-:.J T-vs ::LL ~?sE5 SAmPLE SIZE :S 1EACE.r). SAMPLE S:ZE :3 PASSED :N !BE R:,;T ,.C
D-0 .C. FIND FLOOR OF MIDPOINT OF 7ECTOR AND ASSIGN TO C
I SWS+S GENERATE RANKING VALUES LEFT HALF FIRSTE EL2, Dl
13 (X )=C)IL3
I.1' -( .iZ)<C)I"L2a: NOW CENERATE RIGHT HALF
26 19 .: D 120' rl+E lO "
21: -( (E):C)/L5!22 L6 :DIDl l2; &4 .=' o; i)=C)/L5 ?26 E+E DI "27' + (EI)<C)/L628 1", 4 oFZ) VECTOR IS-ODD VALUE CONCATENATE MIDDLE RANK IN BETWEEN HALFS29' L5:-( B B)O)/L730' a IF SIZE I$ EVEN CONCATENATE LEFT HALF WITs THE REVERSE OF THE RIGHT31' VAR E9132, -0 S
331 L7:VAR*E,B.(OE1)v
V WIL.WILP NN-N;A ;PT;NN;W-PPNMI a THIS FUNCTION IS A SUBPROGRAM OF TEX WILCOXON SIGNED RANK TEST2, a (WISIG). IT GENERATES THE CUMULATIVE DIST. FOR THE TEST STATISTIC3 a TEX QENERATQR USES A RECURSIVE FORMULA. NN=SAMPLE SIZE.5: NM.(L((+/INN)+2 +1*l"5 N-26, a SET P EQUAL TO PEROB. DIST. WHEN N EQUALS 2.7, P.1401I. L3:N N+".9,. A1. a SET T VECTOR TO PROPER SIZE OF ZEROS..- T+(+/(%N))POIF12 a IF AIN USE TRUNCATED RELATION TO COMPUTE OCCURRENCES.13, Lu:.(AIN)/LI.- a IF A>N USE FULL FORMULA TO COMPUTE OCCURRENCES.15. -(A>N)/L216. a WHILE (A-N) IS NEGATIVE TRUNCATE FORMULA TO AVOID A NEGATIVE INDEX.17, L1:T[A2.EAI18, A.A+I19. OL'4'. * IF A IS LARGER THAN THE LENGTH OF P GO TO L6.
"2 a INCE 'A-,V 3ECOMES POSITIVE: THE .ECURSZ7E FORMULA :AN 3E "YSE:.
25 a ONCE A IS LARGER THAN THE LENGTH OF P TRUNCATE FUlZTION AGAIN.26 L6:TEA.PC (A-N)327 L7:A-A+l29, A IF A AS AN INDEX HAS NOT EXCEEDED N(N+1)/2 GO AGAIN.29 i(A<I t)))/LL&'
.30 z CONVERT T INTO P AND CONCATENATE 1 FOR USE IN NEXT ITERATION OR OUTPUT.31 L:P.((+/(%N) pI32 'PNN+P33 WIL- +\PP) .(2*NN)34 a CHECK IF LENGTH OF INPUT VECTOR EXCEEDS NUMBER OF N'S GENERATED.35 - (NN>N) /L3
1.1 0 6 5..5
."
S d~5*? -~-~* .. '*~~~td*~ .. .. . - -, -
UnUW~lW~i Ryr M. . ~ nn nrzrr .- I ---- ' n
APPENDIX G
LISTINGS OF PROGRAMS USED TO GENERATE C.D.F. COMPARISONT A BLES
V EjNTEST:.N-ILPRA.-B:AjC;TAU;P;NUM;DEN; Z:ZZ ;ERRZZ ;M: ZZC;D;AA :PP;NUM4C;I;H;F
.:: 5T:5 ?RhP.m ETS EE F ... :M Rs.' ?OR .XEN.LS 3a SET SAM-OLE S5Zf AND ALPhA 7VALVES.
6AP_ S.-''
t121 PS-gC
'ii'~ - COMPUT? CUMUL:vr7 Z:ST AND ASSOCIATED STATS.:1'P-KENrALP NC. THIS LOOP :ALCVLATS ALPYHA 7ALUES AND APP:RXZMAIONS.
~.
26 AA 6G ,N.27 PP-PP.P:2;A'28 A CQMPUTE NORMAL APPROXIMATION.29 NUM-3x(TAU~x(UNx N-1)DwC.5)30 DEN. (2K (2xN)+5))*O.531 Z NUM4 DEN32 IZZZZ(I-NORMCDF Z)33 aco*kPUjE NORMAL APPROXIMATION WITH CONTINUITY CORRECTION.
* 34 NUMC.3'CTALJ-AA x((NX (N-1))*0.5)*35 ZC-NUMC*DEN*36 Z ZC.Z (1-NORMCDF ZC)
137 (B<71)/L238 a OMPUTE ERROR DIFFERENCES.39 ErRRZZ*PP-Z Z40 ERRZZC.PP-ZZC41 aPRINT OUT TABLE OF VALUES.42 1-0.14 3 .- 9544 $-'TEST STAT. VALUE453 )#.M.l PROBED Z 13; FOR SAMPLE SIZE EQUAL TO 1,(2 0 NN),'
UG 46 L3[1.7 L5:J-294~48J L3:M-M5513 DAVE (180197) .J,610 (8p197),J3
N~4 -~j*(D:4 43Lp50 H. 1 I8 pS~511 -(Dxz 6*52 F-'E4G I 9.9999 >' OFMT(l 7 oKX)*53 F- 1 62 p?*54 j -H.55) -L7~56 L6:FS-9057j H-17o5
61 ? l Ll(7 5aPP LCI)62 iSF5
64 L7:M-M,EIj FS65 D-D+l66 1-167 -(NI 2N L3D68 N1:s.f!CC.F69 .1-198701 -L371 N2:S.'ERROR; NORMAL72 PP-S-RRZZ73 -L3
1 07
% ~ ~ .~~9*f * ** * * .. ~P.~. P. j-
S74] N3:S.'ERROR: NORM. w/CC75 PP*ZRRZZC76] -L377] L4:M
7 KWTsS m .A* PNL AN;;;RPV UEVA~ '~YE; ERPE Rl1CDP;6C ;kk ;l£; J; F F; S;:
S2 *a THIS PROGRAM GENERATES TABLES OF C.D.F. COMPARISONS FOR THE ZRUSL4L-3 A WALLIS TEST.4.
5 A-4 44 46. CC-27 ALPHA- 0.01 0.02 0.03 0.05 0.08 0.13 0.18
.8: -L2
10ALL UWAL TO GENERATE EXACT DISTRIBUTION FOR SELECTED SAMPLE s::Z.f 2 8-0L13 N+/A2 4 CC*CC+1t16 M-~001171 PP00
.20~ ?F-oC21 PFI-oO
1 22
C F P ;
24 LOWCATE POSITION OF EXACT CDP VALUE f. ALPHA.25 D-(+/ (CDFf.ALPRAEBl))26 -(Dzo)1L327 D-128 DETERMINE CORRESPONDING TEST STATISTIC 7ALUEf.29 z3:s-p::;D]30 KK*.KK 331 A RECOAD EXACT VALUE OF CDP.32 PVALUE-P[2-D]33 PP+.PP PVALIJE34 A COMP.TEk CORRSPONDING P-VALUE USING CHISQ APPROX.35 PVAL.(X I1 CHZSQ B36 PC-PC (N-VAL)37 A COPUtzE CORRESPONDING P-VAL51 USING F APPROXIMATION
39 P filp (pN-K)) FOISTY P40 p K F41 A COMPtITE COR.ESPONDINO P-VALUE USING F APPROX W/I1 LESS D.F. IN DENON42 PVF1. (K-1 3(X1-)-I)) FDISTN F43 PPlL(1-'V144 *(BC7)/L745 D-046 ERRS-.RP-PC47 ERRF._PP P48 ERRF1.PP-PFl49 a PRINT OUT TABLE OF VALUES.50 1-..151 '-52 S.'TEST STAT. VALUE53 M+M, P ROB[H Z G AVC463,:]; FOR A GROUP OF 3 SAMPLES CONSISTTNC OF ~
4, AND '(*cc)., obs.54 -,Ls:55: L5:9--*56 LS:M-M EI(17p'-' ),J,61p(Sp'-' ),J.57: .(D=55 1L9*58* F-pQ
,60~ P-.61 C-70
66 * c7)/L1167 *L1268 L10:.C-+169 P~e [(7 5 *KKEC])70 F.f'71 ill~ 3 Z1072 L12:NMM i F73 D-D+174 1-175 -(NI N2 N3,N4 L5)[D]76 Nl:S-rALPEA VALUE77 KK-PP781 .'+'
108
*o 1.Pia; CTISQOARE'ill;i 3iON; F C~sf
N' -UpjJoit P N/ DF'
V pq~~fjM~A~~j~;P_'!N -PW ;;;;,7NV~.F.E.....RR.RF
A-! P~?M ~ ~ :C U.: ~ i
1-1 074C : 1R M C C . ? : R L N S h:E s M L~
' SUN sMlEs RAK 4FZC SX-
.- M .-. 23i27 it181.1: : i~L:s
* SM EE APS FO LACEf OAMPLE.S
26 E*a':~
360 C C C1C'1 81 L..ff~a. VAsFRM LE TORSO~lv SMALLESTALVE
PIa LJMINEf CM;PNI VALUE CESPOCIN C APEAVALUE
a COI' OR*ESPONC:NC P-VALUE VSS.NG F APPROX W LESS r.F. :N ZENON
'51~ ? iPF: I FIU47T4W~52p @R-LPEA-P
a PR:N. OUT TABLE CF VALUES.
~2,-
p~ st, . -
.6 C:6:,, Li '(
IC' iS .F.FPJ
1 09
L I FNij.d RR /
2~ Mi:S-A PE
-8j j C
[so .f:jVOh1 V/IDl
:13 U ?0 00
'NUM :NNo16TTT 7:NOTLPA r
231 ME5.0 TST22 S PTE UMLATSIE ADIS ANDH 7ALSITDSAS
4 a THIS LOOP CCULATS APA VLE S ADA5O MTZ.S
2 -013 2 :6 U' 029 PA k
15 C-0. a2*0.
137 NUCa(..)( oM)2
24 a COTE LOODPN CACLTE APPIATIONSAD PORIATO
.3S0 a CMPTEERORDIFEENES31 ERRZZNOMA.APRXIATON32 gRR~.P- M +2)C
R3CZC.PP-OR ZZ. 4
37 PJmNTO ARE OF_ VALUES.#
110 ]cUC+% 3'% %fW CNOR~f iSUDIN5) APPROXIMATIONJ.'~ .?'~*.f~ ~ ~ ~ **~ p%~j~p~**
60S-$.TST STAT.' VALUE61I NM'I PROStU sq3: FOR SAMPLE SIZES N EQUAL TO ',(2 0 ONN),' AND
62 k*QUAL TO '(2 0 *MM)*
63 fS:J-19464 L3 ~M55)QV(817.,1(p9)66 8-1 Is OS67 -(DUO) 1668 P. 1 30 ZZ9 O FMT(1 7 OEK)169 F. 1 6 OF70 Ff..71 !f78'73
7" ~S.FS.Hf75 C.076 Lg-c'.+,77 F;.1 ( lPC379 , sPPiC)79 r..(C<7 I' L9Fso L7:M.M Li P81. D.D+1.83 -N V2 .N3 ,N4 N5 N6 .N7 LUS)914 N1:S.- 1EXACT .hp85 J-19886 -L387 N2:S'ErRR~oR; NvORMAL
89 -L390 N3:S.&ERROR; NORM. VICC,
g 2 -L3S93 NW4:S.'ERROR; T DIST914 PP.ERRTT95 .L396 NS:S.'ERROR; T V/CC97 P RTa8 -L3
991 N6:S-'ERROR; AVE T/Z10i6 PP.ERRAVEL101) -L3C102~ N7:S.'ErRROR; AVE TC/ZCC103 .iPP.EAATCZCF1014] -L3105~ L4:M
10 1. (NN 9 )/IL
v sIGNTEsT;N;ALPHA;3:A;C:E;P;Z;ZZ:ERHZZ:N:ZZC;D:PP;I:;F:?FS;S;J;ERRZZC;ZC
2A THIS PROGRAM GENERATES TABLES OF C.*D.P . COMPARISONS FOR THE SIGN TEST.3.14 a SET SAMPLE SIZE AND ALPHA VALUES.5. N.-236 ALPHA. 0.01 0.03 0.06 0.12 0.22 0.35 0.57 THIS LOOP INCREMENTS SAMPLE SIZE.8. L1:PP*PO
11 ZZ~pQ012: ZZC-pOO*13. N-N1+114 D,.015. B.016. A COMPUTE CUMULATIVE DIST AND ASSOCIATED STATS.
17 P-BINON Nisa THIS LOOP CALCULATES ALPHA VALUES AND APPORXIXATIONS.19L2:2*8.1
0 -LPHAERI
23 KK-KK,.T214 PPePP,PCA325 a COMPUTE NORMAL.AP3ROXIMATION.26 Z.(-( Q.N)05x (N*0 527 ZZ.ZZ, (NORMCF Z)28 A C3MPUTE N3N L APPROXIMAT~ION WITH CONTINUITI CORRECTION.30 Z14- C NORMCDFZ531 *<'71/t233 aRZ.PZ SMPUTE ERROR DIFFERENCES..34 EPRRZZC*PP-ZZC35 a PRINT OUT TABLE OF VALUES.36 X.0.1
Ile
371 Jr-% 95S39 )J.'TfSZ STAT. VALUE393 ., PRODC EL g J; FOR SAME SIZE EQUAL TO ',(2 0 *I),''40) ;L3413 S:J.194842] L:M.m 0l AVE(l8p197).J.61p(8o187),J3843 .D=845/L4I44 s I IS8LRS456 845 .(D: zz P DMT(1 7 pICK)U.7 P 1 62 pP48 FS.H.F'4 9 -0L750 L6: FS-spO51 8*17PS52 FS-FS.H53 C-054 L9:C.C.551 F' +1 (7 5 OPPEC3)56 ps.FS p'57 -(C'c73/L958 L7:M.M,E1J FS59 ! .D+lE60 1-1E61 -(NI N2 N3 L5) ED)~62 N 'SEXACT'C.D F.
65 N2ii4.'RROR; NORMAL 166 RP-'RZZ
L67 -3- 'ERROR. NORM. 'C'69 PP+ERRZZCS70 -L371L4:M)
7 SPMT'ST-N-ALPVA;B'A-C;AA-P Z-ZZ:fRRZZ:MZZC-D-PP;I;H;;S;S;;?;;T~C;
2: A THIS PROGRAM GENERATES TABLES OF C.D.F. COMPARISONS FOR SPEARMAN'S R.3.84 A SET SAMPLE SIZE AND ALPEA VALUES.5, N-e6, ALPHA* 0.01 0.02 0.03 0.05 0.08 0.13 0.187. a THIS LOOP INCREMENTS SAMPLE SIZE.8, Ll:PPv.pO
10 X0011 FS-po12 ZZ-pO13 ZZC4p 014 72-
16 N1.1+117 D-0
19 AA-i6+NX1I+N*220 a ACOMPUTE CUMULATIVE DIST AND ASSOCIATED STATS.21 P:P.PMATSP N-5-3)22 PC.(P E2:J0/ ;123 a T HIS ploc1 CALCULATES ALPHA VALUES AND APPORXIMATIONS.24 L2:8.8+125 C.APAB
27 RH P1A]28 KK*.r]C.RIO29 PP-PP, F 2e)30 a OMPUTE NORMAL APPROXIMATION.31 Z*RHOX C(N-i )*o.S)32 ZZ.ZZ, (1-NORMCDF Z)33 q :70MPUJT NORMAL APPROX:MATZrON WIZE CONTIJUIT7 :ORRECTION.~3L4 ZC-(RHO-AA4)x ((N-t )*0.352g Zzc-zZC,(-NO'RMCDF zc)36 aCOMPUTE STUDENVT 2' APPROXIMATION37 TT.TT N SPAPROXI RHO38 a CAMPUTE STUDENT T APPROXIMATION mITE CONTIuIT! CORRECTION.39 T'C.T2.C N SPAPROX RHO40 ILB7 /241 aCOMPUTE ERROR DIFFERENCES.42 ERRZ Z.PP-ZZ843 ERRZ ZC.PP-ZZCL4 iRRTC:-P-:Tk
46 a-. PRINT OUT TABLE OF VALUES.848 .7+195849 5.'TB5T STAT. VALUE I,Sol m.M. ,PROBER k fi): FOR SAMPLE SIZE EQUAL TO '.(2 0 *N),,'
11
53 13:N5M ,CIAVC (ISPI97 ),JSIp (0p197).4)54 .*D 6 RL
59 FS.H.7
61. Li:PS-oQ62 H17.S63 PSFR64 C-065 LO:.e66 F.' L' (7 5 vPPCC3)67 FS-FS )
69 L7:.L1 FS701 D-D+171* 1-!72 -(N N2N3 Nit N5,LS)ED4
75: -L376 N:S.' RROR; NORAL I/C77 PNaEPJZZC78 -L382 Nl&:S' ERROR: OM I ,903 PP.ERRzTC
2 5 N5:S-'ERROR; : D/CT863 PP-9RRTC
87) *L398 L4:Mi89 -ZN410)/Ll
VLiEST NiALPHAjB-A C T-P-N WDEPNZJZiR Z DP;?NrH 1 F Smtirc;Tc;N; icNtil; bEmok; DIJONkC:ERRA VEg; EARTT:; EATC E ER±Z;lC.!ER
mTHIS PROGRAM GENERATES TABLES OF C.D.F. COMPARISONS FOR THE WILCOXONmSZGNED-RANE TEST.
a SET' SAMPLE SIZE AND ALPHA VALUES.
7) ALPHA- 0.0180.02 0.03 0.05 0.08 0.1 10jill:PP 0 IS LOOP INCREMENTS SAMLESE
11 X..p12 F4$.po13 70 0~p14 1ZCso15 7.16 TTC-1017 NN+121 D.019 B*020 a COMPUTE CUMULATIVE DIST AND ASSOCIATED STATS.21 P.VILP N22 a THIS LOOP CALCULATES ALPHA VALUES AND APPORximATIONs. p23 L2:B.Iei24 C.AjLPjA EN)
e25 A. +/ PsC)26 A27 WixEKT28 PP.PP.PCA)29 A OMPU5E N3RMAL APPROXIMATION. 4
*0 NUM.T-((Nx(N.1 )e&u
:.-Num+DrN
1Z zZ,:PoRMCmP HAL WITH CONTINUITY CORRECTION.34COMPUTE NOR APRXMIN35 NUMC.(T+0.5)-((Nx(. *436 ZC+-NUMC.DEN v37 ZZC.ZZC,NORM5CDF Z39 OF COPTE SCDTNT T APPROXIMATION39 DErNOM.(t(N (DEN*2 +) (N-1 )- ((NUH*2).(N-1)))*.40 T-NUM4 DENOM41 T.TTz jN-) vIsTNTr42 A 6PUE STUDENT T APPROXIMATION WI1TH CONTINUITY CORRECTION. 4
43 NUMT.(NUJ044 DENQ.HC.(I (NX(DEN*2)).(N-1))-((((NUN)+0.5)*2).(N-1)))*0.545 NC.UM; *D Nom DSC C46 TTC4#-TTc( N-1)TSTTC
113
0-7 (Ic7)/L2 COMPUTE ERROR DIFFERENCES.
AR14 mI'Ci.PP ZZC C42
55a PRINT OUT TABLE OF VALUES.
57 J-19558 S-'TEST STLT. VALUE I
521 )F~jlPROBCVF 1 1(3; FOR SAMPLE SIZE EQUAL TO ',(2 0NW.,
2 3 (:855i GAVU1S8P197),J,61p(8p197),J364 a- 118 0s6~5 -(DbO)/L666 F*'B0<1 ZZ9 >1 GFMTC1 7 oKK)
~67 F 16 pP
F69 -L770LI:FS4.SO
[731 Cq-0
0754 1 q C. (7l aP 9C)
(C ?'''7
L
078 L7:M-M. P: s
80I1J *1 N2 N3,N4 NS.,6N7 L5)[,D]92~ N1i:NS' I EMAT ,C. b.L83 J-19844 *L385 N72S.''ROR NORMAL I
87 -L388 NjSIRO;NORM. V/CC'99 P-5RZ90 -L391 114:S-'ERROR; r DIST92 PP.-iERRTT
93 *L394 #5 :S.'ERROR; 7 Jv/CC I95 P-ERRTC96 -Z397 N9 :S.'EPRRQR; AVE T/Z Igo PP.ERRA V999 -L3
10 !N7S.'RROR; AVE TC/ZCI101 PPRRTCZC1102 -L3103 JL4:N
114~
INITIAL DISTRIBUTION LIST
No. Copies
1. Defense Technical Information Center 2Cameron StationAlexandria, Virginia 22304-6145
2. Library, Code 0142 2Naval Postgraduate SchoolMonterey, California 93943-5002
3. Superintendent, Code 53Jy 1Attn: Prof. T. JayachandranNaval Postgraduate SchoolMonterey, California 93943
4. Superintendent, Code 55Bm 1Attn: Prof. D. BarrNaval Postgraduate SchoolMonterey, California 93943
5. Superintendent, Code 55LaAttn: Prof. H. LarsonNaval Postgraduate SchoolMonterey, California 93943
6. Superintendent, Code 55Re 1Attn: Prof. R. ReadNaval Postgraduate SchoolMonterey, California 93943
6. Superintendent, Code 55Rh 1Attn: Prof. R. RichardsNaval Postgraduate SchoolMonterey, California 93943
7. Superintendent, Code 55Wd 1Attn: Prof. K. WoodNaval Postgraduate SchooLMonterey, California 93943
7. Commanding Officer 2Naval Special Warfare Unit - FourBox 3400FPO Miami, Florida 34051
115 'J
N.
p
L
~ III