TutorialsinQuantitativeMethodsforPsychology
2008,vol.4(1),p.1320.
13
TheMannWhitneyU:
ATestforAssessingWhetherTwoIndependentSamplesCome
fromtheSameDistribution
NadimNachar
UniversitédeMontréal
It is often difficult, particularly when conducting research in psychology, to have
access to large normally distributed samples. Fortunately, thereare statistical tests to
compare two independent groups that do not require large normally distributed
samples.TheMannWhitneyUisoneofthesetests.Inthefollowingwork,a
summary
of this test is presented. The explanation of the logic underlying this test and its
applicationarepresented.Moreover,theforcesandweaknessesoftheMannWhitney
U are mentioned. One major limit of the MannWhitney U is that the type I error or
alpha(
α
)isamplifiedinasituationofheteroscedasticity.
It is generally recognized that psychological studies
often involve small samples. For example, researchers in
clinical psychology often have to deal with small samples
thatgenerallyincludelessthan15participants(Kazdin2003;
Shapiro & Shapiro, 1983; Kraemer, 1981; Kazdin, 1986).
Although the researchers aim at collecting large normally
distributed
samples, they rarely have the appropriate
amountofresources(timeandmoney)torecruitasufficient
number of participants. It is thus useful, particularly in
psychology, to consider tests that have few constraints and
allow experimenters to test their hypotheses on small and
poorlydistributedsamples.
Alotofstudiesdo
notprovideverygoodtestsfor their
hypothesesbecausetheirsamples havetoofewparticipants
(for a review of the reviews, see Sedlmeier & Gigerenzer,
1989). Even tough small samples can be methodologically
questionable (e.g. generalization is difficult); they can be
usefultoinferconclusionsonthepopulationifthe
adequate
statisticaltestisapplied.
One can imagine a situation where a scientist has two
groupsofsubjectsbuthasonlyveryfewparticipantsineach
group (less than eight participants). Thus, this researcher
cannot affirm that his two groups come from a normal
distribution because they include too few participants
(Mann and Whitney, 1947). In addition to this statistical
ʺconstraintʺ, the data of the research conducted by this
experimenter is of continuous or ordinal type. This implies
thathismeasurementscanbelackinginprecision.Insucha
case, this researcher cannot refer to the parametric test of
mean
using the Student’s tdistribution because it is
impossible to check that the two samples are normally
distributed.Howcanonereactinsuchasituation?Initially,
astatisticaltestofnonparametrictypeimposesitselfforthis
researcher (a nonparametric test is necessary when the
distribution is asymmetrical).
Nonparametric tests differ
from parametric test in that the model structure is not
specified a priori but determined from the data. The term
nonparametric is not meant to imply that such models
completelylackparametersbutthatthenumberand nature
of the parameters are flexible and not fixed in
advance.
Therefore, nonparametric tests are also called distribution
free. The MannWhitney U test can be used to answer the
questions of the researcher concerning the difference
between his groups. This test has the great advantage of
possiblybeingusedforsmallsamplesofsubjects(fiveto20
participants). It
can also be used when the measured
variables are of ordinal type and were recorded with an
arbitraryandnotaveryprecisescale.
Inthefieldofbehaviouralsciences,theMannWhitneyU
test is one of the most commonly used nonparametric
14
statistical tests (Kasuya, 2001). This test was independently
worked out by Mann and Whitney (1947) and Wilcoxon
(1945).ThismethodisthusoftencalledtheWilcoxonMann
WhitneytestortheWilcoxonsumofrankstest.
In the following text, a brief summary of the Mann and
Whitneymethod
willbepresented.Theunderlyinglogicof
this test, an example of its application as well as the use of
SPSSforitscalculationwillbepresented.Lastly,someforces
andlimitsofthetestwillbereported.
1.TheMannWhitneyUTest
1.1.HypothesesoftheTest
TheMann
WhitneyUtestnullhypothesis(H0)stipulates
that the two groups come from the same population. In
other terms, it stipulates that the two independent groups
are homogeneous and have the same distribution. The two
variables corresponding to the two groups, represented by
two continuous cumulative distributions, are then called
stochasticallyequal.
If a twosided
or twotailed test is required, the
alternative hypothesis (H
1) against which the null
hypothesis is tested stipulates that the first group data
distributiondiffersfromthesecondgroupdatadistribution.
Inthiscase,thenullhypothesisisrejected for values of the
test statistic falling into either tail of its sampling
distribution (see Figure 1 for a visual illustration).
On the
other hand, if a onesided or onetailed test is required, the
alternative hypothesis suggests that the variable of one
group is stochastically larger than the other group,
according to the test direction (positive or negative). Here,
the null hypothesis is rejected only for values of the
test
statistic falling into one specified tail of its sampling
distribution(seeFigure1foravisualillustration).
Inmorespecificterms,letoneimaginetwoindependent
groups that have to be compared. Each group contains a
number n of observations. The MannWhitney test is based
onthe comparisonof
eachobservationfromthefirstgroup
witheachobservationfromthesecondgroup.Accordingto
this, the data must be sorted in ascending order. The data
from each group are then individually compared together.
The highest number of possible paired comparisons is
thus:
(
)
yxnn
, where nx is the number of observations in the
firstgroupandn
ythenumberofobservationsinthesecond.
If the two groups come from the same population, as
stipulated by the null hypothesis, each datum of the first
group will have an equal chance of being larger or smaller
than each datum of the second group, that is to say a
probabilitypofonehalf(1/2).Intechnicalterms,
H0:
(
)
12ijpx y>=and
H
1:
(
)
12ijpx y>≠
(twotailed test) where x
i is an observation of the first
sampleandy
jisanobservationofthesecond.
The null hypothesis is rejected if one group is
significantlylargerthantheothergroup,withoutspecifying
thedirectionofthisdifference.
Inaonetailedapplicationofthetest,thenullhypothesis
remains the same. However, a change is brought to the
alternative
hypothesis by specifying the direction of the
comparison.Thisrelationcanbeexpressedmathematically,
Figure1.No rmaldistributionsillustratingonetailedandtwotailedtests
Table1.Numbersofsocialphobia’ssymptomsafter
thetherapy

Behavioraltherapy Combinedtherapy
(B) (C)

3 1
3 1
4 2
4 2
7 5
7 5
7 5

Thedataofthetablearefictitious.
15
H
0:
()
12ijpx y>= andH1:
()
12ijpx y>> .
This alternative hypothesis implies that the quantity of
elements, or the dependent variable measurements, of the
firstgroupare significantlylargerthanthoseofthesecond.
Notethatthegroupscanbeinterchanged,inwhichcasethe
alternativehypothesiscorrespondsto:
H
1:
(
)
12ijpx y><.
The hypotheses previously quoted can also be in terms
of medians. The null hypothesis states that the medians of
the two respective samples are not different. As for the
alternative hypothesis, it affirms that one median is larger
thantheotherorquitesimplythatthetwomediansdiffer.
In
amoreexplicitway,thehypothesisrespectivelycorresponds
to:
H
0: xy
θ
θ
= ,H1: xy
θ
θ
< or xy
θ
θ
> (onetailedtest)
H
0: xy
θ
θ
= ,H1: xy
θ
θ
(twotailedtest)
where
θ
xcorrespondstothemedianofthefirstgroupand
θ
ycorrespondstothemedianofthesecondgroup.
Therefore if the null hypothesis is not rejected, it means
that the median of each group of observations are similar.
On the contrary, if the two medians differ, the null
hypothesis is rejected. The two groups are then considered
ascomingfrom
twodifferentpopulations.
1.2.AssumptionsoftheTest
Inordertoverifythe hypotheses, thesamplemust meet
certainconditions.Theseconditionscanbeeasilyrespected.
Theyareofthreetypes:
(a) The two investigated groups must be randomly
drawn from the target population. The concept of random
implies the absence of measurement and sampling errors
(Robertetal.,
1988).Notethatanerroroftheselasttypescan
beinvolvedbutmustremainsmall.
(b) Each measurement or observation must correspond
to a different participant. In statistical terms, there is
independence within groups and mutual independence
betweengroups.
(c) The data measurement scale is of ordinal or
continuous
type. The observations values are then of
ordinal,relativeorabsolutescaletype.
1.3.TheTest
The MannWhitney U test initially implies the
calculation of a U statistic for each group. These statistics
have a known distribution under the null hypothesis
identifiedbyMannandWhitney(1947)(seeTables3to8).
Mathematically, the MannWhitney U statistics are
definedbythefollowing,foreachgroup:
()
(
)
(
)
1/2xxy xx xUnn nn R=+ + (1)
()
(
)
(
)
1/2yxy yy yUnn nn R=+ + (2)
wheren
xisthenumberofobservationsorparticipantsinthe
first group, n
yis the number of observations or participants
in the second group, R
x is the sum of the ranksassignedto
thefirstgroupandR
yisthesumoftheranksassignedtothe
secondgroup.
In other words, both U equations can be understood as
thenumberof timesobservationsinonesampleprecedeor
Table3.ProbabilityofObtainingaUnotLargerthan
thatTabulatedinComparingTwoSampleswhenn
x=3
n
y
U 1 2 3
0 .250 .100 .050
1 .500 .200 .100
2 .750 .400 .200
3.600 .350
4.500
5.650
Table4.ProbabilityofObtainingaUnotLargerthanthat
TabulatedinComparingTwoSampleswhenn
x=4
n
y
U 1 2 3 4
0 .200 .067 .028 .014
1 .400 .133 .057 .029
2 .600 .267 .114 .057
3.400 .200 .100
4.600 .314 .171
5.429 .243
6.571 .343
7.443
8.557
Table2.Numbersofsocialphobia’ssymptomsafterthetherapyandtheirranks
Numbersofsymptoms 1 1 2 2 3 3 4 4 5 5 5 7 7 7
Behavioraltherapy(b)/
Combinedtherapy(c)
c c c c b b b b c c c b b b
Rank 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Thedataofthetablearefictitious.
16
followobservationsintheothersamplewhenallthescores
from one group are placed in ascending order (see the
Procedureand Application section for further information).
In this respect, note that the order in which the data is
arranged is unique when the measurement scale is of
continuous
type. Following the assumption of continuity,
twoobservationscannottakethesamevalue.
Following the calculation of the U statistics and the
determinationofanappropriatestatisticalthreshold(α),the
nullhypothesiscanberejectedornot.Inotherwords,there
is rejection of H
0 if, by consulting the Mann and Whitney
tables, the p corresponding to the min
(
)
,xyUU (the smallest
of U both calculated) is smaller than the p or the
predetermined
α
threshold.Intechnicalterms,
RejectH
0ifpofmin
(
)
,xyUU
α
< threshold.
1.4.Normalapproximation
If the numbers of observations nx and ny are larger than
eight, a normal approximation, as shown by Mann and
Whitney(1947),canbeused,thatistosay:
(
)
(
)
/2 /2
U
xy x ynn U U
μ
==+and
()
()
()
1/12
U
xynn N
σ
=+
where
(
)
xyNnn=+,
μ
Ucorrespondsto theaverageofthe U
distributionand
σ
Ucorrespondstoitsstandarddeviation.
Ifeachgroupincludesmorethaneightobservations,the
sample’s distribution gradually approaches a normal
distribution. If a normal approximation has to be used, the
correspondingequationbecomes:
(
)
(
)
/2 /xy UzUnn
=−
andtheteststatisticbecomes,inabsolutevalues:
/xyUzUU
σ
=+ .
Totestthe differencebetweenU
x oryand
μ
U,thereadercan
refertotheztable.Iftheabsolutevalueofthecalculatedzis
largerorequal to the tabulated zvalue,thenullhypothesis
isrejected.
RejectH
0if calculated z z tabulated .
1.5.Ties(equalities)
Following the MannWhitney U test assumption of
continuity, the data’s arrangement must be unique. This
postulate implies that it is impossible that two values are
exactlyequal(chancesareoneoutofinfinity).However,itis
often possible to observe equal measurements in
behavioural sciences because the measurements are rarely
very precise. In the case of equalities, it is necessary to
calculatebothU byallocatinghalf ofthetiedranks(ties) to
the first group’s values and the other half to the second
group’s values. It is as if one gave to each observation, the
averagerankifno
equalityhadexisted.Notethatwhenties
Table6.ProbabilityofObtainingaUnotLargerthan
thatTabulatedinComparingTwoSampleswhenn
x=6
n
y
U 1 2 3 4 5 6
0 .143 .036 .012 .005 .002 .001
1 .286 .071 .024 .010 .004 .002
2 .428 .143 .048 .019 .009 .004
3 .571 .214 .083 .033 .015 .008
4.321 .131 .057 .026 .013
5.429 .190 .086 .041 .021
6.571 .274 .129 .063 .032
7.357 .176 .089 .047
8.452 .238 .123 .066
9.548
.305 .165 .090
10.381 .214 .120
11.457 .268 .155
12.545 .331 .197
13.396 .242
14.465 .294
15.535 .350
16 .409
17 .469
18 .531
Table5.ProbabilityofObtainingaUnotLargerthanthat
TabulatedinComparingTwoSamplesn
x=5
n
y
U 1 2 3 4 5
0 .167 .047 .018 .008 .004
1 .333 .095 .036 .016 .008
2 .500 .190 .071 .032 .016
3 .667 .286 .125 .056 .028
4.429 .196 .095 .048
5.571 .286 .143 .075
6.393 .206 .111
7.500 .278 .155
8.607 .365 .210
9.452 .274
10.548 .345
11.421
12.500
13
.579
17
occurwithin agroup,thistypeofequalitydoesnotneedto
beconsideredinthecalculationpresentedhere.Indeed,itis
the equalities between the two groups that deserve
attention.Inshort,intheeventofties,assigntherankofhalf
of the observations to the first
group and the other half to
thesecond.
In a situation of ties between the groups, the normal
approximation must be used with an adjustment to the
standard deviation. The standard deviation or the square
rootofthevariancebecomes:
()
()
()
()
()
()
()
()
(
)
33
1
/1 /12 /12
g
Uxy jj
j
nn NN N N t t
σ
=
=−
where
(
)
xyNnn=+, g = number of ties and tj = number of
equalranksinthesecondgroup.
2.ProcedureandApplication
Lettakeafictitiousexampleofapplicationof theUtest.
Anexperimenterreadthatthereisanantibioticoftentested,
welldocumented,andknowntohelpinformationstoragein
memory. This experimenter also knows through scientific
reports and guidelines that the behavioral therapy has an
established efficacy for the treatment
of the social phobia
(APA,1998;INSERM,2004;BPSCORE,2001).Inaddition,he
knows that the behavioral therapy requires the learning of
newbehaviourswhichimpliesinformationstorage.
The number of symptoms of social phobia after two
types of therapy was investigated. Two groups of
individuals with social phobia were
compared. The first
group received the behavioral therapy; the second group
received the behavioral therapy combined with the
antibiotic. After each therapy, both groups showed a
decreasedinthenumberofsymptomsofsocialphobia.The
number of these symptoms was measured and a test was
runtodecidewhetherthe
combinedtherapyhadmoreeffect
onthesymptomsthanthebehavioraltherapyalone.
Inotherterms,theexperimenter wishestocompare two
random variables having continuous cumulative
distributionfunctions.Hewishestotestthehypothesis that
his variables are stochastically equal (their distributions are
similar) against the alternative that C
is stochastically
smallerthanB. Ccorrespondsto thenumbersofsymptoms
under investigation in the combined therapy group, B
corresponds to the numbers of symptoms in the behavioral
therapygroup.
Unfortunately,thenumberofsubjectswithsocialphobia
issmall.
Moreover,nothingindicatesthatthesymptomatologyis
normally distributed amongst
the individuals with social
phobia. Hence, the Mann and Whitney U test is the only
legitimatetest.Table1showstheresultsoftheexperiment.
First, organize each group data in ascending order
irrespectiveofgroupmembership.Beawarethatavalueof
20 is ordered, on an increasing scale,
before a value of 10.
SeeTable2foravisualillustration.
BothUstatisticscanbecomputedusingtheequations(1)
and(2).
Note that the sum of ranks of the two groups is
always
Rx Ry+
:
(
)
()
()
123... 1/2xyRR NNN+=++++= +
where
(
)
xNnny=+
()( 1)/2(1)/2Rx Ry nx ny nx ny N N
+
=+ ++ = + (3)
Inthisway,onecan deduce,startingfromtheequations(1)
and(2),that:
()
(
)
(
)
1/2xxy xx xRnn nn U=+ +
()
(
)
(
)
1/2yxy yy yRnn nn U=+ +
Insertingthesetwoprecedingequationsinequation(3):
Table7.ProbabilityofObtainingaUnotLargerthan
thatTabulatedinComparingTwoSampleswhenn
x=7
n
y
U 1 2 3 4 5 6 7
0 .125 .028 .008 .003 .001 .001 .000
1 .250 .056 .017 .006 .003 .001 .001
2 .375 .111 .033 .012 .005 .002 .001
3 .500 .167 .058 .021 .009 .004 .002
4 .625 .250 .092 .036 .015 .007 .003
5.333 .133 .055 .024 .011 .006
6.444 .192 .082 .037 .017 .009
7.556 .258 .115 .053
.026 .013
8 .333 .158 .074 .037 .019
9 .417 .206 .101 .051 .027
10 .500 .264 .134 .069 .036
11 .583 .324 .172 .090 .049
12.394 .216 .117 .064
13.464 .265 .147 .082
14.538 .319 .183 .104
15.378 .223 .130
16.438 .267 .159
17
.500 .314 .191
18.562 .365 .228
19.418 .267
20.473 .310
21.527 .355
22.402
23.451
24.500
25.549
18
()
()
()
()
()
()
()
()( )
()
1/2
1/2
1/2
xy xy xx xxy
yy y
xyxy
RR nn nn Unn
nn U
nnnn
+= + + + +
+−
=+ ++
sothat
()
(
)
(
)
()
()
()
()( )
()
1/2
1/2 1/2
xy x x xy
yy xyxy
xy
nn n n nn
nn nnnn
UU
++++
+−+ ++
=+
then:
xyxyUU nn
+
=
Thesumof
UxandU yisthusequaltotheproductofthe
two samples sizes. Consequently, once one value of U is
obtainedusingtheequations(1)or(2),thevalueoftheother
U is found by subtracting the value of the first U from the
productofthetwosamplessizes.Thus:
(
)
(
)
orxxyy yxyxUnnUUnnU=− =−
Thislastequationcansaveanenormousamountoftime.
Second, the researcher must calculate the U statistic
correspondingtoeachgroupusingequations(1)or(2):
()
(
)
(
)
()
()
()
1/2
77 771/2 5678121314
49 28 65 12
=+ +
=×+ + ++++ + +
=+−=
BBC BB BUnn nn R
sothat 7 7 12 37
CU
=
×− = .Alternatively:
()
(
)
(
)
()
()
()
1/2
77 771/2 123491011
49 28 40 37
=+ +
=×+ + +++++ +
=+−=
CBC CC CUnn nn R
where nB is the number of symptoms of social phobia after
the behavioral therapy,
nC is the number of symptoms of
social phobia after the combined therapy,
RB is the sum of
theranksassignedtothebehavioraltherapygroupand
RCis
the sum of the ranks assigned to the combined therapy
group.
Based on this method, the experimenter can formulate
the null hypothesis differently:
UB does not differ
significantlyfrom
UC.
Third, compute the global U statistic in this way:
min
(
)
,xyUU (choose the smallest value of both U statistics
calculated). With the Mann and Whitney tables (1947), the
probabilityofobtainingaUvaluethatisnotlargerthanthe
one calculated above can be obtained. To find this
probability in the Mann and Whitney tables, the following
informationisrequired:
thevalueofmin
(
)
,xyUU ,nxandny.
If the probability of obtaining such a U is smaller than the
predetermined alpha (
α) threshold, the null hypothesis is
rejected. If it is a onetailed test, this value found in the
Mann and Whitney tables correspond to the probability
p
value (probability of rejecting
H0 when this one isʺtrueʺ)
which will be compared with the predetermined alpha (
α)
thresholdofstatisticalsignificance.Ontheotherhand,ifitis
atwotailedtest,itisnecessarytodoublethisprobabilityto
obtain the one that will be compared with the
predeterminedalpha
(
α
)thresholdofstatisticalsignificance.
ReferringtoTable7,thesmallestUisinthiscase12and
correspondtoa
pof 0.064(seealsoSPSSsection).Thus,this
pisnotsmallerthanapredeterminedpof,forexample,0.05.
Theresearcherdoesnotreject
H0andconcludesthatthetwo
groupsarenotsignificantlydifferent.
Table8.ProbabilityofObtainingaUnotLargerthanthat
TabulatedinComparingTwoSampleswhenn
x=8
n
y
U 1 2 3 4 5 6 7 8 normal
0 .111 .022 .006 .002 .001 .000 .000 .000 .001
1 .222 .044 .012 .004 .002 .001 .000 .000 .001
2 .333 .089 .024 .008 .003 .001 .001 .000 .001
3 .444 .133 .042 .014 .005 .002 .001 .001 .001
4 .556 .200 .067 .024 .009 .004 .002 .001 .002
5 .267 .097 .036 .015 .006 .003 .001 .003
6 .356
.139 .055 .023 .010 .005 .002 .004
7 .444 .188 .077 .033 .015 .007 .003 .005
8 .556 .248 .107 .047 .021 .010 .005 .007
9 .315 .141 .064 .030 .014 .007 .009
10 .387 .184 .085 .041 .020 .010 .012
11 .461 .230 .111 .054 .027 .014 .016
12
.539 .285 .142 .071 .036 .019 .020
13 .341 .177 .091 .047 .025 .026
14 .404 .217 .114 .060 .032 .033
15 .467 .262 .141 .076 .041 .041
16 .533 .311 .172 .095 .052 .052
17 .362 .207 .116 .065 .064
18 .416 .245 .140 .080 .078
19 .472 .286 .168
.097 .094
20 .528 .331 .198 .117 .113
21 .377 .232 .139 .135
22 .426 .268 .164 .159
23 .475 .306 .191 .185
24 .525 .347 .221 .215
25 .389 .253 .247
26 .433 .287 .282
27 .478 .323 .318
28 .522 .360 .356
29 .399 .396
30 .439 .437
31 .480 .481
32 .520
19
3.ComputingtheMannWhithneyUtestusingSPSS
First of all, one needs to enter the data in SPSS, not
forgetting the golden rule which stipulates that each
participant’s observation must occupy a line. The numbers
of the groups are generally 1 and 2, except whenever it is
morepracticaltouseothernumbers.
Following the entry of the data, open a new syntax
windowandenterthefollowing
syntax.
NPAR TESTS
/M-W= name of the dependent variable column BY
name of the independent variable column (1 2)
/STATISTICS= DESCRIPTIVES QUARTILES
/MISSING ANALYSIS. or /MISSING LISTWISE.
Thelastlineoftheprecedingsyntaxcorrespondstotwo
options that manage the missing values. These options are
usefulwhenmore thanonestatisticaltestisspecified inthe
syntax table. The first option is /MISSING ANALYSIS and
supports that each test is separately evaluated for the
missing values. On
the other hand, with the option
/MISSINGLISTWISE, each empty box or missingvalue,for
any variable, is excluded from all analyses. The option that
onewillchoosedependsontheotherteststhatoneneedsto
apply.IfthereisonlytheMannWhitneystatisticaltestthat
hasto
becarriedout,themissingvalueswillbemanagedin
thesamemanner,doesnotmatterwhichofthetwooptions
isselected.
Following the syntax execution, the results appear in
tables in the
Output window. Initially, descriptive data like
the group averages, their standard deviation, the minimal
and maximal values, the quartiles and the number of
participantsineachgroupappear.Thereafter,thetestresults
appear in two distinct tables. In the first table, between the
values of the
Ranks, the Mean Rank and the Sum of Ranks
given, the
N corresponds to the number of observations or
participants.Inaddition, inthesecond one, thetestsresults
appear.
SPSS automatically provides us the MannWhitney
U, the Wilcoxon W and the Z results. This computer
programalsoreturnstheasymptoticsignificanceorthelevel
of significance based on the normal distribution of the
statistical test:
Asymp. Sig. (2tailed):. In a general way, a
value lower than the statistical threshold is considered
significantandthealternativehypothesisisaccepted.
The asymptotic significance is based on the assumption
that the data sample is large. If the data sample is small or
badly distributed, the asymptotic significance is not in
general
agoodindicationofthesignificance.Inthiscase,the
level of significance based on the exact distribution of a
statisticaltestor
ExactSig.[ 2*(1tailedSig.)]correspondsto
the statistic of decision. Consequently, one should use this
valuewhenthesampleissmall,sparse,containsmanyties,
is badly balanced or does not seem to be normally
distributed. SPSS thus provides the exact value of
p (Exact
Sig.[2*(1tailedSig.)]
)andthevalue ofpbasedonanormal
approximation (
Asymp. Sig. (2tailed)). If a normal
distribution is adequate to the studied case, the two values
should be roughly or exactly equivalent. Note that
Asymp.
Sig.(2tailed):
andSig.[2*(1tailedSig.)]:representtwolevel
of significance for a twotailed test. If one uses a onetailed
test, these two levels must be divided by two. Lastly, the
mention
Not corrected for ties: imply that the test did not
correct the result appearing in the table for the ties or
equalities.
According to the example previously presented, the
researcher will consider the
Exact Sig. [ 2*(1tailed Sig.) ]: .
This done and because his test application is of onetailed
type, he will divide this level of significance based on the
exact distribution by two to obtain the level of significance
that will be compared to his predetermined statistical
threshold
(
α
). In the example previously presented, the p is
0.064 and not smaller than the predetermined statistical
thresholdof0.05.
4.Discussion
Like any statisticaltest, the MannWhitney U has forces
andweaknesses.Intermsofforces,likeanynonparametric
test,theMannWhitneyUdoesnot dependonassumptions
on the distribution (i.e. one does not need to postulate the
datadistributionofthetargetpopulation).Onecanalsouse
it when the conditions of normality neith er are met nor
realisable by transformations. Moreover, one can use it
whenhissampleissmallandthedataaresemiquantitative
oratleastordinal.Inshort,fewconstraintsapplytothistest.
The MannWhitney U test is also one of
the most
powerful nonparametric tests (Landers, 1981), where the
statisticalpowercorrespondstotheprobabilityofrejectinga
falsenullhypothesis.Thistesthasthusgoodprobabilitiesof
providing statistically significant results when the
alternativehypothesisappliestothemeasuredreality.Even
if it is used on averagesize samples
(between 10 and 20
observations)orwith datathatsatisfythe constraintsofthe
ttest, the MannWhitney has approximately 95% of the
Student’s ttest statistical power (Landers). By comparison
withthettest,theMannWhitneyUislessat risk to give a
wrongfully significant result
when there is presence of one
or two extreme values in the sample under investigation
(SiegelandCastellan,1988).
Despite this, the Mann and Whitney test (1947) has its
limits. With the Monte Carlo methods, methods that
calculateanumericalvaluebyusingrandomorprobabilistic
processes, it was shown that
the ttest is most of the time
more powerful than the Utest. Indeed, this fact remains
whatever the amplitude of the differences between the
averagesofthe populationsunderinvestigationandevenif
20
the distributions of these
populations do not meet the
criteriaofnormality(Zimmerman,1985).Ontheotherhand,
verylittlestatisticalpowerislostiftheMannWhitneyUtest
is used instead of the ttest and this, under statistically
controlledconditions(GibbonsandChakraborti,1991).
In addition, the MannWhitney U test is, in exceptional
circumstances, more powerful than the ttest. Indeed, it is
morepowerfulinthedetectionofadifferenceontheextent
of the possible differences between populations’ averages
thanthettestwhena small manpower isassociatedwitha
small
variance (Zimmerman, 1987). On the other hand,
when the sample size is similar or when the smallest
manpower has the greatest variance, the ttest is more
powerful on all the extent of the possible differences
(Zimmerman).

Lastly,theMonteCarlomethodsshowedthattheMann
Whitney U test can give wrongfully significant results, that
is to say the erroneous acceptance of the alternative
hypothesis(Robert&
Casella,2004).Thistypeofresultsisat
risktobeobtainedwheneverone’ssamplesaredrawnfrom
two populations with a same average but with different
variances.Inthistypeofsituations,itislargelymorereliable
tousethettestwhichgivesapossibility forthesamplesto
come
fromdistributions withdifferentvariances.The alpha
(
α)erroror oftypeIisto rejectH0whereasthis one istrue.
This error is thus amplified when MannWhitney U is
applied in a situation of heteroscedasticity or distinct
variances. In addition, some solutions exist to this major
problem(seeKasuya,2001).

In short, the MannWhitney U statistical test is an
excellentalternativetoparametrictests likethettest,when
theassumptionsoftheselastonescannotberespected.With
astatisticalpowersimilartothettest,theMannWhitneyU
is, by excellence, the test of replacement. However, as
one
understood, it is more reliable to use the ttest if its
postulatescanbemet.
References
APA‐American Psychological Association (1998). Special
section: Empirically supported psychological therapies.
JournalofConsultingandClinicalPsychology,66(1).
BPSCORE‐British Psychological Society Centre for
Outcomes Research and Effectiveness (2001).
Treatment
Choice in Psychological Therapies and Counselling: Evidence
Based Clinical Practice Guideline
. RoyaumeUni:
DepartmentofHealth.
Gibbons, J.D., & Chakraborti, S. (1991). Comparisons of the
MannWhitney, Studentʹs t, and alternate t tests for
means of normal distributions.
Journal of Experimental
Education,
59(3),258267.
INSERM‐Institut national de la santé et de la recherche
médicale (2004).
Psychothérapies: trios approches évaluées.
Paris:ÉditionINSERM.
Kasuya,E.(2001).MannWhitneyUtestwhenvariancesare
unequal.
AnimalBehavior,61,12471249.
Kazdin, A. E. (1986). Comparative outcome studies in
psychotherapy: Methodological issues and strategies.
JournalofConsultingandClinicalPsychology,54,95105.
Kazdin, A.E. (2003).
Methodological Issues and Strategies in
ClinicalResearch(3
rd
edition).Washington,D.C.:American
PsychologicalAssociation.
Kraemer, H. C. (1981). Coping strategies in psychiatric
clinical research. Journal of Consulting and Clinical
Psychology,49,
309319.
Landers, J. (1981).
Quantification in History, Topic 4:
Hypothesis Testing IIDiffering Central Tendency
. Oxford :
AllSoulsCollege.
Mann, H. B., & Whitney, D. R. (1947). On a test of whether
one of 2 random variables is stochastically larger than
theother.
AnnalsofMathematicalStatistics,18,5060.
Robert, C.P., & Casella, G. (2004).
Monte Carlo Statistical
Methods,secondedition.
NewYork:SpringerVerlag.
Robert, M. et al. (1988).
Fondements et étapes de la recherche
scientifique en psychologie.
SaintHyacinthe: Edisem et
Paris:Maloine.
Sedlmeier, P., & Gigerenzer, G. (1989). Do studies of
statisticalpowerhaveaneffectonthe powerofstudies?
PsychologicalBulletin,105,309316.
Shapiro, D. A., & Shapiro, D. (1983). Comparative therapy
outcome research: Methodological implicationsof meta
analysis.
Journal of Consulting and Clinical Psychology, 51,
4253.
Siegel, S., & Castellan, N.J.Jr. (1988)
Nonparametric statistics
for the behavioral sciences, second edition
. ÉtatsUnis:
McGrawHillbookcompany.
Wilcoxon, F. (1945). Individual comparisons by ranking
methods.
BiometricsBulletin,1,8083.
Zimmerman, D.W. (1987). Comparative power of Student t
test and MannWhitney U test for unequal sample sizes
and variances.
Journal of Experimental Education, 55, 171
174.
Zimmerman, D.W. (1985). Power functionsofthe t test and
MannWhitney U test under violation of parametric
assumptions.
PerceptualandMotorSkills,61,467470.
Manuscriptreceived29September2006
Manuscriptaccepted1May2007