1312 K. XUE AND F. YAO
4. Simulation studies. In the two-sample test for high-dimensional means, methods that
are frequently used and/or recently proposed include those proposed by [5] (abbreviated as
CQ, an L
2
norm test), [3] (abbreviated as CL, an L
∞
norm test) and [21] (abbreviated as XL,
a test combining L
2
and L
∞
norms) tests. We conduct comprehensive simulation studies to
compare our DCF test with these existing methods in terms of size and power under various
settings. The two samples X
n
={X
i
}
n
i=1
and Y
m
={Y
i
}
m
i=1
have sizes (n, m), while the data
dimension is chosen to be p = 1000. Without loss of generality, we let μ
X
= 0 ∈ R
p
.The
structure of μ
Y
∈ R
p
is controlled by a signal strength parameter δ>0 and a sparsity level
parameter β ∈[0, 1]. To construct μ
Y
, in each scenario, we first generate a sequence of i.i.d.
random variables θ
k
∼ U(−δ,δ) for k = 1,...,p and keep them fixed in the simulation
under that scenario. We set δ(r) ={2r log(p)/(n ∨ m)}
1/2
that gives appropriate scale of
signal strength [3, 5, 28]. We take μ
Y
= (θ
1
,...,θ
βp
, 0
p−βp
)
∈ R
p
,wherea denotes
the nearest integer no more than a,and0
q
is the q-dimensional vector of 0’s. Thus the signal
becomes sparser for a smaller value of β, with β = 0 corresponding to the null hypothesis
and β = 1 representing the fully dense alternative. The covariance matrices of the random
vectors are denoted by cov(X
i
) =
X
i
,cov(Y
i
) =
Y
i
for all i = 1,...,n, i
= 1,...,m.
The nominal significance level is α = 0.05, and the DCF test is conducted based on the
multiplier bootstrap of size N = 10
4
.
To have comprehensive comparison, we first consider the following six different set-
tings. The first setting is standard with (n,m,p) = (200, 300, 1000), where the elements
in each sample are i.i.d. Gaussian, and the two samples share a common covariance ma-
trix = (
jk
)
1≤j,k≤p
. The matrix is specified by a dependence structure such that
jk
= (1 +|j − k|)
−1/4
. Beginning with δ = 0.1, where the implicit chosen value r = 0.217
corresponds to quite weak signal according to [3, 28], we calculate the rejection proportions
of the four tests based on 1000 Monte Carlo runs over a full range of sparsity levels from
β = 0 (corresponding to null hypothesis) to β = 1 (corresponding to fully dense alternative).
Then the the signals are gradually strengthened to δ = 0.15, 0.2, 0.25, 0.3. The second set-
ting is similar to the first, except for
Y
i
= 2
X
i
= 2 for all i = 1,...,n, i
= 1,...,m,
where is defined in the first setting. These two settings are denoted by “i.i.d. equal (resp.,
unequal) covariance setting.”
In the third setting, the random vectors in each sample have completely different distribu-
tions and covariance matrices from one another. The procedure to generate the two samples
is as follows. First, a set of parameters {φ
ij
: i = 1,...,m,j = 1,...,p} are generated from
the uniform distribution U(1, 2) independently, and are kept fixed for all Monte Carlo runs.
In a similar fashion, {φ
∗
ij
: i = 1,...,m,j = 1,...,p} are generated from U(1, 3) indepen-
dently. Then, for every i = 1,...,n,wedefineap × p matrix
i
= (ω
ij k
)
1≤j,k≤p
with each
ω
ij k
= (φ
ij
φ
ik
)
1/2
(1 +|j − k|)
−1/4
. Likewise, for every i = 1,...,m,defineap × p matrix
∗
i
= (ω
∗
ij k
)
1≤j,k≤p
with each ω
∗
ij k
= (φ
∗
ij
φ
∗
ik
)
1/2
(1 +|j − k|)
−1/4
. Subsequently, we gener-
ate a set of i.i.d. random vectors
˘
X
n
={
˘
X
i
}
n
i=1
with each
˘
X
i
= (
˘
X
i1
,...,
˘
X
ip
)
∈ R
p
,such
that {
˘
X
i1
,...,
˘
X
i,2p/5
} are i.i.d. standard normal random variables, {
˘
X
i,2p/5+1
,...,
˘
X
i,p
} are
i.i.d. centered Gamma(16, 1/4) random variables, and they are independent of each other. Ac-
cordingly, we construct each X
i
by letting X
i
= μ
X
+
1/2
i
˘
X
i
for all i = 1,...,n.Itisworth
noting that
X
i
=
i
for all i = 1,...,n,thatis,X
i
’s have different covariance matrices and
distributions. The other sample Y
m
={Y
i
}
m
i=1
is constructed in the same way with
Y
i
=
∗
i
for all i = 1,...,m. Then we obtained the results for various signal strength levels of δ over
a full range of sparsity levels of β, and we denote this setting as “completely relaxed.” The
fourth setting is analogous to the third, except that we set (n,m,p)= (100, 400, 1000),where
two sample sizes deviates substantially from each other. Since this setting is concerned with
highly unequal sample sizes, and is therefore denoted as “completely relaxed and highly un-
equal setting.” The fifth setting is similar to the third, except that we replace the standard