Академический Документы
Профессиональный Документы
Культура Документы
Abstract—Massive multiple-input multiple-output (MIMO) more realistic model to characterize the temporal correlation
systems need to support massive connectivity for the application of channels is the first-order Gaussian-Markov process [12].
of the Internet of things (IoT). The overhead of channel state Training techniques and pilot beam design for single user are
information (CSI) acquisition becomes a bottleneck in the system
arXiv:1707.09093v1 [cs.IT] 28 Jul 2017
performance due to the increasing number of users. An intermit- discussed under this model [13] [14]. In [15], [16], the authors
tent estimation scheme is proposed to ease the burden of channel study the multiuser dynamic channel acquisition problem
estimation and maximize the sum capacity. In the scheme, we under the block fading channel.
exploit the temporal correlation of MIMO channels and analyze Our work focuses on the multiuser scenario. To ease the
the influence of the age of CSI on the downlink transmission burden of CSI acquisition, we propose an intermittent estima-
rate using linear precoders. We show the CSI updating interval
should follow a quasi-periodic distribution and reach a trade- tion scheme. Different from the conventional scheme which
off between the accuracy of CSI estimation and the overhead continuously estimates channels of the whole users in every
of CSI acquisition by optimizing the CSI updating frequency of channel block, the proposed scheme exploits the temporal
each user. Numerical results show that the proposed intermittent correlation and requires users to estimate their channels in-
scheme provides significant capacity gains over the conventional termittently among blocks. The CSI updating interval of each
continuous estimation scheme.
user is shown to follow a quasi-periodic distribution and the
I. I NTRODUCTION CSI updating frequency is optimized based on the temporal
correlation coefficient of each user. Hence, the scheme reaches
By applying a large number of antenna elements, the mas- a trade-off between the accuracy of the CSI and the overhead
sive multiple-input multiple-output (MIMO) system exploits of the CSI acquisition to achieve the maximum sum capacity.
the potentials of spatial multiplexing with tremendous degrees- The rest of the paper is organized as follows. Section
of-freedom, and is believed to increase the spectral efficiency II describes the system model adopted in this paper. The
10 times or more as well as improving the radiated energy influence of aged CSI on the downlink transmission rate is
efficiency in the order of 100 times [1]. Exploiting the advan- investigated in Section III. Then we propose the intermittent
tages of massive MIMO technique requires the knowledge of estimation scheme in Section IV. The numerical results are
channel state information (CSI) at the base station (BS) side. presented in Section V and conclusions are drawn in Section
The canonical massive MIMO protocol is to operate in time- VI.
division-duplex (TDD) mode in order to acquire CSI through
channel reciprocity [2]. However, the overhead of channel II. S YSTEM M ODEL
estimation is still unbearable for the application of the Internet We consider a massive MIMO system with M antenna
of things (IoT) [3], where the systems need to serve a vast elements serving K users. The processes of channel estimation
quantity of users within the limited channel coherence time. and data transmission occur periodically in a block of L chan-
Many prior works try to exploit the spatial correlation nel uses. The system operates in the calibrated time-duplex-
of channels to reduce the overhead of channel acquisition. division (TDD) mode so that channel reciprocity is utilized
The spatial correlation matrices of users are used to design for channel estimation. More specifically, in the beginning of
inter-cell and intra-cell pilot reuse schemes [4] [5]. Ref. [6] every block, a number of T channel uses is spent by the base
assumes channel sparsity in spatial domain and estimates station (BS) on estimating uplink channels, afterwards the BS
channels based on the compressive sensing theory. Mutual- obtains downlink channel vectors as the transpose of uplink
information-optimal pilots for minimum mean square error channel vectors, and transmits precoded data in the remaining
(MMSE) estimation are proposed in [7]–[9]. L − T channel uses.
Another aspect of the channel statistics worth exploring is The uplink channel of the k-th user in the i-th block is
the temporal correlation. The correlation between the current p
gk (i) = βk hk (i), (1)
channel and the previous channel becomes weaker with longer √
time elapsed [10]. Under the assumption of the Gilbert-Elliot where βk and hk (i) denotes the large and small scale
channel model, the authors introduce the concept the age of fading coefficient, respectively. The small scale fading co-
CSI and study its impact on a general utility function [11]. A efficient satisfies the complex circularly-symmetric Gaussian
distribution CN (0, IM ). It stays constant in each block and Channel Data
evolves between blocks according to the first-order stationary estimation transmission
Gaussian-Markov process [12]
q
hk (i + 1) = ρk hk (i) + 1 − ρ2k ek (i), i ≥ 0, (2)
Block 1 Block 2 Block 3 Block 4
where ρk is the temporal correlation coefficient of channels
between blocks, and ek (i) is the innovation process distributed Fig. 1. An example of the intermittent estimation scheme.
as CN (0, IM ). We assume the initial channels gk (0) and the
innovation processes ek (i) of each user are independent, so
the channels of different users are independent in each block. where ηkMF is the normalization coefficient. In the massive
If the channel is updated in the i-th block, then the channel MIMO scenario, the value of ηkMF tends to be constant due
after n blocks can be easily derived by iterating (2), which is to the channel hardening effect, which is derived as
q 1
hk (i + n) = ρnk hk (i) + 1 − ρ2nk ek,n (i), n ≥ 1, (3) ηkMF = Etr{ĤkT Ĥk∗ } = M. (8)
K
where ek,n (i) ∼ CN (0, IM ) is the equivalent innovation For zero forcing precoding, the precoding matrix becomes
process in n blocks. The age of CSI [11] is defined as 1
the number of blocks elapsed since last channel estimation VkZF = q Ĥk∗ (ĤkT Ĥk∗ )−1 , (9)
in our work. We focus on the impact of the CSI updating ηkZF
frequency and only consider estimation errors caused by aging where the normalization coefficient is
of CSI. As the age of CSI n increases, the temporal correlation 1 1
between the estimated channel hk (i) and the channel hk (i+n) ηkZF = Etr{(ĤkT Ĥk∗ )−1 } = . (10)
K M −K
of ground truth declines exponentially as ρnk , making the
estimation less accurate. Theorem 1. If the CSI of the k-th user has an age of n blocks,
then the SINR using MF precoding is given by
III. T RANSMISSION R ATE WITH AGED CSI k βk M ρ2n
γkMF (n) = PK
k
, (11)
The BS needs to precode the user data before downlink 1 + i=1 i βi
transmission to support spatial multiplexing. The received
and the SINR using ZF precoding is given by
signal of the k-th user is
k βk (M − K)ρ2n
√
K
√ γkZF (n) = PK
k
. (12)
1 + (1 − ρ2n
X
yk = k gkT vk dk + i giT vi di + nk k ) i=1 i βi
i=1,i6=k Proof. See Appendix A.
K
(4)
p X p
= k βk hTk vk dk + i βi hTi vi di + nk , From the above theorem, we observe the fact that for
i=1,i6=k both precoding schemes, the SINR functions of the k-th user
monotonically decreases as its age of CSI grows, and are not
where k , vk , dk and nk denotes SNR at the transmitter, affected by the ages of CSI of other users. For the convenience
precoding vector, data symbol and normalized downlink noise, of expressions, we will use the unified term γk (n), Rk (n) for
respectively. A lower bound of per user SINR at the receiver the two precoding schemes in the rest of the paper.
can be obtained by using the similar technique as [17, Theorem
1], which is denoted as IV. I NTERMITTENT E STIMATION S CHEME
k βk |EhTk vk |2 A. Scheme Description
γk = . The channel estimation overhead increases linearly with
1+k βk (E|hTk vk |2 −|EhTk vk |2 )+ i βi E|hTk vi |2
P
i6=k the total number of users. When user number increases to
(5) a certain amount, the channel block can no longer support the
And the downlink transmission rate is obtained: estimation overhead of all users. So we propose an intermittent
channel estimation scheme. In every block, the BS chooses a
Rk = log {1 + γk } . (6)
subset of users instead of the whole users to estimate their
We consider two kinds of linear precoding schemes, namely channels. Afterwards, the BS forms the precoder matrix using
matched filter (MF) and zero forcing (ZF). Assume the esti- the latest CSI of each user and then transmits data to all users.
mated channel matrix is Ĥk , then the precoding matrix of The scheme reduces the overhead of channel estimation at the
matched filter is cost of a decline of the estimation accuracy.
An example of the intermittent estimation scheme is illus-
1
VkMF = q Ĥk∗ , (7) trated in Fig. 1. The system provides data service for total
MF
ηk 4 users. The pilot pattern has a period of 4 blocks, and the
system selects 3 users from the whole user set for channel B. Optimizing the Distribution of CSI Updating Interval
estimation in each block. In each period of the pilot pattern, To solve the relaxed problem (15), we firstly consider the
the CSI of blue user is updated twice with an interval of 1 rate performance of a single user, and optimize fk,n under
block and once with an interval of 2 blocks. So its average fixed value of the CSI updating frequency pk . It is formulated
updating interval is 4/3 blocks and CSI updating frequency as the following optimization problem,
equals 3/4 per block. Similarly, the CSI updating frequencies
of orange, green and purple users are 1, 3/4 and 1/2 per block, max E{Rk }
fk,n
respectively.
s.t. 0 ≤ fk,n ≤ 1, k = 1, · · · , K
For a given user set, we try to maximize the average user X
sum rate by choosing the proper user subset for channel esti- fk,n = 1 (16)
mation in every block. The optimization problem is formulated n
as
X 1
nfk,n = .
I−1 K n
pk
1 T XX
max lim 1− Rk,i Assume the k-th user has done Nk times of channel
qk (i),T I→∞ I C i=0
k=1 estimations in a total of N blocks, and there are Fk,n CSI
s.t. qk (i) ∈ {0, 1}, k = 1, · · · , K (13)
updating intervals with block length of n. The total rate in a
X
qk (i) = T, CSI updating interval with block length of n can be calculated
k as
n−1
X
where qk (i) indicates whether the k-th user is chosen to do Gk (n) , Rk (i), (17)
channel estimation in the i-th block. The sum rate is multiplied i=0
by a discounting factor 1 − T /C where T /C represents the where Rk (i) = log(1+SINRk (i)) is the transmission rate with
ratio of estimation overhead. a CSI age of i blocks. Obviously, Rk (i) is a non-increasing
Since the transmission rate of the k-th user is only affected function with respect to i. According to the definition of
by its own age of CSI, rearranging the order of CSI updating pk and fk,n , the limitations limN →∞ Nk /N = pk and
intervals has no influence on the rate performance of the k-th limN →∞ Fk,n /Nk = fk,n both hold. Therefore, the average
user. Denote pk as the CSI updating frequency and fk,n as the transmission rate of the k-th user is
distribution CSI updating interval. The CSI updating frequency 1 X X
is the reciprocal of the average CSI updating interval: E{Rk } = lim Fk,n Gk (n) = pk fk,n Gk (n).
N →∞ N
n n
1 (18)
pk = P . (14) We extend Gk (n) to a piecewise function:
n nfk,n
We consider a new optimization problem as 0P
x=0
x−1
K
X Gk (x) = i=0 Rk (i) x ∈ N+
T
(1−x+bxc)Gk(bxc)+(x−bxc)Gk(bxc+1) x ∈ / N , x≥0
max 1− E{Rk }
T,pk ,fk,n C (19)
k=1
s.t. 0 ≤ fk,n ≤ 1, k = 1, · · · , K The optimal distribution of CSI updating interval is obtained
X (15) according to the following theorem.
fk,n = 1, k = 1, · · · , K
n Theorem 2. The transmission rate of the k-th user achieves
the maximum value
X
pk = T,
k 1
Sk (pk ) = pk Gk ( ), (20)
pk
which is actually a relaxation of the original problem (13).
The second constraint is relaxed from pilot length in every if the distribution of CSI updating interval is
block equaling T to average pilot length equaling T . When
1 1 1
the number of users goes to infinity, the pilot length tends to be 1 − pk + b pk c n = b pk c
the sum of the CSI updating frequencies of all users by the law fk,n = p1k − b p1k c n = b p1k c + 1 (21)
0. else.
of large numbers. Therefore, the relaxation is asymptotically
tight.
Proof. See Appendix B.
The relaxed problem can be solved in three steps. Firstly,
we fix the value of the CSI updating frequency pk and the The theorem requires the CSI updating interval to be quasi-
pilot length T and try to optimize the distribution fk,n of CSI periodic. More specifically, if 1/pk is an integer, the CSI
updating interval for each user. Then pk is optimized under should be updated every 1/pk blocks. Otherwise the CSI
fixed value of T . Finally we find the optimal T by a one- should be updated every b1/pk c or b1/pk c + 1 blocks with
dimensional search. an average updating interval of 1/pk blocks.
C. Optimizing the CSI Updating Frequency than 0 and the last value less than 1 in {pa(i) }, respectively.
After optimizing fk,n , the transmission rate of the k-th user Summing up (28) from i to j gets
is Sk (pk ). We need to allocate the total estimation resource Pj
T − (K − j) − l=i xl
of T channel uses to all users by optimizing the CSI updating ν= . (29)
j−i+1
frequency. The problem is formulated as
We initialize i = 1, j = K and test whether the constraints
K
0 ≤ pa(k) ≤ 1 are met. If not, we either increase i or
T X
max 1− Sk (pk ) decrease j, according to the value of xa(i) +xa(j) . The detailed
pk C
k=1 projection method is described by Algorithm 1.
s.t. 0 ≤ pk ≤ 1, k = 1, · · · , K (22)
Algorithm 1 Projection Algorithm for (27)
X
pk = T.
k 1: Sort {xk } in ascending order as {xa(k) }.
Initialize i = 1, j = K.
Theorem 3. The problem (22) is a concave optimization
2: while i < j do P
problem. T −(K−j)− jl=i xl
3: ν= j−i+1
Proof. See Appendix C. 4: if xi + ν ≥ 0 and xj + ν ≤ 1 then
5: for k = i, i + 1, · · · , j do P
The concavity guarantees the global optimality of any local T −(K−j)− jl=i xl
maximum we find. However, the problem is still hard to 6: pa(k) = xa(k) + j−i+1
solve since the function Gk (x) is non-differentiable. So we 7: end for
consider to approximate Gk (x) by replacing its function value 8: Break.
by a parabola in each interval (n − δ, n + δ), n ∈ N+ . The 9: else
approximation function is denoted by Ĝk (x) as (23). It can 10: if xa(i) + xa(j) ≥ 1 then
be validated that Ĝk (x) is concave and differentiable, whose 11: pa(j) = 1
derived function is denoted by Ĝ0k (x) as (24). 12: j =j−1
Similarly, we approximate the rate function Sk (pk ) by 13: else
Ŝk (pk ) = pk Ĝk (1/pk ). And its derived function is 14: pa(i) = 0
15: i=i+1
1 1 1 end if
Ŝk0 (pk ) = Ĝk ( ) − Ĝ0k ( ). (25) 16:
pk pk pk 17: end if
When δ → 0, the solution to the approximated version of 18: end while
the optimization problem (22) tends to be the real solution.
We use the gradient projection method to solve the ap- D. Optimizing the Pilot Length
proximated problem, which is an iterative method. In the t-th
iteration step, we find a new value for The optimization problem (15) is finally solved by finding
PKCSI updating frequency the maximum sum rate provided by the solution of (22) for
in the direction of the gradient of j=1 Ŝk (pj ):
each value of T ∈ {0, 1, · · · , K}.
Ŝ 0 (p )
(t) When T = 0, the CSI updatingPfrequencies of all users
(t) (t)
xk = pk + a qP k k , (26) K
are 0. As a result, the sum rate is k=1 Sk (pk ) = 0. In this
K 02 (t)
k=1 k Ŝ (p k ) case, the system spends no resource on channel estimation,
therefore the precoding process is meaningless.
where a is the iteration step size.
When T = K, the CSI updating frequencies of all users are
· , xK ) ∈ RK may lie beyond the
The point x = (x1 , x2 , · ·P
K 1, which means the system applies conventional continuous
constraint space P = {p| k=1 pk = T, 0 ≤ pk ≤ 1, k =
estimation scheme and estimates the channels of the whole
1, · · · , K}, So we need to project x to the space P. The
users inPevery block. The sum rate can be calculated as (1 −
projection is equivalent to solving the optimization problem K
K/C) k=1 Rk (0).
max ||p(t+1) − x(t) ||2 V. N UMERICAL R ESULTS
p(t+1) (27)
(t+1) In this section we evaluate the performance of the proposed
s.t. p ∈ P.
estimation scheme.
By deriving the KKT conditions of (27), we get We assume the same path loss coefficient for each user.
pk = xk + ν + λk − µk , (28) The BS splits the total power equally to each user in order to
ensure fairness. So the receive SNR of each user is the same.
where λi ≥ 0, µi ≥ 0, ν are the Lagrange multipliers of (27). The temporal correlation coefficient of each user is uniformly
If we arrange {xi } in ascending order as {xa(i) }, then the distributed in the interval [0.6, 0.9].
elements of its projection vector {pa(i) } are also in ascending Fig. 2 shows the allocation of estimation resource to users
order. We maintain i, j as the indexes of the first value larger with different temporal correlation coefficients. When the
(
R(n)−R(n−1)
4δ (x−n)2 + R(n)+R(n−1)
2 (x−n)+ Gk (n) + δ(R(n)−R(n−1))
4 n−δ < x < n+δ, n ∈ N+
Ĝk (x) = (23)
Gk (x) else
R(0)
0≤x≤1−δ
Ĝ0k (x) = R(n) n + δ ≤ x ≤ n + 1 − δ, n ∈ N+ (24)
R(n)−R(n−1)
R(n)+R(n−1)
2δ (x−n) + 2 n−δ < x < n+δ, n ∈ N+
0.25 100 35
CSI Updating Frequency
MF,T=5
0.2 Sum Rate of IES
ZF,T=5 Sum Rate of CES
0.15 Optimal Pilot Length
80 30
0.1
0.05
Sum Rate
0.65 0.7 0.75 0.8 0.85 0.9 0.95
Temporal Correlation Coefficient
1
CSI Updating Frequency
40 20
MF,T=30
ZF,T=30
0.8
20 15
0.6
0.4 0 10
0.65 0.7 0.75 0.8 0.85 0.9 0.95 20 25 30 35 40 45 50
Temporal Correlation Coefficient User Number
Fig. 2. CSI updating frequency vs. temporal correlation coefficient under Fig. 4. The sum rate and optimal pilot length of the proposed scheme using
different pilot length. M = 64, K = 40, C = 50, per user SNR is 10 dB. ZF precoding. M = 64, C = 50, per user SNR is 10 dB.
80