Best Paper 9

4786 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO.
18, SEPTEMBER 15, 2014
Hierarchical Interference Mitigation for Massive

MIMO Cellular Networks
An Liu, Member, IEEE, and Vincent K. N. Lau, Fellow, IEEE
AbstractWe propose a hierarchical interference mitigation exploit to mitigate inter-cell interference by sharing both real-
scheme for massive MIMO cellular networks. The MIMO pre- time CSI and payload data among the concerned BSs [6].
coder at each base station (BS) is partitioned into an inner precoder However, these conventional spatial multiplexing and in-
and an outer precoder. The inner precoder controls the intra-cell
interference and is adaptive to local channel state information terference mitigation techniques cannot be applied directly to
(CSI) at each BS (CSIT). The outer precoder controls the inter-cell massive MIMO cellular networks due to the following reasons.
interference and is adaptive to channel statistics. Such hierarchical First, the MU-MIMO precoding requires real-time local CSIT
precoding structure reduces the number of pilot symbols required at the BS. However, the amount of pilot symbols for channel
for CSI estimation in massive MIMO downlink and is robust to estimation is limited by the coherence time of the channel and
the backhaul latency. We study joint optimization of the outer
precoders, the user selection, and the power allocation to maximize it is practically infeasible to obtain good CSI quality when
a general concave utility which has no closed-form expression. each BS is equipped with a massive MIMO array. Second,
We first apply random matrix theory to obtain an approximated the existing inter-cell interference mitigation methods such as
problem with closed-form objective. Then using the hidden con- cooperative and coordinated MIMO require real-time global
vexity of the problem, we propose an iterative algorithm to find the CSIT, which is difficult to achieve in practice due to the back-
optimal solution for the approximated problem. We also obtain a
low complexity algorithm with provable convergence. Simulations haul latency1. Hence, the performance of these schemes is very
show that the proposed design has significant gain over various sensitive to CSIT errors due to outdatedness.
state-of-the-art baselines. In this paper, we address the above issues by proposing a hi-
Index TermsMassive MIMO, hierarchical interference mitiga- erarchical interference mitigation scheme for massive MIMO
tion, statistical user selection. cellular networks. In the proposed scheme, the MIMO precoder
at each BS is partitioned into an inner precoder and an outer pre-
coder as illustrated in Fig. 2. The inner precoder is used to sup-
I. INTRODUCTION port MU-MIMO at each BS and it is adaptive to real-time local
CSIT. The outer precoder can leverage on the remaining spa-
M ASSIVE MIMO is regarded as a promising technology

in future wireless networks due to its high spectrum and
energy efficiency [1]. The large spatial degree of freedom (DoF)
tial DoF to mitigate the inter-cell interference by restricting the
transmitted signal at each BS into a subspace and is adaptive to
long-term channel statistics2. Such hierarchical precoding struc-
of massive MIMO systems can contribute to (i) spatial multi- ture simultaneously resolves both the aforementioned practical
plexing gains for intra-cell users (MU-MIMO) as well as (ii) challenges as will be discussed in Remark 2. We consider joint
inter-cell interference mitigation via linear precoders at the BSs. optimization of the outer precoders, the user selection, and the
In [2], zero-forcing (ZF) and regularized zero-forcing (RZF) power allocation to maximize a general concave utility function
have been proposed for spatial multiplexing of data streams of the average data rates of users. The following first-order chal-
to intra-cell users. More complicated linear precoding schemes lenges need to be addressed.
based on duality [3] or semidefinite relaxing (SDR) [4] have Lack of Closed-Form Optimization Objective: The av-
also been proposed to achieve a better performance. On the other erage data rate of each user involves stochastic expectation
hand, the inter-cell interference mitigation is more complicated. over CSI realizations and it does not have closed form char-
One commonly adopted approach to mitigate the inter-cell in- acterization.
terference is the coordinated MIMO [5], which performs joint Complex Coupling between User Selection and Outer
precoding among the BSs using the global real-time CSIs shared Precoding: The outer precoder will affect the admissible
among the BSs. Alternatively, cooperative MIMO can also be user set. On the other hand, the optimization of outer pre-
coder also depends on user selection because the outer pre-
Manuscript received May 21, 2014; revised July 10, 2014; accepted July 12, coder only needs to suppress the interference to the se-
2014. Date of publication July 18, 2014; date of current version August 14, 2014. lected users in other BSs.
The associate editor coordinating the review of this manuscript and approving
Combinatorial Optimization Problem: The user selec-
it for publication was Prof. Rong-Rong Chen. This work was supported in part
by RGC614913, and in part by NSFCGrant No. 61171080. tion problem with hierarchical precoding in the massive
The authors are with the Department of Electrical and Computer Engineering,
1For example, the X2 interface in LTE systems has a typical latency of 10 ms
The Hong Kong University of Science and Technology, Hong Kong (e-mail:
eewendaol@ust.hk; eeknlau@ece.ust.hk). or more between BSs.
Color versions of one or more of the figures in this paper are available online 2Due to local scattering effects [7], the MIMO spatial channels are not
at http://ieeexplore.ieee.org. isotropic and precoding based on statistical information can be quite effective
Digital Object Identifier 10.1109/TSP.2014.2340814 to control/mitigate the inter-cell interference.
1053-587X 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
LIU AND LAU: HIERARCHICAL INTERFERENCE MITIGATION 4787
due to the local spatial scattering [7], we have

.
The massive MIMO cellular network can be represented by
a topology graph.
Definition 1 (Network Topology Graph): For given spatial
correlation matrices , define the topology graph of the mas-
sive MIMO cellular network as a bipartite graph
, where denotes the set of all BS nodes, denotes
the set of all user nodes, and is the set of all edges between the
BSs and users. An edge between BS node and
user node represents a wireless link between them. Each
edge is associated with a CSI label .
For each BS node , let denote the set of associated users
and denote the set of neighbor
users. For each user node , let denote the index of its serving
BS and denote the set of
Fig. 1. An example of massive MIMO cellular network and the neighbor BSs.
corresponding topology graph. (a) A massive MIMO cellular net-
work with 2 BSs and 5 users. (b) The corresponding topology graph
An example of massive MIMO cellular network and the cor-
, where and responding topology graph is illustrated in Fig. 1. For BS 1, the
. set of associated users is , and the set of neighbor
users is . For user 2, the index of the serving BS is
and the set of neighbor BSs is . For user 3, the
MIMO cellular networks is combinatorial with exponen- index of the serving BS is and the set of neighbor BSs
tial complexity w.r.t. the total number of users. is .
To address the above challenges, we first apply the random At each time slot, linear precoding is employed at BS to
matrix theory to obtain an approximated problem with closed- support simultaneous downlink transmissions to a set of sched-
form objective. Then using the hidden convexity of the problem, uled users denoted by . Let denote the set of
we propose an iterative algorithm to find the optimal solution all the selected users and denote the set of se-
for the approximated problem. We also obtain a low complexity lected users who are neighbors of BS . Note that a user
algorithm with provable convergence. can be potentially interfered by BS because there is a cross
Notations: For a set denotes the cardinality of . link (edge) between BS and a user . For example,
The notation denote the set of all semi-uni- consider the massive MIMO cellular network in Fig. 1. Sup-
tary matrices. Let denote the indication function such that pose the sets of selected users at the BSs are and
if the event is true and otherwise. . Then, we have and
represents the subspace spanned by the columns of , where . Since user 3 has a
a matrix and represents a set of orthogonal basis of cross link with BS 1 as illustrated in Fig. 1(a), it can be poten-
. is the spectral radius of . tially interfered by BS 1. Using the above notations, the received
signal for a user can be expressed as:
II. SYSTEM MODEL
A. Massive MIMO Cellular Network

Consider the downlink of a massive MIMO cellular network
with BSs and single-antenna users as illustrated in Fig. 1.
Each BS has antennas with much larger than the number
of the associated users. The channel between BS and user
is modeled as , where
has i.i.d. complex entries of zero mean and variance ; and
where is the data symbol, is the power alloca-
is the spatial correlation matrix between BS
tion and is the precoding vector of user is the set of se-
and user . As such, the CSI is divided into instantaneous CSI
lected users at BS is the data symbol
and global statistical information
(spatial correlation matrices). If the coverage area of a BS is vector at BS and is
partitioned into small sub-areas, it is reasonable to assume the power allocation vector at BS
that any two users collocated in the same sub-area have almost is the precoding matrix at BS ; and is the
the same spatial correlation matrices. This motivates us to con- AWGN noise.
sider the following locally-clustered spatial channel model.
B. Hierarchical Interference Mitigation
Assumption 1 (Locally-Clustered Spatial Channel): The spa-
tial correlation matrices associated with BS be- 1) Hierarchical Precoding for Intra-Cell and Inter-Cell
longs to a finite set with the size . Furthermore, Interference Mitigation: We propose a hierarchical precoder
4788 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO. 18, SEPTEMBER 15, 2014
Lemma 1. As such, the user selection set and power allocation

is assumed to be adaptive to the global statistical information
only. Specifically, the user selection and the outer precoder are
chosen to satisfy the zero inter-cell interference constraint:
(2)
Remark 2 (Implementation Considerations): The long term

controls are implemented at a central node based
on the spatial correlation matrices , while the short term
control (inner precoding) is implemented locally at each BS
Fig. 2. An illustration of hierarchical precoder structure. based on the local real-time instantaneous CSI knowledge
. The proposed hierarchical precoding solution has
several unique benefits regarding implementation. (a) Robust
for each BS as illustrated in Fig. 2. The outer to CSI signaling latency in backhaul, (b) Resolve the issues
precoder with (we let of insufficient pilot and feedback overhead for real-time CSIT
if ) is used to eliminate the inter-cell interference estimation in massive MIMO systems. For instance, the central
and is adaptive to the global statistical information . The node requires spatial correlation matrices to compute the
inner precoder is used to realize the spatial long term controls . The spatial correlation matrices
multiplexing gain at each BS and is adaptive to the local can be estimated via downlink training using some standard
real-time CSI . Define covariance matrix estimation technique [11] at the users and
as the set of outer precoders for all BSs. then fed back to the BSs. Since changes at a much slower
By properly choosing , one can eliminate the inter-cell inter- time scale w.r.t. the time slot rate, such a design requires
ference as shown in (2). substantially less signaling overhead compared to the coordi-
Remark 1: Physically, the rank of means the number nated MIMO and is more robust w.r.t. to the backhaul latency.
of data streams for spatial multiplexing at BS . Due to limited On the other hand, BS only needs to know the local CSI
spatial scattering [7], a BS with say antennas does not for the inner precoder . This can be
mean it can support spatial multiplexing of 100 data streams. obtained via downlink channel estimation and channel feed-
In practice, there are just a few (say 10) significant eigenchan- back using conventional CSI signaling mechanisms in modern
nels despite having 100 antennas, and having spa- wireless systems such as LTE [12]. Since is substantially
tially multiplexed data streams already capture most of the spa- smaller than , the issue of the huge downlink pilot and
tial multiplexing advantage. The remaining spatial DoFs can be CSI feedback signaling overhead in massive MIMO is also
used for inter-cell interference mitigation. alleviated by the hierarchical precoding design.
For a given outer precoder , we consider regularized zero-
forcing (RZF) inner precoder with a parameter . The RZF pre- III. OPTIMIZATION FORMULATION FOR HIERARCHICAL
coder is easy to implement and is asymptotically optimal for INTERFERENCE MITIGATION
[8]. Moreover, we can apply the technique of
deterministic equivalent (DE) for RZF precoding in [9] to fa- We consider joint optimization of the outer precoders , the
cilitate the algorithm design. For convenience, define the com- user selection , and the power allocation ;
posite channel from BS to a subset of users as all of them are adaptive to the global statistical information .
. If the inter-cell interference is Define as a composite control variable. For given
completely eliminated by the outer precoders , the RZF inner that satisfies (2), the instantaneous data rate
precoder is given by (treating interference as noise) of user is
(1)
(3)
where is a fixed parameter for RZF. Note that is scaled
by to ensure that the matrix is
where the precoders with the inner pre-
well conditioned as .
coder given by (1). The transmit power of BS is
2) Statistical User Scheduling and Power Allocation: As
grows large, the role of multi-user diversity gain (by selecting
users based on instantaneous CSIT) becomes less and less effec-
tive because of channel hardening [10]. Moreover, the ben-
efits of short timescale power allocation (i.e., the power allo-
cation is adaptive to instantaneous CSIT) becomes asymptoti-
cally negligible as because the data rate of each Note that there may not always be enough spatial DoFs to elimi-
user converges almost surely to a deterministic function of the nate the inter-cell interference to all the users. Hence, for a fixed
power allocation vector as will be shown in composite control variable , it is possible that only part of the
users can be scheduled for transmission. For fairness consid- For a given topology graph and per-BS
erations, we consider randomized control policy which realizes power constraint , the problem of interference mitigation via
time-sharing between several composite control variables as de- hierarchical precoding can be formulated as4
fined below.
Definition 2 (Randomized Control Policy): A randomized
control policy consists of a set of composite con-
trol variables with and a prob- Note that the conditional average rate in the utility
ability vector , where the -th composite function and the conditional average power in
control variable in is ; and satis- the constraint function of do not have closed form
fies . At any time slot, the com- expressions. To make the problem tractable, we need to address
posite control variable is used with probability , i.e., the the following challenge.
outer precoders, the user selection set and the power allocation Challenge 1 (Closed Form Approximation for ).
are respectively given by and with probability Find an approximated problem with closed form
. Moreover, define the set of feasible control policies under objective and constraints such that the solution of
per-BS power constraint as is asymptotically optimal w.r.t. as grows large.
We resort to random matrix theory to solve the above chal-
lenge. Specifically, we first derive deterministic equivalents
(DEs) [9] for the conditional average rate and power. Then we
where is the obtain an approximated problem by replacing the
set of feasible composite control variables, and conditional average rate and power with their DE approxima-
is the set of ad- tions. Finally, we show that the solution of is an
missible composite control variables. -optimal solution of .
For given control policy and spatial correlation Definition 3 ( -Optimal Solution): A solution
matrices , the conditional average data rate of user is: is called an -feasible solution of
if it satisfies the zero inter-cell interference constraint
and the following
relaxed per-BS power constraint
The network performance is characterized by a utility func-

tion , where It is called an -optimal solution of if it is an
is the conditional average rate vector. We make the following -feasible solution and , where
assumptions on ( is a simplified notation for ). is the optimal objective value of .
Assumption 2 (Assumptions on Utility): The utility function Throughout the paper, the notation refers to
can be expressed as , where is and such that
the weight for user is assumed to be a twice differen-
. For technical reasons, we require the
tiable, concave and increasing function for all . Moreover,
is L-Lipschitz, i.e., following assumptions.
Assumption 3 (Technical Assumptions for DE):
1) All spatial correlation matrices have uni-
formly bounded spectral norm on , i.e.,
for some constant . (5)
The above utility function captures a lot of interesting cases:
Alpha-fair [13]: Alpha-fair can be used to compromise be-
tween the fairness to users and the utilization of resources. Moreover, .
The utility function is3 2) All the random matrices have uniformly
bounded spectral norm on with probability one, i.e.,
(4)
3) .
where is a small number.
Assumption 31) is satisfied by many MIMO channel
Proportional Fair (PFS) [14]: This is a special case of
models such as the angular domain MIMO channel model
alpha-fair when .
in [7] and it is a standard assumption in the literatures, see
3In the original alpha-fair utility function in [13], is equal to zero. In this e.g., [9], [15]. Under Assumption 3-1), Assumption 32)
paper, we set so that Assumption 2 can be satisfied. Since is very small,
it has negligible effect on the performance. The utility function in (4) is also 4Note that the set of feasible control policies depends on since
scaled by to ensure that it is bounded as . the set of neighbor users of BS depends on .
holds true if , that is, if where , and

belongs to a finite family [9]. According . Given Assumption
to the locally-clustered channel model in Assumption 1, we 3 and for sufficiently small is an -optimal
have and thus Assumption 32) holds solution of as .
true. Assumption 33) is to ensure that the utility function is Please refer to Appendix B for the proof. By Theorem 1, the
bounded as . solution of can be approximated by the solution of
Lemma 1 (DE of Rate and Power): Let Assumption 3 hold , and the approximation is -optimal as .
true and consider composite control variable
IV. SOLUTION OF
satisfying: 1) ; 2) the corresponding user selection
satisfies . Then In the rest of the paper, we focus on solving
for fixed . We will use and as simplified
we have
notations for and when there is no
ambiguity. Clearly, the utility function is not a convex
function on and thus is a non-convex optimization
problem. Moreover, the optimization variables in in-
volve a set of composite control variables with undetermined
size and the associated probabilities with undetermined di-
for sufficiently small , where mension. It is in general very difficult to find the global optimal
solution for such a non-convex problem. In this section, we are
(6) going to address the following challenge.
(7) Challenge 2 (Design a Global Convergent Algorithm for
). Exploit the specific structure of problem
to design an iterative algorithm that converges to the global
are the deterministic equivalent (DE) of user rate and BS optimal solution of .
transmit power, and form the unique solution of We first study the optimality condition of . Then we
propose an iterative algorithm to solve Challenge 2.
A. Global Optimality Condition of

It is difficult to find a simple characterization for the nec-
(8) essary and sufficient global optimality condition of a general
non-convex problem. However, problem is not an ar-
bitrary non-convex problem but has some specific hidden con-
with . vexity structure, which can be exploited to derive the global op-
Please refer to Appendix A for the proof. timality condition for as shown below.
Remark 3: Note that the above DEs are established on the We first study the hidden convexity of . Define the
conditional distribution of the channel (conditioned on the (deterministic equivalent of) average rate region as:
statistics ). Given a realization of (the statistics), the con-
trol actions are all fixed (because they are adap- (9)
tive to only). As such, the conditional measure of (con-
ditioned on the given ) will exhibit random matrix theory
behavior and the DE convergence in Lemma 1 can be proved where with
using standard techniques in [9]. On the other hand, if were . Then we have the following Lemma.
adaptive to the instantaneous CSI (short-term user selection), Lemma 2 (Convexity of ): , where
then conditioned on would be random and hence the DE denotes the convex hull operation and
approximation would fail (due to the random or extreme value .
effect of the user selection which changes the underlying con- Please refer to Appendix C for the proof.
ditional distribution of the channels ). Similar conclusion has The following lemma shows that problem is equiv-
also been made in [16] that the DE of the data rate in massive alent to a convex problem:
MIMO system is valid as long as the user selection is indepen-
dent of the instantaneous CSI .
Based on Lemma 1, we have the following result.
Theorem 1 (Asymptotic -Equivalence of ): (10)
Let denote the optimal solution of
Lemma 3 (Equivalence Between and (10)): If
is the global optimal solution of , then is the
optimal solution of problem (10); on the other hand, if is
the optimal solution of problem (10), then any satisfying
is also the global optimal solution of .
Please refer to Appendix C for the proof. This hidden con-

vexity of (i.e., the equivalence between and
(10)) is the key to derive the global optimality condition of
. Note that although problem (10) is convex, the solu-
tion is still non-trivial because there is no simple characteriza-
tion for its feasible set .
To derive the global optimality condition of , we also
need the first order optimality condition of problem (10) as sum-
marized in the following lemma. Fig. 3. Summary of overall solution and the inter-relationship of the algorithm
Lemma 4 (First Order Optimality Condition of (10)): A so- components for both Algorithm E (with Procedure , composite control vari-
lution is optimal for problem (10) if and able and output ) and the modified Algorithm E (with
only if Procedure W, composite control variable and output ). The
iteration number is omitted for simplicity. Each square represents an algo-
rithm component and the corresponding square bracket explains the function of
(11) this algorithm component.
Finally, from Lemma 3 and Lemma 4, we can obtain the nec-

essary and sufficient global optimality condition for problem Step 3) : If and , where
as follows. is a small number, terminate the algorithm.
Theorem 2 (Global Optimality Condition of ): A Otherwise, let and return to Step 1.
control policy with Fig. 3 summarizes the inter-relationship between the com-
is a global optimal solution of if and only if ponents of Algorithm E. Algorithm E contains two procedures
satisfies: (subroutines) which will be elaborated below.
Remark 4: Algorithm E can be interpreted as the
(12) Frank-Wolfe Algorithm (also known as the conditional
gradient algorithm) with exact line search [17] applied on the
where and the weight vector
equivalent convex problem in (10). Compared to the conven-
.
tional Frank-Wolfe Algorithm, the main difference is that the
The detailed proof can be found in Appendix C.
optimization variable in problem is instead of in
B. Global Optimal Solution of (10), and the optimization w.r.t. is non-convex.
1) Procedure Q (Optimization of for Fixed ): For given
Just as we can obtain the optimal solution of a convex input , Procedure Q essentially solves the optimal probability
problem by solving its KKT conditions, we can also obtain vector for with fixed , i.e., Procedure Q with input
the global optimal solution of by solving the global is a standard convex optimization procedure to solve the fol-
optimality condition in Theorem 2. Specifically, for any given lowing optimization problem:
spatial correlation matrices , we propose Algorithm E to
achieve the global optimality condition of by
iteratively updates the optimization variables and the
weight vector in Theorem 2.
Algorithm E (Top level algorithm for solving ):
Initialization: Set and let . Call (15)
Procedure with input to obtain a composite control
variable and let .
Step 1) (Update probability vector ): Call Procedure Q where is the -th composite control variable in . Hence,
Procedure Q can be efficiently implemented by existing convex
with input to obtain the
optimization methods/software. As such, the pseudo code of
updated probability vector . Procedure Q is omitted here for conciseness.
Let and 2) Procedure (Finding a New Composite Control
, where . Let Variable for Given ): The pseudo code of Procedure
is summarized in Table I. In Line 2,
Step 2) (Update composite control variable set ): Let is the unique solution of (8) with
, where
. In Line 4, is the (de-
terministic equivalent of) weighted sum-rate for given user
(13) selection . In Line 7, . For convenience, is
referred to as the effective channel gain of user and
Call Procedure with input to obtain a new is called the projected spatial correlation matrix of user . For
composite control variable . Update as conciseness, is denoted as when there is no
ambiguity. To calculate the weighted sum-rate , we need
(14) to obtain the effective channel gains associated with
TABLE I TABLE II
PROCEDURE (FOR SOLVING CONDITION (17)) PROCEDURE W (FOR SOLVING CONDITION (17))
by solving the fixed point equation in (8). The solution of (8)

can be obtained using the following fixed point iterations [9]
next subsection, we will propose a low complexity solution,

named the modified Algorithm E, for by replacing the
exhaustive user selection process with a statistical greedy user
(16) selection process.
with initial point , where C. Low Complexity Solution of

.
For given input , Procedure essentially finds a com- The low complexity solution (modified Algorithm E) is ob-
posite control variable which satisfies the global opti- tained by replacing the exact solution of (17) in step 2
mality condition in (12) for fixed . (and the initialization step) of Algorithm E with an approxi-
Theorem 3 (Characterization of Procedure ): For given mate solution found by a low complexity procedure named
input , the output of Procedure satisfies Procedure W. In other words, the modified Algorithm E are the
same as Algorithm E except that Procedure (which involves
(17) exhaustive user selection) is replaced by the low complexity
counterpart Procedure W (which is based on statistical greedy
Please refer to Appendix D for the proof. user selection).
3) Convergence and Performance of Algorithm E: The up- The pseudo code of Procedure W is summarized in Table II.
date rule in Algorithm E is designed according to the global op- In Line 3 and 4, the weighted sum-rate for any given can
timality condition in Theorem 2. As a result, it can be shown be calculated using the same method as described in Procedure
that Algorithm E converges to the global optimal solution of . Clearly, the statistical greedy user selection loop between
using the global optimality condition in Theorem 2 and Line 2 and Line 9 converges to a solution within iterations.
the property of Algorithm E in the following Lemma. Fig. 3 summarizes the overall low complexity solution and
Lemma 5 (Property of Algorithm E): Let be the control the inter-relationship between the components of the modified
policy in the -th iteration of Algorithm E. We have Algorithm E. To justify the modified Algorithm E, we need to
address the following challenge.
Challenge 3 (Monotone Convergence of the modified Algo-
rithm E). Prove the monotone convergence of the modified Al-
(18) gorithm E as well as characterize the performance loss of the
modified Algorithm E w.r.t. the global optimal solution.
where is given in (13). The following theorem provides a solution to Challenge 3.
Please refer to Appendix E for the proof. Theorem 5 (Convergence of the Modified Alg. E): The mod-
Using Theorem 2 and Lemma 5, we obtain the following ified Algorithm E monotonically increases the utility
global convergence result. and . Moreover, the gap of with the
Theorem 4 (Global Optimality of Algorithm E): Algo- optimal utility of is bounded by
rithm E monotonically increases the utility and
, where is the global optimal value
of .
Please refer to Appendix E for the proof. where can be any accumulation point of the iterates
In step 2 of Algorithm E, we need to call Procedure , generated by the modified Algorithm E,
which involves an exhaustive user selection process where is the output of Procedure with input and is
is calculated for all possible user set (see Line 1 to the output of Procedure W with input .
Line 6 of Procedure ). The complexity of exhaustive user Please refer to Appendix F for the proof. Theorem 5 states
selection is exponential w.r.t. the number of users . In the that the performance gap between the modified Algorithm E
TABLE III
COMPARISON OF THE PER TIME SLOT MATLAB COMPUTATIONAL TIME AND
PER TIME SLOT PER CELL SIGNALING OVERHEAD OF DIFFERENT SCHEMES.
ASSUME THAT THE SYSTEM BANDWIDTH IS 1 MHZ, AND THE SPATIAL
CHANNEL CORRELATION MATRICES CHANGES EVERY 1000 TIME SLOTS.
THE OTHER SIMULATION SETUP IS THE SAME AS FIG. 4. THE REAL-TIME
CSI ESTIMATION OVERHEAD INCLUDES THE PILOT SYMBOL OVERHEAD AND
THE UPLINK CSI FEEDBACK OVERHEAD. FOR EXAMPLE, THE REAL-TIME
CSI ESTIMATION OVERHEAD OF THE PROPOSED SCHEME IS ABOUT 22 PS, 9
, WHICH MEANS THAT IN AVERAGE, THE PROPOSED SCHEME REQUIRES
TRANSMITTING 22 INDEPENDENT PILOT SYMBOLS AND FEEDBACKING 9
COMPLEX CHANNEL VECTORS WITH AVERAGE DIMENSION 22 PER TIME
SLOT PER CELL
Fig. 4. Throughput comparisons over different schemes. The user speed is 3

km/h.
and (the optimal) Algorithm E is upper bounded by the per-

formance gap (in terms of weighted sum-rate) between Proce-
dure W (statistical user selection) and Procedure (exhaus-
tive user selection).
Complexity Analysis for the Modified Algorithm E: For sim-
plicity, we assume . Suppose that the fixed point
iterations in (16) converges to the desired accuracy in itera-
tions. Then it can be shown that the overall complexity of Proce-
dure W is upper bounded by matrix multipli-
cations, matrix inversions and
GramSchmidt processes. This is also the order of the
per iteration complexity for the modified Algorithm E because
in each iteration of the modified Algorithm E, the computation
complexity is dominated by Procedure W. Fig. 5. Average cell throughput versus the per BS transmit power . The user
speed is 3 km/h.
V. SIMULATION RESULTS
Consider a cellular network with 19 cells. The inter-site
distance is 500 m. In each cell, there are 2 uniformly distributed under different backhaul latencies. For baseline 2, the 3 coop-
hotspots with a radius of 50 m. There are 12 users in one cell, erative BSs need to exchange CSI and payload data, and thus
2/3 of whom are clustered around the hotspots, while the others there is CSI delay when the backhaul latency is not zero. When
are uniformly distributed within the cell. Each BS is equipped there is CSI delay, the outdated CSI is related to the actual CSI
with antennas. The spatial correlation matrices are by the autoregressive model in [20]. It can be seen that the cell
generated according to , where the throughput of the proposed scheme is close to the baseline 2
path gains s are generated using the path loss model with zero backhaul latency and is much larger than baseline 1.
(Urban Macro NLOS model) in [18], and the normalized The worst 10% users also benefit from huge throughput gain
spatial correlation matrices s with and over baseline 1. Although the performance of baseline 2 is
are randomly generated. In the simulations, promising at zero backhaul latency, the performance quickly
we set the parameter for RZF as . We compare the degrades at 10 ms backhaul latency. These results demonstrated
performance of the proposed algorithm with the following two the superior performance and the robustness of the proposed
baselines. hierarchical interference mitigation w.r.t. signaling latency in
Baseline 1 (FFR): Fractional frequency reuse (FFR) [19] is backhaul. Table III compares the computational complexity
applied to suppress the inter-cell interference. In each cell, ZF (CPU time) and signaling overhead of different schemes. The
beamforming is used to serve the users on each subband. computational complexity and the backhaul signaling overhead
Baseline 2 (Clustered CoMP): 3 neighbor BSs form a cluster of the proposed scheme are similar to FFR, and are much lower
and employ cooperative ZF [6] to simultaneously serve all the than CoMP. The real-time CSI estimation overhead of the
users within the cluster. proposed scheme is lower than both FFR and CoMP.
A. Performance Evaluation Under PFS Utility B. Performance Evaluation Under Sum-Rate Utility
Consider the PFS utility with Consider the sum-rate utility. In Fig. 5, we plot the average
. The per BS transmit power is dB. In Fig. 4, cell throughput of different schemes versus the per BS
we compare the average cell throughput of different schemes transmit power . It can be seen that the cell throughput of
the proposed scheme is close to the baseline 2 with zero back- . From this and Lemma 6, Lemma 1 fol-
haul latency and is much larger than baseline 1. When there is lows immediately.
a backhaul latency of 10 ms, the proposed scheme also has a
significant throughput gain over baseline 2. The DE of the cell APPENDIX B
throughput is also plotted for the proposed scheme. It PROOF OF THEOREM 1
can be seen that the DE is very accurate.
Let be the optimal solution of Problem .
It can be proved by contradiction that the con-
VI. CONCLUSION trol policies and must satisfy:
and
We propose a hierarchical interference mitigation scheme for
massive MIMO cellular networks. The MIMO precoder is par-
. Define two sets
titioned into inner precoder (for intra-cell interference control)
and outer precoder (for inter-cell interference control). We study
joint optimization of the outer precoders, the user selection, and
the power allocation. The optimization only requires the knowl-
edge of spatial correlation matrices and thus is robust to backhaul
latency. We first apply the random matrix theory to obtain an ap-
proximated problem which is non-convex. Then using the hidden Let denote a control policy that satisfies
convexity of the problem, we propose Algorithm E to obtain the
global optimal solution and a low complexity version of Algo- , and . Let
rithm E to find a sub-optimal solution. Simulations show that denote a control policy that satisfies
the proposed design achieves significant performance gain over , and
various state-of-the-art baselines. . It can be shown that as ,
we have
APPENDIX A
PROOF OF LEMMA 1 (23)
Under the zero inter-cell interference constraint in (2), the for or .

-th cell can be viewed as a single-cell downlink system with For composite control variable satisfying the conditions in
equivalent channels . Following similar Lemma 1, it can be shown that and are uniformly
analysis as in the proof of ([9], Theorem 2), the following lemma integrable [21] w.r.t. . Together with Lemma 1, it follows that
can be proved.
Lemma 6: Let Assumption 3 holds true. As , we (24)
have and , where (25)
(19) By definition, we have
(26)
(20)
Then it follows from (25) and (26) that
where and (27)

are given by
Similarly, it can be shown that
(21)
(22) (28)
with , We expand as follows

and given by
(29)
Note that in (29), we have used as an abbreviation for

. From (24), we have
Following similar analysis as in the proof of ([9], Theorem
3), it can be shown that and . (30)
Then it follows that and
for . Then it follows from (30) and Combining (33) and (34), we have
that
(35)
By Lemma 4, is the optimal solution of problem (10).

(31)
Then it follows from Lemma 3 that is the global optimal
From (23),(28) and the definition of and , we have solution of .
On the other hand, suppose is the optimal solution of
(32) . By Lemma 3, is the optimal solution of (10).
Then by Lemma 4, satisfies (35), from which it can be shown
Then it follows from (23),(29),(31),(32) that that satisfies the optimality condition in (12).
APPENDIX D
This completes the proof for Theorem 1. PROOF OF THEOREM 3
It can be seen that the optimal solution of the following
APPENDIX C WSRM problem satisfies (17)
PROOFS FOR THE RESULTS IN SECTION IV.A
Proof of Lemma 2: Clearly, . Hence,
we only need to prove that any Pareto boundary point of (36)
must lie in . First, it is easy to see that can
always be expressed as a convex combination of points Hence, we only need to prove that the output of Procedure
in , i.e., , is the optimal solution of .
where and . First, we show that is equivalent to a joint user
Second, must lie in the selection and power allocation problem.
supporting hyperplane to at the Pareto boundary Lemma 7 (Equivalence of ): Let denote an
point . Otherwise, cannot be a Pareto boundary point optimal solution of
of . The above two facts imply that can be
expressed as a convex combination of points in the
(37)
set , i.e., , where
and . Hence,
must lie in . Then is an optimal solution of
Proof of Lemma 3: The first part of Lemma 3 follows di- , where with
rectly from the definition of problem (10) and . The ; and .
second part of Lemma 3 can be proved by contradiction. Sup- Proof: Lemma 7 can be proved by contradiction.
pose satisfies but is not the global optimal First, it is easy to see that is a fea-
solution of . Then there exists a control policy sible solution of , i.e., . Sup-
such that . Then compared to pose that is not an optimal solution of .
achieves a larger objective value for problem (10), Then there exists such that
which contradicts with the assumption that is the optimal . Since satisfies
solution of problem (10). the zero inter-cell interference constraint in (2), we must
Proof of Theorem 2: Suppose with have .
satisfies the optimality condition in Theorem 2. Let , where with
It follows from (12) that . It can be shown that
and .
(33) Let , where with
.
It is easy to see that satisfies (2) and
and . Using the above
. Using the fact that
fact and noting that , where , it can be shown that
, we have , which implies that is a feasible solu-
tion of Problem (37). Hence, we have
, which contradicts with
. This completes the proof.
It can be verified that in Line 6 of Procedure is
(34) the optimal solution of (37). By Lemma 7, the output of
Procedure is the optimal solution of .
APPENDIX E where the equality holds if and only if .

PROOFS FOR THE RESULTS IN SUBSECTION IV.B3 From (38)(40), we have
Proof of Lemma 5: Note that is equal to the . By Lemma 5, we have
optimal value of problem (15) with . If . Hence
we restrict , problem (15)
reduces to problem (18). Hence, must be no less (41)
than the optimal value of (18).
Proof of Theorem 4: Using the fact that any Pareto point By Lemma 8, we have
of a -dimensional convex polytope in can be expressed
as a convex combination of no more than vertices, it can be (42)
shown that there are at most non-zero elements in in step
1 of Algorithm E. Hence and the solution found Then it follows from (41) and (42) that
by Algorithm E is feasible.
For simplicity of notation, let and
. By Lemma 5, we have
. Since the objective value is upper (43)
bounded, the following lemma holds.
Lemma 8: Let be the iterates generated by Algorithm
Combining (43) and the fact that , we have
E. We have for some .
. This completes the proof.
By Assumption 2, is L-Lipschitz, which implies that
is also L-Lipschitz with the L constant given by
. It is well know that the following lemma holds for APPENDIX F
a L-Lipschitz function. PROOF OF THEOREM 5
Lemma 9: If is L-Lipschitz, i.e., Using similar analysis as in the proof of Lemma 5, it can be
shown that under the modified Algo-
rithm E. Since the objective value is upper bounded, we have
for some . Following similar anal-
for some constant , then ysis as that for (43), it can be shown that any accumulation point
of the iterates generated by the mod-
ified Algorithm E satisfies
(44)
Let and .
By definition, we have . With the above two lemmas, Moreover, it follows from that
we will show that , which implies that is the .
global optimal value (this is because means that Let denote the optimal solution of . Since is
satisfies the global optimality condition in (12)). From Lemma the gradient of (by definition) and is a concave
9, we have function, we have
Note that for some constant (this is

because and is clearly a bounded region).
Then we have
where the last inequality follows from
(38) and (44).
REFERENCES
where is given by [1] F. Rusek, D. Persson, B. K. Lau, E. Larsson, T. Marzetta, O. Edfors,
and F. Tufvesson, Scaling up MIMO: Opportunities and challenges
with very large arrays, IEEE Signal Process. Mag., vol. 30, no. 1, pp.
(39) 4060, Jan. 2013.
[2] C. Peel, B. Hochwald, and A. Swindlehurst, A vector-perturbation
technique for near-capacity multiantenna multiuser communication-
part I: Channel inversion and regularization, IEEE Trans. Commun.,
Clearly, we have vol. 53, no. 1, pp. 195202, Jan. 2005.
[3] M. Schubert and H. Boche, Iterative multiuser uplink and downlink
beamforming under SINR constraints, IEEE Trans. Signal Process.,
(40) vol. 53, no. 7, pp. 23242334, Jul. 2005.
[4] A. Gershman, N. Sidiropoulos, S. Shahbazpanahi, M. Bengtsson, and [18] Technical Specification Group Radio Access Network; Further Ad-
B. Ottersten, Convex optimization-based beamforming, IEEE Signal vancements for E-UTRA Physical Layer Aspects. [Online]. Available:
Process. Mag., vol. 27, no. 3, pp. 6275, 2010. http://www.3gpp.org 3GPP TR 36.814
[5] G. Foschini, K. Karakayali, and R. Valenzuela, Coordinating multiple [19] H. Lei, L. Zhang, X. Zhang, and D. Yang, A novel multi-cell OFDMA
antenna cellular networks to achieve enormous spectral efficiency, system structure using fractional frequency reuse, in Proc. IEEE Int.
Proc. Inst. Electr. Eng.Commun., vol. 153, no. 4, pp. 548555, Aug. Symp. Pers., Indoor Mobile Radio Commun., Sep. 2007, pp. 15.
2006. [20] K. Baddour and N. Beaulieu, Autoregressive modeling for fading
[6] O. Somekh, O. Simeone, Y. Bar-Ness, A. Haimovich, and S. Shamai, channel simulation, IEEE Trans. Wireless Commun., vol. 4, no. 4, pp.
Cooperative multicell zero-forcing beamforming in cellular downlink 16501662, Jul. 2005.
channels, IEEE Trans. Inf. Theory, vol. 55, no. 7, pp. 32063219, [21] D. Williams, Probability With Martingales. Cambridge, U.K.: Cam-
2009. bridge Univ. Press, 1997.
[7] D. Tse and P. Viswanath, Fundamentals of Wireless Communication.
Cambridge: Cambridge Univ. Press, 2005.
[8] R. Zakhour and S. Hanly, Base station cooperation on the downlink:
Large system analysis, IEEE Trans. Inf. Theory, vol. 58, no. 4, pp. An Liu (S07M09) received the Ph.D. and the B.S.
20792106, Apr. 2012. degree in electrical engineering from Peking Univer-
[9] S. Wagner, R. Couillet, M. Debbah, and D. T. M. Slock, Large system sity, China, in 2011 and 2004, respectively.
analysis of linear precoding in correlated MISO broadcast channels From 2008 to 2010, he was a visiting scholar at
under limited feedback, IEEE Trans. Inf. Theory, vol. 58, no. 7, pp. the Department of ECEE, University of Colorado at
45094537, Jul. 2012. Boulder. From 2011 to 2013, he was a Postdoctoral
[10] A. Tomasoni, G. Caire, M. Ferrari, and S. Bellini, On the selection of Research Fellow with the Department of ECE,
semi-orthogonal users for zero-forcing beamforming, in Proc. IEEE HKUST, and he is currently a Visiting Assistant
ISIT 2009, 2009, pp. 11001104. Professor. His research interests include wireless
[11] X. Mestre, Improved estimation of eigenvalues and eigenvectors of communication, stochastic optimization and com-
covariance matrices using their sample estimates, IEEE Trans. Inf. pressive sensing.
Theory, vol. 54, no. 11, pp. 51135129, 2008.
[12] Long Term Evolution of the 3GPP Radio Technology. 2006 [Online].
Available: http://www.3gpp.org/Highlights/LTE/LTE.htm 3GPP
[13] J. Mo and J. Walrand, Fair end-to-end window-based congestion Vincent K. N. Lau (SM04F12) received the
control, IEEE/ACM Trans. Netw., vol. 8, no. 5, pp. 556567, Oct. B.Eng. (Distinction 1st Hons.) from the University
2000. of Hong Kong in 1992 and the Ph.D. degree from
[14] F. Kelly, A. Maulloo, and D. Tan, Rate control for communication Cambridge University, Cambridge, U.K., in 1997.
networks: Shadow price proportional fairness and stability, J. Oper. He was with HK Telecom (PCCW) as a System
Res. Soc., vol. 49, pp. 237252, 1998. Engineer from 1992 to 1995, and with Bell Labs
[15] J. Hoydis, S. Ten Brink, and M. Debbah, Massive MIMO in the - Lucent Technologies as a member of Technical
UL/DL of cellular networks: How many antennas do we need?, IEEE Staff during 19972003. He then joined the De-
J. Sel. Areas Commun., vol. 31, no. 2, pp. 160171, Apr. 2013. partment of ECE, HKUST, and is currently a
[16] A. Adhikary, J. Nam, J. Ahn, and G. Caire, Joint spatial division and Professor. His current research interests include the
multiplexingThe large-scale array regime, IEEE Trans. Inf. Theory, robust and delay-sensitive cross-layer scheduling of
2013. MIMO/OFDM wireless systems, cooperative and cognitive communications,
[17] M. Frank and P. Wolfe, An algorithm for quadratic programming, dynamic spectrum access, as well as stochastic approximation and Markov
Naval Res. Logistics Quart., vol. 3, no. 12, pp. 95110, 1956. decision process.

Best Paper 9

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Best Paper 9

Загружено:

Авторское право:

Доступные форматы

4786 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 62, NO.

18, SEPTEMBER 15, 2014

Hierarchical Interference Mitigation for Massive

M ASSIVE MIMO is regarded as a promising technology

due to the local spatial scattering [7], we have

A. Massive MIMO Cellular Network

Lemma 1. As such, the user selection set and power allocation

Remark 2 (Implementation Considerations): The long term

The network performance is characterized by a utility func-

holds true if , that is, if where , and

A. Global Optimality Condition of

Please refer to Appendix C for the proof. This hidden con-

Finally, from Lemma 3 and Lemma 4, we can obtain the nec-

by solving the fixed point equation in (8). The solution of (8)

next subsection, we will propose a low complexity solution,

with initial point , where C. Low Complexity Solution of

Fig. 4. Throughput comparisons over different schemes. The user speed is 3

and (the optimal) Algorithm E is upper bounded by the per-

Under the zero inter-cell interference constraint in (2), the for or .

(19) By definition, we have

where and (27)

with , We expand as follows

Note that in (29), we have used as an abbreviation for

By Lemma 4, is the optimal solution of problem (10).

APPENDIX E where the equality holds if and only if .

Note that for some constant (this is

Вам также может понравиться