Вы находитесь на странице: 1из 14

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication.


1

Weighted Sum Rate Optimization for Downlink Multiuser MIMO Coordinated Base Station Systems: Centralized and Distributed Algorithms
Tadilo Endeshaw Bogale, Student Member, IEEE and Luc Vandendorpe Fellow, IEEE

Abstract This paper considers the joint linear transceiver design problem for the downlink multiuser multiple-input multiple-output (MIMO) systems with coordinated base stations (BSs). We consider maximization of the weighted sum rate with per BS antenna power constraint problem. We propose novel centralized and computationally efcient distributed iterative algorithms that achieve local optimum to the latter problem. These algorithms are described as follows. First, by introducing additional optimization variables, we reformulate the original problem into a new problem. Second, for the given precoder matrices of all users, the optimal receivers are computed using minimum mean-square-error (MMSE) method and the optimal introduced variables are obtained in closed form expressions. Third, by keeping the introduced variables and receivers constant, the precoder matrices of all users are optimized by using second-order-cone programming (SOCP) and matrix fractional minimization approaches for the centralized and distributed algorithms, respectively. Finally, the second and third steps are repeated until these algorithms converge. We have shown that the proposed algorithms are guaranteed to converge. We also show that the proposed algorithms require less computational cost than that of the existing linear algorithm. All simulation results demonstrate that our distributed algorithm achieves the same performance as that of the centralized algorithm. Moreover, the proposed algorithms outperform the existing linear algorithm. In particular, when each of the users has single antenna, we have observed that the proposed algorithms achieve the global optimum. Index Terms Rate, Matrix fractional minimization, MMSE, multiuser MIMO, distributed optimization and convex optimization.

I. I NTRODUCTION Multiple-input multiple-output (MIMO) systems have been proven to enhance the spectral efciency of wireless systems. This performance improvement is achieved by employing signal processing at the transmitters (precoder) and receivers (decoders). In [1], the achievable sum rate of the broadcast channel (BC) obtained by dirty paper coding (DPC) technique has been characterized for MIMO systems. The authors of [2] and [3] have shown that DPC achieves the capacity region of BC channels. However, due to the non-linear characteristics of DPC, practical realization of it has appeared to be difcult.
The authors would like to thank the Region Wallonne for the nancial support of the project MIMOCOM in the framework of which this work has been achieved. Part of this work has been published in the International Conference on Communications (ICC), Kyoto, Japan, Jun. 2011. Tadilo Endeshaw Bogale and Luc Vandendorpe are with the ICTEAM Institute, Universit catholique de Louvain, Place du Levant 2, 1348 - Louvain La e Neuve, Belgium. Email: {tadilo.bogale, luc.vandendorpe}@uclouvain.be, Phone: +3210478071, Fax: +3210472089.

Given the drawbacks of DPC, linear processing is motivated as it exhibits good performance versus complexity trade-off. However, nding linear processing schemes that achieve the capacity of BC channels is still an open issue. In [4], linear processing method that employs channel block-diagonalization is suggested. The latter method suffers from noise enhancement and has a restriction on the number of transmit and receive antennas. In [5], weighted sum rate maximization problem for the downlink multiuser MIMO system is formulated as the problem of minimizing the geometric product of minimum mean-square-errors (MMSE). This paper solves its problem with a per BS antenna power constraint. The latter problem has also been examined in [6] with a total BS power constraint. To solve the optimization problem, an iterative approach which uses mean-square-error (MSE) uplink-downlink duality is suggested. Minimizing the product of all users MMSE matrix determinants is proposed as an equivalent formulation for the sum rate maximization problem of the downlink multiuser MIMO systems [7]. This problem is non-convex and it is solved by employing sequential quadratic programming. The work of [7] has been extended to the robust case in [8]. The latter paper formulates the robust problem using the worst-case robust design approach, and utilizes MSE uplink-downlink duality approach to solve the sum rate maximization problem. All of the aforementioned papers examine their problems for conventional downlink systems. In these systems, BSs from different cells communicate with their respective remote terminals independently. Hence, inter-cell interference is obliged to be considered as a background noise. Recently, it has been shown that BS coordination communication is a promising technique to signicantly improve the capacity of wireless channels by mitigating (or possibly canceling) inter-cell interference [9][11]. The BS coordination can be performed by two approaches. In the rst approach, BSs are coordinated at the beamforming (precoder) level. In such kind of BS coordination, the system is termed as multi-cell system [10]. In the second approach, BS coordination takes place at both the signal and beamforming (precoder) levels. When BSs are coordinated in this approach, the system is termed as network MIMO system [9], [11]. It is well know that the latter coordination approach has better performance gain compared to the former one [11], [12]. This performance improvement, however, requires additional signal coordination. In the current paper, we focus on the second BS coordination approach. In [13], we examine the joint optimization of the

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
2

precoders to maximize the total sum rate with per BS antenna power constraint for the downlink multiuser systems with coordinated BSs. The latter paper assumes that the BSs are equipped with multiple antennas and mobile stations (MSs) are equipped with single antenna. In [14], four MSE-based linear transceiver optimization problems have been considered for the downlink multiuser MIMO systems with coordinated BSs. These problems are examined by assuming that the total power of each BS or the individual power of each BS antenna (group of antennas) is constrained. The problems of [14] are solved as follows. First, by keeping the receivers constant, optimization of the precoder matrices are formulated as a second-ordercone program (SOCP) problem (SOCP problems are convex and can be solved by using existing convex optimization tools). Second, for the given BS precoders, the receiver of each user is optimized by MMSE technique. These two steps are repeated iteratively to jointly optimize the transmitters and receivers. In [14], the receiver of each user can be optimized independently and distributively. However, the joint optimization of the precoders of [14] has been carried out by a centralized algorithm. When the number of users and/or BSs increase, the computational cost of the joint precoder design also increases [15]. Consequently, solving the precoder optimization problem in a centralized manner, especially for large-scale coordinated networks, is not a computationally efcient approach. This motivates us to develop distributed algorithms to solve MSEbased problems for downlink coordinated BS systems with per BS antenna power constraint in [16]. This paper solves its optimization problems distributively by applying the Lagrangian dual decomposition, modied matrix fractional minimization and an iterative technique. In the current paper, we extend the work of [13] to the case where both the BSs and MSs are equipped with multiple antennas. For this scenario, we design the transmitters and receivers of all users to maximize the weighted sum rate with per BS antenna power constraint problem1 . We propose novel centralized and computationally efcient distributed iterative algorithms that achieve local optimum to the latter problem. These algorithms are described as follows. First, by introducing additional optimization variables, we reformulate the original problem into a new problem. Second, for the given precoder matrices of all users, the optimal receivers are computed using MMSE method and the optimal introduced variables are obtained in closed form expressions. Third, by keeping the introduced variables and receivers constant, the precoder matrices of all users are optimized by using SOCP and matrix fractional minimization approaches for the centralized and distributed algorithms, respectively. Finally, the second and third steps are repeated until these algorithms converge. We have shown that the proposed algorithms are guaranteed to converge. All simulation results show that our proposed distributed algorithm achieve the same performance
1 According to [17], in a multi-antenna BS systems, each BS antenna has its own power amplier and the maximum power of each BS antenna is limited by some value. This motivates us to consider the power constraint of each BS antenna. On the other hand, in some scenario, a per BS power constraint has practical interest. As will be clear later, our proposed algorithm can be extended straightforwardly to handle the latter power constraint and the sum power constraint of the whole network or groups of antennas.

as that of the centralized algorithm. Moreover, the proposed algorithms outperform the existing algorithm. In particular, when each of the users has single antenna, we have observed that the proposed algorithms achieve the global optimum. The contribution of this paper is thus summarized as follows. 1) We propose novel centralized and computationally efcient distributed iterative algorithms to jointly optimize the transceivers of all users to maximize the weighted sum rate with a per BS antenna power constraint problem. Our proposed algorithms can be used for the case where the constraint of this problem is modied to sum power constraint of the whole network or groups of antennas. As will be clear later, we also show that the proposed algorithms can be applied to examine weighted sum rate optimization problem for multi-cell systems. 2) For the aforementioned problem, we have demonstrated that the proposed distributed algorithm has the same performance as that of the centralized algorithm. 3) As will be shown later, our problem has exactly the same mathematical structure as that of in [5] where weighted sum rate maximization with per antenna power constraint problem is considered for conventional downlink MIMO systems. The latter paper, however, solves the optimization problem by constraining that the power allocated for each symbol is always positive. In other words, the algorithm proposed in [5] can not handle inactive symbols. Our proposed algorithms have four major advantages compared to the algorithm in [5]. First, the proposed algorithms constrained the powers of each symbol to be non-negative2 . Second, simulation results show that our algorithm has better weighted sum rate compared to that of [5]. Third, as will be clear later in Section IV, our centralized and distributed algorithms require less computational cost compared to that of [5]. Fourth, the proposed algorithms have faster convergence speed than that of [5]. 4) When each of the users has single antenna, the global optimal solution of weighted sum rate maximization problem can be obtained with the framework of monotonic global optimization (MGO) algorithm as in [18]. For this case, in all of our simulation results, we have observed that the proposed centralized and distributed algorithms achieve the global optimum. The remaining part of this paper is organized as follows. We present the downlink multiuser MIMO coordinated BS system model in Section II. The problem formulation is discussed in Section III. The existing centralized, and the proposed centralized and distributed algorithms are presented in Section IV. The extensions of our centralized and distributed algorithms for multi-cell systems are discussed in Section V. In Section VI, computer simulations are used to compare the performance of the centralized and distributed algorithms, and our proposed algorithms with that of the other existing algorithms. Finally, conclusions are drawn in Section VII.
2 We would like to mention here that at optimality the powers of some of the symbols can be zero. This scenario happens, especially, for the power constrained total sum rate maximization problems. This shows that our algorithms are more general than that of [5].

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
3

Notations: The following notations are used throughout this paper. Upper/lower case boldface letters denote matrices/column vectors. [X]i,j , tr(X), XT , XH and E(X) denote the (i,j) element, trace, transpose, conjugate transpose and expected value of X, respectively. In (I) is an identity matrix of size n n (appropriate size) and CM M represents spaces of M M matrices with complex entries. The diagonal and block-diagonal matrices are represented by diag(.) and blkdiag(.) respectively. Subject to is denoted by s.t and (.) denotes optimal solution. Vectorization of a matrix is represented by vec(.) and xn is the nth norm of a vector x.

where HH = [HH , , HH ] CMk N , B = k 1k Lk [B1 ; ; BL ] CN S , HH CMk Nl is the channel vector lk between the lth BS and the kth MS, and nk is the additive noise at the kth MS. As can be seen from (1), the kth user decodes its symbol dki independently with the receiver wki . As will be clear later, our paper applies MMSE approach to design wki . On the other hand, the kth user can decode its symbol dki by rst canceling known interference (i.e., successive interference cancelation) and then applying MMSE receiver as in [20]. According to [20], the latter decoding approach achieves less symbol-error-probability (SEP) compared to that of the former one. However, since the latter approach is non-linear [6], the current paper focuses on the former decoding approach which is linear. It is clearly seen that the last expression of (1) has exactly the same form as the estimate of dki for the downlink multiuser MIMO system where a BS equipped with N transmit antennas is serving K decentralized multiantenna MSs. Hence, we can interpret coordinated BS system as a one giant downlink system [14], [15]. It is assumed that the entries of nk are independent and identically distributed (i.i.d) zero-mean circularly symmetric complex Gaussian (ZMCSCG) random 2 2 variables with the variance k , i.e., nk N C(0, k IMk ). We also assume that the symbol dk consists of ZMCSCG random K variables with unit variance and is independent of {di }i=1,i=k and noise nk , i.e., E{dk dH } = ISk , E{dk dH } = 0, i = k i k and E{dk nH } = 0. For this system model, the MSE between k dki and dki is given by ki =Ed {(dki dki )(dki dki )H }
H 2 H =wki (HH BBH Hk + k IMk )wki wki HH bki k k

bH Hk wki + 1. ki

(2)

Fig. 1.

MIMO Coordinated base station system model.

II. S YSTEM M ODEL We consider a downlink multiuser MIMO coordinated BS system as shown in Fig. 1 where L BSs are serving K decentralized multiantenna MSs. The lth BS and kth MS are equipped with Nl and Mk antennas, respectively. The total L number of BS and MS antennas are thus N = l=1 Nl K and M = k=1 Mk , respectively. By denoting the symbol K intended for the kth user as dk CSk 1 and S = k=1 Sk , the entire symbol can be written in a data vector d CS1 as d = [dT , , dT ]T . The lth BS precodes d into an 1 K Nl length vector by using its overall precoder matrix Bl = [bl11 , , blKSK ], where blki CNl 1 is the precoder vector of the lth BS for the kth MS ith symbol. The ith symbol of the kth MS employs a receiver wki to estimate its symbol dki . We follow the same channel matrix notations as in [19]. The estimate of the kth MS ith symbol (dki ) is given by
L H H dki =wki ( HH Bl d + nk ) = wki (HH Bd + nk ) lk k l=1

For notational convenience, we represent [11 , , 1S1 , , K1 , , KSK ] by [1 , 2 , , S ], [w11 , , w1S1 , , wK1 , , wKSK ] by [w1 , w2 , , wS ], B = [b1 , , bS ], and the channel matrix and noise variance corresponding to the sth symbol is denoted by Hs and s , respectively3 . By doing so, the MSE 2 of the sth symbol is given by
H H s =ws (HH BBH Hs + s I)ws ws HH bs 2 s s bH Hs ws + 1. s

(3)

When perfect CSI is available at the BS and MSs, the MMSE receiver of the sth symbol is given as ws = (HH BBH Hs + s I)1 HH bs . 2 s s (4)

Plugging this equation into (3), we get the MMSE of the sth symbol as s = 1 bH Hs (HH BBH Hs + s I)1 HH bs . 2 s s s (5)

When each of the symbols ({ds }S ) is decoded individually s=1 independent of each other using a minimum Euclidean distance decoding rule, the achievable rate of the sth symbol can be expressed as [5], [6], [21] Rs = log2 (s )1 .
3 Note that H and 2 are the same as the channel and noise variance of s s the MS associated with the sth symbol, respectively.

(1)

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
4

III. P ROBLEM F ORMULATION Mathematically, the weighted sum rate maximization problem can be formulated as P1 : max
S s=1 {bs }S s=1

s log2 (s )1 , (6)

S bs bH ]n,n pn , n s.t [ s s=1

where pn is the maximum allocated power to the nth BS antenna and s 0 is the rate weighting factor of the sth symbol. The antenna numbers are assigned from the rst antenna of BS1 (which corresponds to antenna 1) to the last antenna of BSL (which corresponds to antenna N ). In a multimedia communication, different types of information (for example, audio and video information) can be sent to a user simultaneously [22]. In such a case, for successful transmission, more priority could be given to symbols corresponding to the video information. Consequently, the symbols of a user (all users) can have different priorities. This motivates us to examine the joint transmitter and receiver design for symbol wise weighted sum rate maximization problem. However, as will be clear later, the proposed algorithms can be applied to get the suboptimal solution for user wise weighted (un-weighted) sum rate optimization problem. Without loss of generality, we assume that {0 < s < 1}S . After s=1 straightforward mathematical manipulations, problem (6) can be equivalently expressed as min
S S s s , s.t [ bs bH ]n,n pn , n. s s=1 s=1

parameter settings and solution methods (see Appendix B for the details). In the following, we present our novel centralized and distributed iterative algorithms that achieve local optimum to (8). The proposed algorithms require less computational cost per iteration than that of the algorithm in [5] (i.e., Ce ). As will be clear later in Section VI, the proposed algorithms also have faster convergence speed than that of the algorithm in [5]. As a result, our algorithms require less overall computational complexities than that of the algorithm in [5]. To this end, we consider the following Lemma. Lemma 1: The optimal/suboptimal {bs , ws }S of (8) can s=1 be obtained by solving the following problem. ( )S S S 1 min s s , s.t [ bs bH ]n,n pn , s S s=1 {bk ,ws ,s }S s=1 s=1
S s=1

s = 1, s 0, s, n.

(10)

{bs }S s=1

(7)

Note that although (6) and (7) are equivalent problems, the optimal (sub-optimal) values of these problems are not necessarily equal. Solving the latter problem in its current form has appeared to be intractable. Due to this, we introduce the receivers {ws }S and then reformulate the above problem as s=1 (see Appendix A) min
S s=1 S s , s.t [ bs bH ]n,n pn , n s s=1

Moreover, for xed {bs }S , the optimal {s }S of this s=1 s=1 problem is given by ]1 [ S i S i=1 i s = , s. (11) s s Proof: See Appendix C. From the proof of this Lemma, one can realize that the S 1 optimal/suboptimal solution of (10) satises S s=1 s s > 0 S and {s > 0}s=1 . Thus, the objective function of (10) can be S S replaced by s=1 s s = s=1 s s s . This is due to the fact N that minx (cf (x)) is equivalent to minx f (x) for any c > 0, f (x) > 0, x and positive integer N [23]. Consequently, (10) can be equivalently expressed as
{s ,bs ,ws }S s=1 S s=1 Due to {s s }S s=1 terms in the objective function of (12), getting the suboptimal solution of this problem is not trivial. To simplify the latter problem, we present Lemma 2. Lemma 2: For any strictly positive real numbers a and b, and 0 < < 1, the following holds true a F = min ( + b ) = ab (13) >0 1 1 where = 1 , = 1 and = (1) . Proof: The optimal of F can be obtained by using the rst order derivative of F with respect to as 1 dF a a = ( 2 + b 2 ) = 0 = . (14) d (b)

min

S s=1

S s s s , s.t [ bs bH ]n,n pn , s s=1

s = 1, s > 0, n, s.

(12)

{bs ,ws }S s=1 where s = s s .

(8)

IV. E XISTING AND PROPOSED SOLUTIONS The above optimization problem is non-convex. Thus, convex optimization techniques can not be applied. In this section, we present the exiting centralized, and the proposed centralized and distributed algorithms for (8). This problem has been examined in [5] for the case where the power of each symbol is strictly positive. The paper proposes an iterative algorithm that achieves a local optimum to (8). Assuming {Mk = M }K , the complexity of each iteration is given by k=1 Ce =O( (N + S)(2N S + 1)2 (2S 2 + 2N S + S))+ O(K M 2.376 ) + CGP (9) where CGP is the complexity of the Geometric Program (GP) problem of [5]. In general, CGP depends on different

Substituting (14) in F , we get ] [ ] [ a + b ) = a(1) (b) = F = ( ab (1) = ab . (15)

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
5

Following Lemma 2, it can be shown that (12) can be optimized by solving (16). ] S [1 s s min + s s , s s s {s ,s ,bs ,ws }S s=1 s=1

{bs , ws , s }S s=1

of

where s = s s s . The objective function of the above problem can be expressed as S S S { H H bH ( i Hi wi wi HH )bs s ws HH bs s i s s=1 i=1 H s s ws ws 2

s s =

(16)

s=1

S S s.t [ bs bH ]n,n pn , s = 1, s > 0, s > 0 s, n s s=1 s=1 (1 )

1 1 where s = 1s , s = s 1 and s = s s s . The above problem is non-convex. Thus, convex optimization can not be applied. Next, we present our centralized and distributed iterative algorithms that achieve local optimum to this problem.

+ + s H H H H H = tr{( W H B ) ( W H B )} + tr{ WH 2 W} (21) s bH Hs ws s where = diag(1 , , S ), 2 = 2 2 = [H1 , , HK ], blkdiag(1 IM1 , , K IMK ), H Wk as the decoder matrix of the kth MS and W = blkdiag(W1 , , WK ). By applying (21), problem (20) can be reexpressed as min s.t [vec( WH HH B )]2 ,
, {bs }S s=1

A. Proposed centralized algorithm The key step of this centralized algorithm is the utilization of Lemma 1 and Lemma 2 which help us to transform the intractable problem (8) to a more convenient problem (16). In this subsection, we present our centralized iterative algorithm for the latter problem as follows. First, keeping the precoders of all symbols {bs }S constant, the optimal s=1 {ws }S can be obtained by using MMSE receiver approach s=1 (4) and {s , s }S are optimized by solving the following s=1 problem ] S [1 s s s min s + s , s s {s ,s }S s=1 s=1 s.t
S s=1

bn 2

pn , n

(22)

where bH as the nth row of B. As we can see, (22) is a SOCP n problem for which the global optimal solution is obtained by existing convex optimization tools [23]. Finally, the rst and second steps are repeated iteratively until convergence is achieved. Our centralized iterative algorithm that achieves a local optimum to (6) is summarized as shown in Algorithm I. Algorithm I: Centralized iterative algorithm for problem (6) Initialization: Set {Bk }K as the rst Sk vectors of k=1 {Hk }K and the maximum number of iterations as k=1 imax . Then, normalize {Bk }K such that each BS k=1 antenna power constraint is satised with equality. repeat 1) With the current {bs }S , {ws , s and s }S s=1 are s=1 updated using (4), (11) and (18), respectively. 2) With the current {s , ws , s }S , {bs }S are optimized s=1 s=1 by solving (22). 3) Compute the objective function of (6). Until convergence. Convergence:- For xed {ws , s and s }S , the global s=1 minimum of (16) can be achieved by optimizing {bs }S with s=1 (22). Moreover, for xed {bs }S , the global minimum of (16) s=1 can be achieved by optimizing {ws , s and s }S with (4), s=1 (11) and (18), respectively. As a result, 2 1 is satised, n n where i is the objective function of (16) at step i of the n nth iteration [6], [8]. At the (n + 1)th iteration, we achieve 2 1 2 n . These discussions reveal the fact (n+1) (n+1) that the objective function of (16) is non-increasing at each step. Which implies that the objective function of (6) also non-decreasing. On the other hand, the objective function of the latter problem is upper bounded by a positive nite value. These two facts show that the proposed iterative algorithm is always guaranteed to converge. However, since problem (6) is non-convex, we are not able to show the global optimality of Algorithm I analytically. Initialization:- In general, different initializations affect the

s = 1, s > 0, s > 0, s.

(17)

The above optimization problem is a GP for which global optimal solution can be obtained by existing optimization tools [24]. However, here we provide closed form expressions for the optimal {s , s }S of this problem. For xed {s }S , s=1 s=1 the optimal {s }S of (17) can be obtained by applying the s=1 rst order derivative of (17) with respect to {i }S and are i=1 given as [ s ] 1+1 s s s = , s. (18) s s
Substituting these {s }S back into the objective function of s=1 (17) we get S s=1 S s s s , s.t s = 1, s > 0, s. s=1

{s }S s=1

min

(19)

It can be easily seen that the latter problem and (46) has the same optimal solution. Thus, the global optimal {s }S s=1 of (19) is given by (11). Second, for xed {ws , s , s }S , s=1 the optimal {bs }S of (16) can be obtained by solving the s=1 following problem min
S s=1 S s s , s.t [ bs bH ]n,n pn , n s s=1

{bs }S s=1

(20)

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
6 S { H bH Abs s ws HH bs s bH Hs ws + s s s s=1 H s s ws ws 2

convergence speed and optimal weighted sum rate of Algorithm I. In most of our simulations, we observe faster convergence speed and better weighted sum rate when the initialization is performed as in Algorithm I. Nonetheless, getting the best initialization that results the fastest convergence speed and best weighted sum rate of Algorithm I is an open research topic. Computational complexity:- The main computational load of Algorithm I arises from solving (4) and (22). For the assumption discussed in Section IV, (4) can be performed with O(K M 2.376 ) [25]. It can be shown that problem (22) has N second-order-cone (SOC) constraints where each of them consists of 2S real dimensions, one SOC constraint with 2S 2 real dimensions and 2N S + 1 real optimization variables. According to [26] (see page 196 of [26]), the computational complexity of the latter problem terms in of number of iterations is upper bounded by O( N + 1) where the complexity of each iteration is on the order of O((2SN + 1)2 (2S 2 + 2SN )). Thus, the total computational complexity of (22) is given by O( N + 1(2SN + 1)2 (2S 2 + 2SN )). Therefore, in one iteration, Algorithm I requires Cc = O( N + 1(2SN + 1)2 (2S 2 + 2SN )) + O(K M 2.376 ) operations. This shows that our proposed centralized algorithm requires less computational cost per iteration than that of the algorithm in [5] (i.e., Cc < Ce ). However, although Cc < Ce , we still believe that for largescale networks, Cc is very large computational load and hence it is not suitable for practical realization. This motivates us to develop a distributed algorithm that achieves a local optimum to (6) with less computational cost than that of our centralized algorithm. B. Proposed distributed algorithm We have shown in the previous subsection that (6) can be solved equivalently by using (16). As can be seen from Algorithm I, the optimal {ws , s , s }S s=1 of (16) can be solved independently and distributively. However, the optimal solution of (20) is computed using a centralized algorithm. In this subsection, we present our distributed algorithm for (20). The Lagrangian dual decomposition technique is applied to solve this problem distributively4 . To this end, we rst express the Lagrangian function associated with (20) as ) S N S ( H L(, B) = s s + n [ bi bi ]n,n pn =
s=1 S { s=1

} + s

N n=1

n p n

(23)

where = diag(1 , , N ) are the Lagrangian multipliers corresponding to the constraint sets of (20) and A = S H H i=1 i Hi wi wi Hi + . Thus, the dual function of (20) is g() = min L(, B) =
S { H bH Abs s ws HH bs s bH Hs ws + s s s s=1 H s s ws ws 2 {bs }S s=1

} + s

N n=1

n pn

} S { 2 H s (s ws ws + 1) s ws HH A1 Hs ws 2 H s
s=1

N n=1

n pn

(24)

where the third equality is obtained after substituting the optimal bs of (24) which is given by b = s A1 Hs ws , b = s [A1 ]l Hs ws l, s s ls (25) where [A1 ]l CNl N is obtained by [A1 ](Fl :Fl +Nl 1,:) l1 with Fl = i=0 Ni + 1 and N0 = 0. As can be seen from (25), for a given , the precoder vector of each symbol can be optimized independently. The optimal of (23) can be obtained by solving the dual optimization problem of (20) which is given by
{n 0}N n=1

max

g() =
S {

{n 0}N n=1

max

s (s ws ws 2 H n pn .

+ 1)

2 H s ws HH A1 Hs ws s

s=1 N n=1

(26)

bH ( s

n=1 S i=1

i=1

H H i Hi wi wi HH )bs s ws HH bs i s

By employing the eigenvalue decompositions of HW 2 WH HH VVH and HW WH HH VVH , problem (26) can be written as { } N min tr FH (RRH + )1 F + n pn (27)
{n 0}N n=1

} H 2 H s bs Hs ws + s s ws ws + s + ( ) S n [ bi bH ]n,n pn i
i=1

N n=1

4 Since (20) is convex and Slaters condition (i.e., the existence of strictly feasible points) is satised by choosing any {bs }S s=1 with {[ S bs bH ]n,n < pn }N , the duality gap between the primal problem s n=1 s=1 (20) and its dual problem is zero [23].

where F = V and R = V . The above optimization problem can be cast as a semi-denite programming (SDP) problem where the global solution can be found by existing convex optimization tools [23]. The computational complexity of this problem is on the order of O((2N 2 + N )2 (4N )2.5 ) [26]. However, here our aim is to obtain the optimal values of {n }N distributively with less computational load than that n=1 of the SDP method. In this regard, we present the following Lemma.

n=1

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
7

Lemma 3: The optimal {n }N of the above optimization n=1 problem can be obtained by solving the following problem min
N H {gn 1 gn + tH tn + n pn } n

{n ,gn ,tn }N n=1

(28)

n=1

s.t Rtn + gn = fn , n

where fn is the nth column of F. Proof: By keeping constant, the Lagrangian function of (28) is given by L=
N n=1 H [gn 1 gn + tH tn + n pn H (Rtn + gn fn )] n n

where H is the Lagrangian multiplier associated with the nth n equality constraint of (28). Differentiation of L with respect to {gi , ti }N yield {gi = i }N and {t = RH i }N . i=1 i=1 i i=1 By substituting these {gi , t }N in the equality constraint of i i=1 (28), we get { i = (RRH + )1 fi }N . It follows i=1
gi = (RRH + )1 fi ,

t = RH (RRH + )1 fi , i. i Plugging (29) into the objective function of (28) yields =


N i=1 H {gi 1 gi + tH ti + i pi } i N i=1 N i=1

(29)

N i=1

{fiH (RRH + )1 fi } +

i pi i pi . (30)

{ } H H 1 =tr F (RR + ) F +

The above equation is the same as the objective function of the original optimization problem (27). It follows that (27) and (28) are equivalent problems. Note that Lemma 3 is proved by modifying the idea of matrix fractional minimization (see [16] and [23]). It can be shown that (28) is a convex optimization problem [23]. To develop distributed algorithm for (28), we reexpress G = [g1 , , gN ] as G = [gH ; ; gH ], where gH is the 1 N i ith row of G. By doing so, G = [g1 , , gN ] of (29) can also be written as G = [(g )H ; ; (g )H ], where 1 N g = i H , i i i
1

(31)

and i is the ith row of = A F. Now, problem (28) can be solved distributively as follows. First, keeping constant, the optimal gr can be computed i independently using (31), i.e., gr = r1 (r1 )H , where the i i i superscripts (.)r and (.)r1 represent the current and previous values, respectively. Then, r is computed by i 1 r = 2 i + pi = 0 r = i /pi , i (32) i i i
r where i = (gr )H gr . As we can see from the above i i expression is always non-negative. Furthermore, from (31) i and (32), one can observe that can be updated in parallel i by using only g . Thus, for our problem, the computation i of {t , gi }N is not required. To summarize, problem (27) i i=1

can be solved iteratively in a distributed manner as shown in Algorithm II. Algorithm II: Iterative algorithm to solve (27) 1) Initialization: Set {n = 1}N . n=1 Repeat 2) With the current {n }N , compute {gn }N using (31) n=1 n=1 and update {n }N with (32). n=1 3) Share the latter {n }N among all BSs/processors. n=1 4) Calculate the objective function of (27). Until convergence. Convergence: The convergence of this algorithm can be studied like that of Algorithm I. Here, although we are not able to show the global optimality of Algorithm II analytically, in all simulation results we observe that the optimal of (27) obtained by Algorithm II and the SDP method are the same. Computational complexity: The major computational task of Algorithm II arises from matrix inversion which has a complexity on the order of O(N 2.376 ) [25]. Thus, Algorithm II requires O(N 2.376 ) per iteration. As will be shown later in Section VI, in all our simulations, Algorithm II converges to an optimal solution in less than 10 iterations. This shows that the proposed distributed algorithm signicantly reduces the computational complexity of (20). Therefore, for (6), the distributed algorithm requires less overall computational cost than that of the centralized algorithm. Using {n }N n=1 of Algorithm III, the suboptimal {bls }S , l of (6) can be computed by (25). With these s=1 {bls }S , l, the introduced variables s and s , and the s=1 receiver of the sth symbol ws are updated by using (4), (11) and (18), respectively. In summary, the suboptimal solution of (6) can be obtained distributively as shown in Algorithm III. Algorithm III: Distributed algorithm for problem (6). Initialization: Set {bs }S like in Algorithm I and the s=1 maximum number of iterations as imax . Repeat 1) With the current {bs }S , optimize {ws , s , and s }S s=1 s=1 using (4), (11) and (18), respectively. 2) Using the latter {s , s , ws }S , compute the optimal s=1 {n }N with Algorithm II. n=1 3) Solve for {bls }S , l, using (25). s=1 4) Compute the objective function of (6). Until convergence. Convergence: It can be shown that at each step the weighted sum rate of (6) is non-decreasing. Hence the algorithm is always convergent. Implementation of Algorithm III: This algorithm can be implemented distributively by two approaches. To be convenient for explanation, we assume {Mk = 1}K and K = N = L, k=1 i.e., S = N . First approach: In this approach, it is assumed that the problem is examined in a central controller which has as many parallel processors as the number of optimization variables. Algorithm III can be implemented distributively as follows. Initialization: The sth processor sets bs as in Algorithm III and {n = 1}N . n=1 1) The current {bs }S are shared among all processors. s=1 Once again, using these precoders, the sth processor

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
8

2)

3) 4) 5)

computes its ws , s and s using (4), (11) and (18), respectively, and then {ws , s }S s=1 are shared to all processors. With the current {n }N and {ws , s }S , the nth n=1 s=1 processor computes gn using (31) and updates its n by (32). Then, {n }N are shared among all processors. n=1 The latter two steps are repeated until {n }N are n=1 found to be optimal. Using {n }N of step 2, the sth processor computes n=1 the optimal bs by (25). Steps (1), (2) and (3) are repeated until Algorithm III converges. The controller nally sends the optimal precoders and decoders to the corresponding BSs and MSs, respectively.

extended straightforwardly for the case where the constraint of (6) is modied to sum power constraint of the whole network or groups of antennas. The computational complexities per iteration of the proposed centralized and distributed algorithms, and the algorithm in [5] for problem (6) when {Mk = M }K are summarized k=1 in Table I V. E XTENSION OF THE PROPOSED ALGORITHMS FOR MULTI - CELL SYSTEMS So far, we have examined the weighted sum rate maximization problem for multiuser MIMO coordinated BS systems where coordination takes place at both the beamforming and signal levels (i.e., network MIMO system) [9], [11]. In this section, we present the extension of our proposed algorithms to the weighted sum rate maximization problem for multiuser MIMO coordinated BS systems where coordination takes place at the beamforming level only (i.e., multi-cell system) [10]. In such a coordination, each MS is served by a subset of BSs. For better exposition of our centralized and distributed algorithms for the latter kind of BS coordination, we consider a multiuser MIMO coordinated BS system with L = K BSs, where the kth BS serves the kth MS only. For this system, if we apply the same channel matrix and symbol vector notations as in Section II, the estimate of the kth MS ith symbol (dki ) is given by H kk dki =wki (HH Bk dk +
K m=k

Second approach: In this approach, we assume that each BS obtain the channel of all users trough the feedback channel prior to optimization. Here we do not consider any central controller. This is motivated by the fact that each BS is responsible to design its precoder matrix independently by exchanging limited information with the other BSs. In our case, each BS is allowed to exchange n , ws and s (three scalars for the aforementioned assumption) with all other BSs to jointly design the transceivers of all users. In such approach, Algorithm III is implemented distributively as given below. Initialization: Each BS sets {bs }S as in Algorithm s=1 III and {n = 1}N . n=1 Using the current precoders, the sth BS computes its ws , s and s using (4), (11) and (18), respectively, and then {ws , s }S are shared to all BSs. s=1 With the current {n }N and {ws , s }S , the nth BS s=1 n=1 computes gn using (31) and updates n with (32). Then, the latter {n }N are shared among all BSs. These n=1 two steps are repeated until {n }N are found to be n=1 optimal. Using the current {n }N , the sth BS computes n=1 {bs }S using (25)5 . s=1 Steps (1), (2) and (3) are repeated until Algorithm III converges. Once Algorithm III converges, each BS uses its precoder matrix to precode the data symbols of all users, and also transmits {Wk , k} to those users near to this 6 BS .

1)

HH Bm dm + nk ) mk

(33)

2)

where Bk = [bk1 , , bkSk ] CNk Mk is the precoder matrix of the kth MS and wki CMk 1 is the receiver vector of the kth MS ith symbol. The MSE between dki and dki is thus given by ki =Ed {(dki dki )(dki dki )H }
K H 2 =wki ( HH Bm BH Hmk + k IMk )wki mk m m=1 H kk wki HH bki bH Hkk wki + 1. ki

3) 4) 5)

(34)

Note: We would like to point out that the un-weighted sum rate optimization problem can be examined with our algorithms either by using (12) with {s = 1}S or employing (16) s=1 with {s = }S and 0 < < 1. It can be clearly seen s=1 that our centralized and distributed algorithms are able to handle both of these cases. Furthermore, it is clearly seen that the proposed centralized and distributed algorithms can be
this equation, since the precoders of all users depend on a common matrix inversion A1 , the precoders of all users can be obtained at each BS without signicant additional cost. 6 Note that in a practical scenario, the backhaul capacity is accurate and fast enough to exchange n , ws and s between BSs (i.e., three scalars for our example setup since N=K=S). Moreover, since users are not expected to design their receivers, the knowledge of {bs }S is not required at the receiver side s=1 (this reduces the bandwidth requirement of the downlink channel).
5 In

When perfect CSI is available at the BSs and MSs, the MMSE receiver of the kth user ith symbol is given as wki =(
K 2 HH Bm BH Hmk + k IMk )1 HH bki . mk m kk

(35)

m=1

Substituting this equation into ki , we get the MMSE of the kth user ith symbol as ki = 1 bH Hkk ( ki
K

(36)
2 HH Bm BH Hmk + k IMk )1 HH bki . mk m kk

m=1

It follows, the weighted sum rate maximization constrained with each BS antenna power problem for our multi-cell system

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
9

Table I: Computational complexities of the proposed algorithms and the algorithm in [5] for (6) Type of algorithm Computational complexity per iteration Proposed centralized O( N + 1(2SN + 1)2 (2S 2 + 2SN )) + O(K M 2.376 ) 2.376 2.376 Proposed distributed O(N ) + O(K M ) 2 2 Algorithm in [5] O( (N + S)(2N S + 1) (2S + 2N S + S)) + O(K M 2.376 ) + CGP

can be formulated as P2 : max


K Sk k=1 i=1

objective function as log2 (kiki )1 , (37) HH Bm BH Hmk + mk m m=1 k=1 i=1 k=1 i=1 2 H kk k IMk )wki wki HH bki bH Hkk wki + 1) ki Sm K K H km = tr{BH ( mi Hkm wmi wmi HH )Bk k m=1 i=1 k=1 H H H B BH H W } + Wk kk k k kk k ki (wki ( H = (38)
K k=1 K Sk

{bki i}K k=1

ki ki =

K Sk

s.t [Bk BH ]u,u pku , k, u k

where ki is the rate weighting factor of the kth MS ith symbol and pku is the available power at the kth BS uth antenna. Like in (8), problem P2 can be expressed as min
K Sk k=1 i=1

{bki ,wki ,i}K k=1

kiki ,

H tr{BH Ak Bk Wk HH Bk BH Hkk Wk } k kk k (42) K Sm

s.t

[Bk BH ]u,u k

pku , k, u.

With the help of Lemma 1 and Lemma 2, the above problem can be equivalently reformulated as min
K Sk

[ ki

{ki ,ki ,bki ,wki ,i}K k=1

s.t [Bk BH ]u,u pku , k

k=1 i=1 K Sk k=1 i=1

] 1 ki ki ki + ki ki , ki

where A = H w w H HH , = K k m 2 m=1H i=1 mi km mi mi km S m mj wmj wmj + mj and m=1 j=1 Wk = [k1 wk1 , , kSk wkSk ]. It follows, problem (41) can be reformulated as min
K k=1

ki = 1, (39)
(1 )

{Bk }K k=1

H tr{BH Ak Bk Wk HH Bk BH Hkk Wk }, k kk k (43)

s.t [Bk BH ]u,u pku , k, u. k

ki > 0, ki > 0 k, i, u

1 where ki = 1 ki , ki = 1 1 and ki = ki ki ki . ki k }K , the optimal wki of the Like in (16), for xed {B k=1 above problem is given by (35), and the optimal {ki and ki , i}K are expressed as k=1

Since {Bk }K are not coupled in both the objective and k=1 constraint functions of (43), the above optimization problem is separable [27]. As a result, each of {Bk }K can be optimized k=1 independently by H min tr{BH Ak Bk Wk HH Bk BH Hkk Wk }, k kk k
Bk

[ K Sk
k=1

km m=1 km

1 ]S

s.t [Bk BH ]u,u pku , u. k ,

(44)

ki = ki

[ ki ] 1+1 ki ki = , ki ki

kiki

k, i

(40)

This optimization problem can be examined with our centralized and distributed algorithms like that of (20). The details are omitted for conciseness. Note that the analysis of this section can be extended straightforwardly for the scenario where each MS is served by a subset of two or more BSs. VI. S IMULATION RESULTS In this section, we present the simulation results for problem (6) (i.e., P1). All of our simulation results are averaged over 100 randomly chosen channel realizations. The channel between all BS and each MS consists of ZMCSCG entries with unit variance. It is assumed that the noise variances of 2 all users are the same, i.e., {k = 2 }K . The signal-tok=1 noise ratio (SNR) is dened as Psum / 2 and it is controlled by varying 2 , where Psum is the total sum power utilized by all antennas.

K where S = k=1 Sk . Next, for given {wki , ki and ki , i}K , the optimal {Bk }K of (39) can be obtained by k=1 k=1 solving the following problem min
K Sk k=1 i=1

{Bk }K k=1

ki ki , (41)

s.t [Bk BH ]u,u pku , k, u k

where ki = ki kiki . To employ our centralized and dis tributed algorithms for the above problem, we reexpress its

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
10

A. Comparison of our centralized and distributed algorithms, and the algorithm proposed in [5] For the comparison of these three algorithms, we consider a system with L = 2 BSs where each BS has 4 antennas, and K = 4 MSs where each MS has 2 antennas. It is assumed that {pn = 0.125}8 n=1 and 1 = [0.6 0.4 0.5 0.8 0.25 0.8 0.46 0.28]T . First, we compare these three algorithms based on their powers utilized at each antenna when 2 = 0.1. For this system setup, all of these three algorithms utilize the maximum available powers at each BS antenna7 . Second, we compare the performance of the aforementioned algorithms based on their total achievable weighted sum rate. Fig. 2 shows that the proposed distributed algorithm achieves the same weighted sum rate as that of the centralized algorithm. Moreover, our proposed algorithms outperform the algorithm proposed in [5].
Weighted sum rate (bps/Hz)

25 Proposed centralized algorithm Proposed distributed algorithm Algorithm proposed in [5]

20

15

10

3
5 10 15 20

0 0

SNR (dB)
Fig. 3. Comparison of the proposed centralized and distributed algorithms, and the algorithm in [5].

22

Weighted sum rate (bps/Hz)

20 18 16 14 12 10 8 6 4 0

Proposed centralized algorithm Proposed distributed algorithm Algorithm proposed in [5]

B. Convergence characteristics of the proposed algorithms and the algorithm in [5] In Section IV, the computational complexities of the proposed centralized and distributed algorithms, and the algorithm in [5] is discussed for a single iteration only. Therefore, to compare the overall computational complexities of our algorithms and the algorithm in [5], the convergence speed of these algorithms should be examined. In this simulation, we examine the convergence speed of our algorithms and the algorithm proposed in [5] for the initialization as presented in Algorithm I. We have used the same simulation parameters as in the rst paragraph of Section VI-A. As can be seen from Fig. 4, the proposed algorithms have faster convergence speed and higher weighted sum rate than that of the algorithm proposed in [5].

10

15

20

SNR (dB)
Fig. 2. Comparison of the proposed centralized and distributed algorithms, and the algorithm proposed in [5].

Next, we compare the performances of the proposed algorithms and the algorithm in [5] for different rate weighting factors. The comparison is based on the total weighted sum rate. For this purpose we use two sets of rate weighting factors 2 and 3 as 2 = [0.9 0.2 0.5 0.95 0.1 0.9 0.2 0.05] and 3 = [0.1 0.4 0.2 0.6 0.3 0.16 0.12 0.25]. For these weighting factors, the weighted sum rates of the proposed algorithms and the algorithm in [5] are plotted in Fig. 3. As can be seen from this gure, the proposed algorithms outperform the algorithm in [5]. From Fig. 2 and Fig. 3, we can observe that the performance gap between the proposed algorithms and the algorithm in [5] depends on the weighting factors. Here, we would like to mention that for more than 90% of our channel realizations, we have noticed that the proposed algorithms achieve at least the same weighted sum rate as that of [5].
would like to mention here that for problem (6) all antennas do not necessarily utilize their maximum powers to optimize the total weighted sum rate (see for example in [13]).
7 We

12

Weighted sum rate (bps/Hz)

11 10 9 8 7 6 5 4

Proposed centralized algorithm Proposed distributed algorithm Algorithm proposed in [5]

10

15

20

25

Number of iterations
Fig. 4. Comparison of the convergence characteristics of our algorithms and the algorithm in [5] when 2 = 0.1.

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
11

C. Convergence characteristics of Algorithm II To demonstrate the computational advantage of our distributed algorithm over the centralized algorithm, we examine the convergence characteristics of Algorithm II for both small-scale and large-scale networks. 1) Small-scale network: In this simulation we demonstrate the convergence characteristics of Algorithm II for a system with L = 2 coordinated BSs where each of them has two antennas and K = 2 MSs where each MS is equipped with 2 antennas. For this system Fig. 5 shows the convergence characteristics of Algorithm II at different iterative stages of Algorithm III (i.e., for different {ws , s }S ). As can be s=1 seen from this gure, Algorithm II converges to an optimal solution in less than 10 iterations.
8

140 120 Objective function of (27) 100 80 60 40 20 0

8 10 12 14 Number of iterations

16

18

20

Objective function of (27)

Fig. 6. Convergence characteristics of Algorithm II at different iterative stages of Algorithm III for large-scale network.
7

6 Z 5 Z
1 2

Z3 Z
4

2 1

10

Number of iterations
Fig. 5. Convergence characteristics of Algorithm II at different iterative stages of Algorithm III for small-scale network with the set Z as given in (45), where Z = [Z1 Z2 ; Z3 Z4 ] with Zk = [[W1 ; ; WK ] ] and = [1 , , S ]T .

2) Large-scale network: Next we examine the convergence characteristics of Algorithm II for large-scale networks. We consider a system with L = 25 coordinated BSs where each of them has four antennas and K = 50 MSs where each MS is equipped with 2 antennas. For simplicity, we assume that {pn = 0.25}N and 2 = 0.18 . The weighting factor of n=1 the sth symbol (s ) is chosen from a uniform distribution with {0 < s < 1}S . For these settings, we examine s=1 the convergence characteristics of Algorithm II at different iterative stages of Algorithm III. As can be seen from Fig. 6, Algorithm II converges to an optimal solution within few iterations. D. Simulation results for problem (6) when {Mk = 1}K k=1 When each MS has single antenna, the global optimal solution of (6) can be obtained with the framework of MGO algorithm as discussed in [18]. The MGO algorithm requires solving a feasibility problem to get the upper boundary feasible points of a monotonic optimization problem (see also [28] for more details about MGO and upper boundary feasible
8 Similar

points of a monotonic optimization problem). For our case, this feasibility problem (i.e., rate feasibility problem) can be solved by the phase rotation technique of [17]. According to [18], the computational complexity of MGO algorithm grows quickly with the number of users. Thus, the MGO algorithm serves as a benchmark for suboptimal less complex algorithms. On the other hand, a simple improved zero-forcing (IZF) solution for (6) with {Mk = 1}K can be obtained by the approach of k=1 [29] (see Section V.B of [29]). These ndings motivate us to compare our proposed algorithms with that of MGO, IZF and the algorithm of [5] when {Mk = 1}K . The comparison of k=1 these algorithms is based on the total weighted sum rate of all users when L = K = 3, {Nl = 1}L , {pn = 1/N }N , n=1 l=1 the rate weighting factors 1 = [0.46 0.83 0.79]T and 2 = [0.9 0.54 0.1]T , and all the other settings are the same as the rst paragraph of Section VI. For the MGO algorithm, we have used the following tolerance ( = 0.001, = 0.01) which is analogous to (, ) of [28]. Here, we have employed the i=1 weighting factors { i }2 to differentiate from the weighting factors { i }3 which are used in Sections VI-A - VI-C i=1 for {Mk = 2}K . As can be seen from Fig. 7.(a)-(b), the k=1 proposed algorithms achieve global optimum, whereas the algorithms in [5] and [29] do not achieve the global optimum. As expected, at high SNR regions, the weighted sum rate achieved by the IZF algorithm of [29] approaches the optimal weighted sum rate. However, the exact SNR value at which the weighted sum rate achieved by the latter algorithm approaches the optimal weighted sum rate is not necessarily the same for all rate weighting factors. We would like to mention here that the performance characteristics of our proposed algorithms, the algorithm of [5], the IZF algorithm of [29] and the MGO algorithm of [18] for P2 are like that of P1. Due to this reason, we omit the simulation results of these algorithms for P2. VII. C ONCLUSIONS This paper considers the joint linear transceiver design problem for downlink multiuser MIMO systems with coordi-

behavior is observed for the other 2 .

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
12

Z=

0.562 0.033i 0.007 + 0.110i 0.053 0.081i 0.759 + 0.033i 0.641 0.018i 0.068 + 0.000i 0.141 + 0.043i 0.809 + 0.018i 0.614 + 0.014i 0.0447 + 0.083i 0.132 + 0.284i 0.246 0.003i 0.816 0.016i 0.188 0.000i 0.107 0.093i 1.045 + 0.042i

1.172 0.423 0.678 0.968 4.103 0.188 0.410 1.646

0.620 0.011i 0.055 + 0.113i 0.110 + 0.129i 0.624 + 0.003i 0.764 0.033i 0.134 + 0.036i 0.081 0.042i 0.962 + 0.045i 0.597 + 0.027i 0.012 + 0.026i 0.130 + 0.332i 0.052 0.001i 0.757 + 0.003i 0.253 0.077i 0.290 0.176i 0.995 + 0.024i

2.002 0.276 0.599 1.215 4.701 0.165 0.304 2.375

(45)

14

Weighted sum rate (bps/hz)

12 10 8 6 4 2 0 0

Global Maximum (MGO) Proposed centralized alg Proposed distributed alg Algorithm Proposed in [5] IZF algorithm of [29]

receivers constant, the precoder matrices of all users are optimized by using SOCP and matrix fractional minimization approaches for the centralized and distributed algorithms, respectively. Finally, the second and third steps are repeated until these algorithms converge. We have shown that the proposed algorithms require less computational cost than that of the existing algorithm. Moreover, the proposed algorithms achieve higher weighted sum rate than that of the existing linear algorithm. All simulation results show that the proposed distributed algorithm achieve the same performance as that of the centralized algorithm. In particular, when each of the users has single antenna, we have observed that the proposed algorithms achieve the global optimum.
20 25

10

15

SNR (dB) (a)


15

A PPENDIX A P ROOF OF THE EQUIVALENCE OF (7)


AND

(8)

Weighted sum rate (bps/hz)

10

Global Maximum (MGO) Proposed centralized alg Proposed distributed alg Algorithm Proposed in [5] IZF algorithm of [29]

Since the constraint functions of (8) do not depend on optimized by {ws }S , the receivers {ws }S of (8) can be s=1 s=1 S applying standard rst order differentiation of s=1 s with H S respect to {ws }s=1 . By doing so, we get S S ( i=1 i ) s ( = s i s s 1) . =0 H H ws ws
i=1,i=s

ws

= (HH BBH Hs + s I)1 HH bs 2 s s

0 0

10

15

20

25

30

SNR (dB) (b)


Fig. 7. Comparison of the proposed algorithms, MGO algorithm of [18], IZF algorithm of [29] and the algorithm in [5]. (a) For the rate weighting factor 1 . (b) For the rate weighting factor 2 .

where the last equality follows from the fact that S ( 1) is always positive for any {bs , s }S s i=1,i=s i s s s=1 with 0 < s < 1. Now, by substituting the above ws into (8), we get (7). A PPENDIX B C OMPUTATION OF Ce Problem (8) has been examined in [5] for the case where the power of each symbol is strictly positive. In [5], the transmitters are decomposed into a product of unity norm lter and square root of power allocation matrices, and the receiver matrix of each user is decomposed as a product of the inverse of the square root of power allocation, unity norm lter and diagonal scaling factor matrices (see Section 2 of [5]). Upon doing so, the weighted sum rate maximization problem is formulated as in (2) of [5]. Then, for (2) of [5], this paper utilizes Algorithm 1 of Table 1. Here, we summarize the computational cost required to perform one iteration of Algorithm 1 in [5] by using the system model parameter settings as discussed in Section II of our paper. For simplicity, we assume that {Mk = M }K . The major computational cost k=1

nated BSs. We examine maximization of the total weighted sum rate with per BS antenna power constraint problem. We propose novel centralized and computationally efcient distributed iterative algorithms that achieve local optimum to the latter problem. These algorithms are described as follows. First, by introducing additional optimization variables, we reformulate the original problem into a new problem. Second, for the given precoder matrices of all users, the optimal receivers are computed using MMSE method and the optimal introduced variables are obtained in closed form expressions. Third, by keeping the introduced variables and

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
13

of Algorithm 1 of [5] comes from the steps 5, 6, 7 and 8. The steps 5 and 7 of the latter algorithm contain matrix inversion. According to [25], matrix inversion can be performed with a complexity of O(K M 2.376 ). The computational load of step 8 is on the order of O( (N + S)(2N S +1)2 (2S 2 +2N S +S)) [26] (see page 196 of [26] for the details). In general, the computational complexity of step 6 depends on different parameter settings and solution methods. The detail analysis on the computational complexity of GP problems (i.e., step 6) can be found in [23], [30] (see page 36 of [30] for the Barrier method of solving GP problems). Therefore, the computational complexity of Algorithm 1 in [5] per iteration is given by O( (N + S)(2N S +1)2 (2S 2 +2N S +S))+O(K M 2.376 )+ CGP , where CGP is the computational cost of the GP in [5]. A PPENDIX C P ROOF OF Lemma 1 Proof: For xed {bs , ws }S , optimizing {s }S of s=1 s=1 (10) can be expressed as ( )S S S 1 min s s , s.t s = 1, s 0, s. (46) S s=1 {s }S s=1 s=1 The above problem is GP for which global optimality is guaranteed. Clearly, the optimal solution of (46) satisfy {s > 0}S , and the objective and constraint functions of this s=1 problem are continuously differentiable. Moreover, by replacS ing 1 = ( s=2 s )1 , the equality constraint of the latter problem can be removed. These two facts show that the optimal {s }S of the above problem are regular [31], [32]9 . s=1 Thus, the global optimal solution of (46) can be obtained by choosing {s }S that satisfy the Karush-Kuhn-Tucker (KKT) s=1 optimality conditions which are given by [23] ( )S1 S S s 1 i i i s = 0 (47) S i=1
i=1,i=s

Substituting of (51) into (50), and noting that 0 we obtain S 1 i i . s s = S i=1

1 S

i=1 i i

>

(52) S
s=1

Multiplying the S equalities of (52) and utilizing yields 1 s = ( S s=1


S S i=1

s = 1

i i )S .

(53)

The above expression shows that the optimal/suboptimal solution of (8) can be equivalently obtained by solving (10). By employing (52) and (53), it can be shown that the optimal {s }S of (46) can be expressed as in (11) [13], [32]. s=1 R EFERENCES
[1] S. Vishwanath, N. Jindal, and A. Goldsmith, Duality, achievable rates, and sum-rate capacity of Gaussian MIMO broadcast channels, IEEE Tran. Info. Theo., vol. 49, no. 10, pp. 2658 2668, Oct. 2003. [2] M. Costa, Writing on dirty paper, IEEE Tran. Info. Theo., vol. 29, no. 3, pp. 439 441, May 1983. [3] W. Yu and J. M. Ciof, Sum capacity of Gaussian vector broadcast channels, IEEE Tran. Info. Theo., vol. 50, no. 9, pp. 1875 1892, Sep. 2004. [4] Q. H. Spencer, A. L. Swindlehurst, and M. Haardt, Zero-forcing methods for downlink spatial multiplexing in multiuser MIMO channels, IEEE Trans. Sig. Proc., vol. 52, no. 2, pp. 461 471, Feb. 2004. [5] S. Shi, M. Schubert, and H. Boche, Per-antenna power constrained rate optimization for multiuser MIMO systems, in International ITG Workshop on Smart Antennas (WSA), Berlin, Germany, 26 27 Feb. 2008, pp. 270 277. [6] S. Shi, M. Schubert, and H. Boche, Rate optimization for multiuser MIMO systems with linear processing, IEEE Tran. Sig. Proc., vol. 56, no. 8, pp. 4020 4030, Aug. 2008. [7] A. J. Tenenbaum and R. S. Adve, Improved sum-rate optimization in the multiuser MIMO downlink, in 42nd Annual Conference on Information Sciences and Systems (CISS), Princeton, NJ, USA, 19 21 Mar. 2008, pp. 984 989. [8] T. Endeshaw, B. Chalise, and L. Vandendorpe, Robust sum rate optimization for the downlink multiuser MIMO systems: Worst-case design, in Proc. IEEE International Conference on Communications (ICC), Cape Town, South Africa, 23 27 May. 2010, pp. 1 5. [9] K. M. Karakayali, G. J. Foschini, and R. A. Valenzuela, Network coordination for spectrally efcient communications in cellular systems, IEEE. Tran. Wirel. Comm., vol. 13, no. 4, pp. 56 61, Aug. 2006. [10] H. Dahrouj and W. Yu, Coordinated beamforming for the multi-cell multi-antenna wireless system, IEEE Tran. Wirel. Comm., vol. 9, no. 5, pp. 1748 1759, May 2010. [11] E. Bjornson, R. Zakhour, D. Gesbert, and B. Ottersten, Cooperative multicell precoding: Rate region characterization and distributed strategies with instantaneous and statistical CSI, IEEE Tran. Sig. Proc., vol. 58, no. 8, pp. 4298 4310, Aug. 2010. [12] E. Bjornson, R. Zakhour, D. Gesbert, and B. Ottersten, Distributed multicell and multiantenna precoding: Characterization and performance evaluation, in Proc. IEEE Global Telecommunications Conference (GLOBECOM), Honolulu, HI, USA, 30 Nov. 4 Dec. 2009, pp. 1 6. [13] T. Endeshaw and L. Vandendorpe, Sum rate optimization for coordinated multi-antenna base station systems, in Proc. IEEE International Conference on Communications (ICC), Kyoto, Japan, 23 27 May 2011. [14] S. Shi, M. Schubert, N. Vucic, and H. Boche, MMSE optimization with per-base-station power constraints for network MIMO systems, in Proc. IEEE International Conference on Communications (ICC), Beijing, China, 19 23 May 2008, pp. 4106 4110. [15] T. Tamaki, K. Seong, and J. M. Ciof, Downlink MIMO systems using cooperation among base stations in a slow fading channel, in Proc. IEEE International Conference on Communications (ICC), Glasgow, UK, 24 28 Jun. 2007, pp. 4728 4733. [16] T. E. Bogale, L. Vandendorpe, and B. K. Chalise, MMSE transceiver design for coordinated base station systems: Distributive algorithm, in 44th Annual Asilomar Conference on Signals, Systems, and Computers, Pacic Grove, CA, USA, 7 10 Nov. 2010.

s s = 0 s 0, s

(48) (49)

where and {s }S are the Lagrangian multipliers cors=1 S responding to the constraints s=1 s = 1 and {s S 0}s=1 , respectively. Multiplying (47) by s , and employing S s=1 s = 1 and (48) results ( )S1 S S 1 s s i i s i s s = 0 S i=1 i=1,i=s )S1 ( S S 1 i i s s = i = . (50) S i=1 i=1 By summing the S equalities of (50), can be determined by ( )S S 1 i i . = (51) S i=1
9 For the inequality constrained optimization problems, a feasible point is said to be regular if all the inequality constraints are inactive at this point [31], [32].

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
14

[17] W. Yu and T. Lan, Transmitter optimization for the multi-antenna downlink with per-antenna power constraints, IEEE Trans. Sig. Proc., vol. 55, no. 6, pp. 2646 2660, Jun. 2007. [18] J. Brehmer and W. Utschick, Utility maximization in the multiuser MISO downlink with linear precoding, in IEEE International Conference on Communications (ICC), Munich, Germany, 14 18 Jun. 2009. [19] T. Endeshaw, B. K. Chalise, and L. Vandendorpe, MSE uplinkdownlink duality of MIMO systems under imperfect CSI, in 3rd IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), Aruba, 13 16 Dec. 2009, pp. 384 387. [20] A. Zanella, M. Chiani, and M.Z. Win, MMSE reception and successive interference cancellation for MIMO systems with high spectral efciency, IEEE Tran. Wirl. Comm., vol. 4, no. 3, pp. 1244 1253, May 2005. [21] A. Lapidoth, Nearest neighbor decoding for additive non-Gaussian noise channels, IEEE Trans. Info. Theo., vol. 42, pp. 1520 1529, Sep. 1996. [22] H. Sampath, P. Stoica, and A. Paulraj, Generalized linear precoder and decoder design for MIMO channels using the weighted MMSE criterion, IEEE Tran. Sig. Proc., vol. 49, no. 12, pp. 2198 2206, Dec. 2001. [23] S. Boyd and L. Vandenberghe, Convex optimization, Cambridge University Press, Cambridge, 2004. [24] A. Mutapcic, K. Koh, S. Kim, and S. Boyd, GGPLAB: A simple matlab toolbox for geometric programming, May 2006, http:// www.stanford.edu/boyd/ggplab/. [25] D. Coppersmith and S. Winograd, Matrix multiplication via arithmetic progressions, Journal of Symbolic Computation, vol. 9, pp. 251 280, 1990. [26] M. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret, Applications of second-order cone programming, Linear algebra and its applications, vol. 284, pp. 193 228, 1998. [27] D. P. Palomar and M. Chiang, A tutorial on decomposition methods for network utility maximization, IEEE Jour. Sel. Commun., vol. 24, no. 8, pp. 1439 1451, Aug. 2006. [28] A. Rubinov, H. Tuy, and H. Mays, An algorithm for monotonic global optimization problems, A Journal of Mathematical Programming and Operations Research, vol. 49, no. 3, pp. 205 221, 2001. [29] A. Wiesel, Y. Eldar, and S. Shamai, Zero-forcing precoding and generalized inverses, IEEE Tran. Sig. Proc., vol. 56, no. 9, pp. 4409 4418, Sep. 2008. [30] M. Chiang, Geometric Programs for Communication Systems, Hanover, Princeton University, Princeton, NJ 08544, USA, 2005. [31] D. Ding and S. D. Blostein, MIMO minimum total MSE transceiver design with imperfect CSI at both ends, IEEE Tran. Sig. Proc., vol. 57, no. 3, pp. 1141 1150, Mar. 2009. [32] B. Jaumard, C. Meyer, and H. Tuy, Generalized convex multiplicative programming via quasiconcave minimization, Journal of Global Optimization, vol. 10, no. 3, pp. 229 256, Apr. 1997.

Luc Vandendorpe (M93-SM99-F06) was born in Mouscron, Belgium, in 1962. He received the Electrical Engineering degree (summa cum laude) and the Ph.D. degree from the Universit Catholique de Louvain (UCL), Louvain-la-Neuve, Belgium, in 1985 and 1991, respectively. Since 1985, he has been with the Communications and Remote Sensing Laboratory of UCL, where he rst worked in the eld of bit rate reduction techniques for video coding. In 1992, he was a Visiting Scientist and Research Fellow at the Telecommunications and Trafc Control Systems Group of the Delft Technical University, The Netherlands, where he worked on spread spectrum techniques for personal communications systems. From October 1992 to August 1997, he was Senior Research Associate of the Belgian NSF at UCL, and invited Assistant Professor. He is currently a Professor and head of the Institute for Information and Communication Technologies, Electronics and Applied Mathematics. His current interest is in digital communication systems and more precisely resource allocation for OFDM(A)-based multicell systems, MIMO and distributed MIMO, sensor networks, turbo-based communications systems, physical layer security and UWB based positioning. Dr. Vandendorpe was corecipient of the 1990 Biennal Alcatel-Bell Award from the Belgian NSF for a contribution in the eld of image coding. In 2000, he was corecipient (with J. Louveaux and F. Deryck) of the Biennal Siemens Award from the Belgian NSF for a contribution about lter-bankbased multicarrier transmission. In 2004, he was co-winner (with J. Czyz) of the Face Authentication Competition, FAC 2004. He is or has been TPC member for numerous IEEE conferences (VTC, Globecom, SPAWC, ICC, PIMRC, WCNC) and for the Turbo Symposium. He was Co-Technical Chair (with P. Duhamel) for the IEEE ICASSP 2006. He was an Editor for Synchronization and Equalization of the IEEE TRANSACTIONS ON COMMUNICATIONS between 2000 and 2002, Associate Editor of the IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS between 2003 and 2005, and Associate Editor of the IEEE TRANSACTIONS ON SIGNAL PROCESSING between 2004 and 2006. He was Chair of the IEEE Benelux joint chapter on Communications and Vehicular Technology between 1999 and 2003. He was an elected member of the Signal Processing for Communications committee between 2000 and 2005, and an elected member of the Sensor Array and Multichannel Signal Processing committee of the Signal Processing Society between 2006 and 2008. Currently, he is an elected member of the Signal Processing for Communications committee. He is the Editor-in-Chief for the EURASIP Journal on Wireless Communications and Networking. L. Vandendorpe is a Fellow of the IEEE.

Tadilo Endeshaw Bogale (S09) was born in Gondar, Ethiopia. He received his B.Sc and M.Sc degree in Electrical Engineering from Jimma University, Jimma, Ethiopia and Karlstad University, Karlstad, Sweden in 2004 and 2008, respectively. From 2004-2007, he was working in Ethiopian Telecommunications Corporation (now Ethio-Telecom) in mobile project department. Since 2009 he has been working towards his PhD degree and as an assistant researcher at the ICTEAM institute, University Catholique de Louvain (UCL), Louvain-la-Neuve, Belgium. His research interests include robust (non-robust) transceiver design for multiuser MIMO systems, centralized and distributed algorithms, and convex optimization techniques for multiuser systems.

Copyright (c) 2011 IEEE. Personal use is permitted. For any other purposes, permission must be obtained from the IEEE by emailing pubs-permissions@ieee.org.

Вам также может понравиться