Вы находитесь на странице: 1из 22

Troubleshooting Exchange Networking: DNS (Part 1)

Written by on January 5, 2012

Often Exchange administrators will receive escalated help desk tickets from users complaining that Exchange is slow and demanding resolution. These sorts of tickets (slow being at best a relative term, and never specific enough about what precisely is considered to be slow) can be extremely challenging to work, since the subjective nature of slowness is often combined with an inability to replicate the problem, or the problem is intermittent. The Exchange admin can take a look at the server(s) for high CPU utilization, low memory conditions, disk and network queue lengths exceeding the norm, and finding nothing, shrug it back off to the desktop support team as a client issue. While it is often a client issue, there are several places between Outlook and a users mailbox that can cause intermittent slowness, and are fair to call networking bottlenecks. In a six part series of articles, well look at how Exchange interacts on the network with various other services to help you identify network issues, and troubleshoot them when they occur. In many cases, troubleshooting Exchange network bottlenecks will require a network trace, and may also require performance monitor counters. This series of articles will talk about both of those in general terms; how to useNetMon or Wireshark, and PerfMon are out of scope. In Part 1 of this series, were going to discuss how Exchange is dependent upon and interacts with DNS on the network.

DNS
DNS is one of the most important, and fundamental services on any TCP/IP network and the critical role it plays in all aspects of Exchange cannot be understated. Every single interaction between servers depends on being able to resolve a name to an IP address, and being able to quickly (and correctly) perform name resolution can set the tone for the entire transaction. Most of you will be using AD integrated DNS, so your DNS servers will be domain controllers. Keep in mind that the default TTL for AD integrated zones is 3600, so your Exchange servers will cache responses for an hour before trying to resolve the same name again. Using AD integrated zones also means that changes to DNS records must replicate to all domain controllers, and then the TTL must expire before you can assume that a client or Exchange server is resolving the right IP address to name. To ensure that the right IP address is being provided in response to a query, open an administrative command prompt on the Exchange server you are troubleshooting, and use the NSLOOKUP command to query the primary DNS server, and the secondary. Confirm that both provide the same result and that it is correct, and then ping the destination server by name. Compare the IP address in the PING command to what NSLOOKUP returned to be sure that your Exchange server is trying to reach the right address. If it is not, issue the ipconfig /flushdns command to clear the local cache, and try again. >nslookup exch2.example.com Server: Address: dc1.example.com 192.168.0.2

Name: Address:

exch2.example.com 192.168.0.6

>ping exch2.example.com Pinging exch2.example.com [192.168.0.9] with 32 bytes of data: Reply from 192.168.0.104: Destination host unreachable. Ping statistics for 192.168.0.9: Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),

>ipconfig /flushdns Windows IP Configuration Successfully flushed the DNS Resolver Cache.

>ping exch2.example.com Pinging exch2.example.com [192.168.0.6] with 32 bytes of data: Reply from 192.168.0.6: bytes=32 time=4ms TTL=128 Ping statistics for 192.168.0.6: Packets: Sent = 1, Received = 1, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 2ms, Maximum = 2ms, Average = 2ms You want to place DNS servers as close to your Exchange servers as possible, configure your Exchange servers to use the closest DNS servers they can, and to keep the application response time (ART) for DNS queries as low as possible. If it takes more than 50 milliseconds to resolve a DNS performance will suffer. You can use a protocol analyzer like Microsofts NetMon or Wireshark to analyze that, or you can just use the dig command. A Windows port can be downloaded from here. The dig command can tell you how long it takes to resolve a name.

>dig @192.168.0.2 -t a exch2.example.com ; <<>> DiG 9.3.2 <<>> @192.168.0.2 -t a exch2.example.com ; (1 server found) ;; global options: ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 104 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: printcmd

;exch2.example.com.

IN

;; ANSWER SECTION: exch2.example.com. ;; Query time: 8 msec ;; SERVER: 192.168.0.2#53(192.168.0.2) ;; WHEN: Fri Dec 30 15:29:26 2011 ;; MSG SIZE rcvd: 51 3600 IN A 192.168.0.6

Eight milliseconds is not bad at all. Your internal Exchange servers (CAS, HUB, UC, and Mailbox) should be configured to use local servers for both their primary and secondary DNS. In sites where there is only DNS server, you really ought to add another, but if you cannot, configure the secondary to be the one with the least latency. That wont always be the one on the other side of the connection with the greatest bandwidth; test. Your Edge Transport servers should be configured to resolve DNS queries to servers as close to the Internet edge as possible, and these should be able to go straight to root rather than forwarding to your ISP. That way, every MX lookup, SPF lookup, DKIM lookup, and PTR lookup that the Edge must perform when sending or receiving a message can complete as quickly as possible. Configuring the Exchange server to query an internal DNS server, which then must forward to your ISP, which then may forward to another, adds lots of latency to every DNS lookup. Sure, the operating system will cache those lookups, but caches expire and you are exchanging email with hundreds or thousands of domains each day. Keep in mind that changes beyond your control will be made as other admins move their services to different servers, networks, etc. Changes to DNS records take time to replicate; if you are troubleshooting a connectivity failure to a remote system, dont forget that they may be in the middle of a change and DNS records are simply stale. Time will sort that out for you. Considering that DNS queries must be resolved in order for an Exchange server to connect to the Global Catalog server, which it must do for authentication, to expand distribution lists, to look up topology information, and to do practically anything else, and you will understand that you dont want to waste time just trying to resolve a name to an IP address.

Coming up next
In Part 2, we will look at how Exchange interacts with Active Directory at the network level, where bottlenecks can occur, and how to troubleshoot those problems. Heres a rundown of the six parts in this series. Well update with live links as each part is published over the next several weeks. Introduction and DNS Active Directory Firewalls NICs RPCs Client side issues

1. 2. 3. 4. 5. 6.

Troubleshooting Exchange Networking: Active Directory (Part 2)


Written by on January 16, 2012

Often Exchange administrators will receive escalated help desk tickets from users complaining that Exchange is slow and demanding resolution. These sorts of tickets (slow being at best a relative term, and never specific enough about what precisely is considered to be slow) can be extremely challenging to work, since the subjective nature of slowness is often combined with an inability to replicate the problem, or the problem is intermittent. The Exchange admin can take a look at the server(s) for high CPU utilization, low memory conditions, disk and network queue lengths exceeding the norm, and finding nothing, shrug it back off to the desktop support team as a client issue. While it is often a client issue, there are several places between Outlook and a users mailbox that can cause intermittent slowness, and are fair to call networking bottlenecks. In a six-part series of articles, well look at how Exchange interacts on the network with various other services to help you identify network issues, and troubleshoot them when they occur. In many cases, troubleshooting Exchange network bottlenecks will require a network trace, and may also require performance monitor counters. This series of articles will talk about both of those in general terms; how to use NetMon or Wireshark, and PerfMon are out of scope. In Part 2 of this series, were going to discuss how Exchan ge is dependent upon and interacts with Active Directory on the network.

Active Directory
Theres a ton of network interactions between Exchange servers and Active Directory, which is why you are required to have a Global Catalog server in every site in which you have an Exchange server. An Active Directory site is usually defined as a collection of subnets with sufficient bandwidth to support replication, and that can lead to sites spanning WAN links. While the WAN may have sufficient bandwidth and low enough latency to support Active Directory replication and authentication traffic, any AD client that is in a site may connect to, and query, and Domain Controller within that site. When the target of queries is across the WAN, the total latency of the WAN link can add up to noticeable delays. Understanding just how much goes on between your Exchange server and your Global Catalog server may be enough to make you change the word site to subnet. Exchange servers will bind to a randomly selected domain controller and global catalog server in the same site, to minimize WAN traffic. Ensure that there are redundant servers will keep WAN traffic to a minimum, and optimize Exchange performance. Note: Read-Only domain controllers are not usable by Exchange. Exchange must access writable domain controllers.

Configuration information
The configuration partition in Active Directory contains critical data about the forest-wide configuration. Exchange configuration information can be found in a subfolder of the Services container in the Configuration partition. This includes: 1. Address lists 2. Address and display templates 3. Administrative groups 4. Client access settings 5. Connections 6. Messaging records management, mobile, and UM mailbox policies 7. Global settings 8. E-mail address policies 9. System policies 10. Transport settings All Exchange server roles, except the Edge Transport Server, will query AD directly for this information. Heres more specific information on how each role depends upon AD. You can also read more about that herehttp://technet.microsoft.com/en-us/library/aa998561.aspx.

Hub Transport Server Role


The Hub Transport server must contact Active Directory to perform message categorization, necessary for recipient lookup and routing resolution. This will include the location of the recipients mailbox and any restrictions or permissions that may apply. It will also use LDAP queries to expand the membership of distribution lists to determine membership of a dynamic distribution list. The Hub Transport Server will use cached information regarding the AD site topology to determine routing for message delivery between sites. If the Hub Transport server determines that a mailbox is in the same site, it will deliver the message directly to the Mailbox server, otherwise it will route the message to a Hub Transport server in the destination site. The Hub Transport server uses the application partition of Active Directory to store and access configuration information, including transport rules, journal rules, and connectors.

Client Access Server Role


The Client Access server role services clients connecting from the Internet who want to use Outlook Web App, POP3, IMAP4, or ActiveSync. When a connection is received, the Client Access server authenticates the user against AD and then queries to determine the appropriate mailbox server. If the users mailbox is in the same site, the user is connected directly to their mailbox. If in a different site, the connection is redirected to a Client Access server in the remote site.

Unified Messaging Server Role


The Unified Messaging server queries Active Directory to retrieve global configuration information, such as dial plans, IP gateways, and hunt groups. When a message is received by the Unified Messaging server, it matches the telephone number to a recipient address, then the location of the users mailbox. It can then route the voicemail message to a Hub Transport server for delivery to the mailbox.

Mailbox Server Role


The Mailbox server also stores configuration information Active Directory, including agent configuration, address lists, and policies. The Mailbox server will use this to enforce mailbox policies and global settings.

Edge Transport Server Role

The Edge Transport server doesnt access Active Directory. It stores it configuration in an instance of Active Directory Lightweight Directory Services. It uses an Edge Subscription to subscribe to a Hub Transport server in an Active Directory site, which will use the Microsoft Exchange EdgeSync service to synchronize Active Directory data to AD LDS.

Site definitions
There are two rules of thumb for Active Directory site design and how it impacts Exchange: 1. Make sure every single subnet that hosts an Exchange server belongs to a site 2. Dont let any of those sites span the WAN, no matter how much bandwidth you have available. If an Exchange server cannot determine its AD site because the subnet does not belong to a site, the MSExchangeDSA will fail with a 2114 and MSExchangeSA will fail with a 1005. In both cases it is because Exchange could not determine the AD site based on the subnet. Even the fastest WAN links have higher latency than the slowest LAN links, and that latency will have a cumulative and negative impact on Exchange performance as the server is waiting on responses from domain controllers if the DC is on the far side of the WAN from the Exchange server.

Troubleshooting Exchange interaction with Active Directory


1. 2. 3. 4. Knowing how Exchange depends upon Active Directory will help you troubleshoot issues. The four main categories of problem are: Network latency between the Exchange server and GC/DC Firewall rules blocking connection attempts Incorrect site configuration Replication problems within AD If you suspect Exchange is having a problem accessing Active Directory, first ensure that Exchange can communicate with a domain controller for each domain in the forest that has users with mailboxes, and that there is at least one domain controllers in the same site that is a global catalog server. Look for errors including 2114, 1005, and 1722. Test connectivity between Exchange and Active Directory by using thePortQueryUI tool, and the response times to LDAP queries using LDP.EXE and a protocol analyzer. And of course, ensure that you have no replication problems with your Active Directory. A domain controller that stops replicating because of DNS islanding or other connectivity issues with the rest of the forest will directly impact AD. Changes in AD (like name, group membership, SMTP proxy addresses, etc.) must replicate to all domain controllers that Exchange relies upon before you can be sure that Exchange will pick up on/display the differences. Performance will be enhanced by redundancy. When possible, ensure that there are multiple global catalog servers in the same site as every Exchange server, and that every domain in the forest with Exchange users is represented. Performance of Exchange will also improve directly with the capabilities of those domain controllers. When the DC is able to cache the entire Active Directory in memory, response to queries from Exchange will be much faster. Look at implementing 64bit DCs with enough RAM to cache the entire database. On a domain controller a quick way to check for replication problems is to run this command in an administrative command prompt Repadmin /replsummary [enter] Check for fails, servers that are down or unreachable, and larger times since the last replication event.

Coming up next

1. 2. 3. 4. 5. 6.

In Part 3, we will look at the connectivity requirements for Exchange as they relate to firewalls, and how to troubleshoot those problems. Heres a rundown of the six parts in this series. Well update with live links as each part is published over the next several weeks: Introduction and DNS Active Directory Firewalls NICs RPCs Client side issues Tags: Exchange 2010, troubleshooting Posted in Email Management, Exchange server | No Comments

Troubleshooting Exchange Networking: Firewalls (Part 3)


Written by on January 26, 2012

Often, Exchange administrators will receive escalated help desk tickets from users complaining that Exchange is slow and demanding resolution. These sorts of tickets (slow being at best a relative term and never specific enough about what precisely is considered to be slow) can be extremely challenging to work, since the subjective nature of slowness is often combined with an inability to replicate the problem, or the problem is intermittent. The Exchange admin can take a look at the server(s) for high CPU utilization, low memory conditions, disk and network queue lengths exceeding the norm, and finding nothing, shrug it back off to the desktop support team as a client issue. While it is often a client issue, there are several places between Outlook and a users mailbox that can cause intermittent slowness, and are fair to call networking bottlenecks. In a six part series of articles, well look at how Exchange interacts on the network with various other services to help you identify network issues, and troubleshoot them when they occur. In many cases, troubleshooting Exchange network bottlenecks will require a network trace, and may also require performance monitor counters. This series of articles will talk about both of those in general terms; how to use NetMon or Wireshark, and PerfMon are out of scope. In Part 3 of this series, were going to discuss the connectivity you need to permit through firewalls for Exchange to function properly on the network.

Firewalls
There are at least three places where a firewall can cause problems for Exchange. The most common is at your Internet border, when you are trying to support a protocol and the firewall is not permitting the necessary traffic. The second is between your DMZ and the internal network, which can cause issues for both Edge Transport servers and Client Access Servers, depending upon whether you pass traffic into them directly (which is not recommended) or you publish the CAS services using TMG or some other reverse web proxy. The third, which is both the least common and the most problematic, is when there are firewalls between different internal Exchange servers, or between Exchange servers and Active Directory. Clients on the Internet must connect to the CAS servers for the various protocols they will use. Other Internet mail servers must connect to the Edge Transport server to exchange SMTP messages, and all Exchange server roles except the Edge Transport Server must query AD directly for configuration information, and to perform LDAP lookups for servers in different sites. They will also need to communicate with Active Directory to authenticate users. Edge Transport servers have to communicate with Hub Transport servers both to update their configuration, and to pass SMTP traffic in to the internal network. Any time a firewall is between two Exchange servers, or between an internal Exchange server and either Active Directory or any other part of the Exchange environment, you must ensure that all required traffic is permitted to pass through the firewall. Firewalls frequently

translate IP addresses, called NAT. NAT is okay for some protocols; for others not so much. Windows 2008 and 2008 R2 servers will source all ephemeral connections from ports between 49152 and 65535. If you have any Exchange servers running 2003 or 2003 R2, you will need to expand that range to 1025-65535. The same can be said for clients. Windows Vista and 7 will source their connections from ports between 49152 and 65535. XP clients will source from 1025 to 65535. Lets look at each of the roles to see more about the required connectivity.

Edge Transport Server Role


Of course, your firewall needs to permit inbound TCP 25 from the Internet (ip any) to enable other Internet mail servers to send it email, and source ports can be anything from 1025 on up. You should also permit TCP port 587, which is commonly used by clients sending TCP over TLS connections. Older firewalls sometimes attempt to perform a rudimentary form of Intrusion Protection (fixup, inspect, etc.) which can often cause more problems than it solves, so consider carefully whether to enable that or not. The Edge Transport server doesnt access Active Directory directly, it stores it configuration in an instance of Active Directory Lightweight Directory Services. It uses an Edge Subscription to subscribe to a Hub Transport server in an Active Directory site, which will use the Microsoft Exchange EdgeSync service to synchronize Active Directory data to AD LDS. The Edge Transport server must be able to communicate to each and every Hub Transport server within the site it is subscribed to over TCP port 50636. Thats every Hub Transport server in the site, not just one or two, and it will source its queries from an ephemeral port between 49152 and 65535. If you add a Hub Transport server to the site, you must update your firewall rules to include the new server and update your Edge subscription. You can use NAT for both Internet traffic in to the Edge Transport server, and from the Edge Transport server into the Hub Transport servers in the subscribed site.

Hub Transport Server Role


The Hub Transport server must contact Active Directory to perform message categorization, necessary for recipient lookup and routing resolution. This will include the location of the recipients mailbox and any restrictions or permissions that may apply. It will also use LDAP queries to expand the membership of distribution lists to determine membership of a dynamic distribution list. Its best if there is no firewall between a Hub Transport server and the Domain Controllers in the same site, but if you must place a firewall between them, ensure that the Exchange server can reach all Domain Controllers in the site over all the following ports and protocols.Collapse this tableExpand this table

Application protocol Protocol Ports Global Catalog Server TCP 3269 Global Catalog Server TCP 3268 LDAP Server TCP 389 LDAP Server UDP 389 LDAP SSL TCP 636 LDAP SSL UDP 636 RPC TCP 135 RPC randomly allocated high TCP ports TCP 49152 65535

Collapse this tableExpand this table

Application protocol Protocol Ports DCOM TCP + UDP random port number between 49152 65535 ICMP (ping) ICMP LDAP TCP 389 SMB TCP 445 RPC TCP 135, random port number between 49152 65535 SMTP TCP 25
NAT is no good here; it can break RPC DCOM traffic which is used for some Active Directory functions.

Client Access Server Role


The Client Access server role services clients connecting from the Internet who want to use Outlook Web App, POP3, IMAP4, or ActiveSync. When a connection is received, the Client Access server authenticates the user against AD and then queries to determine the appropriate mailbox server. If the users mailbox is in the same site, the user is connected directly to their mailbox. If in a different site, the connection is redirected to a Client Access server in the remote site. If you are going to provide client connections directly to the CAS server, you must permit the following for the relevant client protocols.Collapse this tableExpand this table

Application protocol Protocol Ports IMAP TCP 143 IMAP over SSL TCP 993 POP3 TCP 110 POP3 over SSL TCP 995 Randomly allocated high TCP ports TCP random port number between 49152 65535 RPC TCP 135 RPC over HTTPS TCP 443 or 80 SMTP TCP 25

Unified Messaging Server Role


The Unified Messaging server will need essentially the same connectivity as the Hub Transport server role, plus whatever required ports are necessary for your particular VoIP gateway. Consult your vendors documentation for those specifics.

Mailbox Server Role


The Mailbox server will also need the same connectivity as detailed for the Hub Transport server role.

Limiting RPC ports


Firewall admins dont like to carve large holes in their walls, and will often request that you limit the port ranges used by RPC connections. This is supported, and well documented, but be warned. It is very common to limit RPC connections to too narrow a range of ports. This will manifest as random failures particularly at peak load times, with tons of 1722 errors. If you must restrict RPC ports, I suggest you start with a range of at least 1000 ports, and

carefully monitor clients and servers to ensure that this is enough to support all connections during peak times.

Troubleshooting Exchange firewall issues


Knowing the ports Exchange uses will help you troubleshoot issues. If you suspect Exchange is having a problem caused by a firewall, its best if you can work directly with the firewall administrator, who can monitor the source and/or destination IP addresses to see if rules are blocking. If that is not possible, you can test connectivity between Exchange and Active Directory or other Exchange servers by using the PortQueryUI tool. You can also use PING, the TCPING tool, or even the Windows Telnet client to see whether you can connect to the port or not. PortQueryUI can provide specific success or failures, but you can use PING to make sure you can reach the destination server, and then TCPING or Telnet to confirm whether or not you can make a connection on the specific ports required. If you get timeouts or refusals, and you have confirmed the destination server is up and running, then you are probably dealing with a firewall issue. Theres no real workaround here; the firewall admin mu st permit the required traffic for all services.

Coming up next
1. 2. 3. 4. 5. 6. In Part 4, we will look at the issues that can cause Exchange problems when NICs are involved, and how to troubleshoot those problems. Heres a rundown of the six parts in this series. Well update with live links as each part is published over the next several weeks: Introduction and DNS Active Directory Firewalls (this post) NICs RPCs Client side issues Tags: Exchange 2010, troubleshooting Posted in Exchange server | No Comments

Troubleshooting Exchange Networking: NICs (Part 4)


Written by on February 6, 2012

Often Exchange administrators will receive escalated help desk tickets from users complaining that Exchange is slow and demanding resolution. These sorts of tickets can be extremely challenging to work, since the subjective nature of slowness is often combined with an inability to replicate the problem, or the problem is intermittent. The Exchange admin can take a look at the server(s) for high CPU utilization, low memory conditions, disk and network queue lengths exceeding the norm, and finding nothing, shrug it back off to the desktop support team as a client issue. While it is often a client issue, there are several places between Outlook and a users mailbox that can cause intermittent slowness, and are fair to call networking bottlenecks. In a six part series of articles, well look at how Exchange interacts on the network with various other services to help you identify network issues, and troubleshoot them when they occur. In many cases, troubleshooting Exchange network bottlenecks will require a network trace, and may also require performance monitor counters. This series of articles will talk about both of those in general terms; how to use NetMon or Wireshark, and PerfMon are out of scope. In Part 4 of this series, were going to look at the humb le physical layer (DoD, not OSI) and discuss troubleshooting NICs.

NICs
Were now down where the rubber meets the road, that is, where the packets meet the wire. Your Network Interface Cards can be the most important part of the entire network connectivity between client process and server process, and are also the most commonly overlooked aspect of the entire communications channel. Ive seen many a case where Exchange network performance issues came down to problems with the NIC, but days had gone by troubleshooting the problem, or weeks just accepting the poor performance, before anyone thought to look at the NICs. If the NICs arent happy, aint nobody happy so lets make sure those NICs smile. The differences between the various physical connections are beyond the scope of this article, but the recommendations and troubleshooting suggestions in this article should apply equally to all types of NIC, whether copper or fibre based, and whether physical or virtual. Lets start with some best practices for connecting up all your servers and clients:

Use quality NICs


There are times to save money, and there are times to spend the extra for the best, and as far as Exchange servers are concerned, you cannot go wrong spending a little extra on the higher quality NICs. Single port or multi-port, specific name brand not as important, but dont buy the cheap one off NICs or limit yourself to what is built -in to your server.

Use good cables

I take pride in my ability to roll my own cables (Ethernet, not fibre -optic) and I also know that name-brand cables can cost a fortune, but here again is where you dont want to take any chances. All of your drop cables should be commercially made, but at the same time, dont assume that because they are, they are faultless. Make i t a habit to test all cables early in the troubleshooting process if not at time of install.

Use quality, managed switches


Inexpensive unmanaged switches are good for home use, or to provide last minute patches in a meeting room without wireless, but have no place in a datacenter. Make sure all your servers directly connect to managed switches that can provide you details and statistics about the physical connection. With that out of the way, now well move on to some more best practices that should also be the second steps you take on the server when troubleshooting connectivity issues, right after reseating all the cables.

Hardware Drivers
Make absolutely certain you are running the latest hardware drivers. Check the vendor site, and read the documentation for any known issues that might correlate to your problem, but unless there is something contraindicated in that documentation, make sure you have the latest supported drivers. If you do though, consider downgrading one rev just in case you have encountered a new bug.

Firmware
Dont just stop at the software drivers for your NICs, make sure you have the latest firmware installed as well.

TCPIP.SYS
Check the Microsoft operating system drivers for your specific platform, and if you are not running the latest TCPIP driver, upgrade immediately. I have personally seen dozens of problems magically disappear just by catching up on patches. Of course, I do recommend staying current on all patches, but this is one that should have no exceptions.

Teaming
More connectivity problems have been solved by breaking the team than any other single fix in history. If you have having network connectivity problems and are using network teaming, break the team and see if the problem goes away. Do this early on, as it is a quick thing to check, and to put back if that is not the problem. Odds are that it is, and in that case, you need to troubleshoot network teaming, not Exchange networking. The solution will usually be with updating drivers, fixing a problem with your configuration, or something on the switch.

Receive Side Scaling and ToE


If your multi-processor Exchange server is slamming one CPU(or core) and the rest are sitting idle, its a good bet you dont have RSS enabled. RSS lets your server balance NIC interrupts across all the CPUs, which leads to better overall performance. Its on by default in 2008 and 2008R2, but might have been turned off by another admin. If you see high CPU on only one processor, check with this command. netsh interface tcp show global If Receive-side Scalaing state shows as disabled, youve found the culprit. That same command will also show you the status of TCP Chimney Offload, or ToE. With compatible NICs, ToE can provide much better throughput on large file transfers (like database replication for DAGs, mailbox moves, etc.) and reduced CPU utilization. With it off, those operations will take much longer, have lower throughput, and cause higher CPU utilization. 2008 disables ToE by default, while 2008 R2 uses an automatic setting. If your

NICs support ToE, make sure you are using it by enabling it (if necessary) in the O/S, and then setting the advanced properties of the NIC to use it.

Using Hardware Load Balancers


The biggest challenge to troubleshooting load balanced servers is that the problem usually will manifest itself as intermittent, or isolated to a single client or subnet. If load balancers are in the mix, test from your machine, but test against the VIP and against each physical server one by one. If you cannot reproduce the problem, try the same process from the client. This may be one time where you have to use a HOSTS file to trick the client into connecting to each server one by one. If you dont have admin access to the hardware load balancer, get on with that admin to do your tests so they can view realtime logs to see if anything stands out.

The Microsoft Service

Network

Load

Balancing

If you are trying to load balance Exchange servers and are running into problems using software load balancing, my money is on the problem being in your switch configuration, and not with the MS NLB service. The easy test is to move the VIP to one of the servers, validate that everything works, and then move the VIP to the other and validate again. If it works without NLB in the mix, then it is not Exchange you should be looking at. MS NLB works great, though it is limited to IP based affinity and not port based, but there are so many ways the switch and/or router that your server connects to can screw up NLB, Ill frequently recommend against using it unless I can directly manage the switches myself, or I know the person who does and that he or she understands their side of making NLB work. See http://technet.microsoft.com/en-us/library/ff625247.aspx for some more tips on MS NBL, and if you are using VMware to virtualize your servers, see this article for specific settings in VMware.http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=dis playKC&externalId=1007371

Coming up next
1. 2. 3. 4. 5. 6. In Part 5, we will look at the issues that can cause Exchange problems when making RPC calls, and how to troubleshoot those problems. Heres a rundown of the six parts in this series. Well update with live links as each part is published over the next several weeks. Introduction and DNS Active Directory Firewalls NICs (this post) RPCs Client side issues Tags: Exchange 2010, troubleshooting Posted in Exchange server | No Comments

Troubleshooting Exchange Networking: RPCs (Part 5)


Written by on February 17, 2012

Often Exchange administrators will receive escalated help desk tickets from users complaining that Exchange is slow and demanding resolution. These sorts of tickets can be extremely challenging to work on, since the subjective nature of slowness is often combined with an inability to replicate the problem, or the problem is intermittent. The Exchange admin can take a look at the server(s) for high CPU utilization, low memory conditions, disk and network queue lengths exceeding the norm, and finding nothing, shrug it back off to the desktop support team as a client issue. While it is often a client issue, there are several places between Outlook and a users mailbox that can cause intermittent slowness, and are fair to call networking bottlenecks. In a six part series of articles, well look at how Exchange interacts on the network with various other services to help you identify network issues, and troubleshoot them when they occur. In many cases, troubleshooting Exchange network bottlenecks will require a network trace, and may also require performance monitor counters. This series of articles will talk about both of those in general terms; how to use NetMon or Wireshark, and PerfMon are out of scope. In Part 5 of this series, were going to discuss how Exchange is dependent upon and uses Remote Procedure Calls (RPCs) on the network.

Remote Procedure Calls (RPCs)


The Remote Procedure Call is one of the oldest ways for a process running on a client to request something from a service running on another. Generally speaking, services that are running on the server will bind to some dynamically assigned port, and register that port with the RPC Endpoint Mapper service. Usually, that service will be the same until reboot (or service restart). For a client to make a socket connection to the service, it must first query the RPC Endpoint Mapper for the port that the desired service is using, and once that query is answered, it can establish a connection to the service. In Windows, the RPC Endpoint Mapper service listens on TCP port 135. RPC services in earlier versions of the operating system could bind to any port between 1024 and 65535. RPC services running on Windows Vista, Windows 7, Windows 2008 and later versions will typically bind to ports between 49152 and 65535. On any Microsoft operating system, certain RPC services may bind to specific ports above 1024, and often can be configured to listen on a specific port if necessary to facilitate firewall rules.

Understanding RPC Client Access


Exchange 2010 services almost all client requests with a new RPC service called the RPC Client Access Service. The RPC Client Access service runs on each Exchange server installed with the CAS role, and handles all MAPI client connections to mailboxes and to Active Directory that are related to Exchange, like GAL lookups. The only time a MAPI client will go directly to the mailbox server is for Public Folder connectivity.

The MAPI RPC connection point on a CAS server will process all MAPI connections. If the request is for a mailbox or address lookup (Address Book Service), the MAPI RPC service will process the request. If it is for a Public Folder, the service will respond with a referral to a mailbox server. The RPC Client Access Service will use ports between 6005 and 59530 to service clients. Of course, clients must first connect to the RPC Endpoint Mapper on TCP 135 to determine which RPC ports are listening. This is a very large range, and outside the norm for Server 2008 RPC services. Microsoft Exchange 2010 uses this design to improve performance and facilitate failover, especially when using CAS arrays and DAGs. Servers can go down or mailboxes can be dismounted without the client needing to make any change to its connection.

Restricting ports
Administrators can restrict the ports that are used for RPC access, but unless you have a very strong reason to do so, and are willing to take on significantly more administrative overhead, I urge you not to do this. In the microcosm this works well, but six months from now when the company must scale out for increased user counts, or needs to replace a failed server, any gaps in the server documentation can cause hours to days of troubleshooting. See this TechNet article for the specific registry keys involved. Take special note of the warnings to use the same port assignments for all CAS servers in the same site, and make absolutely certain that your documentation is updated to reflect this change. Any new CAS server that uses different ports (including the default) can lead to intermittent client connectivity issues, which will be extremely difficult to diagnose, and firewall rules will have to be maintained precisely when you they separate clients from CAS servers.

Latency
The biggest problem with RPC-based communications is that they are very intolerant of latency. Generally speaking, clients making any kind of RPC connection to a server will find that performance degrades quickly when latency begins to exceed 50 milliseconds. When we look specifically at how the CAS server communicates with the Mailbox server, Microsoft recommends that latency between the CAS server and the Mailbox server does not exceed 10 milliseconds. This means that your CAS and Mailbox servers must be on the same LAN. Exchange requires that a CAS server be in the same AD site as a Mailbox server, but AD sites have been known to span WAN links. When AD replication and authentication are the only factors to consider, this usually doesnt present a problem, but when you add Exchange to the mix, never span a WAN with your site definitions. Excessive latency between client and CAS server will usually manifest itself as Outlook becoming unresponsive. This can happen when the client performs GAL lookups, or at peak traffic times. Remember, setting your Outlook client to use cached mode can be measured with a protocol analyzer like Wireshark, but will also show up in logs

RPC over HTTPS


RPC over HTTPS, or Outlook Anywhere (OA) as it is commonly called, encapsulates RPC communications within an HTTPS connection. This offers several advantages over direct RPC communication. First, it enables OA to function over the Internet, either directly connecting to a CAS server or being proxied through a TMG or other reverse proxy. By tunneling all client connections within an HTTPS connection, only a single port on a firewall must be opened, and reverse proxies can facilitate a secure connection between client and CAS server without requiring a VPN. OA leverages cached mode as well, greatly reducing the issues introduced by highly latent links such as when the client is accessing the CAS server over the Internet.

Troubleshooting

1. Look for Event ID 1722 on the client, or on the CAS server when it tries to connect to the Mailbox server. 1722 indicates that the RPC server was unavailable. Confirm firewall rules between hosts, and that the RPC services are running. 2. If a firewall with IPS is between client and server, check its logs to determine whether or not it is blocking RPC connections. This is very common with some firewall vendors that have an active or smart intrusion prevention mechanism. MAPI clients can make twenty or more simultaneous connections to the CAS server, which may exceed some default thresholds. 3. Check for excessive latency using your protocol analyzer. If latency begins to exceed 50 milliseconds, you may start to see issues. Once it is above 200, you really should only be using Outlook Anywhere connections with cached mode. Remember, between the CAS server and the Mailbox server, and between the CAS server and a global catalog server, latency should be below 10 milliseconds. 4. If you determine that response times from the global catalog server to the CAS server are taking too long, either add another global catalog server to the site, or upgrade the existing GC to 64bit and add enough RAM to allow it to cache the entire Active Directory database. You will see huge performance boosts when a GC can cache all of AD and not have to rely on disk for lookups.

Coming up next
In Part 6, we will look at how clients interact with Exchange at the network level, where bottlenecks can occur, and how to troubleshoot those problems. Heres a rundown of the six parts in this series. Well update with live links as each part is published over the next several weeks. Introduction and DNS Active Directory Firewalls NICs RPCs (this post) Client side issues Tags: Exchange Posted in Exchange server | No Comments 2010, troubleshooting

1. 2. 3. 4. 5. 6.

Troubleshooting Exchange Connectivity (Part 6)


Written by Casper Manes on February 21, 2012

Networking:

Client

Often Exchange administrators will receive escalated help desk tickets from users complaining that Exchange is slow and demanding resolution. These sorts of tickets (slow being at best a relative term, and never specific enough about what precisely is considered to be slow) can be extremely challenging to work, since the subjective nature of slowness is often combined with an inability to replicate the problem, or the problem is intermittent. The Exchange admin can take a look at the server(s) for high CPU utilization, low memory conditions, disk and network queue lengths exceeding the norm, and finding nothing, shrug it back off to the desktop support team as a client issue. While it is often a client issue, there are several places between Outlook and a users mailbox that can cause intermittent slowness, and are fair to call networking bottlenecks. In a six part series of articles, well look at how Exchange interacts on the network with various other services to help you identify network issues, and troubleshoot them when they occur. In many cases, troubleshooting Exchange network bottlenecks will require a network trace, and may also require performance monitor counters. This series of articles will talk about both of those in general terms; how to use NetMon or Wireshark, and PerfMon are out of scope. In Part 6 of this series, were going to troubleshoot client connectivity issues to Exchange.

Remote Procedure Calls (RPCs)


Yes, were starting this post out with some of the same content as the last one. The RPC is not only critical for Exchange to Exchange communications, but also for our Outlook client connections to Exchange. Whether you use MAPI connections to an ephemeral port, or the now more popular Outlook Anywhere (RPC over HTTPS) your Outlook client will make constant calls to Exchange. MAPI connections are going to be very sensitive to latency. As much as possible, keep your MAPI clients and their CAS servers local to one another. Short hops across the WAN are okay, but if you are seeing consistent latencies above 200 milliseconds, you should expect to see some client issues when using MAPI connections. These will manifest as Outlook not responding, or warnings in Outlook that connectivity to the Exchange server has been lost. These will usually go away faster than the end user can call the help desk, but will come back again, especially during times of high network utilization. If you are going to have clients on one side of the WAN, and servers on the other, you will find using cached mode and using Outlook Anywhere connections to provide a much improved user experience.

Connectivity between client and server


When you are using Outlook 2010 and Exchange 2010, all of your client connections (except public folders if you still use those) will be between your client and your CAS server. Ensuring that all the required ports are open between client and server will head off many potential connectivity issues. If you are going to use MAPI connections, see our earlier articles on RPC and firewalls (links below) to learn about considerations when using dynamic

ports, and recommendations for the same. Outlook Anywhere, an d certain legacy protocols like IMAP and POP3 are much easier to deal with when you are looking at firewalls, or even traffic shaping. Heres a simple table on what you need to permit:

Client protocol POP3

Transport Source ports TCP Vista and later 49152-65535 (XP 102565535) TCP Vista and later 49152-65535 (XP 1025IMAP 65535) Vista and later 49152-65535 (XP 1025Outlook Anywhere TCP 65535) TCP Vista and later 49152-65535 (XP 1025IMAP over SSL 65535) TCP Vista and later 49152-65535 (XP 1025POP3 over SSL 65535)

Destination port 110 143 443 993 995

Certificates
Any of the secure protocols will require that the Exchange CAS server has a certificate that can be used for the encryption. Clients must trust that certificate, or they will have to click through warnings repeatedly. If you have an internal Enterprise Public Key Infrastructure, your domain members will trust certificates issued by that infrastructure automatically, but you may need to configure standalone and non-Windows machines by hand. If you want to support non-domain joined machines, and the various smart phones and tablets that are growing in popularity, you really should secure communications using a certificate from a commercial CA. The money you spend on that certificate will more than save the amount of time you will spend configuring all the executives iPads and everyones Droids to trust your internal certificates.

DNS
Your client software will have to make numerous DNS queries in order to connect to the appropriate Exchange server(s). When troubleshooting connectivity, make sure that your client is resolving the correct ip.addr for the CAS server it must connect to. NSLOOKUP will query your DNS server, but it wont tell you if you have a HOSTS file that has bad information in it. I like to use NSLOOKUP to verify that I remember the server ip.addr, and then I ping the server by name to make sure that the client tries to connect to the ip.addr that DNS says it should. A cached entry, or a HOSTS (or even LMHOSTS) file with bad information will show itself very quickly. Remember you can use the ipconfig /flushdns command to quickly clear the resolver cache, or you can edit HOSTS and LMHOSTS in c:/windows/system32/drivers/etc as long as you have admin rights. Many an admin has put an entry in there as part of a quick fix, and then forgotten to come back to do things the right way.

Outlook logging
Sometimes, you cannot get a good handle on just what is wrong with the client until you can see some hard log data. By default, Outlook performs no real useful logging, so you have to turn that on. Before you do though, keep in mind that it can be resource intensive, so make sure you turn it off when you are done. Every time you launch Outlook with logging enabled, it will suggest you do turn it off. To enable logging, click File, Options, Advanced, and scroll down to the bottom of the page. Check the box to Enable troubleshooting logging. This will require you to restart Outlook. Once enabled, Outlook will log to %userprofile%AppDataLocalTempoutlook logging and will contain logs for calendar,

free/busy, reminders, OAB, and MAPI. Check the appropriate log for the issue you are troubleshooting.

Troubleshooting client issues


There are several steps you can take with Outlook to troubleshoot client side issues. Here are the symptoms and how to try to get around them.

Outlook cannot start

Start Outlook in safe mode to verify that no plugins are causing issues. This will also turn off the preview pane to ensure no corrupt message is causing Outlook to fail. Start, run, outlook.exe /safe

Outlook says it cannot load your profile Problems viewing seems out of day Free/Busy

Close Outlook, and create a new e-mail profile to test whether or not your profile is corrupt.

information,

or

Free/Busy

Launch Outlook using the cleanfreebusy switch to force a reload of free/busy information. Start, run, outlook.exe /cleanfreebusy

Problems with IMAP or POP3

This may require you to use the server side logs to troubleshoot. Open the Exchange Management Shell and run the appropriate command. You must restart the service after enabling logging. Files are stored in C:Program FilesMicrosoftExchange ServerV14Logging. Set-IMAPSettings -Server CAS01 -protocolLogEnabled $true Set-POPSettings -Server CAS01 -protocolLogEnabled $true Use $false to disable logging, and restart the service.

Other Outlook switches


Heres a list of several other switches you can use when starting Outlook. These can be very useful when trying to narrow down what is causing Outlook to fail:

Switch /a

/altvbaotmfilename /c messageclass

Description Creates an item with the specified file as an attachment.Example: C:Program FilesMicrosoft OfficeOffice11Outlook.exe /a C:My Documentslabels.doc Note: If no item type is specified, IPM.Note is assumed. Cannot be used with message classes that arent based on Outlook. Opens the VBA program specified in otmfilename, rather than %appdata%MicrosoftOutlookVbaProject.OTM. Creates a new item of the specified message class (Outlook forms or any other valid MAPI form). Some of the related examples: o /c ipm.activity creates a Journal entry
o o o o o

/c ipm.appointment creates an appointment /c ipm.contact creates a contact /c ipm.note creates an e-mail message /c ipm.stickynote creates a note /c ipm.task creates a task Prompts for the default manager of e-mail, news, and contacts.

/checkclient

/cleanclientrules /cleandmrecords /cleanfinders /cleanfreebusy

Starts Outlook and deletes client-based rules. Deletes the logging records saved when a manager or a delegate declines a meeting. Removes Search Folders from the Microsoft Exchange server store. Clears and regenerates free/busy information. This switch can only be used when you able to connect to your Microsoft Exchange server. /cleanprofile Removes invalid profile keys and recreates default registry keys where applicable. /cleanpst Launches Outlook with a clean Personal Folders file (.pst). /cleanreminders Clears and regenerates reminders. /cleanrules Starts Outlook and deletes client- and server-based rules. /cleanschedplus Deletes all Schedule+ data (free/busy, permissions, and .cal file) from the server and enables the free/busy information from the Outlook Calendar to be used and viewed b all Schedule+ 1.0 users. /cleanserverrules Starts Outlook and deletes server-based rules. /cleansniff Deletes duplicate reminder messages. /cleansubscriptions Deletes the subscription messages and properties for subscription features. /cleanviews Restores default views. All custom views you created are lost. /designer Starts Outlook without figuring out if Outlook should be the default client in the first r /embedding Opens the specified message file (.msg) as an OLE embedding. Also used without command-line parameters for standard OLE co-create. /f msgfilename Opens the specified message file (.msg) or Microsoft Office saved search (.oss). /firstrun Starts Outlook as if it were run for the first time. /hol holfilename Opens the specified .hol file. /ical icsfilename Opens the specified .ics file. /importprfprffilename Launches Outlook and opens/imports the defined MAPI profile (*.prf). If Outlook is already open, queues the profile to be imported on the next clean launch. /l olkfilename Opens the specified .olk file. /launchtraininghelpassetid Opens a Help window with the Help topic specified in assetid. /m emailname Provides a way for the user to add an e-mail name to the item. Only works in conjunc with the /c command-line parameter.Example: Outlook.exe /c ipm.note /m emailname /nocustomize Starts Outlook without loading outcmd.dat (customized toolbars) and *.fav file. /noextensions Starts Outlook with extensions turned off, but listed in the Add-In Manager. /nopollmail Starts Outlook without checking mail at startup. /nopreview Starts Outlook with the Reading Pane off. /p msgfilename Prints the specified message (.msg). Does not work with HTML. /profile profilename Loads the specified profile. If your profile name contains a space, enclose the profile name in quotation marks (). /profiles Opens the Choose Profile dialog box regardless of the Options setting on the Tools menu. /recycle Starts Outlook using an existing Outlook window, if one exists. Used in combination with /explorer or /folder. /resetfoldernames Resets default folder names (such as Inbox or Sent Items) to default names in the cur Office user interface language.Note: If you first connect to your mailbox Outlook usin Russian user interface, the Russian default folder names cannot be renamed. To chan

/resetfolders /resetnavpane /rpcdiag /s filename /safe /safe:1 /safe:2 /safe:3 /safe:4 /select foldername

/sniff /t oftfilename /v vcffilename /vcal vcsfilename /x xnkfilename

the default folder names to another language such as Japanese or English, you can us this switch to reset the default folder names after changing the user interface languag installing a different language version of Outlook. Restores missing folders for the default delivery location. Clears and regenerates the Navigation Pane for the current profile. Opens Outlook and displays the remote procedure call (RPC) connection status dialog Loads the specified shortcuts file (.fav). Starts Outlook without extensions, Reading Pane, or toolbar customization. Starts Outlook with the Reading Pane off. Starts Outlook without checking mail at startup. Starts Outlook with extensions turned off, but listed in the Add-In Manager. Starts Outlook without loading Outcmd.dat (customized toolbars) and *.fav file. Starts Outlook and opens the specified folder in a new window. For example, to open Outlook and display the default calendar use: c:Program FilesMicrosoft OfficeOffice11Outlook.exe /select outlook:calendar Starts Outlook and forces a detection of new meeting requests in the Inbox, and then adds them to the calendar. Opens the specified .oft file. Opens the specified .vcf file. Opens the specified .vcs file. Opens the specified .xnk file.

Wrapping it all up
We hope you have enjoyed this six part series on troubleshooting Exchange networking issues. Weve gone over the critical role DNS plays in Exchange networking, how Exchange interacts with Active Directory on the network, playing nicely with firewalls, getting the most out of your NICs, how RPCs come into play, and finally troubleshooting client issues. Below are links to the full series in case you missed one or are interested in seeing them all. Introduction and DNS Active Directory Firewalls NICs RPCs Client side issues (this post).

1. 2. 3. 4. 5. 6.

Вам также может понравиться