Вы находитесь на странице: 1из 7

Status Code: 54

Timed out connecting to client

A Status Code 54 will occur when the server could not complete the connection to the client. The accept system or winsock call timed out after 60 seconds. This problem can occur when a master/media server

tries to connect to bpcd on the client machine and the client fails to respond before the software times out

The processes involved in this function are: bpcd, bprd, bpbrm

and vnetd (if vnetd is configured for firewall operation).

after 60 seconds (default timer setting).

operation). after 60 seconds (default timer setting). Status Code: 54 Timed out connecting to client Page

Table of Contents

1 General Status 54 troubleshooting

3

1.1 Verify NetBackup Server processes are running:

3

1.2 Verify the NetBackup Client Daemon (bpcd) is

3

1.3 Use the telnet command to test the NetBackup daemons

4

1.4 Check logging on the affected Media Server and Client(s):

4

1.5 Is a firewall present in the configuration?

5

1.6 Are there any networking issues?

6

1.6.1 Name resolution issues:

6

1.6.2 Network performance issues

6

1.7

Is there a machine resource issue?

6

2 Troubleshooting for NetBackup Database Agents

6

3 Links

7

1

General Status 54 troubleshooting

The goal here is to isolate the issue to a machine or pair of machines. Since the issue deals with socket to socket communications, the issue could be happening between the Master or Media server, and the client. The key to success is to isolate which socket is not being established. Once this is done, then further troubleshooting can be done to isolate the failure to a specific function, network configuration, performance issue, machine resources, etc.

1.1 Verify NetBackup Server processes are running:

On the Windows NT/2000 or UNIX master server, verify the NetBackup Request Manager (bprd), NetBackup Job Daemon (bpjobd), and NetBackup Database Manager (bpdbm) services are running. These daemons must be running on the master server.

Open a command prompt in Windows, or open a shell in UNIX as root, and run following command:

For Windows run:

% <install_path>\VERITAS\NetBackup\bin\bpps * 1TIME

4/25/05 14:02:30.203

COMMAND

PID

LOAD

TIME

MEM

START

bpdbm

1912

0.000%

0.031 4.4M

4/25/05 13:07:45.890

bprd

2080

0.000%

0.125 4.7M

4/25/05 13:07:47.531

bpjobd

3248

0.000%

0.171 3.4M

4/25/05 13:07:48.703

<Media manager processes would be displayed here>

For UNIX run:

# /usr/openv/netbackup/bin/bpps –a NB Processes

------------

root 18633

1 0 Apr 22 ? 0:01 /usr/openv/netbackup/bin/bpdbm

root 18620

1 0 Apr 22 ? 0:02 /usr/openv/netbackup/bin/bprd

root 18641 18633 0 Apr 22 ? 0:05 /usr/openv/netbackup/bin/bpjobd

<Media manager processes would be displayed here>

If these services are not running on the Windows master, start them. On the Windows Desktop:

1. Right-click on My Computer on the desktop, or within the Start Menu and choose "Manage".

2. Expand Services and Applications and highlight Services.

3. Locate the NetBackup services (NetBackup Request Manager, NetBackup Database Manager, NetBackup Client Service, NetBackup Volume Manager, NetBackup Device Manager) and verify they are started.

4. If services are not started then right-click on each service and choose Start.

If these services are not running on the UNIX master, start them.

# /usr/openv/netbackup/bin/goodies/netbackup start

1.2 Verify the NetBackup Client Daemon (bpcd) is listening.

The client daemons such as NetBackup Client Service (bpcd), etc. are started from bpinetd.exe on Windows or inetd\xinetd on UNIX\Linux and won’t appear in the bpps output. Instead the netstat command can be used to verify these daemons are in LISTEN status. On the Master and the affected client(s) run the command below to verify if bpcd is listening.

Windows: netstat -a > c:\netstat.txt

UNIX:

netstat –a > /tmp/netstat.txt

The netstat.txt file that gets created should list the listening processes that are running (bpcd, vnetd, vopied, bpjava-msvc). Search this file to determine if bpcd is in LISTEN status. The vnetd process should also be in LISTEN status if vnetd is being used for firewalls.

Windows: TCP

UNIX:

*.bpcd

hostname:bpcd

*.*

0

hostname.domain.com:0 LISTENING

0

49152

LISTEN

1.3 Use the telnet command to test the NetBackup daemons

Another test after the problem systems have been identified would be to try to telnet to NetBackup well known ports from machine to machine. For example from the Master server, a telnet session could be run to the Media server or client and visa versa:

From Master command line:

# telnet <machine name or machine IP address> bpcd

This will connect to the target machine and display a message similar to the ones below:

For UNIX:

If telnet is successful you will get a message similar to:

# telnet nbclient bpcd

Trying x.x.x.x Connected to nbclient.domain.com. Escape character is '^]'.

< If successful no additional messages will be returned >

Press enter to end telnet session.

If telnet is unsuccessful you will get a message similar to:

# telnet nbclient bpcd

Trying x.x.x.x telnet: Unable to connect to remote host: Connection refused The telnet session will end automatically and return to the prompt.

For Windows:

If telnet is successful you will get a message similar to:

% telnet nbclient bpcd

< If successful no displayed messages will be returned > Press enter to end telnet session.

If telnet is unsuccessful you will get a message similar to:

% telnet nbclient bpcd Connecting to

.Could not open a connection to host on port

13782 : Connect failed The telnet session will end automatically and return to the prompt.

This is also a very good test for firewall issues to see if a path is open through the firewall. This test can be repeated for connection testing to bprd, bpdbm, and vnetd.

1.4 Check logging on the affected Media Server and Client(s):

Examine the All Log Entries report for the time of the failure to determine where the failure occurred. Also view the logging information detailed in the previous flow chart for error and failure information. This log information is the best way to isolate where the problem is occurring and what machines are involved in the issue, and will enable you to narrow your focus and concentrate your troubleshooting efforts.

The Media server bpbrm log and the client bpcd log will contain identical logconnections lines:

<2> logconnections: BPCD ACCEPT FROM x.x.x.x.<port> TO y.y.y.y.13782

The x.x.x.x will be the source IP address for the connection. Verify this is using the expected network interface. The client will need to have forward and reverse name lookup information for this IP address.

Example from a UNIX Media server /usr/openv/netbackup/logs/bpbrm/log.<date> file:

<2> bpcr_connect: bpcr_connect timeout during select after 60 seconds on port <port> <16> bpbrm start_bpcd: timed out trying to connect to <hostname>

This indicates the client did not reply to the server before the 60 second socket timeout. In this case check the client’s bpcd log for additional troubleshooting information.

Example from a UNIX client /usr/openv/netbackup/logs/bpcd/log.<date> file:

<8> bpcd peer_hostname: gethostbyaddr failed: HOST_NOT_FOUND (1) <16> bpcd peer_hostname: gethostbyaddr failed to return peer host, herrno = 1 <16> bpcd main: Couldn't get peer hostname

Example from a UNIX client /usr/openv/netbackup/logs/bpcd/log.<date> file:

<2> hosts_equal: gethostbyname failed for <hostname>: No such host is known. (0)

This would indicate a failure with the name or reverse name lookup of the master or media sever. NetBackup does a reverse name lookup of the IP in order to get the name to authenticate against the SERVER entry in the Windows Registry or the UNIX /usr/openv/netbackup/bp.conf.

After reviewing the log files, a better idea of what machines are involved in the failure should be evident.

For name lookup errors add an entry to the /etc/hosts on UNIX or the C:\WINDOWS\system32\drivers\etc\hosts on Windows and try the operation again.

x.x.x.x master

master.domain.com

1.5 Is a firewall present in the configuration?

If so are all of the required ports open? Check the NetBackup System Administrator Guide (for UNIX or Windows) for firewall and port information. At a minimum ports 13782 (bpcd) and 13724 (vnetd) need to be opened in the firewall for a client backup. This requires configurations to be made for the client on the master before it will work. Additional ports are required for restores or if the client is also a media server.

Example from a UNIX client /usr/openv/netbackup/logs/bpcd/log.<date> file:

<2> bpcd peer_hostname: Connection from host <hostname> (x.x.x.x) port <reserved port> <2> bpcd main: Peer hostname is <hostname> <2> nb_bind_on_port_addr: bound to port <reserved port> <2> bpcd main: Got socket for output 5, lport = <reserved port>

This would indicate the client is using the default of reserved ports for the callback. The nb_bind_on_port_addr: call will display the reserved port number being used for the callback. A firewall will most likely be blocking reserved ports which will cause the backup to abort on the media server with a status 54.

Example from a UNIX client /usr/openv/netbackup/logs/bpcd/log.<date> file:

<4> bpcd valid_server: hostname comparison succeeded <2> bpcd main: output socket port number = 13782

Note: For NetBackup 5.x there will be a dozen “<2> vnet vnetd_<function>” log entries between these lines.

<2> get_vnetd_socket: connected to vnetd socket 5

This would indicate the client is using vnetd port for callbacks. The nb_bind_on_port_addr: call will not appear in the logs when vnetd is used for callbacks.

1.6 Are there any networking issues?

1.6.1 Name resolution issues:

Use the bpclntcmd to test name lookups in both directions. This should be run against both the hostname and IP address of each machine involved in order to test both forward and reverse name lookups. Review the following Technote http://support.veritas.com/docs/261393 for details on using the bpclntcmd command.

1.6.2 Network performance issues

Duplex issues Commands to run: “netstat –ian” to check for Ierrs or Oerrs.

Routing issues Commands to run: “netstat –rn”, “traceroute” or “ifconfig –a” to check for routing or subnet mask errors.

Network bottlenecks Commands to run: “ftp” or “ttcp” to test underlying network performance.

1.7 Is there a machine resource issue?

Verify VERITAS suggested minimum kernel parameters are in place for UNIX machines. Review the following Technote: http://seer.support.veritas.com/docs/238063.htm

2 Troubleshooting for NetBackup Database Agents

Script-based NetBackup database clients such as DB2, Informix, Oracle, SAP, Sybase, SQL- Server, and Teradata require additional troubleshooting to resolve status 54’s. These clients use comm files in the /usr/openv/netbackup/logs/user_ops directory tree that must be updated by the master and the media server and then read by the client prior to establishing the Name and Data sockets.

First, three connections to the client occur from the master and then the media server. These connections use bpcd on the client, including the server connect-back, to update the comm file with job progress information and eventually the hostname and additional port numbers that the client should use to establish the Name and Data sockets. Troubleshooting a status 54 during this portion of the backup or restore is identical to the steps for a standard backup described in this document.

A second cause for a status 54 on a database client backup or restore occurs when the client fails to receive an expected update from either the master or the media server before the CLIENT_READ_TIMEOUT or other timeout expires on the client. Upon timeout, the database client will exit in error. Eventually the job will become active and bpbrm will bind to ports for the Name and Data sockets, write the port numbers into the comm file, and wait for the database client

to connect-back. If the connect-back does not occur, within 60 seconds of the comm file update, bpbrm will fail the job with a status 54. The bpbrm log on the media server will show the additional ports for the sockets along with the media server hostname to which the client should use to complete the connect-back.

<2> bpbrm listen_for_client: HOT_ORACLE_DB_BACKUP <2> bpbrm listen_for_client: bpbrm.c.19241: listen(2)ing on port: 3826 3826

0x00000ef2

<2> bpbrm listen_for_client: bpbrm.c.19243: listen(2)ing on port: 4941 4941

0x0000134d

<2> bpcr_get_peername_rqst: Server peername length = 8 <2> bpbrm write_msg_to_progress_file: INF - Data socket = sv2n2adm.3826 <2> bpbrm write_msg_to_progress_file: INF - Name socket = sv2n2adm.4941

Please note that the hostname provided in the comm file may differ from the expected hostname for the media server. Such a mismatch is a third potential cause for a status 54 on a database client

backup or restore.

through the network, then bpbrm will timeout after 60 seconds and fail the job with a status 54.

If the client cannot resolve the provided hostname and complete the socket

Hence, it is vitally important that the database client log be checked to determine if the database client has already exited, is denied a socket by the network, is unable to bind to a local port, or is otherwise unable to read the comm file. To determine the exact cause, enable logging for the database client per the Troubleshooting instructions in the VERITAS NetBackup ™ for <Database agent> System Administrators Guide.

3

Links

Click here to Search for other documents on Status 54

Also, you may click below to perform a search on the following relevant items:

Status Code 54

Timed out connecting to client

Can't open shared library

open failed: No such file or directory