Академический Документы
Профессиональный Документы
Культура Документы
Important Notice
(c) 2010-2014 Cloudera, Inc. All rights reserved.
Cloudera, the Cloudera logo, Cloudera Impala, and any other product or service
names or slogans contained in this document are trademarks of Cloudera and its
suppliers or licensors, and may not be copied, imitated or used, in whole or in part,
without the prior written permission of Cloudera or the applicable trademark holder.
Hadoop and the Hadoop elephant logo are trademarks of the Apache Software
Foundation. All other trademarks, registered trademarks, product names and
company names or logos mentioned in this document are the property of their
respective owners. Reference to any products, services, processes or other
information, by trade name, trademark, manufacturer, supplier or otherwise does
not constitute or imply endorsement, sponsorship or recommendation thereof by
us.
Complying with all applicable copyright laws is the responsibility of the user. Without
limiting the rights under copyright, no part of this document may be reproduced,
stored in or introduced into a retrieval system, or transmitted in any form or by any
means (electronic, mechanical, photocopying, recording, or otherwise), or for any
purpose, without the express written permission of Cloudera.
Cloudera may have patents, patent applications, trademarks, copyrights, or other
intellectual property rights covering subject matter in this document. Except as
expressly provided in any written license agreement from Cloudera, the furnishing
of this document does not give you any license to these patents, trademarks
copyrights, or other intellectual property.
The information in this document is subject to change without notice. Cloudera
shall not be liable for any damages resulting from technical errors or omissions
which may be present in this document, or from use of this document.
Cloudera, Inc.
1001 Page Mill Road Bldg 2
Palo Alto, CA 94304
info@cloudera.com
US: 1-888-789-1488
Intl: 1-650-362-0488
www.cloudera.com
Release Information
Version: 5.1.x
Date: August 28, 2014
Table of Contents
About this Guide.........................................................................................................7
Managing the Cloudera Manager Server and Agents............................................9
Starting, Stopping, and Restarting the Cloudera Manager Server.........................................................9
Configuring Cloudera Manager Server Ports............................................................................................9
Moving the Cloudera Manager Server to a New Host.............................................................................9
Starting, Stopping, and Restarting Cloudera Manager Agents.............................................................10
Configuring Cloudera Manager Agents...................................................................................................11
Viewing Cloudera Manager Server and Agent Logs...............................................................................14
Changing Hostnames................................................................................................................................14
Backing up Databases...............................................................................................................................16
Backing Up PostgreSQL Databases .....................................................................................................................17
Backing Up MySQL Databases..............................................................................................................................17
Backing Up Oracle Databases...............................................................................................................................17
Next Steps...............................................................................................................................................................46
Kerberos......................................................................................................................................................73
Sending Usage and Diagnostic Data to Cloudera...................................................................................74
Configuring a Proxy Server....................................................................................................................................74
Managing Anonymous Usage Data Collection....................................................................................................74
Managing Hue Analytics Data Collection.............................................................................................................74
Diagnostic Data Collection.....................................................................................................................................74
You can stop (for example, to perform maintenance on its host) or restart the Cloudera Manager Server without
affecting the other services running on your cluster. Statistics data used by activity monitoring and service
monitoring will continue to be collected during the time the server is down.
To stop the Cloudera Manager Server:
$ sudo service cloudera-scm-server stop
Description
Clean Start
$ sudo service cloudera-scm-agent clean_start
The directory /var/run/cloudera-scm-agent is completely cleaned out; all files and subdirectories are
removed, and then the start command is executed. /var/run/cloudera-scm-agent contains on-disk
running Agent state. Some Agent state is left behind in /var/lib/cloudera-scm-agent, but you shouldn't
delete that. For further information, see Server and Client Configuration and Process Management.
Stopping and Restarting Agents
To stop or restart Agents while leaving the managed processes running, use one of the following commands:
10 | Cloudera Manager Administration Guide
Restart
$ sudo service cloudera-scm-agent restart
Hard Restart
$ sudo service cloudera-scm-agent hard_restart
Description
Description
Set health status to Concerning if The number of missed consecutive heartbeats after which a Concerning
the Agent heartbeats fail
health status is assigned to that Agent.
Default: 5.
Set health status to Bad if the
Agent heartbeats fail
Description
server_host, server_port,
listening_port,
listening_hostname,
listening_ip
Hostname and ports of the Cloudera Manager Server and Agent and IP
address of the Agent. Also see Configuring Cloudera Manager Server Ports
on page 9 and Ports Used by Cloudera Manager.
The Cloudera Manager Agent configures its hostname automatically.
However, if your cluster hosts are multi-homed (that is, they have more
than one hostname), and you want to specify which hostname the
Cloudera Manager Agent uses, you can update the listening_hostname
property. If you want to specify which IP address the Cloudera Manager
Agent uses, you can update the listening_ip property in the same file.
To have a CNAME used throughout instead of the regular hostname, an
Agent can be configured to use listening_hostname=CNAME. In this
case, the CNAME should resolve to the same IP address as the IP address
of the hostname on that machine. Users doing this will find that the host
inspector will report problems, but the CNAME will be used in all
configurations where that's appropriate. This practice is particularly useful
for users who would like clients to use
Description
namenode.mycluster.company.com instead of
machine1234.mycluster.company.com. In this case,
namenode.mycluster would be a CNAME for machine1234.mycluster,
The path to the Agent log file. If the Agent is being started via the init.d
script, /var/log/cloudera-scm-agent/cloudera-scm-agent.out will
also have a small amount of output (from before logging is initialized).
Default: /var/log/cloudera-scm-agent/cloudera-scm-agent.log.
lib_dir
parcel_dir
max_collection_wait_seconds
Maximum time to wait for all metric collectors to finish collecting data.
Default: 10 sec.
metrics_url_timeout_seconds
mgmt_home
Default: /usr/share/cmf.
cloudera_mysql_connector_jar, Location of JDBC drivers. See Cloudera Manager and Managed Service
cloudera_oracle_connector_jar, Databases.
cloudera_postgresql_jdbc_jar
Default:
MySQL - /usr/share/java/mysql-connector-java.jar
Oracle - /usr/share/java/oracle-connector-java.jar
PostgreSQL /usr/share/cmf/lib/postgresql-version-build.jdbc4.jar
Changing Hostnames
Required Role:
Important: The process described here requires Cloudera Manager and cluster downtime.
After you have installed Cloudera Manager and created a cluster, you may need to update the names of the
hosts running the Cloudera Manager Server or cluster services. To update a deployment with new hostnames,
follow these steps:
1. Verify if SSL/TLS certificates have been issued for any of the services and make sure to create new SSL/TLS
certificates in advance for services protected by TLS/SSL. Review Cloudera Manager and CDH documentation
at Cloudera Documentation.
Tip: Search for SSL and TLS in the documentation.
2. Export the Cloudera Manager configuration using one of the following methods:
Open a browser and go to this URL http://cm_hostname:7180/api/api_version/cm/deployment.
Save the displayed configuration.
3.
4.
5.
6.
7.
8.
where cm_hostname is the name of the Cloudera Manager host and api_version is the correct version of the
API for the version of Cloudera Manager you are using. For example,
http://tcdn5-1.ent.cloudera.com:7180/api/v6/cm/deployment.
Stop all services on the cluster.
Stop the Cloudera Management Service.
Stop the Cloudera Manager Server.
Stop the Cloudera Manager Agents on the hosts that will be having the hostname changed.
Back up the Cloudera Manager Server database using mysqldump, pg_dump, or another preferred backup
utility. Store the backup in a safe location.
Update names and principals:
a. Update the target hosts using standard per-OS/name service methods (/etc/hosts, dns,
/etc/sysconfig/network, hostname, and so on). Ensure that you remove the old hostname.
b. If you are changing the hostname of the host running Cloudera Manager Server do the following:
a. Change the hostname per step 8.a.
b. Update the Cloudera Manager hostname in /etc/cloudera-scm-agent/config.ini on all Agents.
c. If the cluster is configured for Kerberos security, do the following:
a. In the Cloudera Manager database, set the merged_keytab value:
PostgreSQL
update roles set merged_keytab=NULL;
MySQL
update ROLES set MERGED_KEYTAB=NULL;
b. Remove old hostname cluster service principals from the KDC database using one of the following:
Use the delprinc command within kadmin.local interactive shell.
From the command line:
kadmin.local -q "listprincs" | grep -E
"(HTTP|hbase|hdfs|hive|httpfs|hue|impala|mapred|solr|oozie|yarn|zookeeper)[^/]*/
[^/]*@" > cluster-princ.txt
Open cluster-princ.txt and remove any non-cluster service principal entries within it. Make
sure that the default krbtgt and other principals you created, or were created by Kerberos by
default, are not removed by running the following: for i in `cat cluster-princ.txt`; do
yes yes | kadmin.local -q "delprinc $i"; done.
c. Start the Cloudera Manager database and Cloudera Manager Server.
d. Start the Cloudera Manager Agents on the newly renamed hosts. The Agents should show a current
heartbeat in Cloudera Manager.
e. Within the Cloudera Manager Admin Console recreate all the principals based on the new hostnames:
a. Select Administration > Kerberos.
Cloudera Manager Administration Guide | 15
9. If one of the hosts that was renamed has a NameNode configured with High Availability and automatic
failover enabled, reconfigure the ZooKeeper failover controller znodes to reflect the new hostname.
Warning:
Do not perform this step if you are also running JobTracker in a High Availability configuration,
as clearing the hadoop-ha znode will negatively impact JobTracker HA.
All other services, and most importantly HDFS, should not be running.
a. Start ZooKeeper services.
Note: Make sure the ZooKeeper Failover Controller role is stopped within the HDFS service;
start only the ZooKeeper Server role instances.
b. On one of the hosts that has a ZooKeeper Server role, log into the Zookeeper CLI to delete the Nameservice
znode:
On a package-based installation zkCli.sh is found at: /usr/lib/zookeeper/bin/zkCli.sh
On a parcel-based installation zkCli.sh is found at:
$/opt/cloudera/parcels/CDH/lib/zookeeper/bin/zkCli.sh
a. Verify that the HA znode exists: zkCli$ ls /hadoop-ha
b. Delete the old znode: zkCli$ rmr /hadoop-ha/nameservice1
c. In the Cloudera Manager Admin Console, go to the HDFS service.
d. Click the Instances tab.
e. Select Actions > Initialize High Availability State in ZooKeeper....
10. For each of the Cloudera Management Service roles (Host Monitor, Service Monitor, Reports Manager, Activity
Monitor, Navigator) go to their configuration and update the Database Hostname property.
11. Start all cluster services.
12. Start the Cloudera Management Service.
13. Deploy client configurations.
Backing up Databases
Cloudera recommends that you periodically back up the databases that Cloudera Manager uses to store
configuration, monitoring, and reporting data and for managed services that require a database:
Cloudera Manager - Contains all the information about what services you have configured, their role
assignments, all configuration history, commands, users, and running processes. This is a relatively small
database (<100 MB), and is the most important to back up. A monitoring database contains monitoring
information about service and host status. In large clusters, this database can grow large.
Activity Monitor - Contains information about past activities. In large clusters, this database can grow large.
Reports Manager - Keeps track of disk utilization and processing activities over time. Medium-sized.
Hive Metastore - Contains Hive metadata. Relatively small.
Sentry Server - Contains authorization metadata. Relatively small.
Cloudera Navigator Audit Server - Contains auditing information. In large clusters, this database can grow
large.
16 | Cloudera Manager Administration Guide
3. Run the following command as root using the parameters from the preceding step:
# pg_dump -h host -p 7432 -U scm > /tmp/scm_server_db_backup.$(date +%Y%m%d)
4. Enter the password specified for the com.cloudera.cmf.db.password property on the last line of the
db.properties file. If you are using the embedded database, Cloudera Manager generated the password
for you during installation. If you are using an external database, enter the appropriate information for your
database.
For example, to back up the Activity Monitor database amon created in Creating Databases for Activity Monitor,
Reports Manager, Hive Metastore, Sentry Server, and Cloudera Navigator Audit Server, on the local host as the
root user, with the password amon_password:
$ mysqldump -pamon_password amon > /tmp/amon-backup.sql
To back up the sample Activity Monitor database amon on remote host myhost.example.com as the root user,
with the password amon_password:
$ mysqldump -hmyhost.example.com -uroot -pcloudera amon > /tmp/amon-backup.sql
2. Click Start to confirm. The Command Details window shows the progress of starting the roles.
3. When Command completed with n/n successful subcommands appears, the task is complete. Click Close.
Stopping the Cloudera Management Service
Required Role:
1. Do one of the following:
1. Select Clusters > Cloudera Management Service > Cloudera Management Service.
2. Select
Actions > Stop.
1.
2. Click Stop to confirm. The Command Details window shows the progress of stopping the roles.
Cloudera Manager Administration Guide | 19
2. Click Restart to confirm. The Command Details window shows the progress of stopping and then starting
the roles.
3. When Command completed with n/n successful subcommands appears, the task is complete. Click Close.
Configuring Management Service Database Limits
Required Role:
Each Cloudera Management Service role maintains a database for retaining the data it monitors. These databases
(as well as the log files maintained by these services) can grow quite large. For example, the Activity Monitor
maintains data at the service level, the activity level (MapReduce jobs and aggregate activities), and at the task
attempt level. Limits on these data sets are configured when you create the management services, but you can
modify these parameters through the Configuration settings in the Cloudera Manager Admin Console. For
example, the Event Server lets you set a total number of events to store, and Activity Monitor gives you "purge"
settings (also in hours) for the data it stores.
There are also settings for the logs that these various services create. You can throttle how big the logs are
allowed to get and how many previous logs to retain.
1. Do one of the following:
a. Select Clusters > Cloudera Management Service > Cloudera Management Service.
b. On the Status tab of the Home page, in Cloudera Management Service table, click the Cloudera Management
Service link.
2. Click the Configuration tab.
3. In the left-hand column, select the Default role group for the role whose configurations you want to modify.
4. Edit the appropriate properties:
Activity Monitor - the Purge or Expiration period properties are found in the top-level settings for the
role.
Host and Service Monitor - see Monitoring Data Storage.
Log Files - log file size settings will be under the Logs category under the role group.
5. Click Save Changes.
Adding and Starting Cloudera Navigator Roles
Required Role:
1. Do one of the following:
a. Select Clusters > Cloudera Management Service > Cloudera Management Service.
b. Select
Actions > Restart.
2. Click the Instances tab.
20 | Cloudera Manager Administration Guide
Matching Hosts
10.1.1.[1-4]
host[1-3].company.com
host[07-10].company.com
IP addresses
Rack name
Click the View By Host button for an overview of the role assignment by hostname ranges.
5. When you are satisfied with the assignments, click Continue. The Database Setup page displays.
6. Configure database settings:
a. Choose the database type:
Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure
required databases. Make a note of the auto-generated passwords.
Select Use Custom Databases to specify external databases.
1. Enter the database host, database type, database name, username, and password for the database
that you created when you set up the database.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the database using the
information you have supplied. If the test succeeds in all cases, click Continue; otherwise check and correct
the information you have provided for the database and then try the test again. (For some servers, if you
are using the embedded database, you will see a message saying the database will be created at a later
step in the installation process.) The Review Changes page displays.
7. Review and accept any configuration changes (typically there are none). Click Accept. This returns you to the
Instances page.
8. Check the checkboxes next to the Navigator Audit Server and Navigator Metadata Server roles.
9. Select Actions for Selected > Start and confirm Start in the pop-up.
10. Click Close.
An user in the Administrator role manages user accounts through the Administration > Users page.
User Authentication
Cloudera Manager provides several mechanisms for authenticating users. You can configure Cloudera Manager
to authenticate users against the Cloudera Manager database or against an external authentication service.
The external authentication service can be an LDAP server (Active Directory or an OpenLDAP compatible directory),
or you can specify another external service. Cloudera Manager also supports using the Security Assertion Markup
Language (SAML) to enable single sign-on.
If you are using LDAP or an external service you can configure Cloudera Manager so that it can use both methods
of authentication (internal database and external service), and you can determine the order in which it performs
these searches. If you select an external authentication mechanism, Administrator users can always authenticate
against the Cloudera Manager database. This is to prevent locking everyone out if the authentication settings
are misconfiguredsuch as with a bad LDAP URL.
With external authentication, you can restrict login access to members of specific groups, and can specify groups
whose members are automatically given Administrator access to Cloudera Manager.
Users accounts in the Cloudera Manager database page show Cloudera Manager in the User Type column. User
accounts in an LDAP directory or other external authentication mechanism show External in the User Type
column.
User Roles
A user's role determines the Cloudera Manager features visible to the user and the actions the user can perform.
All the tasks in the Cloudera Manager documentation indicate which role is required to perform the task.
Note: Currently there is no indication in the Cloudera Manager Admin Console of the role a logged in
user is assigned to. To determine your role, contact a user that has the Administrator role.
A user account can be assigned one of the following roles:
Read-Only - Allows the user to view service and monitoring information but cannot add services or take any
actions that affect the state of the cluster.
Limited Operator - Allows the user to view service and monitoring information and decommission hosts
(except hosts running Cloudera Management Service roles), but cannot add services or take any other actions
that affect the state of the cluster.
If you specify groups in these properties, users must also be a member of at least one of the groups specified
in the LDAP User Groups property or they will not be allowed to log in. If these properties are left empty,
users will be assigned to the Read-Only role and any other role assignment must be performed manually
by an Administrator.
Note:
The default password for the cacerts store is changeit.
The alias can be any name (not just the domain name).
3. Configure the LDAP URL property to use ldaps://ldap_server instead of ldap://ldap_server.
0 - Read-Only
1 - Administrator
2 - Limited Operator
3 - Operator
4 - Configurator
Preparing Files
You will need to prepare the following files and information, and provide these to Cloudera Manager:
A Java keystore containing a private key for Cloudera Manager to use to sign/encrypt SAML messages.
The SAML metadata XML file from your IDP. This file must contain the public certificates needed to verify
the sign/encrypt key used by your IDP per the SAML Metadata Interoperability Profile
The entity ID that should be used to identify the Cloudera Manager instance
How the user ID is passed in the SAML authentication response:
As an attribute. If so, what identifier is used.
As the NameID.
The method by which the Cloudera Manager role will be established:
From an attribute in the authentication response:
What identifier will be used for the attribute
What values will be passed to indicate each role
From an external script that will be called for each use:
The script takes user ID as $1
The script sets an exit code to reflect the assigned role:
0 - Administrator
1 - Read-Only
2 - Limited Operator
3 - Operator
4 - Configurator
The -validity option specifies the certificate lifetime in number of days. If no validity value is specified,
the default value is used. The default varies, but is often 90 days.
The <path-to-keystore> must be a path to where you want to save the keystore file, and where the
Cloudera Manager Server host can access.
2. When prompted by keytool, create a password for the keystore. Save the password in a safe place.
Description
Path to TLS Keystore File The full filesystem path to the keystore file. Enable TLS encryption between the
Server and Agents.
Keystore Password
Description
use_tls
2. When prompted by keytool, create a password for the keystore. Save the password in a safe place.
3. When prompted by keytool, fill in the answers accurately to the questions to describe you and your company.
The most important answer is the CN value for the question "What is your first and last name?" The CN must
match the fully-qualified domain name (FQDN) or IP address of the host where the Server is running. For
example, cmf.company.com or 192.168.123.101.
Important: For the CN value, be sure to use a FQDN if possible, or a static IP address that will not
change. Do not specify an IP address that will change periodically. When agents connect to the
server using TLS, they check whether the key uses the same name as the one they are using to
connect to the server. If the names do not match, agents do not heartbeat.
4. On the Server host, run the following command to export the server certificate from your keystore in the
binary DER format:
$ keytool -exportcert -keystore <path-to-keystore> -alias jetty -file server.der
5. Convert the binary DER format to a .pem file that can be used on the Agents by using openssl (available for
download here.)
$ openssl x509 -out server.pem -in server.der -inform der
Description
verify_cert_file
use_tls
Description
Step 3. Generate the private key for the Agent using openssl.
1. Run the following openssl command on the agent:
$ openssl genrsa -des3 -out agent.key
The key is output in a .pem file. In the preceding example, the optional days argument results in a certificate
that is valid for 365 days.
2. Fill in the answers to the questions about the certificate. Note that the CN must match the hostname or IP
address of the Agent host.
Step 5: Create a file that contains the password for the key.
The Agent reads the password from a text file instead of from a command line. The file allows you to use file
permissions to protect the password. For example, name the file agent.pw.
Step 6: Configure the Agent with its private key and certificate.
1. On the Agent host, open the /etc/cloudera-scm-agent/config.ini configuration file:
2. Edit the following properties in the /etc/cloudera-scm-agent/config.ini configuration file.
Property
Description
client_key_file
client_keypw_file
client_cert_file
Step 9: Enable Agent authentication and configure the Server to use the new truststore.
1.
2.
3.
4.
Description
Path to Truststore
Specify the full filesystem path to the truststore located on the Cloudera
Manager Server host.
Truststore Password
Step 12: Verify that the Server and Agents are communicating.
In Cloudera Manager Admin Console, open the Hosts page. If the Agents heartbeat successfully, the Server and
Agents are communicating. If they are not, you may get an error in the Server, such as a null CA chain error.
This implies either the truststore doesn't contain the Agent certificate or the Agent isn't presenting the certificate.
Double check all of your settings. Check the Server's log to verify whether TLS and Agent validation have been
enabled correctly.
The -validity option specifies the certificate lifetime in number of days. If no validity value is specified,
the default value is used. The default varies, but is often 90 days.
The <path-to-keystore> must be a path to where you want to save the keystore file, and where the
Cloudera Manager Server host can access.
2. When prompted by keytool, create a password for the keystore. Save the password in a safe place.
3. When prompted by keytool, fill in the answers accurately to the questions to describe you and your company.
The most important answer is the CN value for the question "What is your first and last name?" The CN must
match the fully-qualified domain name (FQDN) or IP address of the host where the Server is running. For
example, cmf.company.com or 192.168.123.101.
Important: For the CN value, be sure to use a FQDN if possible, or a static IP address that will not
change. Do not specify an IP address that will change periodically. When agents connect to the server
using TLS, they check whether the key uses the same name as the one they are using to connect to
the server. If the names do not match, agents do not heartbeat.
Description
Keystore Password
Log out and then log in into Cloudera Manager to test the certificate. You may see an warning message to accept
the certificate if the root certificate is not installed in your browser.
OR
If no truststore is configured through Cloudera Manager, the default Java truststore (cacerts) will be used to
verify certificates.
The following table shows Cloudera Manager roles that act as HTTPS servers as other roles communicate with
them as HTTPS clients. This table does not depict the entirety of the roles' communication, just communications
over HTTPS.
Table 1: HTTPS Communication Between Cloudera Manager Roles
Roles as HTTPS Servers
Activity Monitor
Host Monitor
Service Monitor
Event Server
Reports Manager
Note: The Cloudera Navigator roles also act as HTTPS clients, but are outside the scope of this
document.
The Cloudera Manager roles behave as follows when communicating using HTTPS:
If the Cloudera Management Service's SSL Client Truststore File Location parameter is configured, the roles
will use this truststore to perform certificate verification on the server certificates. If this parameter is not
set, certificate verification will be performed using the default Java truststore. This means that it is not
possible, without the use of safety valves, to perform certificate verification for some Cloudera Management
Service roles and not for others. Nor is it possible to perform certificate verification for only a subset of the
HTTPS communication by a role.
The Cloudera Management Service roles never participate in mutual TLS authentication with any CDH service
or with the Cloudera Manager Server. Instead each service has it's own authentication scheme. For most
services this is Kerberos authentication, while Impala uses HTTP digest. For the Cloudera Manager Server,
this is session-based authentication.
User Impact
This depends on how you are using certificates:
If you are using a CA-signed certificate, configure the Cloudera Manager Service's SSL Client Truststore File
Location parameter to point to a truststore that contains the CA certificate. Adding a new service or enabling
TLS on an existing service will not require any changes to the Cloudera Management Service configuration
Cloudera Manager Administration Guide | 41
Understanding Upgrades
The process for upgrading Cloudera Manager varies depending on the starting point. The categories of tasks to
be completed include the following:
Install any databases required for the release. In Cloudera Manager 5, the Host Monitor and Service Monitor
roles use an internal database that provides greater capacity and flexibility for current and future uses. You
no longer need to configure an external database for this purpose. If you are upgrading from Cloudera Manager
4, this transition is handled automatically. If you are upgrading a Free Edition installation and you are running
a MapReduce service, you are asked to configure an additional database for the Activity Monitor that is part
of Cloudera Express.
Upgrade the Cloudera Manager Server.
Upgrade the Cloudera Manager Agent. This can be done via an upgrade wizard that is invoked when you
connect to the Admin Console or by manually installing the Cloudera Manager Agent packages.
Upgrading CDH
Cloudera Manager 5 can manage both CDH 4 and CDH 5, so upgrading existing CDH 4 installations is not required.
However, to get the benefits of the most current CDH features, you may want to upgrade CDH. For more
information on upgrading CDH, see Upgrading CDH and Managed Services.
Back up Databases
Before beginning the upgrade process, shut down the services that are using databases. This includes the
Cloudera Manager Management Service roles, the Hive Metastore server, and Cloudera Navigator, if it is in use.
Cloudera strongly recommends that you then back up all databases, however backing up the Activity Monitor
database is optional. This is especially important if you are upgrading from Cloudera Manager 4 to Cloudera
Manager 5. For information on backing up databases see Backing up Databases on page 16.
If any additional database will be required as a result of the upgrade, complete any required preparatory work
to install and configure those databases. For example, if you are upgrading from Cloudera Manager Free Edition,
Cloudera Manager 5 with Cloudera Express requires a database for the Activity Monitor. The upgrade instructions
assume all required databases have been prepared. For more information on using databases, see Cloudera
Manager and Managed Service Databases.
4. Review the contents of the exported database for non-standard characters. If you find unexpected characters,
modify these so the database backup file contains the expected data.
5. Import the database backup to the newly created database.
Modifying Oracle to Support UTF-8
Work with your Oracle database administrator to ensure any Oracle databases support UTF-8.
For example, if a host has two databases, you anticipate 250 maximum connections. If you anticipate a maximum
of 250 connections, plan for 280 sessions.
Once you know the number of sessions, you can determine the number of anticipated transactions using the
following formula:
transactions = 1.1 * sessions
Continuing with the previous example, if you anticipate 280 sessions, you can plan for 308 transactions.
Work with your Oracle database administrator to apply these derived values to your system.
Using the sample values above, Oracle attributes would be set as follows:
alter system set processes=250;
alter system set transactions=308;
alter system set sessions=280;
Next Steps
After you have completed any required database preparatory tasks, continue to Upgrading Cloudera Manager
4 to Cloudera Manager 5 on page 53.
Review Warning
Warning: If you have enabled auditing with Cloudera Navigator, during the process of upgrading
Cloudera Manager 5 auditing is suspended and is only restarted when you restart the roles of audited
services.
Procedure
HDFS - NameNode
HBase - Master and RegionServers
Hive - HiveServer2
Hue - Beeswax Server
2. If you are using the embedded PostgreSQL database for Cloudera Manager, stop the database:
sudo service cloudera-scm-server-db stop
Important: If you are not running the embedded database service and you attempt to stop it, you
will get a message to the effect that the service cannot be found. If instead you get a message
that the shutdown failed, this means the embedded database is still running, probably due to
services connected to the Hive Metastore. Do not proceed with the installation until you have
stopped all your Metastore-dependent services and the database successfully shuts down (restart
the Cloudera Manager server to shut down services as necessary). If you continue without solving
this, your upgrade will fail and you will be left with a non-functional Cloudera Manager installation.
3. If the Cloudera Manager host is also running the Cloudera Manager Agent, stop the Cloudera Manager Agent:
$ sudo service cloudera-scm-agent stop
(Optional) Upgrade JDK on Cloudera Manager Server Host and Agent Hosts
If you are manually upgrading the Cloudera Manager Agent packages in Upgrade Cloudera Manager Agent
Packages on page 50, and you plan to upgrade to CDH 5, ensure that Oracle JDK 7u55 is installed on the Agent
hosts following the instructions in Java Development Kit Installation.
If you are not running Cloudera Manager Server on the same host as a Cloudera Manager Agent, and you want
all hosts to run the same JDK version, optionally install Oracle JDK 7u55 on that host.
that contains information including the repository's base URL and GPG key. The contents of the
cloudera-manager.repo file might appear as follows:
[cloudera-manager]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 6 x86_64
name=Cloudera Manager
baseurl=http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/
gpgkey = http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera
gpgcheck = 1
For Ubuntu or Debian systems, the repo file can be found by navigating to the appropriate release directory,
for example, http://archive.cloudera.com/cm4/debian/wheezy/amd64/cm. The repo file, in this
case, cloudera.list, may appear as follows:
# Packages for Cloudera Manager, Version 5, on Debian 7.0 x86_64
deb http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib
deb-src http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib
b. Replace the repo file in the configuration location for the package management software for your system.
Operating System
Commands
RHEL
SLES
Ubuntu or Debian
Commands
RHEL
Note:
yum clean all cleans up yum's cache directories, ensuring that you
download and install the latest versions of the packages
If your system is not up to date, and any underlying system
components need to be upgraded before this yum update can succeed.
yum will tell you what those are.
SLES
Commands
$ sudo zypper ar -t rpm-md
http://myhost.example.com/path_to_cm_repo/cm
$ sudo zypper up -r http://myhost.example.com/path_to_cm_repo
Ubuntu or Debian
Use the following commands to clean cached repository information and update
Cloudera Manager components:
$ sudo apt-get clean
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get install cloudera-manager-server
cloudera-manager-agent cloudera-manager-daemons
As this process proceeds, you may be prompted concerning your configuration file
version:
Configuration file `/etc/cloudera-scm-agent/config.ini'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
At the end of this process you should have the following packages, corresponding to the version of Cloudera
Manager you installed, on the host that will become the Cloudera Manager Server host.
OS
Packages
cloudera-manager-agent-5.1.2-0.cm5.p0.932.el6.x86_64
cloudera-manager-server-5.1.2-0.cm5.p0.932.el6.x86_64
cloudera-manager-daemons-5.1.2-0.cm5.p0.932.el6.x86_64
Ubuntu or Debian
~# dpkg-query -l 'cloudera-manager-*'
Desired=Unknown/Install/Remove/Purge/Hold
|
Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name
Version
Description
+++-======================-======================-============================================================
ii cloudera-manager-agent 5.1.2-0.cm5.p0.932~sq The Cloudera
Manager Agent
ii cloudera-manager-daemo 5.1.2-0.cm5.p0.932~sq Provides daemons
for monitoring Hadoop and related tools.
ii cloudera-manager-serve 5.1.2-0.cm5.p0.932~sq The Cloudera
Manager Server
You may also see an entry for the cloudera-manager-server-db-2 if you are using the embedded database,
and additional packages for plug-ins, depending on what was previously installed on the server host. If the
cloudera-manager-server-db-2 package is installed, and you don't plan to use the embedded database, you
can remove this package.
OK
If the Cloudera Manager Server does not start, see Troubleshooting Installation and Upgrade Problems.
2. In the Cloudera Admin Console, select No, I would like to skip the agent upgrade now and click Continue.
3. Copy the appropriate repo file as described in Upgrade Cloudera Manager Server Packages on page
56.
4. Run the following commands:
Operating System Commands
RHEL
Note:
yum clean all cleans up yum's cache directories, ensuring that
you download and install the latest versions of the packages
If your system is not up to date, and any underlying system
components need to be upgraded before this yum update can
succeed. yum will tell you what those are.
SLES
Ubuntu or Debian
Use the following commands to clean cached repository information and update
Cloudera Manager components:
$ sudo apt-get clean
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get install cloudera-manager-agent
cloudera-manager-daemons
3. Click Continue. The Host Inspector runs to inspect your managed hosts for correct versions and configurations.
If there are problems, you can make changes and then re-run the inspector. When you are satisfied with the
inspection results, click Finish.
4. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file
paths required vary based on the services to be installed.
Warning: DataNode data directories should not be placed on NAS devices.
Click Continue. The wizard starts the services.
5. Click Finish.
All services (except for the services you stopped in Stop Selected Services on page 47) should be running.
next to the name of each service you shut down and select Start.
Note:
When you install on EC2 using the Cloud wizard, the wizard creates a security group that by default
opens ports used by Cloudera Manager and CDH components. Before upgrading, you must manually
open these ports:
Upgrades from Cloudera Manager 4.7.2 or earlier - 7185 for the Cloudera Manager Event Server.
Upgrades from Cloudera Manager 5.0.0 beta 2 or earlier - 18080 and 18081 for the Spark master
and worker web UI ports.
If you are upgrading from Cloudera Manager Free Edition 4.5 or earlier you are upgraded to Cloudera
Express, which includes a number of features that were previously available only with Cloudera
Enterprise. Of those features, activity monitoring requires a database. Thus, upon upgrading to
Cloudera Manager 5, you must specify Activity Monitor database information. You have the option
to use the embedded PostgreSQL database, which Cloudera Manager can set up automatically.
Procedure
HDFS - NameNode
HBase - Master and RegionServers
Hive - HiveServer2
Hue - Beeswax Server
Important: If you are not running the embedded database service and you attempt to stop it, you
will get a message to the effect that the service cannot be found. If instead you get a message
that the shutdown failed, this means the embedded database is still running, probably due to
services connected to the Hive Metastore. Do not proceed with the installation until you have
stopped all your Metastore-dependent services and the database successfully shuts down (restart
the Cloudera Manager server to shut down services as necessary). If you continue without solving
this, your upgrade will fail and you will be left with a non-functional Cloudera Manager installation.
3. If the Cloudera Manager host is also running the Cloudera Manager Agent, stop the Cloudera Manager Agent:
$ sudo service cloudera-scm-agent stop
(Optional) Upgrade JDK on Cloudera Manager Server Host and Agent Hosts
If you are manually upgrading the Cloudera Manager Agent packages in Upgrade Cloudera Manager Agent
Packages on page 58, and you plan to upgrade to CDH 5, install Oracle JDK 7u55 on the Agent hosts following
the instructions in Java Development Kit Installation.
If you are not running Cloudera Manager Server on the same host as a Cloudera Manager Agent, and you want
all hosts to run the same JDK version, optionally install Oracle JDK 7u55 on that host.
that contains information including the repository's base URL and GPG key. The contents of the
cloudera-manager.repo file might appear as follows:
[cloudera-manager]
# Packages for Cloudera Manager, Version 5, on RedHat or CentOS 6 x86_64
name=Cloudera Manager
baseurl=http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/
gpgkey = http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera
gpgcheck = 1
For Ubuntu or Debian systems, the repo file can be found by navigating to the appropriate release directory,
for example, http://archive.cloudera.com/cm4/debian/wheezy/amd64/cm. The repo file, in this
case, cloudera.list, may appear as follows:
# Packages for Cloudera Manager, Version 5, on Debian 7.0 x86_64
deb http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib
deb-src http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib
b. Replace the repo file in the configuration location for the package management software for your system.
Commands
RHEL
SLES
Ubuntu or Debian
Commands
RHEL
Note:
yum clean all cleans up yum's cache directories, ensuring that you
download and install the latest versions of the packages
If your system is not up to date, and any underlying system
components need to be upgraded before this yum update can succeed.
yum will tell you what those are.
SLES
Ubuntu or Debian
Use the following commands to clean cached repository information and update
Cloudera Manager components:
$ sudo apt-get clean
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get install cloudera-manager-server
cloudera-manager-agent cloudera-manager-daemons
As this process proceeds, you may be prompted concerning your configuration file
version:
Configuration file `/etc/cloudera-scm-agent/config.ini'
==> Modified (by you or by a script) since installation.
==> Package distributor has shipped an updated version.
What would you like to do about it ? Your options are:
Y or I : install the package maintainer's version
N or O : keep your currently-installed version
D : show the differences between the versions
Z : start a shell to examine the situation
The default action is to keep your current version.
At the end of this process you should have the following packages, corresponding to the version of Cloudera
Manager you installed, on the host that will become the Cloudera Manager Server host.
Packages
cloudera-manager-agent-5.1.2-0.cm5.p0.932.el6.x86_64
cloudera-manager-server-5.1.2-0.cm5.p0.932.el6.x86_64
cloudera-manager-daemons-5.1.2-0.cm5.p0.932.el6.x86_64
Ubuntu or Debian
~# dpkg-query -l 'cloudera-manager-*'
Desired=Unknown/Install/Remove/Purge/Hold
|
Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name
Version
Description
+++-======================-======================-============================================================
ii cloudera-manager-agent 5.1.2-0.cm5.p0.932~sq The Cloudera
Manager Agent
ii cloudera-manager-daemo 5.1.2-0.cm5.p0.932~sq Provides daemons
for monitoring Hadoop and related tools.
ii cloudera-manager-serve 5.1.2-0.cm5.p0.932~sq The Cloudera
Manager Server
You may also see an entry for the cloudera-manager-server-db-2 if you are using the embedded database,
and additional packages for plug-ins, depending on what was previously installed on the server host. If the
cloudera-manager-server-db-2 package is installed, and you don't plan to use the embedded database, you
can remove this package.
OK
If the Cloudera Manager Server does not start, see Troubleshooting Installation and Upgrade Problems.
2. In the Cloudera Admin Console, select No, I would like to skip the agent upgrade now and click Continue.
3. Copy the appropriate repo file as described in Upgrade Cloudera Manager Server Packages on page
56.
4. Run the following commands:
Operating System Commands
RHEL
Note:
yum clean all cleans up yum's cache directories, ensuring that
you download and install the latest versions of the packages
If your system is not up to date, and any underlying system
components need to be upgraded before this yum update can
succeed. yum will tell you what those are.
SLES
Use the following commands to clean cached repository information and update
Cloudera Manager components:
$ sudo apt-get clean
$ sudo apt-get update
$ sudo apt-get dist-upgrade
$ sudo apt-get install cloudera-manager-agent
cloudera-manager-daemons
3. If you are upgrading from a free version of Cloudera Manager prior to 4.6:
a. Click Continue to assign the Cloudera Management Services roles to hosts.
b. If you are upgrading to Cloudera Enterprise, specify required databases:
a. Configure database settings:
a. Choose the database type:
Leave the default setting of Use Embedded Database to have Cloudera Manager create and
configure required databases. Make a note of the auto-generated passwords.
Select Use Custom Databases to specify external databases.
1. Enter the database host, database type, database name, username, and password for the
database that you created when you set up the database.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the database using
the information you have supplied. If the test succeeds in all cases, click Continue; otherwise check
and correct the information you have provided for the database and then try the test again. (For
some servers, if you are using the embedded database, you will see a message saying the database
will be created at a later step in the installation process.) The Review Changes page displays.
4. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file
paths required vary based on the services to be installed. Click Finish.
5. If you are upgrading from Cloudera Manager prior to 4.5:
6. If you are upgrading from Cloudera Manager prior to 4.8, and have an Impala service, assign the Impala
Catalog Server role to a host.
All services (except for the services you stopped in Stop Selected Services on page 55) should be running.
Upgrade Impala
If your version of Impala was 1.1 or earlier, upgrade to Impala 1.2.1 or later.
Before performing this step, ensure you understand the semantics of the hard_restart command by reading
Hard Stopping and Restarting Agents on page 11.
3. Start all services.
HDFS - NameNode
HBase - Master and RegionServers
Hive - HiveServer2
Hue - Beeswax Server
next to the name of each service you shut down and select Start.
2. Reinstall the same Cloudera Manager Server version that you were previously running. You can reinstall
from the Cloudera repository at http://archive.cloudera.com/cm4/ or alternately, you can create your
own repository, as described in Understanding Custom Installation Solutions.
a. Find the Cloudera repo file for your distribution by starting at http://archive.cloudera.com/cm4/
and navigating to the directory that matches your operating system.
For example, for Red Hat or CentOS 6, you would navigate to
http://archive.cloudera.com/cm4/redhat/6/x86_64/cm/. Within that directory, find the repo file
For Ubuntu or Debian systems, the repo file can be found by navigating to the appropriate directory, for
example, http://archive.cloudera.com/cm4/debian/squeeze/amd64/cm. The repo file, in this case,
cloudera.list, may appear as follows:
# Packages for Cloudera's Distribution for Hadoop, Version 4, on Debian 6.0
x86_64
deb http://archive.cloudera.com/cm4/debian/squeeze/amd64/cm squeeze-cm4 contrib
deb-src http://archive.cloudera.com/cm4/debian/squeeze/amd64/cm squeeze-cm4
contrib
You must edit the file if it exist and modify the URL to reflect the exact version of Cloudera Manager you
are using (unless you want the downgrade to also upgrade to the latest version of Cloudera Manager 4).
The possible versions are shown in the directory on archive.
Setting the URL (an example):
OS
Command
RHEL
Replace baseurl=http://archive.cloudera.com/cm4/redhat/5/x86_64/cm/4/
with
baseurl=http://archive.cloudera.com/cm4/redhat/5/x86_64/cm/4.7.3/
Ubuntu or Debian
b. Copy the repo file to the configuration location for the package management software for your system:
Operating System
Commands
RHEL
SLES
Ubuntu or Debian
Commands
RHEL
SLES
Ubuntu or Debian
There's no action that will downgrade to the version currently in the repository.
Read DowngradeHowto, download the script described therein, run it, and then run
Commands
apt-get install for the name=version pairs that it provides for Cloudera
Manager.
At the end of this process you should have the following packages, corresponding to the version of Cloudera
Manager you installed, on the Cloudera Manager Server host. For example, for CentOS,
$ rpm -qa 'cloudera-manager-*'
cloudera-manager-daemons-4.7.3-1.cm473.p0.163.el6.x86_64
cloudera-manager-server-4.7.3-1.cm473.p0.163.el6.x86_64
cloudera-manager-agent-4.7.3-1.cm473.p0.163.el6.x86_64
For Ubuntu or Debian, you should have packages similar to those shown below.
~# dpkg-query -l 'cloudera-manager-*'
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name
Version
Description
+++-======================-======================-============================================================
ii cloudera-manager-agent 4.7.3-1.cm473.p0.163~sq The Cloudera Manager Agent
ii cloudera-manager-daemo 4.7.3-1.cm473.p0.163~sq Provides daemons for monitoring
Hadoop and related tools.
ii cloudera-manager-serve 4.7.3-1.cm473.p0.163~sq The Cloudera Manager Server
You may also see an entry for the cloudera-manager-server-db if you are using the embedded database,
and additional packages for plug-ins, depending on what was previously installed on the server host. If the
commands to update the server complete without errors, you can assume the upgrade has completed as desired.
For additional assurance, you will have the option to check that the server versions have been updated after
you start the server.
OK
Note: If you have problems starting the server, such as database permissions problems, you can
use the server's log /var/log/cloudera-scm-server/cloudera-scm-server.log to
troubleshoot the problem.
Administration Settings
The Settings page provides a number of categories as follows:
Performance - Set the Cloudera Manager Agent heartbeat interval.
Advanced - Enable API debugging and other advanced options.
Monitoring - Set Agent health status parameters. For configuration instructions, see Configuring Cloudera
Manager Agents on page 11.
Security - Set TLS encryption settings to enable TLS encryption between the Cloudera Manager Server, Agents,
and clients. For configuration instructions, see Configuring TLS Security for Cloudera Manager on page 31.
You can also:
Set the realm for Kerberos security and point to a custom keytab retrieval script. For configuration
instructions, see Configuring Hadoop Security with Cloudera Manager.
Specify session timeout and a "Remember Me" option.
Ports and Addresses - Set ports for the Cloudera Manager Admin Console and Server. For configuration
instructions, see Configuring Cloudera Manager Server Ports on page 9.
Other
Enable Cloudera usage data collection For configuration instructions, see Managing Anonymous Usage
Data Collection on page 74.
Set a custom header color and banner text for the Admin console.
Set an "Information Assurance Policy" statement this statement will be presented to every user before
they are allowed to access the login dialog. The user must click "I Agree" in order to proceed to the login
dialog.
Disable/enable the auto-search for the Events panel at the bottom of a page.
Support
Configure diagnostic data collection properties. See Diagnostic Data Collection on page 74.
Configure how to access Cloudera Manager help files. By default, when you click the Help link under the
Support menu in the Cloudera Manager Admin console, Help files from the Cloudera web site are opened.
This is because local Help files are not updated after installation. You can configure Cloudera Manager to
open either the latest Help files from the Cloudera web site (this option requires Internet access from the
browser) or locally-installed Help files by configuring the property Open latest Help files from the Cloudera
website.
External Authentication - Specify the configuration to use LDAP, Active Directory, or an external program for
authentication. See Configuring External Authentication on page 25 for instructions.
Parcels - Configure settings for parcels, including the location of remote repositories that should be made
available for download, and other settings such as the frequency with which Cloudera Manager will check
for new parcels, limits on the number of downloads or concurrent distribution uploads. See Parcels for more
information.
Network - Configure proxy server settings.
Custom Service Descriptors - Configure custom service descriptor properties for Add-on Services.
Managing Licenses
Required Role:
When you install Cloudera Manager, you can choose to select Cloudera Express (no license required), a 60-day
Cloudera Enterprise Data Hub Edition trial license, or Cloudera Enterprise (which requires a license). You can
later end a trial license or upgrade your license.
About Trial Licenses
You can use the trial license only once; once the 60-day trial period has expired or you have ended the trial, you
cannot restart it.
When a trial ends, features that require a Cloudera Enterprise license immediately become unavailable. However,
data or configurations associated with the disabled functions are not deleted, and become available again when
you install a Cloudera Enterprise license. Trial expiration or termination has the following effects:
Basic Edition - a cluster running core CDH services: HDFS, Hive, Hue, MapReduce, Oozie, Sqoop, YARN, and
ZooKeeper.
Flex Edition - a cluster running core CDH services plus one of the following: Accumulo, HBase, Impala, Navigator,
Solr, Spark.
Data Hub Edition - a cluster running core CDH services plus any of the following: Accumulo, HBase, Impala,
Navigator, Solr, Spark.
Ending a Cloudera Enterprise Data Hub Edition Trial
If you are using the trial edition the License page indicates when your license will expire. However, you can end
the trial at any time (prior to expiration) as follows:
1. On the License page, click End Trial.
Matching Hosts
10.1.1.[1-4]
host[1-3].company.com
host[07-10].company.com
IP addresses
Rack name
3. Select a host and click OK.
4. When you are satisfied with the assignments, click Continue. The Database Setup page displays.
5. Configure database settings:
a. Choose the database type:
Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure
required databases. Make a note of the auto-generated passwords.
Select Use Custom Databases to specify external databases.
1. Enter the database host, database type, database name, username, and password for the database
that you created when you set up the database.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the database using the
information you have supplied. If the test succeeds in all cases, click Continue; otherwise check and correct
the information you have provided for the database and then try the test again. (For some servers, if you
are using the embedded database, you will see a message saying the database will be created at a later
step in the installation process.) The Review Changes page displays.
6. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file
paths required vary based on the services to be installed.
Warning: DataNode data directories should not be placed on NAS devices.
Click Continue. The wizard starts the services.
7. At this point, your installation is upgraded. Click Continue.
8. Restart Cloudera Management Services and audited services to pick up configuration changes. The audited
services will write audit events to a log file, but the events are not transferred to the Cloudera Navigator
Audit Server until you add and start the Cloudera Navigator Audit Server role as described in Adding and
Starting Cloudera Navigator Roles on page 20. For information on Cloudera Navigator, see Cloudera Navigator
documentation.
Matching Hosts
10.1.1.[1-4]
host[1-3].company.com
host[07-10].company.com
IP addresses
Rack name
8. When you are satisfied with the assignments, click Continue. The Database Setup page displays.
9. Configure database settings:
a. Choose the database type:
Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure
required databases. Make a note of the auto-generated passwords.
Select Use Custom Databases to specify external databases.
1. Enter the database host, database type, database name, username, and password for the database
that you created when you set up the database.
b. Click Test Connection to confirm that Cloudera Manager can communicate with the database using the
information you have supplied. If the test succeeds in all cases, click Continue; otherwise check and correct
the information you have provided for the database and then try the test again. (For some servers, if you
are using the embedded database, you will see a message saying the database will be created at a later
step in the installation process.) The Review Changes page displays.
10. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file
paths required vary based on the services to be installed.
Warning: DataNode data directories should not be placed on NAS devices.
Click Continue. The wizard starts the services.
70 | Cloudera Manager Administration Guide
Managing Alerts
Required Role:
The Administration > Alerts page provides a summary of the settings for alerts in your clusters.
Alert Type The left column lets you select by alert type (Health, Log, or Activity) and within that by service instance.
In the case of Health alerts, you can look at alerts for Hosts as well. You can select an individual service to see
just the alert settings for that service.
Health/Log/Activity Alert Settings Depending on your selection in the left column, the right hand column show
you the list of alerts that are enabled or disabled for the selected service type.
To change the alert settings for a service, click the next to the service name. This will take you to the Monitoring
section of the Configuration tab for the service. From here you can enable or disable alerts and configure
thresholds as needed.
Recipients You can also view the list of recipients configured for the enabled alerts.
Configuring Alert Delivery
When you install Cloudera Manager you can configure the mail server you will use with the Alert Publisher.
However, if you need to change these settings, you can do so under the Alert Publisher section of the Management
Services configuration tab. Under the Alert Publisher role of the Cloudera Manager Management Service, you
can configure email or SNMP delivery of alert notifications.
Select Clusters > Cloudera Management Service > Cloudera Management Service.
On the Status tab of the Home page, in Cloudera Management Service table, click the Cloudera
Management Service link.
2. Click the Configuration tab.
3. Select the Alert Publisher Default Group role group.
Kerberos
Required Role:
After enabling and configuring Hadoop security using Kerberos on your cluster, you can view and regenerate
the Kerberos principals for your cluster. If you make a global configuration change in your cluster, such as
changing the encryption type, you would use the Kerberos page to regenerate the principals for your cluster. In
a secure cluster, the Kerberos page lists all the Kerberos principals that are active on your cluster.
Regenerating Kerberos Principals
If you make a global configuration change in your cluster, such as changing the encryption type, you must use
the following instructions to regenerate the principals for your cluster.
Important:
Regenerate principals using the following steps in the Cloudera Manager Admin Console and not
directly using kadmin shell.
Do not regenerate the principals for your cluster unless you have made a global configuration
change. Before regenerating, be sure to read Appendix B - Set up a Cluster-dedicated MIT KDC
and Default Domain for the Hadoop Cluster to avoid making your existing host keytabs invalid.
If you are using Active Directory, delete the AD accounts with the userPrincipalName (or login
names) that you want to manually regenerate before continuing with the steps below.
To view and regenerate the Kerberos principals for your cluster:
1. Select Administration > Kerberos.
2. The currently configured Kerberos principals are displayed. If you are running HDFS, the hdfs/hostname and
host/hostname principals are listed. If you are running MapReduce, the mapred/hostname and
host/hostname principals are listed. The principals for other running services are also listed.
3. Only if necessary, select the principals you want to regenerate.
Cloudera Manager Administration Guide | 73
Disabling the Automatic Sending of Diagnostic Data from a Manually Triggered Collection
If you do not want data automatically sent to Cloudera after manually triggering data collection, you can disable
this feature. The data you collect will be saved and can be downloaded for sending to Cloudera Support at a later
time.
1. Select Administration > Settings.
2. Under the Support category, uncheck the box for Send Diagnostic Data to Cloudera Automatically.
3. Click Save Changes to commit the changes.
Note: The Send Diagnostic Data form that displays when you collect data in one of the following
procedures indicates whether the data will be sent automatically.
Manually Triggering Collection and Transfer of Diagnostic Data to Cloudera
1. Optionally change the System Identifier property:
a. Select Administration > Settings.
b. Under the Other category, set the System Identifier property and click Save Changes.
2. Under the Support category, choose Send Diagnostic Data. The Send Diagnostic Data form displays.
3. Fill in or change the information here as appropriate:
Cloudera Manager populates the End Time based on the setting of the Time Range selector. You should
change this to be a few minutes after you observed the problem or condition that you are trying to capture.
The time range is based on the timezone of the host where Cloudera Manager Server is running.
If you have a support ticket open with Cloudera Support, include the support ticket number in the field
provided.