Академический Документы
Профессиональный Документы
Культура Документы
Telecommunications:
Winning the NCR Teradata Center for
CRM at Duke University Competition
Objective was to predict probability of loss of a customer 30-60 days into the
future
Call behavior: statistics describing completed calls, failed calls, voice and data calls,
call forwarding, customer care calls, directory info
Statistics included mean and range for at least 3 months, last 6 months, and
lifetime
Immune to outliers
Binary classification
Multinomial classification
Least-squares regression
Least-absolute-deviation regression
(Insert equation)
Equivalent to fitting data to a constant.
Patient Learning Strategy:
Key to TreeNet Success
Do not use all training data in any one iteration
Randomly sample from training data (we used a 50%
sample)
(insert equation)
(insert equation)
Learning Rates and Step
Size
P is called the learning rate, T is the number of trees grown
For models with pT=6, the smaller the learning rate the better
Model Results:
Variable Importance
(insert table)
Impact of Hand Set Age and
Time Since Retention Call
(insert graphs)
Effect of Recent Change
in Minutes Usage
(insert graph)
Effect of Tenure:
The One-year Spike
(insert graph)
References
Friedman, J.H. (1999). Stochastic gradient boosting.
Stanford: Statistics Department, Stanford
University.
Friedman, J.H. (1999). Greedy function
approximation: a gradient boosting machine.
Stanford: Statistics Department, Stanford
University.
Salford Systems (2002) TreeNet 1.0 Stochastic
Gradient Boosting. San Diego, CA.
Steinberg, D., Cardell, N.S., and Golovnya, M.
(2003) Stochastic Gradient Boosting and Restrained
Learning. Salford Systems discussion paper.