Вы находитесь на странице: 1из 6

Journal of Periodontology; Copyright 2016

DOI: 10.1902/jop.2016.150554

Statistical Significance Versus Clinical Relevance in Periodontal

Research: Implications for Clinical Practice
Leandro Chambrone DDS, MSc, PhD*, and Gary C. Armitage DDS, MS
*Associate Professor, Unit of Basic Oral Investigation (UIBO), School of Dentistry, El
Bosque University, Bogota, Colombia; and School of Dentistry, Ibirapuera University
(Unib), So Paulo, Brazil.

R. Earl Robinson Distinguished Professor, Department of Orofacial Sciences, Division of

Periodontology, School of Dentistry, University of California (UCSF), San Francisco, CA,
Source of support None
Conflict of interest The authors report no conflict of interest related to this commentary.

Statistical analyses are routinely used for the management and interpretation of data from
interventional studies in clinical periodontal research. In situations where two clinical
interventions are compared (i.e., test vs. control) and a statistically significant difference is
found in the outcome variable of interest, the null hypothesis is rejected and it is often
concluded that one treatment procedure is better or superior to another.1-3 However, a
statistically significant difference, per se, does not necessarily mean that the difference is
clinically important. In evaluating the data from clinical studies the clinician must ask
themselves the following question, Is the statistically significant difference clinically
relevant and does this finding allow me to advocate and use this therapy as the most
effective and safe way to manage the clinical problem? The issue of statistical significance
versus clinical relevance is extremely important and all areas of medicine have been
struggling with it for a long time.1, 2, 4-13 Indeed, several authors have stressed that statistical
findings are often inappropriately considered as clinically relevant.1, 2, 4-13


It has been unmistakably demonstrated that statistically significant differences (e.g., P-value
< 0.05) are more likely to be detected with large sample sizes compared to small ones.1-3, 11
Although the use of P-values provides a direct, fast and simple way to express statistical
differences, it does not provide the magnitude of the difference (i.e., it only provides a
dichotomous answer).1-3, 11, 12 A P-value < 0.05 merely means that the finding might have
occurred by chance 5% of the time. It has no direct bearing on whether or not the finding is
clinically important. In addition, P-values are only applicable to the study participants and
are not generalizable to a wider population.2, 3 Calculation of 95% confidence intervals (CIs)
are more useful in judging if a finding is of potential clinical relevance. The CIs can be
calculated when the mean and standard deviation of a specific outcome variable are known.
A finding that falls within the 95% CIs of a measured outcome is often clinically
important.1-3,11, 12 In addition, mean values that fall within calculated CIs are more likely to
be generalizable to a wider population.2
In evaluating the results of a clinical study it is important to determine if changes in the
primary outcome variable will be large enough to be clinically relevant and applicable to
clinical practice. For example, in a randomized clinical trial (RCT) comparing two

Journal of Periodontology; Copyright 2016

DOI: 10.1902/jop.2016.150554

nonsurgical periodontal interventions (i.e., test vs control therapies), it was found that the
test therapy led to a statistically significant reduction (p = 0.01) in the mean number of
bleeding sites with periodontal probing depths [PD] 6 mm than the control therapy (e.g. 5
sites vs 15 sites). As a result, it might be concluded that the test treatment reduced the
number of sites that were candidates for surgical therapy (e.g., open-flap debridement).


In clinical practice, it can be argued that person-centered outcomes (i.e., those that matter to
the individual) are the most important.14 Clinical attachment level (CAL) and PD are not
considered true end points, but surrogate ones.3, 14 Important person-centered or patientreported outcomes include comfort, function, aesthetics and tooth retention. In any analysis
dealing with clinical relevance the success of a treatment approach should include patientcentered outcomes.
An often neglected patient-centered outcome that is relevant to clinical practice is how
many teeth return to periodontal health after treatment. The number of teeth that return to
periodontal health after treatment might be a better primary outcome than the reduction in
number of inflamed/bleeding deep pockets. For example, an individual tooth presenting at
baseline with 3-4 deep/bleeding pockets may still have 1 residual deep/bleeding pocket at
the final examination (and remain unhealthy), thus additional therapy (e.g., open-flap
debridement) might be needed to successfully treat that tooth.
Interpretation of the clinical relevance of surrogate outcomes is exceptionally difficult. If
4 treatment approaches result in different average reductions in probing depth (e.g., 0.3 mm,
0.5 mm, 0.7 mm, and 1.00 mm) and all are statistically significant (P < 0.05) compared to
baseline values, which reductions (if any) are clinically important? Obviously, this is
difficult to answer, but it should be considered that the percentage of teeth that had a
questionable or hopeless diagnosis before therapy may inflate potential mean differences,
independent of the type of therapy the patient received.15
Evidence is clear that the number of residual deep pockets (PD 6 mm) with bleeding
on probing (BOP) after periodontal therapy is important in the prediction of disease
progression,16, 17 while the mean changes from baseline are important to establish the degree
of CAL gain. However, these can lead to another discussion such as what percentage of sites
or patients achieved a clinically significant end result (e.g., PD < 5 mm or CAL gain > 2
mm)? Additionally, the selection of the primary variable based on the inherent conditions
of the topic of interest seem of more clinical relevance, such as: a) sites/patients needed to
be treated (NNT) to have one site/patient with a desired outcome (e.g. CAL gain >2 mm for
a regenerative procedure); b) percentage of sites or patients achieving a clinically significant
end result (e.g., CAL gain > 2 mm for a regenerative procedure, 100% or 80% coverage for
a root coverage procedure, PD < 5 mm after a non- or surgical procedure); and combining
multiple different outcomes in the evaluation (e.g., CAL gain and radiographic gain for a
regenerative procedure, % coverage for a root coverage procedure and gain in keratinized
tissue (KT), PD < 5 mm and no BOP after a non- or surgical procedure).18, 19
Consequently, the number of teeth reaching a status of clinical periodontal health after
treatment (i.e., stable non-inflamed sites, with shallow probing depths) seems to be of better
clinical relevance than the number of residual pockets for the assessment of effect/success of
treatment (i.e. the primary outcome measure for the clinicians decision-making process). In
other words, it is essential to evaluate/interpret whether the different outcome measures
described in a study are relevant for clinical practice.10

Journal of Periodontology; Copyright 2016

DOI: 10.1902/jop.2016.150554

Different definitions can be applied to the terms clinical relevance or clinical significance,
such as, a way to determine the practical value of a treatment, as opposed to the
statistical significance,4 the minimally clinically important difference between
groups,7 or a returning to normal functioning.6 Jacobson et al.6 have suggested two
criteria that are needed to establish the minimum requirements of a clinically significant
change: (1) The magnitude of the change has to be statistically reliable and (2) by the end
of therapy, clients/patients have to end up in a range that renders them indistinguishable
from well-functioning people.6 In terms of periodontal therapy, the minimum requirements
of clinically significant outcomes include those associated with potential clinical
changes/improvements after therapy (e.g., gains in CAL and reduction in PD, BOP and
mobility) as well as patient-reported outcomes (e.g., comfort, better aesthetics and ability to
To reach these objectives, the following should be considered:
1 Clinicians judgment on whether a new therapy seems applicable in terms of clinical
outcomes (Are the clinical improvements obtained with the tested therapy as effective as the
gold-standard established by the periodontal literature?);
2 Patient-centered outcomes and their importance in the implementation of novel
treatment approaches in the future (e.g., adverse effects [such as. discomfort, pain],
functional limitations [e.g., limitations on the chewing and deglutition of food], costs, and
patients preferences);5, 13, 20, 21
3 Long-term impact of treatment (Do the short-term results of therapy remain longterm?);
4 The interpretation of the base of evidence from efficacy and effectiveness trials.
The use of statistical approaches has been also described to determine clinical relevance.
For instance, the Cohens effect size22 computes the degree of the relationship between
outcome measures as well as the magnitude of the difference of the groups based on two
mean scores (one from each group) and the pooled standard deviation of the groups
(meanTest group mean Control group/SDpooled). However, the issue of clinical relevance is more
involved with aspects related to an ample group of conditions that have been studied by
evidence-based medicine/dentistry.


For more than a decade, systematic reviews (SR) of efficacy studies have been designed to
identify the best treatment options in periodontology, but these per se may not be clearly
designed to translate the current evidence into practical decision guidances for common
daily clinical scenarios.21 Efficacy is only one dimension of evidence-based dentistry and
adoption of a specific technology requires more than this line of evidence. Efficacy
refers to the probability of benefit to individuals in a defined population from an
intervention administered under ideal conditions, whereas effectiveness is the impact in
real-world situations by assessing the benefit of an intervention provided to typical
individuals by the average practitioner under ordinary conditions.23
Systematic reviews of RCTs are important tools that are primarily designed to determine
differences in treatment approaches in interventional studies and provide implications for
future research and practice.18 The calculation of pooled estimates from different trials is

Journal of Periodontology; Copyright 2016

DOI: 10.1902/jop.2016.150554

certainly a critical issue due to inherent conditions related to the protocol of the study [i.e.,
adherence to the study design, sample of patients treated, expertise of clinician(s) providing
treatment], and the number of studies included in the analysis.15, 20, 21 Meta-analyses
reporting, or not reporting, statistical differences assist the clinician in rendering a treatment
plan, but they do not certify that one therapy is much better than the other. It should be noted
that factors associated with the periodontal diagnosis, sample size, and the applied inclusion
criteria may make the pooling of data from different trials a critical issue. For instance, the
inclusion of studies with mean PD > 6 mm can increase the impact of outcome change by
promoting greater differences between baseline and follow-up means (i.e. baselinemean followupmean = outcome change), an element that may impact the estimates of meta-analyses.
Thus, it is expected that studies should be combined into meta-analyses according their
baseline characteristics in order to generate more accurate evaluations.18, 20, 21 Moreover,
both RCTs and systematic reviews may not answer key questions regarding uncommon
conditions where an adequate sample of patients may not be easily available for analysis, or
for clinical outcomes of interest that could not be reproduced in humans due to ethical
aspects. Thus, information gathered from private-practice studies should assist in answering
some research questions.
Effectiveness is a very significant question, which has not been explored to the same
extent that efficacy has, and the idea of effectiveness is based on outcome of therapy by
average clinicians in the community and not by research experts.21 Most practitioners are not
used to designing research protocols or conducting/running clinical studies, and only a few
will ever present the results of their work as case reports/series.21 These factors are
important and deserve distinction, because often there is a gap between protocols
investigated by experts and those carried out in clinical practice.21 As a result, other
assessment tools/criteria that take into account the available clinical evidence and expert
opinions might be considered. An example of such an assessment tool is the one defined by
the U.S. Preventive Services Task Force (USPSTF) adapted by the American Dental
Association in 2013,24 with the intent that it may guide the strength of recommendation of a
procedure as follows: a) strong evidence strongly supports providing this intervention; b)
in favor evidence favors providing this intervention; c) weak evidence suggests
implementing this intervention after alternatives have been considered; d) expert opinion for
evidence is lacking; the level of certainty is low (expert opinion guides this
recommendation); e) expert opinion against evidence is lacking; the level of certainty is
low (expert opinion suggests not implementing this intervention); and f) against evidence
suggests not implementing this intervention or discontinuing ineffective procedures.24
In other words, new or alternative procedures should be used when definitive
information is scarce and evidence based on clinical experience must be used to fill the gap
in knowledge until strong evidence becomes available.

The findings of a study should allow for the balanced interpretation of both the statistical
significance and clinical relevance of the data followed by a thoughtful assessment of their
clinical applicability in a practical setting.1 The use of P-values is important, but other useful
estimates/tools such as confidence intervals (CIs) can quantify the amount of differences
between groups. For instance, an honest way to evaluate the balance between significance
and relevancy relies on the following question proposed by Saiani11: Are any of the values
within the 95% CI big enough to care about? If the answer is no, then the effect is clinically
insignificant and statistical significance is immaterial.

Journal of Periodontology; Copyright 2016

DOI: 10.1902/jop.2016.150554

Without a doubt the use of statistical analysis is certainly important and necessary for
clinical research. However, statistically significant differences alone between two
procedures may not be sufficient to support the use of a new therapy in private practice. An
evidence-based clinical practice clearly requires the data generated by clinical research, and
statistical analysis of the data provides a basis for the development of treatment strategies
and helps in the decision-making process. However, cost-benefit analyses and magnitude-ofchange determinations of any treatment approach depend on how much of the statistical
differences can be translated into clinically useable tools for daily practice.
The authors report no conflict of interest related to this commentary.


Bhardwaj SS, Camacho F, Derrow A, Fleischer Jr AB, Feldman SR. Statistical significance and clinical
relevance the importance of power in clinical trials in dermatology. Arch Dermatol 2004; 140: 1520-1523.


Fethney J. Statistical and clinical significance, and how to use confidence intervals to help interpret both.
Aust Crit Care 2010; 23: 93-97.


Tu Y-K, Gilthorpe MS. Key statistical and analytical issues for evaluating treatment effects in periodontal
research. Periodontol 2000 2012; 59: 75-88.


Jacobson NS, Follette WC, Revenstorf D. Psychotherapy outcome research: methods for reporting
variability and evaluating clinical significance. Behav Ther 1984; 15: 336-352.


Todd KH. Clinical versus statistical significance in the assessment of pain relief. Ann Emerg Med 1996;
27: 439-441.


Jacobson NS, Roberts LJ, Berns SB, McGlinchey JB. Methods for defining and determining the clinical
significance of treatment effects: description, application, and alternatives. J Consult Clin Psychol 1999;
67: 300-307.


Beaton DE, Boers M, Wells GA. Many faces of the minimal clinically important difference (MCID): a
literature review and directions for future research. Curr Opin Rheumatol 2002; 14: 109-114.


Campbell TC. An introduction to clinical significance: an alternative index of intervention effect for group
experimental designs. J Early Intervent 2005; 27: 210-227.


Baicus C, Caraiola S. Effect measure for quantitative endpoints: Statistical versus clinical significance, or
how large the scale is? Eur J Internal Med 2009; 20: e124e125.

10. Thirthalli J, Rajkumar RP. Statistical versus clinical significance in psychiatric researchAn overview for
beginners. Asian J Psychiatr 2009; 2: 7479
11. Sainani KL. Clinical versus statistical significance. Phys Med Rehabil 2012; 4: 442445.
12. Jakobsen JC, Gluud C, Winkel P, Lange T, Wetterslev J. The thresholds for statistical and clinical
significance a five-step procedure for evaluation of intervention effects in randomised clinical trials.
BMC Medical Res Methodol 2014; 14: 34.
13. Sedgwick P. Clinical significance versus statistical significance. BMJ 2014; 348: g2130.
14. Hujoel PP, Armitage GC, Garcia RI. A perspective on clinical significance. J Periodontol 2000; 71: 15151518.
15. Chambrone L, Chambrone D, Lima LA, Chambrone LA. Predictors of tooth loss during long-term
periodontal maintenance: a systematic review of observational studies. J Clin Periodontol 2010; 37: 675
16. Tonetti MS, Claffey N. Advances in the progression of periodontitis and proposal of definitions of a
periodontitis case and disease progression for use in risk factor research. Group C consensus report of the
5th European Workshop in Periodontology. J Clin Periodontol 2005; 32 (Suppl. 6): 210213.

Journal of Periodontology; Copyright 2016

DOI: 10.1902/jop.2016.150554

17. Matuliene G, Pjetursson BE, Salvi GE, Schmidlin K, Brgger U, Zwahlen M, Lang NP. Influence of
residual pockets on progression of periodontitis and tooth loss: Results after 11 years of maintenance. J
Clin Periodontol 2008; 35: 685695.
18. Esposito M, Grusovin MG, Coulthard P, Worthington HV. Enamel matrix derivative (Emdogain) for
periodontal tissue regeneration in intrabony defects. Cochrane Database Syst Rev 2005; Issue
19. Needleman IG, Worthington HV, Giedrys-Leeper E, Tucker RJ. Guided tissue regeneration for
periodontal infra-bony defects. Cochrane Database Syst Rev 2006; Issue 2:CD001724.
20. Higgins JPT, Green S. Cochrane Handbook for Systematic Reviews of Interventions Version 5.1.0
[updated September 2011] The Cochrane Collaboration 2011. Available from www.cochranehandbook.org (accessed in May 15, 2015).
21. Chambrone L, Tatakis DN. Periodontal soft tissue root coverage procedures: a systematic review from the
AAP Regeneration Workshop. J Periodontol 2015; 86(2 Suppl): S851.
22. Cohen J. The concepts of power analysis. In: Cohen J. editor: Statistical power analysis for the behavioral
sciences. Hillsdale, New Jersey: Academic Press, Inc: 1998. p. 1-17.
23. Pihlstrom BL, Curran AE, Voelker HT, Kingman A. Randomized controlled trials: what are they and who
needs them? Periodontol 2000 2012; 59: 1431.
24. ADA Clinical Practice Guidelines Handbook [updated November 2013]. American Dental Association
available at: http://ebd.ada.org/contentdocs/ADA_Clinical_Practice_Guidelines_Handbook__2013_Update.pdf (accessed and downloaded in 29 January 2014).

Contact : Leandro Chambrone, Rua Antonio Pinto Guedes, 626, Mogi das Cruzes, SP,
08820-430 Brazil e-mail: leandro_chambrone@hotmail.com
Submitted September 10, 2015; accepted for publication December 22, 2015.