Вы находитесь на странице: 1из 24

University of Brighton, Faculty of Health School of Health Professions

MSc in Physiotherapy

Assessing and Monitoring Infants and Young Children with Gross Motor Developmental Delay: a systematic literature review on clinical utility and responsiveness of gross motor assessment measures

60 credits Ida Hasni Shaari 09814933 Submitted in part-fulfilment of the University of Brighton degree, MSc in Physiotherapy

September 2010

Word Count: 14, 925 words

Declaration of authorship 7 confirm that the I work reported in this dissertation and the composition of the written dissertation has been carried out by the author named on the title page. I have read the regulation relating to plagiarism and I confirm that no part of this work has been copied from any other sources unless it is referenced. I give permission for this work to be made for reference by future cohorts of students'

[i]
COPYRIGHT UiTM

Abstract There have been numerous gross motor assessment measures developed and used in clinical practice. However, there is lack of evidence that reports on the clinical utility and responsiveness of these assessment measures. Hence, this study has systematically reviewed what the evidence suggests about the clinical utility and responsiveness of assessment measures that are used to assess and monitor gross motor functions in infants and preschool children who have gross motor developmental delay. The articles were searched using the AMED [Complementary Medicine : British Nursing Index and CINAHL (Cumulative Index to Nursing and Allied Health Professions)], Science Direct, ISI Web of Science, Wiley Inter Science, PEDro (Physiotherapy Evidence Database) and the Google Scholar search engine. From 244 studies, only 13 articles were included for the final review and there were seven assessment measures identified from these studies. From this seven, only the Movement Assessment of Infants (MAI), the Test of Infant Motor Performance (TIMP), the Alberta Infants Motor Scale (AIMS) and the Peabody Developmental Motor Scale-Gross Motor (PDMS-GM) have almost complete clinical utility evidence. Meanwhile only the BSID, the MAI, the TIMP, the PDMS-GM and the GAS have the evidence available for responsiveness. The reviewed evidence indicates that the PDMS-GM can be considered as a practical and clinically useful gross motor assessment measure to monitor gross motor functions in infants and preschool children who have gross motor developmental delay.

[ii]
COPYRIGHT UiTM

Acknowledgements

I would like to express my gratitude firstly to my wonderful supervisor, Trish Westran, for her advice, guidance and moral support to ensure that I went through the appropriate process to produce this work. Even though this has been an incredibly difficult process and needed hard work, with her support I enjoyed it and feel that I have gained benefit from it. I would also like to express my deepest gratitude to Helen Fiddler, MSc Physiotherapy Course Leader, Physiotherapy Division, School of Health Profession, Faculty of Health and Social Science, University of Brighton. I am grateful for her guidance from the initial to final level of my MSc process. Thank you to all University of Brighton staff, at Eastbourne Campus mostly, who have assisted me in completing this work. In addition, I would like thank my colleagues at the University of Brighton and at MARA University of Technology (Malaysia) who have been supportive throughout this MSc process. Last but not least, thank you to my parents, family and Johari Abu Bakar for being incredibly understanding and tolerating with me all the way through this MSc year.

Shaari, Ida Hasni

[iii]
COPYRIGHT UiTM

Table of Content Contents Declaration authorship Abstract Acknowledgement Contents List of figure List of table 1. Introduction 1.1 1.2 2. 3. 4. Back ground of study Literature review Page number [i] [ii] [iii] [iv] [vi] [vii] [1] [2]. [7] [15] [20] [31] [31] [35] [42] [44] [45] [47]

Methodology Findings Discussion 4.1 Clinical utility 4.2 Responsiveness 4.3 Quality assessment 4.4 Strength, limitation of study and recommendation

5. 6.

Conclusion References

[iv]
COPYRIGHT UiTM

Contents Append x 1 : Search strategy for PubMed (MEDLINE) database Append x 2 : Search strategy for AMED database Append x 3 : Search strategy for ISI web of science database Append x 4 : Search strategy for Science Direct database Append x 5 : Search strategy for Wiley Inter Science database Append x 6 : Search strategy for PEDro database Append x 7 : Search strategy for Google scholar database Append x 8 : Results from hand search, other internet sources and citation search strategy Append x 9: PEDro scale quality assessment form Append x 10 : Summary for the inclusion and exclusion of the studies Append x 11 : Descriptions of responsiveness quote from different studies Append x 12: Checklist proposed by Jerosch-Herold (2005) for critical appraise study related with validity, reliability and responsiveness.

Page number

[54] [59] [64] [72] [77] [82] [83]


[86]

[87] [89] [97] [98]

[v]
COPYRIGHT UiTM

List of figures

Figure Figure 3.1

Title Study selection process

Pages [21]

List of tables

Table Table 2.1 Table 2.2

Title Summary of the inclusion and exclusion criteria Summary for data required to identify clinical utility and responsiveness of assessment measures

Pages [17] [19]

Table 3.1 Table 3.2 Table 3.3 Table 3.4

Assessment measures identified Study characteristics Evidence of assessment measures clinical utility Summary of PEDro Quality Scale scores

[22] [24] [26] [30]

[vl]
COPYRIGHT UiTM

Assessing and Monitoring Infants and Young Children with Gross Motor Developmental Delay: a systematic literature review on clinical utility and responsiveness of gross motor assessment measures 1. Introduction

Gross motor development can be considered an important aspect of development because it simultaneously reflects the maturation of a child's nervous system (Spittle et al., 2008). For this reason, assessing and monitoring a child's gross motor functions has been a main concern of physiotherapists working with disabled children (Palisano et al., 2006; Brenneman et al., 2008; Spittle et al., 2008). Physiotherapists need to ensure that these children are able to achieve optimum gross motor functions which are considered to be appropriate with their impairment and ages. Once these children are able to perform gross motor functions, they are most likely to gain independent mobility function and functional activities (Spittle et al., 2008). Hence it is important for physiotherapists to carry out a comprehensive assessment and close monitoring of these children's gross motor development. Numerous pediatric assessment measures have been developed and used in clinical practice to assess and monitor a child's gross motor development. However, not every assessment measure has been found to be clinically relevant by physiotherapists (Spittle ef al., 2008; Jette et al., 2009; Greenhaigh et al., 1998). Assessment measures are considered to be clinically relevant by physiotherapists when they have good clinical utility functions and good evidence of responsiveness (Gilmore ef al., 2010; Ketelaar et al., 1998). This is because, beside the need to be reliable and valid, the assessment measure with good clinical utility will provide therapists with the theoretical background, clear information how to administer the scores according to established standards and could then enhance therapist competency in carrying out the assessment procedure (Wiart et al., 2001). Responsiveness could be important because, as well as providing diagnosis and prognosis, the information gathered should reflect patient progression or deterioration before and after attending the physiotherapy programme (Wiart et al., 2001). However, due to a lack of evidence about these criteria, pediatric physiotherapists may find it difficult to choose a clinically relevant assessment measure in their clinical practice (Gilmore et al., 2010). Moreover, current available evidence is more focused on specific assessment measures that have been validated for certain groups of children with specific impairment and disabilities such as cerebral palsy (CP) and traumatic brain injuries (TBI) (Ketelaar et al., 1998; Thomas-Stonell et al., 2006). This evidence is more concerned with certain domains

[l]
COPYRIGHT UiTM

of child development such as neuromotor, functional and upper limb functions (Spittle et al., 2008; Ketelaar et al., 1998; Gilmore et al., 2010). Currently, there is limited evidence on clinical utility and the responsiveness of assessment measures that are used to assess and monitor a child's gross motor development. The aim of this study is to conduct a systematic review to identify what the available evidence is for the clinical utility and the responsiveness of assessment measures which are used to assess and monitor child gross motor functions. This review focuses on evidence that involves pediatric participants having or being at risk of gross motor delay. The background of this review focuses on introducing the terms of gross motor development, gross motor delay, clinical utility and responsiveness. The literature review section incorporates critical reviews on the role of clinical utility and responsiveness of assessment measures.

1.1

Background of the study

Gross motor development can be described as the sequential development of motor behaviour which includes pre-ambulatory skills such as rolling, crawling and sitting, and ambulation and advanced skills such as running, jumping and hopping (Boyce et al., 1991; Ketelaar et al., 1998). This term, however, has not been used consistently by some authors because they may use terms such as gross motor capacity, gross motor functions or motor milestone rather than gross motor development (Smits et al., 2010; Ketelaar et al., 1998; Boyce et al., 1991). These skills were found to be associated with the development of somatosensory, visual and vestibular systems, together with central processing systems that help in organizing the multiple inputs for posture and movement orientation (Ketelaar et al., 1998; Boyce era/., 1991).

For the purpose of this review, gross motor development is described as the ability of a child to perform motor activities such as rolling, sitting and walking, together with the ability to maintain stability, and the development of the adaptive and anticipatory mechanism that allows the child to maintain and change position in response to task and environment (Boyce et al., 1991; Ketelaar era/., 1998; Smits et al., 2010). Gross motor delay then can be referred to as the inability of infants or children to fully develop sequential motor activities appropriate for their age (Shevell et al., 2008). This can be objectively identified when an infant or a child has gross motor evaluation scores significantly lower than 2 standard deviation of the norm value (Shevell et al., 2008). [2]
COPYRIGHT UiTM

Gross motor development has been the main concern of parents when their child is referred for physiotherapy (Spittle et al., 2008; Boyce et al., 1991). This is because parents are concerned about the ability of their child to walk and run after appropriate term age (Boyce et al., 1991). However, physiotherapists are concerned with the child's gross motor development as this part of development has a direct influence on the child's ability to develop independent functional and mobility at later ages (Boyce et al., 1991). Hence, the physiotherapy assessment and monitoring process of gross motor development of a child referred for physiotherapy can be considered as a vital procedure (Shevell et al., 2008). An accurate and comprehensive assessment will require the physiotherapist to used sound or good assessment measures (Boyce et al., 1991).

Assessment measures can be described as tools that used by physiotherapists to develop a hypothesis on patient conditions through clinical reasoning processes in which physiotherapists need to link theory with practice (Brenneman et al., 2008). These tools could enable physiotherapists to measure certain characteristics of the patient such as quality of movement and the child's ability during the initial or pre-treatment assessment session. This information is then used to formulate treatment goals, treatment strategies, evaluate the effectiveness of treatment and the patient's condition (Jette et al., 2009; KetelaareJa/., 1998).

Commonly, physiotherapists will focus on two aspects of assessment measures before they decide to use certain assessment measures. First, the clinical utility of the assessment measures (Spittle et al., 2008; Ketelaar et al., 1998: Horner et al., 2006; Greenhalgh et al., 1998). Available evidence suggests that the clinical utility can be referred as the qualitative aspect of assessment measures which incorporate the duration required to score the assessment items, the cost related to the purchase of the assessment manual, and the availability of assessment and training that is required to use the assessment measures (Gilmore et al., 2010; Greenhalgh et al., 1998). However, different authors have described clinical utility in different ways. For instance, Tyson et al. (2009) indicate that clinical utility of assessment measures used to evaluate balance in stroke patients include the duration required to administer the scores, the cost required to own the assessment measures, training required for the assessor and the transportation of the assessment measures.

Gilmore et al. (2010) suggest that clinical utility means the characteristics of the assessment measures. These characteristics are types of population, and infants or children age ranges, [3]
COPYRIGHT UiTM

purpose, format and content of the assessment measures, number of items assessed, duration required to complete the assessment, cost and access to the manual, and training required for the assessor. For the purpose of this review, clinical utility of assessment measures covers the information related to the target population and their age, duration required to administer the scores, assessment procedures used such as observation or direct handling, domain of assessment, scoring system, assessor training and manual or equipment required (Tyson et al., 2009; Gilmore et al., 2010; Greenhalgh et a/., 1998). This information is emphasised in this review because it could enable physiotherapists to have a better understanding of the practicality of each assessment measure. At the same time, it could enable physiotherapists to analyse the strengths and weaknesses of each measure (Wiart et al., 2001). Unfortunately, commonly, there has been incomplete information given regarding the assessment measures clinical utility in the literature. Hence, it is hoped that the information gathered in this systematic review could pool this evidence collectively. The second aspects that concern physiotherapists are the assessment measure psychometric properties, which are the validity, the reliability and the responsiveness (Gilmore et al., 2010). The validity of an assessment measure is the extent of how far the tools are able to measure what they are supposed to measure in a specific population (Ketelaar et al., 1998; Rosenbaum et al., 1990). There are four types of validity: content, criterion, construct and evaluative validity (Pountney et al., 1999; Horner et al., 2006; Spittle et al., 2008). Evaluative validity has also been referred to as responsiveness of the measures. The responsiveness is usually considered as a separate part of the psychometric properties (Horner et al., 2006). However, there is inconsistency in the literature in describing the term responsiveness. Moreover, there have been a variety of statistical techniques which have been suggested and used to measure responsiveness (Husted et al., 2000). Responsiveness of assessment measures can be divided into internal and external responsiveness (Husted et al., 2000; Deyo et al., 1991). Internal responsiveness can be considered as the ability of a measure to detect any changes that occurs in patient performance or characteristics which are usually pertinent to intervention (Thomas-Stonell et al., 2006; Barak et al., 2006). External responsiveness is considered as the ability of assessment measures to clinically detect an acceptable change in patient condition (Husted et al., 2000).

The term sensitivity has also been used to describe the ability of measures to show changes in scores which indicate patient improvement (Horner et al., 2006). However, the clinical changes shown by the scores may not be clinically relevant or may not be necessarily [4]
COPYRIGHT UiTM

meaningful (Iyer et al., 2003). The term sensitivity could also refer to the ability of measures to detect certain conditions such as gross motor delay when it is present in a child (Spittle et al., 2008). Therefore, for this systematic review, the term responsiveness refers to the sensitivity of measures to show clinically important changes in patient scores over a period of time (Barak et al., 2006; Horner et a/., 2006). This systematic review, however, only focuses on the general aspect of assessment measure responsiveness.

There have been no statistical tests suggested as a gold standard calculation to measure responsiveness (Thomas-Stonell et al., 2006). The literature suggests different statistical tests to identify responsiveness of assessment measures. The effect size (ES) is one of the common methods being used to evaluate responsiveness (Thomas-Stonell et al., 2006; Barak et al., 2006; Tyson et al., 2009; Beaton et al., 1997). The ES can be calculated using the t-test statistics. This statistics test will indicate the responsiveness of assessment measures through the statistically significant changes that occur in the assessment measure scores such as before and after intervention (Husted et al., 2000; Deyo et al., 1991). High significant changes of the p values will indicate high responsiveness of the assessment measures.

Next, ES can be evaluated by identifying the changes that occur from baseline and follow-up mean scores; then divided by the standard deviation of the baseline scores; this is known as Standardized Effect Size (SES) (Deyo et al., 1991; Beaton et al., 1997). If changes in scores are divided by standard deviation of change scores, this is known as Standardized Response Mean (SRM) tests (Thomas-Stonell et al., 2006). Tyson et al. (2009) conducted a systematic review to identify psychometric and clinical utility of outcome measures used to evaluate balance activity in stroke patients. They used SRM statistics to identify the responsiveness of each outcome measure found. Their review found that outcome measures with effect size of 0.2 to 0.5 were considered to have weak responsiveness, ES of 0.5 to 0.8 was considered moderately responsive and an outcome measure with ES greater than 0.8 was considered to have good responsiveness (Thomas-Stonell et al., 2006; Tyson et al., 2009).

A similar scale is used to indicate responsiveness of assessment measures using SES (Locker et al., 2004). This was indicated by Cacchio et al. (2010) who conducted a study to identify psychometric properties of the Italian version of Lower Extremity Functional Scale (LEFS) in patients with lower extremity musculoskeletal dysfunction. The authors interpreted the responsiveness of assessment measures using SES and SRM as 0.2 = small, 0.5= [5]
COPYRIGHT UiTM

the other method. This is because the cut off point gives the value which can be used as a scale, differentiating scores between improving and not improving.

Responsiveness can also be determined through correlation statistical tests. Correlation statistics such as Pearson Product-Product-Moment-Correlation coefficient and Spearman's Rank Correlation coefficient can considered as other simple methods to identify responsiveness of health outcome measures (Hornet et al., 2006). This could possibly be because this test can be used to evaluate construct validity of health outcome measures at the same time. Hence, there is a possibility that most researchers could be familiar with this test (Horner et al., 2006; Barak et al., 2006). The values from both tests were reported as r for Pearson product moment and r s for Spearman's Rank Correlation. Both rand rs can be classified as small = 0.10, medium=0.30 and 0.50 = large (Horner et al., 2006).

1.2

Literature Review

The clinical utility and responsiveness of assessment measures Previous literature suggests that physiotherapists generally select assessment measures based on their clinical utility, reliability, validity and responsiveness (Greenhaigh et al., 1998; Spittle et al., 2008; Gilmore et al., 2010). The evidence also indicates that physiotherapists prefer to use assessment measures that require less time to administer and interpret the scores (Spittle et al., 2008; Jette et al., 2009). This is because time consuming assessment measures were found to be clinically irrelevant for use in hectic clinical situations due to the limited time allocated for one patient per session (VanSwearingen et al., 2001). Jette et al. (2008) investigated perceptions and barriers that could influence physiotherapists in using standardized assessment measures in their clinical practice. These data were collected by distributing a Health Status Questionnaire developed for the study. The authors conducted a peer review process to ensure good face and content validity of this questionnaire. The questionnaire consisted of five topics which related to perceptions and barriers of using standardized assessment measures in clinical practice. The authors, however, did not mention in detail how the scores were allocated for each question. The questionnaire was first emailed to 1000 members of the American Physical Therapy Association (APTA) selected randomly through geographical area by computerized system. After a week, a second questionnaire with a reminder letter was mailed to participants who

[7]
COPYRIGHT UiTM

had not answered via email. From 1000 questionnaires distributed, only 456 questionnaires were returned. Of the participants, 90% agreed that using standardized assessment measures in clinical practice is essential to assist in developing treatment plans and enhance communication between therapist and patient. Of the questionnaires, 75% indicated that some assessment measures had not been used in clinical practice because the items caused confusion, were difficult to administer and were too time consuming for both patient and therapist. This could indicate that some assessment measures not only require training for the therapist but also for the patient. This is because, even though a patient is only going to use the assessment measure for one session, a good understanding is still required for the patient to complete it and give reliable answers (VanSwearingen et al., 2001). Furthermore, the findings show that assessment measures can be favourable if they are not too time consuming (VanSwearingen et al., 2001). From the survey, three main reasons why physiotherapists choose to use certain assessment measures in their daily practice were identified. These reasons are: the assessment can be completed in a short time, they are easy for the patient to understand and they have been shown to be valid and reliable. Jette et a/.'s (2009) findings show that, despite being valid and reliable, physiotherapists were more concerned with the clinical utility aspect of the assessment measures. Based on this study, less time consuming assessment measures could be more favourable because the participants feel that there is high ratio between numbers of patients per therapist. Hence, a quick and comprehensive assessment measure can be more practical. Besides clinicians, researchers may also need assessment measures that can be practically administered and completed over a short duration and follow up procedures within a specified duration (Spittle et al., 2008). This could possibly be because physiotherapists were only required to complete one patient session within a limited time such as 30 minutes or less than 1 hour (Jette et al., 2009; VanSwearingen et al., 2001). As for pediatric physiotherapists, practically administered assessment measures could be essential for hectic situations such as in follow up clinics (Spittle et al., 2008). In addition, evidence indicates that researchers and therapists will commonly select assessment measures which require minimal training, less cost for the manual and equipment and have easy access to it (Tyson et al., 2009; Spittle et al., 2008; VanSwearingen et al., 2001). This idea, however, could be inaccurate as having sufficient training and good knowledge of the assessment measure background and theory could enhance the competency of users (Russell et al., 1994). Furthermore, training before using [8]
COPYRIGHT UiTM

any assessment measures could assist therapists to maintain reliability of the assessment scores (Spittle etal., 2008; Russell etal., 1994). The effect of training on the reliability of scores was investigated by Russell et al., (1994) on 79 health care providers such as physiotherapists and paediatricians who have many years of working experience in pediatrics. The authors conducted five separate one-day training sessions for potential users of GMFM-88. The authors, however, did not mention whether the participants had any experience using GMFM prior to the training workshop. This is because any experience using GMFM measures could possibly affect the participant scoring competency (Russell et al., 1994). In this workshop, participants were required to perform pre- and post-training scoring activities based on a case study presented through video. The same videos were shown for pre- and post-training sessions and discussions were carried after each video session. The videos shown were rated by instructors prior to the workshop. This could possibly be to ensure the correct criteria were selected for each characteristic shown by the patients. However, the background of these instructors was not mentioned by the authors. This could be an important matter because the competency level of the instructors could also affect the main scores given for each patient (Russell et a/., 1994). Paired t-test shows significantly different (p < 0.0001) and estimated kappa mean was 0.58 for pre- and 0.82 for post-training on the first day session. This could indicate that there were improvements in scores before and after training. The same statistical findings were reported for the other 4 sessions. The reliability between pre- and post-training was found to be improving from 0.81 to 0.92 of estimated kappa mean. Furthermore, the authors reported that there was no significant relationship between user working experience and the score reliability (p > 0.05). The Russell et al. (1994) study indicates that training would be necessary for physiotherapists to use specific assessment measures. This may not only be to enhance the competency of the assessor and maintain the reliability of the scores, but may also be to avoid measurement error (VanSwearingen et al., 2001). This is because any errors that occur due to scoring mistakes, which could lead to under- or over-estimatation of the child's performance, then might lead to an inappropriate treatment plan (Spittle et al., 2008). This study seems not to support the idea of physiotherapists choosing assessment measures that provide less training, less cost and easily have free access to it (Spittle et al., 2008). Besides good clinical utility, physiotherapists were most likely to use assessment measures with good validity and reliability (Jette et al., 2009; Horner et al., 2006; Greenhalgh et al., 1998). This, however, might be different for pediatric physiotherapists. This is because, if [9]
COPYRIGHT UiTM

physiotherapists were dealing with infant and young children gross motor delay, which may have unstable patterns of gross motor development (Spittle et al., 2008; Hsieh et al., 2009). There can be deterioration or progression of these children's gross motor skills and functions over a period of time. Hence, besides good clinical utility, validity and reliability, the assessment measures must have stable responsiveness ability to detect any changes occurring (Husted et al., 2000; Deyo et al., 1991). The evidence of assessment measure responsiveness, however, was not frequently addressed in the literature compared to the reliability and the validity matters. For instance, Miller-Loncar et al., (2005) determined to evaluate gross motor development of 392 compatible infants having a history of cocaine exposure during the prenatal period, and 788 infants were recruited as the control group. Miller-Loncar et al. (2005) used NICU Network Neurobehavioral Scale (NNNS), Posture and Fine Motor Assessment of Infants (PFMAI), Bayley Scales of Infant Development-Second Edition (BSID-II) and the Peabody Developmental Motor Scale (PDMS) to evaluate infant neuromotor behaviour and gross motor skills. They probably chose to use different assessment measures for gross motor skills as there are no assessment measures that are available to continuously assess infant gross motor from 1 to 18 months (Spittle et al., 2008). The authors did mention that the assessment measures were valid and reliable assessment measures. However, there has been no evidence given for the responsiveness, and no detail was given on the clinical utility aspect of each assessment measure. The infant was being examined at 1, 4, 12 and 18 months of age. The reliability of score collected at each age was uncertain as the authors did not conduct Intra- and Interrater reliability tests before conducting the data collection. Miller-Loncar et al. (2005) reported that infants exposed to cocaine show poorer gross motor development compared to the control group. Moreover, these infants gradually showed statistically significant improvement in their gross motor skills throughout the study compared to the control group. Even though the findings positively indicate that the assessment measures used are able to detect changes in these infants' gross motor development, it is uncertain whether this could reflect the responsiveness of the assessment measures used. This is because the authors converted the raw scores from all measures and used Hierarchical Linear Modelling (HLM) to plot the pattern of gross motor development between and within individuals throughout the study duration. It seems as, if the authors had no intention to relate their findings with the assessment measure responsiveness.

[10]
COPYRIGHT UiTM

The responsiveness for the assessment measures they used can be determined if they have not transformed all raw scores into z-scores. This is because the mean scores in between age from the initial and the last follow-up could probably be used to evaluate the responsiveness (Husted et al., 2000). If there is a statistically significant difference between the mean, this then could reflect the responsiveness of assessment measures that they have used. On the other hand, the authors may have ignored this matter possibly because the main aim of this study was to determine the pattern of gross motor development in infants exposed to cocaine without relating the findings with the ability of the assessment measures to detect changes. Overlook on the evidence of responsiveness was also found in a study conducted by Schertz et al. (2008). This study investigated the development of gross motor functions in 101 infants born with CMT. The authors mean to identify whether CMT affects infant gross motor development. Gross motor assessment was carried out using the Alberta Infant Motor Scale (AIMS) during the initial stage of study when the infants were approximately 2 months old, and the second follow-up was when these infants were 18 months old. The AIMS can be considered as one of the most commonly used assessment measures to evaluate gross motor development at infant stage. However, the authors have not clarified the responsiveness and the clinical utility aspect of this assessment measure. For the initial follow-up at age 2.9 months, only 66 of these infants were found to have normal gross motor scores. Thirty-five of them were suspected to have abnormal gross motor functions. The follow-up was done when these infants were aged 12.8, and just 8 of them were suspected to have abnormal gross motor functions. The improvement in the infant gross motor functions in this study might be influenced by the early intervention provided (Lenke, 2003). For this reason, it could be better if the authors have compared the scores changes between the initial and the follow up measurement for infant that have attended physiotherapy and those who has not. These could then be used to show the ability of the AIMS to detect changes that could be persuaded by the natural development or by the intervention interference (Spittle et al., 2008). However, due no statistical tests was done to determine the responsiveness of the AIMS was being carried out, the gross motor changes recorded from this study can still be questioned. Hence, it would have been better if the authors had provided or related the study findings with the responsiveness of AIMS to support the changes recorded. Unlike Miller-Loncar et al. (2005) and Schertz et al. (2008), a study conducted by Giordano et al. (2009) shows that the authors were aware that the responsiveness of assessment [11]
COPYRIGHT UiTM

measures can be influenced by the assessment measure items or by the patient condition (Deyo et al., 1991). This is because Giordano et al. (2009) determined to evaluate the internal and external responsiveness of 3 health related quality of life (QOL) self evaluation questionnaires commonly used on multiple sclerosis (MS) patients after the exacerbation period. Giordano et al. (2009) recruited 104 MS patients identified from four main MS centres in Italy with age ranges from 18 to 55 years old who had received intravenous steroid therapy. The participants were asked to complete the Functional Assessment of MS (FAMS), the 29-items MS Impact Scale (MSIS-29) and the 54-items MS Quality of Life (MSQOL-54). These are self administered questionnaires that are commonly used on MS patients to evaluate their QOL. The same procedures were repeated after eight weeks. Giordano et al. (2009) used SRM statistical tests to evaluate the internal responsiveness and receiver operating characteristics (ROC) curve to evaluate the external responsiveness of physical and psychological domain from each questionnaire. The findings indicate that only MSQOL-54 has moderate internal responsiveness for both physical (SRM= 0.71) and mental health composite (SRM= 0.57). For external responsiveness, only MSQOL-54 has high discriminative value for both physical (Area Under Curve= 0.67) and psychological domain (AUC= 0.70). The findings from the Giordano et al. (2009) study indicate that MSQOL has good internal and external responsiveness, because the AUC more than 0.60 indicate adequate discriminative ability (Horner et al., 2006). This could show that, the MSQOL assessment measure is able to detect changes that occur in patients, and intravenous steroid therapy does improve patient condition. However, it would be better if Giordano et al. (2009) had provided a cut off point for MSQOL on the ROC curve to discriminate scores between improved and unimproved patients. Hence, in clinical practice, it could possibly assist the physician to identify what the clinically meaningful scores for this measure are. An average score that could indicate clinical meaningful scores was reported in a study by Wang et al. (2006a). The main objective of this study, however, was to evaluate the responsiveness of first version Gross Motor Function Measures (GMFM-88), and its second version GMFM-66 in continuously assessing changes that occur on CP child gross motor skills and functions. This is because the construct studies for both versions of GMFM report good reliability and validity of these measures. However, the responsiveness of these measures has not yet been determined. Similar to Giordano et al. (2009), Wang et al. (2006a) used ROC calculation methods to evaluate the responsiveness of these measures. The scores were administered by the same [12]
COPYRIGHT UiTM

physiotherapist on 65 CP children at two separate follow-ups. That the same physiotherapist was used could possibly be because of the authors' attempt to maintain the reliability of the scores given. However, it was not stated which version of GMFM was administered first or whether they separated the participants into two groups. The follow-up procedure was not clearly discussed by the authors as there was no information given whether they used the same or different procedures for the follow-up session. Lack of information in the study procedure may cause difficulty for reproducibility purposes (Greenhalgh et al., 1998). After the second evaluation, the therapists were required to rate each child as greatly, moderately or no improvement using the Therapist Judgement Measure (TJM) which was developed purposely for this study. This measure was used as an external standard to evaluate how close the GMFM scores were related to other measures used to evaluate improvements in child gross motor. However, the authors did not mention whether they conducted a peer review process or pilot study for this measure. Hence, it is uncertain whether TJM could provide valid and reliable scores. Wang et al. (2006a) reported that logistic regression analysis shows that GMFM-66 has a large significant (p <0.05) area under the ROC curve (AUC) compared to GMFM-88. Furthermore, GMFM-66 was found to be more significantly related with TJM. Different from the Hsieh et al. (2009) and Giordano et al. (2009) studies, Wang et al. (2006a) identified the cut off point that could differentiate clinically improved and unimproved patients in both GMFM versions. Wang et al. (2006a) reported that an average score changes at 1.58 for GMFM-66 and 1.29 for GMFM-88 to signify a clinically meaningful improvement of a CP child's gross motor functions. This information could be clinically useful for physiotherapists as it could enable them to identify the number of scores that could reflect their patient's progression. Besides good responsiveness, significantly larger AUC significantly related with TJM could show that the revised version of GMFM-66 has potentially better psychometric properties which are valid and reliable compared to the original version of GMFM-88. Hence, this study could possibly show that GMFM-66 does have better responsiveness compared to GMFM-88 in detecting changes in CP child gross motor function over a period of time. Findings reported by the Giordano et al. (2009) and Wang et al. (2006a) studies indicate that responsiveness is clearly an important ability of an assessment measure. This is because it could enable physiotherapists and other health care providers to develop diagnosis and monitor changes of the child's gross motor functions over a period of time (Horner et al., 2006; Husted et al., 2000; Giordano et al., 2009). This then could signify that physiotherapists may need to use assessment measures with good responsiveness to [13]
COPYRIGHT UiTM

accurately measure changes in gross motor skills and functions in infant and young children with gross motor delay. Besides preterm and infants with neurological deficits, infants born with a congenital impairment such as congenital muscular torticollis (CMT), congenital heart disease, and infants that have suffered from an acquired injury such as traumatic brain injuries, may also develop gross motor delay (Schertz et al., 2008; Schoenmakers et al., 2004; Dusing et al., 2007; Chen et al., 2004; Linder-Lucht et al., 2007). This again indicates that physiotherapists may need to thoroughly document any progression or deterioration of the gross motor development that a child has acquired. Overall, it can be considered that clinical utility and responsiveness are important matters for physiotherapists in choosing assessment measures. However, due to a lack of information and updated evidence about clinical utility and responsiveness for gross motor assessment measures, it limits the physiotherapist's ability to choose accurately assessment measures that can be considered as clinically relevant to their practice (Tyson et al., 2009; Jette et al., 2009; Spittle et al., 2008; Ketelaar et al., 1998). Therefore, it is hoped that this review can provide useful evidence on the responsiveness and clinical utility of assessment measures that can be used to assess and monitor gross motor development in infant and young children with gross motor delay. For that reason, the research question for this review is: What does the evidence suggest about the responsiveness and clinical utility of assessment measures that are used to monitor gross motor functions in infants and preschool children who have gross motor developmental delay?

[14]
COPYRIGHT UiTM

2.

Methodology

A systematic review was carried out from June to August 2010 by the author (IH) using electronic databases which are AMED [Complementary Medicine : British Nursing Index and CINAHL (Cumulative Index to Nursing and Allied Health Professions)], Science Direct, ISI Web of Science, Wiley Interscience, PEDro (Physiotherapy Evidence Database) and the Google Scholar search engine. Other internet sources such as the Chartered Society of Physiotherapy (CSP) website and the NHS centre for reviews and dissemination website were also used. Besides using electronic data searches, this review included articles from hand searching. The hand search focused on the Journal of Physical & Occupational Therapy in Paediatrics, the Physical Therapy Journal and the New Zealand Physiotherapy Journal because these journals were found to be more related to gross motor and pediatric physiotherapy assessment measures. Finally, the search involved citation search based on the article reference lists. Details of the electronic database search strategy are reported in Appendices 1 to 7. Since the aim of this review was to identify what the evidence suggests regarding the responsiveness and clinical utility of assessment measures used to assess and monitor gross motor development in infant and young children, the key words used for this review were as follows: assessment measures, outcome measures, measurement tool, evaluation, responsiveness, sensitivity, reliability, validity, clinical utility, milestone, milestone development, gross motor, developmental, motor functions, gross motor functions, motor performance, neuromotor, developmental delay physiotherapy, physical therapy, preschool, rehabilitation, preterm and infants. The search was modified for other search engines by using truncation and Boolean operators on the keywords. This included combination of the MeSH terms (Medical Subjects Heading) of MEDLINE/ PubMed with Boolean operators. 2.1 Inclusion and exclusion criteria

The inclusion and exclusion criteria for the articles are summarized in Table 2.1. This review included three main types of experimental study group: Randomized Control Trials (RCT), Quasi-experimental studies and Observational studies. The RCT group included Randomised cross-over trials and cluster randomized trials. Studies included under Quasiexperimental group were non-randomized controlled studies, repeated measures studies and interrupted time series. The Observational study group included cohort types of studies, [15]
COPYRIGHT UiTM

case-control studies and case series studies. Any articles that involved a single case study, a systematic review and non-experimental studies were excluded from this review. Articles were selected if they contained at least one key word in the title and abstract. Only articles published in English and available as full text were included in the review. Therefore, any articles not published in English and without an available full text were excluded. Moreover, only studies that used assessment measures to assess OR monitor gross motor functions for infant and young children from birth to school ages were included in the review. Those articles that used assessment measures to evaluate quality of life (QOL), for screening purposes, for classification, to measure participation, school functions and self care were excluded from the review. Hence, the assessment measures which have component of gross motor domain were selected for this systematic review. This systematic review evaluates studies that used assessment measures on infants and preschool children aged between 0 to 5-years old, who have gross motor delay due to congenital neurological impairment, underlying medical conditions or environmental factors. This is because the evidence indicates that gross motor delay due to these factors may not only interfere with the child's movement or functional activities, but could as well affect their academic performance (Sullivan et ai., 2003). The age range between 0- to 5- years old was selected for this review because children at these ages will usually demonstrate rapid gross motor development (Huges etal., 1981; Shevell et ai, 2003). For this reason, any study that included young children over 5 years old was excluded. General gross motor delay was chosen for this review because gross motor function is the main area that requires physiotherapist attention when a child is referred to physiotherapy (Lenke, 2003). Furthermore, gross motor skills such as rolling, sitting and crawling are the main concern of parents when their child is referred to physiotherapy (Lenke, 2003). For the next inclusion criteria, research articles needed to have evidence regarding valid and reliable assessment measures in their study. Furthermore, the articles were only included if the authors had investigated or described the responsiveness or clinical utility of the assessment measures used.

[16]
COPYRIGHT UiTM

Inclusion Randomized control trial - Randomised cross-over trials - Cluster randomised trials Quasi-experimental studies - Non-randomised controlled studies - Before-and-after study - Interrupted time series Observational studies - Cohort study - Case-control study - Case series - Full text and published in English. - At least has one key word in the title and abstract. - The assessment measures were used to evaluate gross motor skills.

Exclusion

1. Types of study

- Single case study - Systematic review - Non-experimental design

2. Study criteria

3. Study population

4. Assessment measures

- Ages enrolled for study from 0-to 5 years old. - Children who had suffered from gross motor delay due to congenital neurological impairment, underlying medical problem or environmental factors. - Need to have at least gross motor domain. - Has published evidence of validity and reliability. - Has mentioned OR discussed the evidence of responsiveness OR clinical utility for assessment measures used.

- Only abstract available and was not published in English. - The assessment measures were used to evaluate quality of life (QOL), screening purposes, classification, measures participations, school functions and self care. - Ages above 5 years old. - Children that have not suffered from gross motor developmental delay.

- Does not have gross motor domain. - Has not mentioned OR discussed the evidence of responsiveness OR clinical utility for the assessment measures used.

Table 2.1

: Summary of the inclusion and exclusion criteria

[17]
COPYRIGHT UiTM

Data was extracted and tabulated for ease of comparison and analysis. For data analysis, the study characteristics were types of population, assessment measures, evidence of clinical utility and responsiveness. For clinical utility, the assessment measures were analysed in terms of seven criteria (Gilmore et a/., 2010; Tyson et a/., 2009): 1) 2) 3) 4) 5) 6) 7) Target population and age Duration required administering the scores Assessment procedures such as observation or direct handling Domain area of assessment Scoring system Training for assessor Manual or equipment required.

Table 2.2 summarizes the characteristics required to identify the responsiveness and clinical utility of assessment measures. The statistics used to indicate responsiveness of the assessment measures can vary between studies (Thomas-Stonell et a/., 2006). For this review, responsiveness of assessment measures were: being identified through paired t-test, correlation analysis and regression models and, effects size statistics which included standardized effect size (SES), standardized response mean (SRM) and Guyatt's responsiveness index and Receiver Operating Characteristics (ROC) curve (Husted et a/., 2000) 2.3 Quality Assessment

Quality assessment was carried out using the adapted PEDro (Physiotherapy Evidence Database) Scale (Appendix 9) created from the Delphi consensus. This tool was chosen to evaluate the quality of each study because it is found to be related to physiotherapy practice and demonstrates a reliable scale to evaluate the quality a study (Maher et al., 2003). This scale rates a study based on eleven separate criteria which gives a maximum total score of 11 because each criterion carries one score. The eleven criteria include eligibility criteria of populations, random allocation of subjects, concealment on allocation, baseline prognostic similarity for each subject, blinding of subjects, blinding of therapist who provides the treatment, blinding of therapist who administers the measurement scores, the study key [18]
COPYRIGHT UiTM