Вы находитесь на странице: 1из 80

CAVENDISH UNIVERSITY ZAMBIA

BACHELOR OF BUSINESS ADMINISTRATION

(BBA)

Research Methods (BBA413)


Module

1
Dedication

T o all those who have me supported all along during the enduring and toiling times to pursue
further studies.

Above all, my family will always be appreciated, and cherished.

2
Contents

1 Introduction………………………………………………………..5

2 The Research Problem…………………………………………….10

3 Review of Literature……………………………………………….15

4 Hypothesis and Research Questions…………………………….20

5 Population and Samples……………………………………… ...23

6 Research Design………………………………………………….28

7 Reliability and Validity of Measuring Instrument……………….42

8 Methods of Data Collection……………………………………....52

9 Writing Research Report………………………………………....67


..
10 References………………………………………………………..79

3
Preface.

Most individuals spend nearly most of the time reading books, but have never thought of how such
books come into being. While on the other hand. The world faces so many challenges that have to
be clearly investigated and a possible remedy sought. The challenge to continue for as we battle to
produce the right numbers of people to address such remedies through research is continuous.

The author, Hasen kasolo who has been a distinguished student and now a respected colleague in
the promotion of quality education and social research, has done some commendable work in
putting together the volume.

The readership scope of this book spans the undergraduate researcher through to the post-graduate
researcher, each of which groups will find the concepts treated of relevance. It is therefore neither
too elementary nor too advanced.

Under such conditions, it’s ought that this book is timely to interested students who see research as
their possible career destination.

Professor Duncan Adepoju


Makerere university.

4
One: Introduction
1.1 The Nature of Social and Educational Research

Social Research is systematic inquiry into the Social. Process. Inquiry involves the application of
scientific procedures to the study of social problems. The scientific procedures employed include
problem
Definition, hypothesis formulation, data collection, data analysis, and test of the hypothesis. The
objective of social research is to confirm existing Knowledge; to discover new facts and general
principles for explaining, predicting and controlling events in social situations.

Whereas research in the physical sciences began in the seventeenth century, social research
emerged in the late nineteenth century. It was not possible to conduct scientific research in social
settings until the necessary scientific instruments for measuring variables of interest to society were
developed.
There were need to measure such variables as poverty and disease, income and distribution, gender
issues, unemployment, families and dependence, attitudes and motivation, personality anxiety, self-
esteem, behaviours, and more. Tools were needed for the systematic observation of social events.

1.2 Justification for Social Research.

Research in Social sciences is justified on the basis of the ever-present problems of the social
process. These problems require solutions based on convincing evidence. Research is the only
credible basis for making rational decisions in the face of several perplexing situations and
conflicting practices that abound in education. These perplexing situations and conflicting practices
are addressed in educational research from theoretical and practical viewpoints.

Issues that have been investigated over the years include effective monitoring, implementation and
evaluation methods, materials and media, appropriate leadership styles of social settings, effective
social strategies and intervention techniques; validity and reliability of measurement and evaluation,
instruments and techniques; control for discipline, behaviour and conduct problems in social
institutions; learning implications of different philosophical traditions, cultures, historical,
philosophical, psychological, and sociological explanations for observed social phenomena and
trends.
Methods of investigation have included direct observation, experiments and quasi experiments,
documentary analysis surveys and tests. Taking just one of the issues mentioned, namely, effective
methods of instruction, excerpts are quoted below to illustrate the need for research.

The above excerpts show clearly that attempts to enhance individuals as a result of formal education
and instructional practices have a better chance of success if they are based on sound research
evidence than on conventional wisdom, fashionable ideologies, or individual insights.

5
Thus several problems present themselves for solution in the area of effective methods of
investigation. If this is only one of many social issues that require research, then myriads of
problems exists to warrant social research

1.3 Types of Research

Depending on the goals and methods of investigation, research may be classified, into different
types.

1.3.1 Classification on the basis of goal.


Based on the goal of the particular investigation and the use to be made of it, research may be
classified into basic and applied research.

a) Basic Research:
This is applicable when the aim is to collect and use empirical data for theory formulation. By
investigating relationships between methods, personal characters, environmental variables and
learning efficiency, theories of learning are developed, illustrated, tested and expanded. The utility
of findings is not the concern of basic research. However, the findings may eventually be applied to
practical problems of social value.

b) Applied Research
This aims at solving immediate problems. Its findings help leaders to make rational practical
decisions about specific problems. Applied research may be subdivided into three subcategories,
namely, action research, evaluation research, research and development.

i) Action Research is undertaken by practitioners to solve their local practical problems.


For example, teachers may undertake action research to examine instructional practices;
principals may undertake action research to examine administrative practices;
supervisors may under take action research to examine evaluation practices.

ii) Evaluation Research is used to evaluate social programmes in order to improve their
efficiency by making necessary revisions or modification. For example, systematic
evaluation of the practice of continuous assessment in schools may necessitate
modifications in the requirement of the Ministry of Education. The change to the 6-3-3-4
educational systems ought to have been based on systematic evaluation of the preceding
system. Similarly, the change to 6-5-4 system ought to have been based on systematic
evaluation of the preceding 8-5-2-3 system

iii) Research and Development is concerned with developing and testing methods and
materials to ensures maximum efficiency of products and practices. Basic and applied
researches are interdependent. While theory is applied to solve Practical problems, the
findings of applied research may provide the necessary Insight for theoretical
formulations, theory testing, and theory refinement.

6
1.3.2 Classification on the basis of method of investigation

Another classification system for educational research is based on method of investigation


used. On the basis of methods used to investigate social problems, the following types of
research may be distinguished.

a) Experimental Research
In experimental research, independent variables are manipulated and controlled. Their
effects on the dependable variable are observed. It is said that experimental research serves
to determine possible outcomes given certain conditions. Experimental research embraces
most studies in which subjects are randomly assigned to different treatments whose effects
are observed. An example is the random assignment of subjects to instruction in the mother
tongue and instruction on student in a second language. The effect of language of instruction
on student learning is observed.

b) Ex post facto Research

Ex post facto research is an attempt to conduct an experimental study in which the


investigator is unable to directly manipulate the independent variables. Usually, the random
assignment for subjects to different treatments is not feasible. Instead, the subjects may be
grouped on the basis of some naturally occurring characteristics. Non-manipulated variables
include sex, race, intelligence, aptitude, creativity, teacher personality and socio economics
status. An example of ex post facto research is a study in which intact classes in a school are
taught using different methods and the effect of teaching method on student learning is
measured. The Students are not randomly assigned to classes. Ex post facto studies are
sometimes referred to as casual comparative studies.

C) Descriptive Research

Descriptive studies make no attempts to manipulate variables. Their concern processes


and trends or company variables. Descriptive Research may be divided into several
categories such as surveys, documentary analysis, and case studies. If the entire target
population is studied, the survey is called Census. For example, a census of introduction of
condoms in all secondary schools or a census of the opinions of all university teachers on
cost sharing in tertiary education may be taken. If the entire population is not studied, it is
important to use and appropriate sampling technique to obtain a truly representative sample.
Representatives sample. Representativeness of the sample is critical to survey research.
Otherwise, reliable inferences about the target population cannot be made from the sample.
Other important attributes of good quality survey research are the relevance of information
collected to the research questions, accurate enumeration, appropriate and accurate
measuring instrument for constructs or variables of interest, and accurate data collection
procedure.

7
Surveys may be cross – sectional or longitudinal. In cross sectional surveys, data is collected
at one point in time. The duration may be one day, one or more months. In longitudinal
surveys, data is collected at different points in time. Cross-sectional surveys
May reveal relationships that are time-bound such as between attitudes in urban and rural
settings or time-ordered such as between attitudes in successive classes. Longitudinal
surveys are superior in revealing such relationships because they use the same population at
different points in time. The same general population may be sampled as in trend studies; a
specific population may be sampled as in cohort studies. While the same individuals are
followed up at successive periods of time in panel studies.
Unlike the situation in sociology, surveys do not contribute appreciably to the body of
organized knowledge in education.

iv) Documentary analysis is used to examine documents and records for relevant
information may be obtained from official gazettes, minutes of meetings, reports of
panels and blueprints.

v) Case studies investigate in detail individual cases or aggregations of individual cases


treated as units. Thus studying a phenomenon in one school, one association, agency, or
organization, one student, one teacher, or one administrator may carry out case studies.
Case studies are used to solve specific problems through in-depth study. The issues are
usually remedial.

Case studies samples are not representative so their findings are not generalizable.
However, in-depth study may reveal relationships that merit investigation on a wider Scale.
For example, the foundations for jean Piagets’ theory of intellectual development were
rooted in case studies of intellectual maturation. The major limitations of case Studies are
non-representatives and subjectivity.

d) Historical research

Historical research involves the location, documentation, evaluation, and interpretation of


available evidence in order to understand past events. The understanding of past events may
lead to greater understanding of present and future events, prevent future pitfalls, or suggest
hypotheses for the solution of current problems. It may focus on social concerns, educational
practice, illiteracy, politics, urbanization and many more.

Historical research relies on evidence from documents, relics, artifacts, records, and oral accounts.
In education, such evidence may be sought from certificates, remains of manuscripts equipment,
attendance registers, inventories, report cards, and records of news talk. When evidence comes from
a direct source such as original documents, photographs, or records of witnesses, it is referred to as
a primary source.

After obtaining evidence, historical research is faced with additional task of determining its
authenticity. This process called historical criticism may be external or internal. External criticism

8
is concerned with the authenticity of the source of information. Internal criticism is concerned with
the authenticity or validity of information provided by the source.

Historical research has more limitations than either descriptive or experimental research. Whereas
experimental research is able to control the independent variables, dependent variables, treatment
sample, measuring instruments; and descriptive research is able to control the dependent variables;
sample, measuring instruments historical research has no control over any of the events or objects.
Past events which were not recorded and for which remnants do not exist, are inaccessible to the
researcher. Of the accessible events, the reports are often colored by personal bias.

A perusal of current newspapers and magazines will reveal that reporter accounts of events tend to
emphasize facts of interest to the reporters and to ignore other facts. Think, for instance, whether
government-sponsored document would likely highlight the fact that worker’s salaries were being
paid four months in arrears, and observation that is needed for greater understanding of other even

An overview of these forms of research may be represented thus:

A. A Goal-based classification

Basic research

Action research

Applied research Evaluation research

Research and development

B. Method-based classification

Experimental research

Ex post Facto research Surveys

Descriptive research Documentary analysis

Historical research
Case studies

9
Two. The Research Problem
2.1 Definition

There is usually a problem at the root of every research. The problem is established to justify the
research. Investigation is designed to offer solutions to the problem.

What then constitutes a researchable problem? A problem arises when the interplay of two or more
factors result in one of three possible problematic outcomes. The three problematic outcomes are a
perplexing state, an undesirable consequence, or a conflict for which the appropriate course of
action is controversial. A problem solution then involves clarification of the perplexing state,
elimination or alleviation of the undesirable consequence, or resolution of the conflict.

Examples of a perplexing state, an undesirable consequence and a conflict arising from the interplay
of factors include the following:

(a) What socioeconomic index for researchers to classify groups into social classes can
uniformly apply? There is difficulty in obtaining a socioeconomic index that can
uniformly be applied to sort any country into social classes. This is a perplexing state
for researchers who wish to use socioeconomic status as a variable. The perplexing state
arises from the interplay of unreliable socioeconomic indices and the need to classify a
population by socioeconomic status. A problem solution is then an attempt to evolve
indices that can uniformly be applied to classify that country by social status. Such
indices would then provide clarification about the component elements and the hierarchy
among the elements of social stratification in that particular country.

(b) Females are underrepresented in most social system. This leads to a reduction in the
number of females who can possibly occupy desirable positions in government. This
then is an undesirable consequence. A problem solution would seek ways of increasing
the female representation would seek ways of increasing the female representation in
such system and in government.

(c) Females are traditionally socialized through their role as home-makers rather than
through their careers. A conflict then arises in getting females into careers that
contribute nothing or even negatively to homemaking. The interacting factors are the
socialization pattern for females and the need to get them enrolled into careers
traditionally stereotyped as male domains. A problem solution might then seek ways of
altering the socialization patterns of females.

2.2 Problem Statement

Common errors in problem formulation arise when conditions, objectives, questions, hypotheses,
topic, or an uncomfortable feeling are mistaken for problems. For example, the no-existence of a
reliable socioeconomic index is a condition, not a problem. The difficulty in classifying a country

10
into social classes as a result of the unavailability of a reliable socioeconomic index is the problem.
A frequently occurring manner of wording faulty problem statements is to begin with the phrase:
“the purpose of this study is to …” what is subsequently expressed is an objective, not a problem.
An objective is a step along the way of achieving a problem solution. It cannot therefore be the
problem.

In order to counteract the errors, the essential elements of a problem statement are listed below,

(a) Firstly, a problem statement should establish the existence of factors whose interaction
yields a problematic outcome.
(b) Secondly, it should relate the problem to its educational scientific, economic, or social
antecedents. In doing this, the origin of the problem is traced to its present context so
that the reader can appreciate the problem. For example attempts by a principal to
reduced truancy rates through corporal punishment fail and she consider it worthwhile to
search for alternative solutions. When the problem is related to an established fact that
truancy influences performance significantly, the reader appreciates the need for
spending resources to search for better alternatives to corporal punishment. It is often
the antecedents that make the research worthwhile.
(c) Thirdly, the problem statement qualifies the problem in terms of any special
circumstances prevailing when the investigation is conducted. For example, workers’
strikers may be prevalent at the time when a study is carried out in organisations. This
may have important consequences for the research. If no special circumstance
influenced the study, this third element would be omitted.
(d) Finally, the problem statement should justify the significance of the problem. This
justification makes it evidence that scarce resources are not being wasted in the study.
This impact of new knowledge expected from the research should be clear.

2.3 Sources of Problems

How does a researcher locate or identify problems? This is a thorny question for most beginning
researchers. It is useful to suggest that research problems may be generated from any of three main
sources, namely, experience, literature and theory.

2.3.1 Experience

As the beginning researcher examines his/her environment, the daily experience as a student in
school, a teacher, an inspector, or other participants, certain problematic situations are observed.
These observations are very valid sources for researchable problems. For example, when a teacher
tries the inquiry approach to teaching and finds that classroom management problems are increased
manifold in contrast to the traditional approach, she is confronted with a conflict, a real problem
about what is most expedient. The inspector might observe that, inspite of in-service programmes,
workers’ practices and conduct remain unchanged. A perplexing state arises as to whether in-
service training or some other factor is the needed approach for changing such practices. The usual

11
observation of poor performances among students at the GCE examinations is an undesirable
consequence resulting from the interplay of certain identified and unidentified but certainly
unameliorated factors. Corporal punishment may not certainly truancy rates. The principal finds
herself in a perplexing state over what actions would serve better purposes than corporal
punishment. An educational administrator may observe that the creation of s ingle-sex schools does
not improve the performance of girls as expected. In each case problems arise which warrant
investigation.

Other avenues for generating problems through experience include brainstorming and discussion
with colleagues and supervisors; participation is team research projects organized by institutes,
departments, or international organizations; participation in workshops, seminars and conferences.

2.3.2 Literature

Research literature, such as periodicals, dissertations, and conference and workshop reports would
suggest areas requiring further research.

Apart from suggestions for further research, the reader of research, literature encounters
contradictions, inconsistencies and unsatisfactory findings in an area of investigation. The reader
can then design research to fill in the gaps created by the inconsistencies and unsatisfactory
findings. The new design could contribute knowledge through improved methodology or
modifications to existing theory.
Potentially researchable problems can also arise from voracious reading of literature, which usually
generates ideas. Sometimes historical research could spring from the search for a missing link in
the literature.
One of the major contributions of literature as source problems is the suggestion of replication
studies. Repetition of an earlier study encountered in the literature increases the generalizability
and validity of its findings. The repetition of study using different geographical contexts, periods,
different subjects, different educational levels, different time periods, different methodology, or
different instruments serves the purpose of confirming and extending its findings.

2.3.3 Theory

Theories pronounced general principles for which applicability to social situations requires
research. Theories are thus fertility sources of research problems. From theories, relationships
among variables can be predicted and tested. For example, Piaget’s Theory of intellectual
development was a springboard for the suggestion that there could be a mismatch between the task
demands of the school curriculum and the level of intellectual development of secondary school
children. The education antecedent of the interplay between the curriculum task demands and the
level of intellectual development of secondary school students is that poor performance among
students is partly due to the effect of the mismatch between the two factors. Maslows’ hierarchy
theory would act as an eye opener to employers that workers needs need to be fulfilled if they are to
get the best out of them.

12
2.4 Criteria for problem Selection.

Faced with several potentially researchable problems, how does researcher select the most
appropriate one for investigation at given point time? This can be a thorny question for beginning
researchers. Sometimes a start is made on one problem, which is abandoned for another one. Trails
may be made on several other problems in succession before settling down to the chosen one. This
causes undue delay in conducting the research. While subjective decision is involved in the
selection of an appropriate problem a few guidelines would help to prevent some pitfalls. If
research problems are evaluated by the criteria below, one is likely to select the moist expedient
problem.

2.4.1 Significance

The ultimate goal of research is to enrich knowledge. It is desirable to select a problem whose
solution would make the most valuable contribution to the body of organized knowledge. The
contribution could be in the area of methodology. Methodology, which improves on, that used in
earlier studies may yield more reliable knowledge. Otherwise the new contribution could be in the
area of theory, practice, or replication of existing findings. Existing theories and relationships may
be modified, refined or replaced. If the solution of the problem makes no difference to practice.
Otherwise it should be a replication, which improves the generalisability of earlier findings. As an
illustration, the relationship between aptitude and achievement has long been well established. A
study, which arrives at the same relationship, would be considered trivial because no new
knowledge is uncovered. However, later studies modified the relationship by contributing the
knowledge that aptitude could interact with other attributes of the students and the teacher to affect
achievement. The moderating ‘attribute-treatment interaction’. The proliferation of attribute-
treatment interaction studies led to the discovery that the relative efficacy of different teaching
methods is not universally applicable but depends on different students and teacher characteristics
and different psychosocial contexts.
It is worth mentioning as a note of caution that case studies of local problems are very low on the
criterion of significance. This is because their findings cannot be generalized beyond a particular
institution or group of individuals.

2.4.2 Researchability

Some problems involve questions, which cannot be subjected to systematized study. For instance,
one cannot study the influence of the Holy Spirit on the academic performance of priests. Are
searchable problems involves variables that can be defined and measured. In general, many
philosophical and ethical problems cannot be studied empirically. The contribution of research in
such cases is to yield useful information that can be used to find answers to philosophical and
ethical questions.

Sometimes unresearchability from unavailability of the required measuring instruments and


inability of the researcher to develop and validate novel instruments.

13
2.4.3 Genesis of further research.

A Viable problem is one that can be expanded or followed up in further research, rather than a dead
end. As the research question is answered, further questions ought to be generated that requires
investigation. In general, descriptive surveys and correlation studies tend to fail on this criterion.

2.4.4 Suitability.

A final screen through which a problem should pass is suitability to the peculiarities of the research.
Firstly, the problem ought to be relevant to the professional goal of the researcher. It should
make him more knowledgeable and more proficient to his career.
Secondly, the researcher should find the problem meaningful and interesting. This would
make the researcher enthusiastic enough thoroughly investigate the problem and to persevere to the
end of the research.
Thirdly, the problem solution should be within the level of competence of the researcher. If
the researcher is not knowledgeable in the use of the relevant instruments and cannot acquire the
expertise within reasonable time; he lacks the experience to investigate the problem. In addition to
the relevant skills and competencies, the researcher needs to be well acquainted with existing
theories and concepts in the area.
Fourthly, the researcher should consider the availability of the required manpower,
equipment, finance, and other resources. Some problems involve so many variables that only large-
scale studies by a team of researchers having enormous amounts of funding can tackle them.
Individual researchers cannot take cognizance of all relevant variables in order to yield meaningful
funding.
Fifthly, the time required for an appropriate problem solution should be realistic for the
researcher’s programme. Research undertaken for a degree usually has time limits for completion.
Such time limits should be considered while choosing an appropriate problem.
Finally, the researcher should consider the accessibility of the subjects and the data. One of
the author’s research students was compelled to abandon a project because she could not obtain the
scores of candidates in the different papers offered for the Common Entrance Examinations. Only
aggregate scores were accessible. It is also common knowledge that certain individuals are difficult
to reach. For example, a sample of ministers or governors is not easily accessible to the ordinary
researcher

14
Three. Review of Literature
3.1 Definition

Literature review is the systematic study of all existing work that is relevant to the research. It is
concerned with locating, reading evaluating and citing reports of related research.
The purpose of literature review is to succinctly evaluate state of the art in the area under
instigation.

3.2 Usefulness of Literature Review

Imagine an ignorant researcher expending a lot of effort and resources only to find out that the
world is round and the researcher would have profited from reviewing previous work. This
illustration elucidates the usefulness of literature review. By reviewing literature, the researcher
ascertains the extents to which previous research addressed the problem at hand. Findings already
established are noted. Research questions for which answers have been found are eliminated.
Significant issues are identified thus preventing redundant, irrelevant, wasteful, or trivial research.
Some topics in research tend to be over-flopped. Research should contribute new
knowledge. But some times, it is unnecessary duplication and a waste of time and resources.
Another usefulness of literature review is that it helps to avoid areas that do not hold
promise for meaningful findings. An example of educational research which does not hold promise
of meaningful findings but which attracts many students is the search for a superior method of
teaching. Process-product studies in which different teaching methods are compared for relative
efficacy are affected by variables too numerous for individual researchers to contend with. Large-
scale studies show that all methods of teaching are appropriate on different occasions, with different
kinds of studies, and at different levels of education. Research, which aims at establishing one
superior under all conditions, is suggested by previous research as unproductive and futile.

A third usefulness of literature review is that it guides to the adoption of a different and
better methodology for investigating the problem than that earlier planned by the investigator. The
improved methodology of investigation. With improved methodologies, apparent inconsistencies in
earlier findings may be resolved or missing links may be identified. For example, earlier research
showed inconsistencies in the effects of different school types (co-educational, single-sex, private,
public) on the achievement of female students. By employing improved methodology, later
research identified a missing link, namely, socio-economic status, which interacted with school
type. Improved methodology may apply to the design of the study and/or analysis of the data.
Literature review thus creates awareness of alternative procedures, which may help to enrich or
refine the research methodology.
In addition to enriching the methodology, literature review enriches and refines the
hypotheses, instruments for data collection, and the references.
During literature review, the researcher’s understanding of the issues involved is broadened.
This places the research in a better position to prefer intelligent guesses about the tentative answers
to the research questions (hypotheses). Literature review thus serves to sharpen the focus of the
researcher and to generate plausible hypotheses.

15
Fifthly, literature review grants the researcher an awareness of the range of appropriate
instruments available for data collection. Such instruments may be indigenized intelligence test,
vocational inventories, personality measures, or attitude scales. The researcher may only adopt or
adapt instruments rather than create new ones. The use of previously validated instruments allow
for comparison of results across different studies. Opportunity is also provided for instruments
needing further validation to be reassessed.
Sixthly, references that may not have been encountered library search can be obtained from
the articles reviewed.
By making available to the researcher the input, information mistakes, problems
encountered, and further suggestions from previous studies, literature review enables the researcher
to save time. Otherwise, he may have to make the same mistakes and spend more time to correct
them. Herein lies the seventh advantage of literature review, that is, saving the researcher time
preventing possible pitfalls.
An eight advantage to be gained from literature review is the delimitation of research may
suggest the choice of a smaller geographical area, a smaller sample size, the administration of fewer
instruments of the study of fewer variables.
Finally, literature review may help to provide or justify theoretical framework for the study.
For example, psychological changing provided a theoretical framework for the development
systems for the analysis of sequence in classroom verbal discount (Anderson, 1966). Herzberg and
Maslow propounded motivational issues in work place environment. Erikson’s ego identify
achievement model was the basis of the theoretical model to explain the pattern of uptake of science
by adolescent boys and girls (Head, 1980). Piaget’s theory of intellectual development provided a
theoretical framework for the development of taxonomies for assessing demand level of curriculum
material (Shayer and Adey, 1981)

3.3 Procedure for Literature Review

The first step in the systematic study of literature is to select the key words, which are used to
access relevant works. If the research is Pornography among the urban youth community, disability
discrimination at work place and in school, attitude and stigmatization of HIV/AIDS victims,
examination malpractices in secondary schools, for example, all have key words that may be
carefully selected to arrive at amore researchable statement. In the later for example, may include
examination malpractice, examination agencies, senior secondary schools examinations, junior
secondary school examinations and secondary schools. If the research focuses on teaching method
the key words may include teaching methods, teaching instruction, learning and teacher.

The list of selected key words is used to access relevant references from such preliminary sources of
information as library catalogues (subject and author index), education index, and current index of
journals in social issues, education, dissertation, abstracts, or computerized referencing services.
Having obtained the list of useful references, the researcher is ready to consult libraries and other
information depositories in order to access the referenced materials. Certain materials may be
located at such distances that the researcher cannot reasonably travel to access them. She can then
apply to have such materials sent on inter-library loan through the local library. A fee may be paid
for the inter-library loan service but this is always cheaper than having to travel.

16
Before reading any material, the researcher should purchase index cards on which to record the
information. Index cards are of various sizes. For example, some measures eight inches by five
inches or six inches by four inches. The larger-sized cards make it possible to record virtually all-
essential information without recourse to other paper material. The author sometimes prefers to
continue the recording of essential information on other paper material. The advantage is that all
essential information recorded from one reference is more easily located in one place.
While writing out references on the index cards, it is useful to choose a referencing style for the
word go and to keep consistent in using the chosen style to record all references. Convectional
ways of documenting references are discussed in fuller details in the chapter on preparing a research
report. The advantage of recording references in the standard form in which it would appear in the
research report is that it saves time and energy. The researcher would not need to write out a list of
references for the typist. She simply arranges the index cards in the order in which the references
would appear on the reference or bibliographical list and makes the cards available to the typist.
Some index cards come in a storage box, which has coded demarcations for filing the cards by the
alphabets that start the authors’ names. This enhances easy access of any entry when needed.

After writing out references on the index cards, the researcher is ready to locate the actual materials.
The assistance of librarians may be needed. Primary sources of information should be sought for
since they provide the researcher with comprehensive, unadulterated and unmultilated first hand
information.

The most recent references should be consulted first. The reason is that information provided by
recent materials may superceeds those in earlier references. At least, they would embrace the major
ideas of earlier work. In consulting recent materials first, time is saved because some notes that
would have been taken down from earlier references may become redundant and unnecessary.
Recent materials also provide further references to relevant works, which were oversight earlier.

It is useful to first peruse the abstract and/or summary of any reference consulted to quickly
ascertain the relevance of the material. Time should not be wasted on reading entire works, which
are not particularly relevant to the research. Materials that pass the screening test should be read in
their entirety so as to obtain a correct picture of the ideas portrayed. The relevant materials may
include theoretical discussions, critiques, and empirical studies. All relevant notes are then taken
down using the index cards. Any page numbers that would be cited should be noted.

The final stage in literature review involves the organization and write-up of the insights received
while reading. The write-up should be a critical appraisal of the state of the art in the area being
investigated. It ought to show that the researcher clearly understands all the related issues. The
final point at which all the discussion converges is the pertinence of the proposed research in the
light of the present state of affairs. A common error among students carrying out research is the
inclusion of several bits of information extracted from literature without any linking chains or
binding ideas and without pointing out the pertinence of the reviewed literature to the proposed
research. Any ideas that sound nice are just put together in an related list of information pieces and
labeled literature review must flow in accordance with a trend or trends of thought. Different trends
of thought may reorganize round subthemes. The linking thread that ties them all up is their bearing
on the proposed research.

17
3.4 Citation

Any ideas lifted from referenced material, whether quoted verbatim or paraphrased diagrams and
tables, must be duly acknowledged, citing the source. Careless statements encountered in the
dailies and some magazines are not appropriate in research. The sources are cited in the text in the
manner elaborated below while full bibliographical details of the references are listed at the end of
the write-up. The appropriate position for citing the source depends on the context.

For a paraphrase, the source may be cited in the middle of a sentence, at the end of a sentence, or at
the beginning of a sentence thus:

Derived scores have been shown to be more useful than raw scores in the identification of students’
potentials (Nana, 1992).

Or

Okpala (1995) sensitized readers on how to articulate the various stages of a research project.

For verbatim quotations, the quotation is enclosed in double inverted commas or quotation marks.
The source is cited in a similar manner to the paraphrase except that the page number from which
the statement was lifted is inserted after the date of publication thus:

According to Nana (1979:23). “The measurements were not absolutely dependable; however,
otherwise all ten results would have been exactly the same”
Or

“Derived scores are more use full than raw scores for identifying the potential of students “ (Nana,
1992:54)

Okpala (1995): 1) stated that the essence of his paper was “to sensitize readers on how to articulate
the various stages of a research project”.

In the above examples, the quotations are lifted from pages23, 54 and 1 respectively.

Quotations longer than three typewritten lines should be indented, starting on a new line and
centering quotation between the two margins.
Such quotations are not enclosed in quotation marks but are typed with single spacing. An example
is given below:

In the words of Nkpa (1990:2),

Teachers of young children in nursery and primary schools can help to modify the gender role
socialization received by girls in childhood by providing learning experiences in schools, which
help to foster interests in science and technology among girls.

18
Verbatim quotations must contain the exact wording, spelling, and capitalization and interior
punctuation of the original source. If underlining or italicizing words for emphasis effects changes,
for example, the words “italics added” should be enclosed in brackets immediately after the
underlined words thus.

Using derived scores (italics added), a graphic or pictorial representation of a student’s performance
on different assessments may be produced. This representation is known as a profile. Profiles are
invaluable for detecting a student’s strengths and weaknesses at a glance (italics added) and for
providing the necessary information for career guidance… if such a profile is plotted by guidance
counselors or teachers for different school terms and over a few years, and fluctuations [changes] in
the students performance would readily be observed (Nkpa, 1992:59-61).

In cases of joint authorship of a publication, both names are cited each time the publication is cited
in the text if there are only two authors. For example, Ohuche and Anyanwau (1990). If the
authors are more than two but fewer than six, all authors are cited the first time the reference
appears in the text. Subsequent citations may bear the name of only the first author followed by
“et.al.” To indicate that there are other joint authors. For example Okpala, Oncha, and Odeji
(1993:35) defined achievement test as “a test that measure the extent to which a person has…
acquired certain information.” In any subsequent references to the same publication, Okpala, et.al
(1993) is used rather than repeating all three names.

If two or more authors bear the same surname, their initials must be included in all citations to avoid
confusion thus: Ingwe S.O. (1989); Ingwe, A. (1990).

In cases where the author is a corporate body, the name of the body is given in full the first time if
appears in the text. If the name is long, cumbersome, or if its abbreviated form is familiar,
subsequent citations bear the abbreviated name. Corporate bodies include associations,
international organizations, governments, and institutions. Examples include the following:

Association:
Association of African Universities (1988) abbreviated as AAU (1988)
Ashby Report (1960)
Sometimes, publications are cited for which the author’s name is not inserted in the publication or is
represented by initials or typographical devices such as asterisks. Such publications are treated as
anonymous and cited under the title. For example, Holy Bible (people’s Parallel Edition) (1981).
For pseudonymous publications using a fictitious name, the abbreviation “pseudo” is
appended to the pseudonym thus:

Eagle pseudo. (1996)


If the real name is known, it should be added in square brackets thus: Eagle, pseudo. [ASUU]
(1996).

19
Four. Hypotheses and Research Questions

4.1 Hypotheses

A research hypothesis is a reasoned supposition proposed as a starting point for an investigation. A


hypothesis statement expresses possible relationship s among variables. The relationship may be
direct as in the following statement: “Performance in mock examinations is positively correlated to
performance in the actual examinations”. They may be indirect as in the following
statement:”Unemployement persons do not enroll in sandwich programmes.” They may be causal
as in the following statement: “Student riots are caused by poor boarding conditions.” They could
be co-occurrence as in the following statements: “Destruction of public property only occurs in
government owned properties.” “Truancy is most pronounced in times of economics crises.” They
could be differences in which one variable is greater than, less than, or equal to another variable as
in the following statements: “Students in boarding houses perform better than day students in
examinations.” “Students taught in a second language exhibit a lower level of intellectual
competence than students taught in the mother tongue.” “Children of non-working mothers are as
truant as children of working mothers.” “Children in private primary schools perform better in
common entrance examinations than children in state primary schools

4.2 Sources of Hypotheses.

The relationship suggested by hypotheses may be based on observation, experience, and guesswork,
suggestions from earlier research or deductions from existing theory. To exemplify these sources of
hypotheses, a number of statements: “Children of teachers do better at school than other children.”
The statement relates home background to performance. Such a statement could be prompted
background. It could also be prompted by direct experience in which the children of teachers find
themselves excelling in school. Otherwise, it could be guesswork arising from personal reflections
about lifestyle of teachers. Reading of earlier research whose results suggested but could not
confirm the existence of such a relationship could also prompt it.

We could consider a second statement “Children who are at least six years old do better than
younger children in the first year of primary school.” This statement expresses a proposition, which
relates age with performance. Such a statement may arise from thoughtful deductions from the
Piagetan theory of intellectual development contends that children can internalize information for
which their age and level of development has made them ready to assimilate.

A third statement of relationships might read thus: “Students who do well in mathematics also do
well in physics.” This statement could be an intelligent guess based on the knowledge that physics
content involves much of mathematics.

4.3 Functions of Hypotheses

Hypotheses propose possible but not certain relationship, which may then be confirmed, proven or
disproved by research. They provide clarity and direction to research. They guide the researcher to

20
appropriate methods and relevant data thus preventing the dissipation of effort and resources in
unproductive activities.

4.4 Qualities of a good hypothesis

Hypotheses are very important in research. However, they can function maximally only when they
are properly stated. A properly stated hypothesis must be clear, relevant plausible, verifiable,
consistent with known facts, and testable.

A clear hypothesis is one that precisely states the relationship between variables. A defective
hypothesis is herein stated: “the socio-economic status of parents affect the school performance of
their children.” Socio-economic status is not a precisely defined term. It may refer to level of
education, make of a car, type and site of housing, occupation or, in some communities, number of
wives. It may be clearer and more precise to state several hypotheses rather than one. One of such
hypotheses may read thus: “The level of education of parents is related to the school performance
their children”.
A good hypothesis is directly relevant to the research problem. A research which seeks to ascertain
the issues considered by students when choosing subjects for further study would have the
following as a relevant hypothesis: “Students take their future occupations into account when
choosing subjects.” An irrelevant hypothesis reads thus: “Students choose more of arts subject
than science subjects.” This second hypothesis relates to the choice made rather than the criteria for
choice.
A plausible hypothesis is informed, reasonable and tenable not just any wild guess. A proposition
that dark-skinned person type better than light-complexioned persons is an unreasonable wild guess.
In contrast, the proposition that boarders perform better academically than day students is plausible.

A verifiable hypothesis is one for which appropriate evidence can be supplied. A proposition that
deaf and dumb children verifiable. It is not possible to probe and test understanding, which is not
expressed.
Consistency with known facts is an important quality of a hypothesis. A good proposition should
not run contrary to established facts. It is established, for instance, that people grow taller, not
shorter; that aptitude is related to achievement, and that intelligence influences learning. It would
be fruitless labour to investigate a proposition, which violates these facts.

A testable hypothesis is one that is amenable to statistical procedures. Statistical procedures must
therefore be considered when hypotheses are stated. Hypotheses, which cannot be subjected to
statistical test, can be modified while retaining their essential meaning. In a research, which sought
to relate teachers’ qualification to student performance, a student started with the following
hypothesis: “there is no significant relationship between students’ results and the qualifications of
their teacher.” The statistical test of hypothesis involves the correlation of teachers’ qualifications
with students’ results. However, teacher’s qualification would be measured as a discrete variable
while student’s results would be measured as a continuous variable. A correlation of the values of
the two variables is statistically inappropriate. Besides, individual students, not classes are the units
of analysis. There are therefore not enough teachers to match the students. The data would
therefore not be amenable to statistical correlation. The hypothesis was then modified as follows to

21
make it testable: “there is no significant difference in performance at the GCE examinations
between students taught by graduate and non-graduate teachers. It becomes possible to compare
performance measured on the same scale between two groups of students using t-tests. The
problem of the small size of the teachers’ sample is also eliminated.

4.5 Null and alternative hypotheses

A testable hypothesis may be stated in earlier of two forms; the null or the alternative hypothesis.

A null hypothesis (Ho) is a hypothesis of no relationship or no difference in the values of the


operational statement of the research hypothesis, which suggests a direction of predicted
relationship, or difference. Using the earlier stated hypothesis for an example the null form reads
thus: “there is no significant difference in performance at the GCE examinations between students
taught by graduate and non-graduate teachers.” The alternative hypothesis may read thus: Students
taught by graduate teachers perform significantly better at the GCE examinations than students
taught by non-graduate teachers. Or “students taught by non-graduate teachers perform
significantly better at the GCE examinations than students taught by graduate teachers.”

The implication of testing the null hypothesis must be borne in mind that is if Ho is not upheld, the
direction of differences is not specified. Difference could therefore be the opposite of the prediction
suggested in the research hypothesis. For example, if Ho is tested and rejected, meaning that a
difference exists between the performance of students taught by graduate and non-graduate
teachers, it is possible that the performance of students taught by non- graduate teachers is better
than that of students taught by graduate teachers or that the performance of students taught graduate
teachers is better than that of students taught by non-graduate teachers.

In situations, which allow an a priori determination of direction of difference, H1 is tested. A one-


tailed test is then used for a direct test of the prediction. On the other hand, a two-tailed test is
applicable for testing the null hypothesis. If a statistical test upholds the null hypothesis, it means
that any observed relationship or differences are due to sampling error.

4.6 Research Questions

While hypothesis are useful in relevant cases, it should be borne mind that hypotheses are not
always required in research. Fact finding research and explanatory studies may not require
hypotheses. For example, research may wish to uncover what successful users of the inquiry
method actually do in classrooms and laboratories; to identify the days of the week when truancy is
most pronounced; to identify the reasons for students’ choice of subjects; or to ascertain the
textbook preferences of students. In such cases research questions would suffice.

The relevant research questions would then read as follows: what classroom behaviours constitute
inquiry? On what day of the week is truancy most pronounced? What criteria influence the choice
of subjects by students? Why do students prefer such textbooks?

22
Five. Populations and Samples

5.1 Definition

A population refers to all the elements in a well-defined collection or set of values. A sample is any
subset of values from a population. All the secondary schools in Nigeria make up the population of
secondary schools. One secondary school from each local government area in Nigeria would
constitute a sample of secondary schools.

5.2 Target population

Any research is expected to yield findings. The findings may be applicable to the entire universe,
limited sections of the universe, or certain elements within the universe. An educational research
whose aim is to ascertain the relative efficacy of instruction in the mother tongue and instruction in
a second language learner. If the mother is Spanish, the findings are expected to generalize to only
Spanish speakers. The research is conducted in primary schools; the findings are expected to
generalize to only primary school learners. If it is conducted in rural settings, the findings are
expected to generalize to only rural settings. The population to which the researcher intends to
generalize his/her findings is referred to as the target population.

5.3 Sample

Usually, the target population is too large for the researcher to work with. The geographical area
covered may be too widespread. The number of persons may be too many for the researcher’s
limited resources. For example, if the number of Swahili speaking primary education learners is
over one million. The researcher is then constrained to limit investigation to a smaller controllable
sample. A crucial decision, which the researcher must make, is how to select samples that will truly
represent the population to which the findings would be generalized. The extent to which the
sample accurately represents the target population is the concern of external validity. If the sample
does not truly represent the population, influences cannot be drawn about the population
characteristics from the sample no matter how powerful the statistical inferential techniques used.
No definite conclusions eminent form the research. Therefore, no new knowledge would be
generated and the research would be a worthless expenditure of effort. Two issues to be addressed
in the selection of a representative sample are the sample size and the sampling techniques used for
selections.

5.4 Sample Size

As discussed in the section on reliability, increases in sample size increase the likehood of
accurately estimating the population characteristics from the sample. A researcher should select a
sample that is large enough to improve the likehood of obtaining results that are similar to what
would be obtained using the entire population. There is no hard and fast rule but in general, a 50%
sample is recommended for populations that run in hundreds. For population that runs in thousands,
5% to 20% samples may be drawn. If the phenomenon being studied shows a lot of variations

23
within the population, a higher sampling ratio is recommended. The type of significance test to be
used in the analysis of data may also dictate an appropriate sample size.

5.5 Sampling Techniques

If several samples of equal size are drawn from a given population, it is unlikely that their
characteristics would be identical. Similarly, the characteristics of any one sample are not expected
to correspond perfectly with those of the population.

The differences between the population parameters and the sample characteristics are said to be due
to sampling error. Sampling error may arise from variations due to the chance selection of different
individuals or to mistakes made in sampling techniques. If the sample is biased, it is not possible to
make an accurate prediction about the population. In contrast, random to estimate the magnitude of
sampling error due to chance factors. The degree of confidence specifying the probability of
accuracy of the sample estimate can then be calculated. Random selection is critical to theories of
statistical inference, which are based on laws of probability. If random selection is not strictly
followed, there is no justification for using statistical tests of significance to infer to population
characteristics from sample estimates. This is because sampling statistics are treated as random
variables. If simple random sampling is either inappropriate or impossible, other probability
sampling techniques should be used in order to validity generalize the findings to the targeted
population.

A probability sample is one in which chance factors determine which elements from the population
will be included in the sample. It is thus theoretically possible to calculate the probability that any
specific element in the population will be included in the sample. Probability sampling techniques
include sampling, stratified sampling, cluster sampling, stage sampling and systematic sampling.
Non-probability those for which the probability of a member of the population being selected
cannot be calculated. Statistical inference cannot be used to legitimately generalize statistically fro
a non-probability sample to the target population. Satisfactory replicating the investigation in
several contexts may make generalization from non-probability samples. Non-probability sampling
techniques include purposive sampling, volunteer sampling and captive audiences.

5.5.1 Simple Random Sampling

The critical attribute of simple random sampling is that each member of the target population has an
equal and independent chance of being included in the sample. Independence in this sense means
that the selection of a member of the population. Thus, if there are 500 members in the population,
the probability of any member being included in the sample is 1/500 irrespective of whoever else is
selected.

The simple random requirements of independence and equal probability are assumed in most tests
of significance. Using tables of random numbers for selection meet the requirements of
independence and equal probability. A table of random numbers is displayed in the appendix. If
available, other used for selection. To use a table of random numbers for selection, every member
of the population is identified with a number. The appropriate number of columns for reading the

24
table is chosen. For example, the appropriate number of columns for population of thousands is
three while the appropriates number, of columns for a population of thousands is four. As
appropriate, the last three or four digits of the numbers in the table are read, picking out those
members of the population whose identity numbers appear in the table. For populations of 500 and
5,000 respectively, the numbers would range from 001 to 500 and 0001 to 5,000. The point at
which the researcher starts to read the numbers is determined in a random manner. From the
starting point, the table is then read systematically in any direction. The corresponding members of
the population for all the selected numbers make up the sample.

An essential requirement of the procedure for obtaining a simple random sample is that a complete
list of the population must be available. This poses a problem in cases where the population list
may be unavailable or inaccessible. Besides, such lists tend to be quickly outdated by geographic
mobility, births, or deaths,. As discussed in the section on research design, even when a simple
random sample is selected, some members may refuse to co-operate while others may be lost
through attrition. This result in a sample size, provision should be made for attrition. For example,
minimum numbers required for significance tests should be exceeded.

5.5.2 Stratified Random Sampling

When homogeneous subgroups can be identified within a population, these subgroups are referred
to as strata. Stratified sampling aims at ensuring proportionate representation of these subgroups in
the sample.

The stratified sampling procedure divides the population into homogenous subgroups containing
members who share common characteristics. In a population which contains male and females,
subgroup of females. Simple random selection of members from each subgroup is then undertaken
in such a way that the sample. For example, if the population is made up four hundred males and
one hundred females, a random sample is selected in the proportion of four males to one female.
Illustrating from our example above, a simple random selection of two hundred males could be
made from the male subgroup and a similar selection procedure would yield fifty females from the
female subgroup. The total sample size would be 250.

Stratified sampling is particularly useful when the study requires a comparison of subgroups or
when the study likely to influence the level of the dependent variable.

Quota sampling may be regarded as a special case of stratified random sampling in which the
number of members to be selected from each stratum is fixed by precedetermined quota rather than
proportionate representation. For example, one school may be required in each local government
area irrespective of the relative number of schools in each local government area.

5.5.3 Cluster sampling

Cluster sampling is necessitated when simple random sampling poses administrative problems. The
population may be large and widely dispersed, for example, all primary school children in one

25
country. It would then be impractical to sample from every school and to work for six –month
period in all the schools. The cluster sampling procedure would then involve the random selection
of smaller number of schools. All the children within the selected schools would then be used for
the study. In the above example, the school is the unit of sampling. Similarly, the classroom may be
used as the unit of sampling. Cluster sampling saves time and resources.

5.5.4 Stage sampling

Stage sampling is an extension of cluster sampling. Using the example, given cluster sampling,
stage sampling would involve the random selection in stages, firstly of a number of schools;
secondly, of a number of classes within these schools; and thirdly, of a number of pupils within
these classes.

5.5.5 Systematic sampling

Systematic sampling involves the selection of members from a population list in a systematic
fashion. The technique is used when the members of a defined population are already placed on a
list in random order. The selection of members then proceeds by dividing the population by the
required sample size. For example, if a sample of four hundred from a population of two thousand,
a division process (2,000-: 400) results in the number, 5. Every fifth person on the list is then
selected. A random process chooses the starting point for selection.

Independence is not ensured in systematic sampling. Once the first member is selected, all other
members of the sample are automatically determined. For instance, if the member with serial
number 2 is selected, then the members with serial number 7, 12, 17, etc. would automatically be
included in the sample.

5.5.6 Purposive sampling

Purposive sampling is necessitated when the research is interested in certain specified


characteristics. Only members with such characteristics are selected. For example, a sample of
effective teachers, judged by the consistent performance of their students in external examinations,
may be required. ‘Effective teachers’ are then purposively selected, workers in a given trade union
movement, an effective sampling of trade union leaders may be appropriate.

5.5.7 Volunteer sampling

Volunteer sampling is necessitated when every member of the population cannot comply with the
demands of the investigation. For example, it is not realistic to compel human beings to take part in
an experiment. The nature of the experiment would determine the kind of individuals who would
be willing to comply with its demands. Only such willing individuals are included in the sample.
These individuals are volunteers. They are willing to co-operate with the researcher.

It is worth nothing that volunteers necessarily comprise a biased sample. Volunteers tend to differ
from non-volunteers characteristically. The characteristics on which they may differ include level of

26
education, ethical standards, social, and status, level of intelligence, sex, and psychological
characteristics such as personality, age, and religion. The findings from research using volunteers
are therefore not easily generalisable to any particular population.

5.5.8 Captive Audience

Captive audience has been used as samples in social and educational research. For example, the
clients whom counselor normally counsels, the students whom a researcher normally teaches. Any
research using a captive audience should generalize its findings only to the particular group used for
research. Replication of the findings by several researchers in several other contexts may, however,
warrant a generalization to an appropriate population but not by inference on a statistical basis.

27
Six. Research Designs

6.1 Introduction

Research design is the structure or plan of a research – what to do and how to do it. It involves the
structuring of variables in a manner that enables their relationship to be determined. Understanding
of design principles enables a reader to evaluate the conclusion of research reports in terms of
whether they follow logically.

Research should be carefully designed and carried out to the planned specifications with respect to
the choice of variables, procedures, controls and assignment of subjects. The design of real studies
is a complex and difficult task.

It may be pertinent to define the terms variables and control at this point. A variable is a
quantitative or equivalent entity, which can take on different values or levels. For, example, sex can
be male or female, age or income can take on several continuous values. The manipulated variable
is the independent variable. Its effects are observed on the dependent variable.

A careful design should take care of extraneous variables, which may be responsible for the
observed levels of dependent variables. Such variables are referred to confounding variables since
they invalidate the answer to the research question.

Confounding variables may arise from history, instrument reactivity, instrumentation problems,
attrition of subjects and Hawthorne effects.

6.2.1 History

Some events in the environment may affect subjects in addition to the independent variables. For
example, in a research to determine whether boarding enhances academic performance, other
interactions of the students such as private coaching classes may also affect performance. In
longitudinal studies, maturation over the period of investigating may influence the level of the
dependent variable. The researcher has no control over these extraneous events. The best that can
be done in such all the groups. For instance, one could equalize the ages of the subjects and assume
that the maturation effect cancels out. The Solomon four-group design, discussed later in this
chapter, is particularly useful in controlling for the effects of history.

6.2.2 Instrument reactivity

Instrument reactivity refers to the reaction of instruments with the things they measure. For
example, when a teacher’s lesson is being recorded, the teacher’s behaviour may change because of
the presence of a recording instrument. Similarly, the attitude of respondents may change when
they are required to complete an attitude questionnaire. In either case, the true measurements are
distorted.
The Solomon four-group design makes it possible to determine the extent of possible interaction of
the measuring instrument with the independent variable.

28
6.2.3 Instrumentation problems

Measuring may go faulty. A tape recorder, for instance, may go faulty even in-between
measurements. An investigator or investigators may use the instruments incorrectly. They may
also interpret the measured values incorrectly. All the above-mentioned situations pose problems of
instrumentation. Instrumentation problems result in measurements error. Where possible, the
sources of error should be eliminated. Otherwise, the magnitude of error should be estimated and
adjusted for.
Attempts to deal with problems of instrumentation includes periodic checks on the instruments,
improvement of the instrument, replacement of the instrument, monitoring of the investigators to
ensure that they are operating according to the required specifications, and the use of multiple
measurements. The rationale for using several items in a questionnaire to measure the same
variable or for using large samples is that random measurement error averages out to be quite small
using multiple measurements or large samples. The use of multiple methods and multiple
investigators as elucidated by Denzin (1978) in the theory of triangulation is a very good strategy.
However, non-random measurements errors such as bias must be either eliminated or estimated and
adjusted for.

6.2.4 Attrition of Subjects

Some subjects die, transfer to other locations or become unwilling to continue as participants before
a study is concluded. This is referred to as attrition. Attrition may not be random. This implies that
completers may have a peculiar characteristic, which limits the generalizability of the results.
Attrition also distorts the equivalence of treatment groups making it difficult to determine the
contribution of the independent variables to the results of the study.
Strategies to cope with the problem of attrition include a shortening of the duration of the study,
ensuring continued subject participating through reinforcement or prior commitment.

6.2.5 Specialized Samples

Specialized samples are those chosen from specialized groups such as street kids, prostitutes,
university students, gifted children, and private schools or from a limited population or even the
original sample only. If desired, samples must be selected to represent the target population.
Otherwise, the study must be replicated in other contexts to render it externally valid.

6.2.6 Bias in Subject Assignment

Whenever subjects are not assigned randomly to different levels of the independent variable, bias
results. Bias must be eliminated by matching or by controlling, as much as possible, all difference
among the groups.

29
6.2.7 Hawthorne Effects

Hawthorne effects are named after the Western Electric Studies in industry (Roethlisberger and
Dickson, 1939). Hawthorne effects are distortions in behaviour, which arise from the knowledge by
participants that they are the subjects of a study.

Strategies to cope with the problem of Hawthorne effects include unobtrusive methods in which
subjects are aware of their participation in the study; provision for an adaptation period so that
behaviour return to the status quo before measurements are taken, and information to control
subjects aimed at equalizing the knowledge of participation among control and experimental
groups.

6.3 Experimental and non-experimental designs

The experimental and non-experimental designations represent two ends of a continuum with quasi-
experimental designs in-between. An experimental design is one in which the subjects are
randomly assigned to conditions and manipulated by the investigator who studies the effects of the
manipulations. In contrast, non-experimental studies observe subjects in naturally occurring
situations. The degree of manipulation of subjects and conditions determine whether a study is
designated experimental or non-experimental as illustrated below.

Manipulation of subjects and


Conditions

Non-experimental quasi-experimental Experimental


Designs designs designs

In a perfect experiment, the variables are manipulated, the subjects are randomly assigned, and all
other variables are held constant.

The critical difference between experimental and non-experimental studies is the amount of
manipulation of subjects and conditions. They do not differ in the type of research questions asked
or the conditions and characteristics studied. Take for example, the following research question: Is
the mother tongue a more effective medium of instruction than English language. Professor Lungu
(1994) designed an experiment to answer this research question. He assigned primary school pupils
to two treatment groups. One group was taught in Swahili while the group was taught in English.
He monitored their performance over six years of primary schooling. Confounding variables which
he has to control for include – level of intelligence of the pupils, their socio-economic background,
their entry behaviuor, and out of school learning. A non- experimental study designed to answer the
same research question would locate pupils who are normally in the mother tongue and those who
are taught in English and compare their performance.
In either case, the same reach question applies, the independent variable is language of instructions
and the dependent variable is the amount of learning.

30
Experimental and non-experimental designs have their advantages and limitations. In experimental
studies, the investigator exercises greater control over the subjects and conditions, holding variables
constant or varying them systematically.
However, when a variable is held constant, the generalizability of the results applies only to the
chosen level of the variable, for example, a specified level of intelligence. Experimental designs
thus enhance internal validity while extreme compromise external validity. On the other hand,
experimental designs compromise internal validity while extreme validity is assured because the
research conditions are the same as those in the population to which generalization applies.

In general, experimental studies are more powerful in establishing causation and direction of
causation. However, an experimental study, which does not control all confounded variables, can
fail to establish causation while a non-experimental study can determine causation by systematic
observation of several variables over time.

In the sections that follow, selected specific designs are discussed in relation to their capacity to
deal with confounding variables and the appropriate methods of analysis for the type of data
generated. The design structure is denoted using symbols. In the nation, O is used to represent
observation or measurement; X is used to represent treatment or independent variable; numeric
subscripts are used to represent different observations and different treatments. The sequence of
events run form left to right. A design structure represented symbolically as X1 O1 X2 O2 X3 O3
implies that an observation follows each of three sequential treatments. A structure denoted as O1
X1 O2 implies that treatment is applied in-between two observations. A structure denoted as

X1 O1
X2 O2

Implies that two different treatments applied at the same time to two groups are followed by an
observation of each treatment group.

6.4 Non-experimental designs

One-group designs

6.4.1 Pretest-posttest (O1 X O2)

In this design, one group of subjects is observed, subjected to treatment, and then observed a second
time. Comparisons are made on the level of the dependent variable before and after treatment.
Pretest-posttest differences are then attributed to treatment. This design is frequently used to
determine the effectiveness of a particular programme when no control or comparison groups are
available. For example, to ascertain the effect of corporal punishment on the rate of truancy.

There are two main flaws of the one-group pretest-posttest design. Firstly, a researcher has limited
control over the events, which occur between the pretest and posttest. Extraneous events changes.
This problem is especially acute when the intervening period between pretest and posttest is along.
If the study is carried out with children, maturation may be responsible for pretest-posttest

31
differences in the level of the dependent variable. In the example given above, the rate of truancy
might decrease because mothers returned home from a vacation programme and began to monitor
their children.

Secondly, the design is subject to instrument reactivity. The first observation predisposes subjects
to guess what is being expected at the second observation. Therefore, the condition surrounding the
second observation is different from that of the firs. In order to minimize instruments should be
made. Otherwise, subjects could be acquainted with the nature of the study before pretest in an
attempt to hold that knowledge constant from pretest to posttest.

Data generated from pretest-posttest designs are tested for significance using the t-tests.

6.4.2 Time series designs (O1 O2 … O50 X O51 … O100)

This design resembles the pretest-posttest except that there are more than two
observations/measurements before and after treatment and equal time intervals between
observations.

Time series designs are used to ascertain trends over time. For example, one may wish to ascertain
whether the rate of examinations malpractice changed after the decree stipulating 21 years
imprisonment as penalty was promulgated or whether teacher behaviour changed after in-service
training. By collecting periodic data over a long time it is possible to determine when the variable
of interest changed. Unlike pretest-posttest studies, time- series designs make it possible to
determine the direction of trends before and after interventions.
Suppose the rate of truancy dropped immediately after punishment. The decrease in truancy rate
may not sustain over time. The time-series design will provide information about the sustained
efficacy of punishment. Such information may not be obtained using the pretest-posttest design.

The flaws of the time series design are the same as for the pretest-posttest design, namely,
instruments reactivity and influence of extraneous factor. Instrument reactivity may get
progressively worse over time as fatigue and boredom set in. Furthermore, when faking in order to
be consistent over time.

When there are few observations, data from time-series studies may be tested for significance using
the analysis of variance (ANOVA-tests the null hypothesis that the means of several independent
variables/populations). Otherwise, multiple regression and other sophisticated statistical methods
may be used to compare measurements before and after treatment

6.4.2 Correlation designs

In correlational designs, measurements of the state of affairs are taken using observation;
questionnaires, documents or correlational designs may be cross-sectional or longitudinal.

32
a) Cross-sectional designs (O)

Cross-correlational designs take all measurements at the same time. They are popular because of
their simplicity and ease of administration. They are useful in determining whether two or more
variables are related, for example, whether aptitude is related to achievement or whether
performance in theory tests is related to performance in practical tests; whether performance in
mathematics is related to performance in science.

An important advantage of the cross-sectional design is that history cannot pose a threat to internal
validity. The flaws of the design include instrument reactivity and Hawthorne effects. Suing a
variety of methods for data collection may minimize instrument reactivity. A Hawthorne effect
refers to the distorting effect on behaviour or responses, which arise from the knowledge by
subjects that they are participating in a study. The subjects may alter their behaviour/responses in
order to look good or conform to the researcher’s expectations. Consequently, false information
may be supplied in questionnaires and interviews giving rise to false relationships among variables.
In order to minimize the Hawthorne effect, subjects who exhibit bias may be eliminated. This
implies that certain questions, which would reveal such bias, as faking and social desirability would
be built into the questions, asked.

Appropriate statistics for analyzing data from cross-sectional studies include correlation for two
variables, multiple regression and factor analysis for more than two variables. The correlation
coefficient indicates the magnitude of the relationship. Multiple regressions make it possible to
predict the dependent variable from a set of independent variables. Factor analysis is used to
ascertain the underlying theoretical factors, which make for a relationship. Among several variables.

b) Longitudinal designs (O1 O2 …. On)

Longitudinal designs take the measurements of two or more variables over time. The design is
useful in prediction studies. For example, measurement of performance in the Joint Admissions
Examination could be followed with measurement of performance in subsequent undergraduate
examinations in an attempt to predict success in the university from the JAE performance.
Longitudinal designs may be trend studies, cohort studies or panel studies

Trend studies sample a given general population at each point in time. The samples at different
points in time are made up of different individuals/subjects who represent the same population, for
example, SS1 Students in X State in 1990, SS2 students in X in 1991, SS3 students in X in 1992.
The sample at each point in time may be made up of different individuals.

Cohort studies follow up a specific population over a period of time, for example, JSS 3 students in
the International Secondary School followed up as SS1 Students in the International Secondary
School the next year.

In panel studies, the same sample is studied using successive measurements at different points in
time.

33
The threats to internal validity of cross-sectional studies apply to longitudinal studies namely,
instrument reactivity and Hawthorne effects. In addition, attrition of subjects’ over time as well as
historical influences of extraneous events may affect panel studies than to trend and cohort studies
in which different individuals make up the sample at different points in time.

6.4.4. Single-subject designs, (X O)

In single subject designs, the sample consists of one individual or several individuals treated as a
group, for example, a class of students. They are widely used for behaviour modification. For
example, studies could aim at ascertaining the effect of supervision/observation on teaching
behaviour.

Single subject designs achieve internal validity through a high degree of control. The confounding
effects for extraneous variables are ruled out using the following techniques:

(i) Rigorous observation – Low inference behaviours to be observed are well defined,
observers are carefully trained and periodic checks of observer reliability are made.

(ii) Repeated measurements – Multifarious measurements are taken at very short


intervals and thus yield reliable descriptions of behaviour.

(iii) Description of experimental conditions – A precise description of the baseline and


treatment conditions, subject characteristics, and measurement procedures is
provided for replication purposes.

(iv) Baseline and treatment stability – the natural frequency of target behaviour prior to
treatment (baseline) is allowed to stabilize before treatment. Variations of more than
five per cent from the mean over ten observations should not be tolerated.

Similarly, the treatment effects should be interpreted only when treatment has
lasted long enough for a stable interpretable pattern to emerge.

(v) Length of baseline and treatment phases – the length of baseline and treatment
phases should be equal except when the needs to wait for a stable pattern of
measurements to emerge override this condition. Additional exceptions to the rule of
equal’s treatment phases are institutional and ethical demands. In such cases, the
problem can be circumvented by the use of several pilot studies to investigate
baseline and treatment parameters prior to the main study. It would then be possible
to create baseline and treatment conditions of equal duration and number of
measurements.

Replication using different investigators, different settings, and different individuals as subjects
increases the external validity of single-subject designs. The analysis of data yielded by single-
subject designs may be carried out using descriptive statistics such as graphs with units of time on

34
the abscissa and units of target behaviour on the ordinate. On the graph, solid lines connect the data
points while broken lines indicate the transition from one phase to another. Adjacent phases are
compared for changes in the mean level of the target behaviour, changes in slope and changes in
level between the last data point for one phase and the fist data point for the next phase.

Occasionally, inferential statistics may be used, for example, t-test to compare the means of baseline
and treatment measurements. The t-test would indicate whether the variations in measurements
between the phases represent true differences or chance fluctuations. Significant differences would
lead to the conclusion of reliable treatment effects.

6.5 Quasi-experimental designs

6.5.1 Static group comparison (XOO)

Two groups are involved in the static group comparison design. One of the groups is exposed to
treatment; the control group is not. A posttest is administered to both groups. Information obtained
from the control group is used to determine the extent to which any posttest difference between the
two groups can be attributed to treatment.

A major problem of the static group comparison is that posttest differences may be due solely or
partially to the characteristics of the groups. Suppose a researcher makes up two groups of silent
and exposes one group to private lessons as experimental treatment. The aim is to ascertain whether
private lessons contribute to improved performance. If the experimental group consists of students
whose characteristics such as intelligence and socio-economic background is better than those in the
control group, the differences in performance may be due to the advantage of higher intelligence
and socio-economic background rather than to private lessons. The confounding effects of
intelligence and socio-economic background reduce the internal validity of the experiment.

One way of ensuring that any differences in performance are due to private lessons is to make both
groups equal in all relevant characteristics. The groups are then referred to as equivalent groups.
Another way is to pretest both groups as is done in the non-equivalent control group design.

Another problem of the static group comparison is that mere participation in an experiment may
influence the subjects’ posttest behaviour. Differences in posttest behaviour are then due to the
Hawthorne effect group comparison a very weak design.

Analysis of static group comparison data is carried out with t-tests. If the data cannot be assumed to
be normally distributed, non-parametric equivalents of t-tests are used, namely, Wilcoxons T and
Nann-Whitney U.

O1 X O2
6.5.2 Non-equivalent control-group design O1 O2

The non-equivalent control group design is the most widely used of the quasi-experimental designs.
Instead of randomly assigning subjects to experimental and control groups, a pretest is carried out to

35
ascertain the level of the dependent variable before treatment. Conclusions are drawn on the
differential change from pretest to posttest.

The problem of confounding variables remains unresolved in this design. For instance, if the
experimental group is exposed to private lessons in addition to convectional classroom instruction, a
differential posttest change could still result from individual student characteristics. The analysis of
non-equivalent control group design data is carried out using t-tests to compare pretest-posttest
change scores. Otherwise, analysis of covariance (ANCOVA) is used to adjust for initial group
differences before final comparisons of pattern scores.

X1 O1 match
6.5.3 Ex Post Facto design X 2 O1

In this design, subjects in two or more groups are matched on every variable that is related to the
study. The groups are then regarded as equivalent groups and are exposed to different treatments.
Ex Post facto studies are like experimental studies where random assignment of subjects is not
possible. For example, if the academic performance of sandwich and regular students is to be
compared, regular and sandwich students with similar they are matched would be identified. The
characteristics on which they are matched would be intelligence quotient (IQ), sex and socio-
economic background.

One major problem of the ex post facto design is that one may not be able to match for every
relevant variable. For example, the age and experience of sandwich students is likely to be
inherently different from that of regular students and matching for these two characteristics
becomes impossible.

Another problem is that matching necessarily reduces the required size that can be sued especially
in cases where matching is required for many variables. Generalization of the findings beyond the
sample is thus difficult.

In spite of these problems, the ex post facto design is an improvement over the static group
comparison. T-tests may be used to analyze data yielded in ex post facto studies.

6.6 Experimental designs

X1 O1
6.6.1 Two-group designs X2 O1

The simplest of the experimental designs, which has one independent variable, which may be
treatments in a laboratory or field setting. The levels of the dependent variable are then measured.

36
The two-group design can be expanded to a multiple group posttest design by introducing n values
or levels of the independent variables thus:

X1 O1
X2 O1
X3 O1




‘Xn O1

The design is superior to non-experimental designs because of the control achieved by random
assignment of subjects. It is important that control is exercised over all factors irrelevant to the
study. The only difference to be experienced by the two groups is the treatment

One problem of this design is non-random subject loss. Non-random subject loss could result from
non-completion of the treatment by some subjects.

Data from the two-group design can be analyzed using t-tests or analysis of various (ANOVA) to
compare the group means.

6.6.2 Multiple-group pretest-posttest design

01 x1 02
01 x2 02
‘ ‘
‘ ‘
‘ ‘
‘ ‘
01 xn 02

In this design groups are pretested for the level of the dependent variable before treatment. The
pretest shows whether the groups were initially equivalent. It is widely used in programmes or
evaluation studies to ascertain the efficacy of programmes or interventions. In such studies an
untreated control is not usually available. An example is the experiment undertaken by Professor
Lungu to ascertain whether instruction in the mother tongue is more effective than instructions in
English among primary school pupils. In the experiment, one group of pupils were taught in the
mother tongue while another group was taught in English throughout the six years of primary
schooling. The performance of the two groups was compared. A problem shared by the multiple
groups’ pretest-posttest and the one-group pretest –posttest is that pretest cannot substitute for
proper control groups.

37
A second problem of the multiple-group pretest-posttest is that a pretest exposes the subjects to
instrument reactivity. The pretest influences their reactions with the treatment.

A third problem is that a statistical artifact could be responsible for the posttest changes.
Regression towards the mean is inherent in resting. When two groups are tested twice on the same
instrument, any group whose scores were initially more extreme would show more change than the
other merely as a function of regression toward the mean. It becomes difficult to separate real
treatment effects from regression toward the mean.

Data analysis could be by comparison of the groups on changes from pretest to posttest by
subtracting posttest from pretest scores. However, measurement error introduced by the pretest-
posttest producer renders the difference scores even more unreliable than the measurement scores.

If pretest scores showed the group to be equivalent initially, the posttest scores could be directly
compared.
If random assignment was strictly adhered to, ANOVA could he used for analysis.

6.6.3 Factorial Designs

Factorial designs make it possible to manipulate several independent variables at the same time and
to observe their effect on the dependent variable. The combined effect of two or more variables due
to interaction can also be investigated. For example, if the efficacy of two teaching methods is
investigated, it is possible that the efficacy of each method differs for males and females and for
students of different intellectual ability. Using factorial designs the amount of variance due to each
independent variable and to combine variables can be determined. The effects of single
independent variables are referred to as main effects while the joint effects due to combinations of
variables are referred to as interaction effects. In factorial designs, several variables may be
included as controls for confounding factors.

The simplest factorial design is the two-by-two factorial design (2 x 2) in which there are two
independent variables each of which takes on two levels or values. For example, a situation in
which teaching method takes on two levels (problem solving (P)) and discussion method (D) and
intellectual ability takes on two levels (gifted students (G) and non-gifted students (NG), a 2 x 2
matrix is formed as follows:

Intellectual Treatment
Ability P D
G
NG

If the total number of subjects were 40, ten students would be randomly assigned to each cell thus

38
P D
G 10 10
NG 10 10

If the total number subjects were 40, ten students would be randomly assign to each cell thus,

P D
G 10 10
NG 10 10

The main effect fro treatment would be obtained by comparing all subjects on the dependent
measure regardless of the level of independent variable in this case, intellectual ability. The main
effect for intellectual ability would be obtained by comparing all students on the dependent measure
regardless of the treatment. Interaction effects can be obtained by comparing the students on one
independent variable for one level or the other, in this case, comparing gifted to none /gifted
students for each treatment. If differences are found between gifted and non-gifted students for
either or both treatments, the there is an interaction of treatment with intellectual ability. When
interactions occur, the main effect is discarded. Suppose that the dependent variable mean scores
are shown below,

Problem solving Discussion


Gifted students 80.5 70.6
Non gifted students 60.3 45.8

If the difference in the mean scores is significant, then there are the main effects for the treatment
(problem solving scores are high for both gifted and non gifted students) and the intellectual ability
(gifted students score high for both treatments).

However, if the mean scores areas shown below, there is interaction of treatment with the
intellectual ability (gifted students do better with problem solving while non gifted students are
better off with discussion). The effects of the treatment are not the same for students of different
intellectual ability.

Problem solving Discussion


G 80.5 50.3
NG 50.5 80.2

A third possibility is that the main effect exists for only one of the independent variable as
illustrated below.

Problem solving Discussion


G 80.5 50.3
NG 50.3 50.3
OR

39
G 80.9 60.5
NG 30.5 50.9

In the two interactions above, there is a main effect for intellectual ability only.(gifted students
where always better at problem solving)

Factorial design can get very complicated with introduction of multiple independent variables. For
example, the addition of the sex variable to the example above gives arises to 2x2s2 matrix thus.

Problem solving Discussions


Male Female Male Female
G
NG

If two more treatments are added, for example. Lecture (L) and demonstrate (DM), a 4x2x2 matrix
results. And so on.

An M x N matrix would b obtained by increasing the number and levels of the independent
variables. ANOVA I used to determine how much of the total scores variance can be attributed to
each variable. However, when the value of any particular variable is continuous, for example, age
and income, multiple regression analysis is used to ascertain which main effects and interaction re
significantly related to the dependent variable.

If the independent variables are correlated, a stepwise addition and deletion of effects of each
independent variable using a hypothetically educed order,

6.6.4 Solomon Four-Group design

Notation,
O1 X1 O2
O1 X2 O2
X1 O2
X2 O2

This is a combination of two group designs and the pretest-posttest. Subjects are placed in a 2 x 2
structure and half of them received a pretest and posttest for each treatment. Because it compares
subjects who only differ in receiving a pretest, it has the peculiar advantage of allowing a
determination of (the extent of) instruction reactivity.

The basic logic of the Solomon Four –group design can be extended to larger and smaller group
designs. Provide that a sufficient number of subjects is available for each group, half the subjects
could receive a pretest in design involving more than two groups. For single-group pretest-posttest
design, instruments reactivity could be determined by omitting the pretest fro half the subjects.

40
Statistical analysis of data generated from Solomon four-group design is difficult because if
ANOVA is used, there would be missing data for half the subjects. One way out is to analyze the
data in stages. In the first stage, the pretest is compared for both treatments; groups to verify the no
pretest differences exist after random assignment of the subjects. If pretest difference exist, it would
be difficult to draw definitive conclusions from the study. In the second stage, the posttest scores
are compared as in 2 x 2 factorial design with treatment level as factor and pretest as other thus,

X1 X2
Pretest present
Pretest absent

The treatment effects due to instrument reactivity of to interaction between and pretest would then
be determined.

6.7. Hierarchical Design.

A hierarchical design is one in which the subjects are nested in units which are assigned to
treatments. For example. Students are encountered in pre- existing groups of units (classes)

These classes may be assigned to treatments as entire groups. Fro example some classes are taught
using one strategy while other classes are taught using another strategy. Individuals are not
randomly assigned to groups. Rather, groups (classes) are randomly assigned to treatments.
Conclusions are made to the groups and not to the individuals.
The data are analyzed as if the classes are the subjects, using grouped data from individual class
members. A sufficient number of units (classes) must be used for the analysis to be feasible. This
implies that a sufficient number of classes must be selected at the design stage. Otherwise, the data
will not be amenable to statistical analysis.

6.7 Multivariate designs

All designs involving a single dependent variable are considered as univariate design regardless of
the use of multiple independent variables. In reality, social phenomena are complex and not easily
reduced to single measures. For example, the effects of different government strategies on both
academic achievements and attitudes may be sought. Multivariate design involves multiple
independent variables. They are used to determine the complex relationships among variables. Any
of the design discussed in the preceding sections can be expanded to multivariate version by
collecting data on more then one dependent variable.

The analysis of data generated form multivariate design is very complex and necessitates the use of
computer packages. The relevant statistical procedures include factor analysis, cluster analysis,
multiple analysis of variance and multiple regressions among others.

41
Seven. Reliability and Validity of
Measuring Instruments.
7.1 Measurements and Errors.

An instrument is a measuring device. It could be questionnaire-measuring attitude, a watch


measuring time, a test measuring intelligence or achievement. If a study is very well design but uses
faulty instruments, it’s finding would be completely invalidated. It is therefore important that care
full deign is appropriately matched with carefully developed and precise instruments; the aim is to
keep measurements errors at a minimum possible levels.

Measurement errors could arise from faults in the instruments, incorrect interpretation of the values
obtained or instability in the behaviors of the respondents. They could be systematic or random.
Systematic error or bias occurs when the errors are more frequently made in one direction away
from the true score, for example, a watch that is always loosing time. Random errors occur when
measurement values deviate from the true score as frequently in one direction as another. In such
cases, the watch would loose time and gain time on another occasion. Random errors are attributed
to chance factors. Where possible, the magnitude of the error should be estimated and adjusted for
or its sources eliminated. For example, the instrument could be improved or replaced.

The extent to which an instrument measure a true score is indicated by two important attributes,
namely, Reliability and Validity, the degree of random error is inversely related to the degree of
reliability while the degree of non-random error is inversely related to validity.

The rationale for using several items in a questionnaire to measure the same variable or for using
large samples is that random errors tend to average out over repeated measurements, thereby
improving reliability. The best strategy is to use multiple measures, multiple measurements and
multiple investigators as elucidated in the triangular theory of Denzin (1978)

7.2 Reliability.

Reliability refers to the degree of internal consistency of measuring instrument. A reliable


instrument yields the same result for the same individual regardless of when it is administered.
Thus, an intelligent test, which yields the same score for and individual on repeated measurements,
is highly reliable. Recall what we do when we count money for transactions. To ensure that our
counting is reliable, we recount. If there is consistence in the second counts (repeated
measurements) we assume that we counted correctly. Similarly, we assume that we measure
correctly if the results of repeated measurements are consistent.

Reliability is related to the difference between and individual’s true score on a construct and the
measurement value arising from any administration of the measuring instruments. The degree of
reliability is measured by coefficients, which indicate the degree to which a measuring instrument is
free from error variance. Error variance refers to the sum of effects of chance differences between

42
individuals, which arise from factors associated with particular measurements. Such factors may be
an individual’s psychological mood, health and fatigue conditions on the day of administration of
the instrument, the wording of the components items, and the ordering of the items or the
substantive content of the instrument. For instance, ambiguous wording of items may lead to
different responses by the same individual on different occasions. Error in scoring would also lead
to inconsistent score on different occasions.

The various reasons for inconsistencies in the score require different reliability measure, each
representing a separate error component. Researcher should consider the sources of error likely to
be represented in a study while choosing measures of reliability. Different reliability coefficients
measure different error components. Suitably high values of the coefficients indicate acceptable
levels of reliability. The maximum value is 1 for a perfectly reliable instrument. In practice, no
instrument is perfectly reliable. The actual values to be accepted depend on the nature of the
construct being measured. For example IQ is expected to be relatively stable and therefore require
very high reliable coefficient values. On the other hand, attitudes are rarely stable and lower
coefficients may be accepted.
The methods of estimating reliability include the test-retest, alternate form reliability, split-halves
and internal consistence methods (rational equivalence and coefficients derived from factor
analysis)

7.2.1 The Test-retest method.

Test-retest involves the repeated administration of the instruments to the same individuals on two
occasions. Usually a time interval of between two weeks and one month is recommendable for the
first and, second administration of the instrument. The resulting scores are correlated to determine
the coefficient of stability.
This method is appropriate when performing on retesting is not likely to be improved by the
experience provided by the first administration of the instrument and when problems of the
instrument reactivity can be measured to be minimal. It may be used for speeded tests and for
indicating stability over time.
Test-retest correlation coefficients do not reflect the error due to limited sampling of the items
because same restricted sample of items is administered twice. The internal consistency methods
show up this error
The time interval between the first and the second administrations affects the level of the
correlations obtained. If the time interval is too short, respondents will recall their previous
responses (memory effects). This gives rise to spuriously high coefficients, which over estimate the
true reliability. The alternative form method minimizes the memory effect, if the time interval is too
long, maturation or learning may alter the respondents’ behaviors. True change would then be
interpreted as instability. Low coefficients, which under estimate the true reliability, are
consequently obtained. It is therefore important that researcher should use the most expedient time-
interval.

43
7.2.2 Alternate-form method.

Alternative forms of an instrument are parallel measures designed to meet the same specifications.
For example, the same specification of content and objectives. Parallel forms of an instrument are
expected to measure exactly the same behaviours.

In the alternative forms method of estimating reliability, two parallel forms of an instrument are
administered to the same respondents at a single sitting or with a time-interval between the two.
The scores from the parallel forms are then correlated to determine the coefficient of equivalent. It
is similar to the test-retest method except for two important differences. Firstly, an alternative from
rather than the same instrument is administered. This procedure offers the advantage of scores
being unaffected by the memory effect, a serious problem of the test-retest method. Secondly, there
need to be a time-interval between the two administrations of the instrument since the items are not
exactly the same. This procedure rules out differences in maturation differences in learning,
differences in psychological mood, health and fatigue as possible sources of error in contrast to the
test-retest method. The alternative form method is therefore better than the test-retest when less
stable behaviours and phenomena are being measured. However, if a time-interval is allowed
between the administrations of the parallel forms of the measuring instrument, the same limitation
as in the test-retest method are applicable.

The alternative from methods is widely used to estimate the reliability of standardized and
intelligence tests.

7.2.3 Split-halves method

This is sometimes categorized as an internal consistency estimate because the items of the
instrument are split into two subsets after a single administration of the instrument. The total set of
items is divided into halves and the scores on the halves are correlated to obtain an estimate of
reliability. The assumption is that the two halves are approximations of alternative forms and are
expected to produce identical results. There are several ways of splitting a set of items into halves.
According to carmines and Zeller (1979), there are 125 different possible splits for a ten-item scale.
The most popular ways of splitting into halves are placing odd numbered in one subset and even-
numbered items in the other, randomly dividing the items into two groups; and separating the items
into first and second halves by serial order. Each split will probably yield a different value of the
correlation coefficient with the result that different reliability estimates are obtainable even though
the same items are administered to the same respondents at the same time. This is a limitation of
the split halves method.

The reliability estimate from the split-halves method applies to half the set of items. Since the
number of items influences reliability, the split-estimates require a statistical correlation to yield the
reliability estimate for the total set of items. Such a correlation is provided by the Spearman-Brown
prophecy formula derived independently by Spearman (1910) and Brown (19110). The formula is
as follows:

44
nrs
rc = 1+(n-1)rs

Where rc is the correlated reliability coefficient, rs is the split-half correlation, and n is the number
of times longer the full test is. For split-halves, the instrument is twice as long as each of the halves
so the value of n is 2 resulting in a simpler from of the formula thus:

rc = 2rs
1+rs

The split-halves method excludes sources of error arising from differences in psychological mood,
fatigue, health and environmental conditions surrounding different administrations of an instrument.

7.2.4 Internal Consistency methods.

An internal consistency estimate is obtained through an analysis of the individual items following a
single administration of the measuring instrument

a) The rational equivalent method

The rationale equivalent method requires neither the splitting nor the repeating of the items. The
method copes with the problem of forming equivalent halves by considering all possible splits. The
reliability coefficients are computed using either Kuder-Richardson formula or Cronbach’s alpha.

The coefficient s of Kuder and Richardson’s (1937) are used to estimate the reliability of
instruments composed of dichotomously scored items. Dichotomous items are scored one or zero
for the presence or absence of and for positive and negative responses to a characteristic under
investigation. Kuder-Richardson estimates represent the expected correlation between an
instrument and its hypothetical alternative containing the same number of items.
The formulas are numbered. The most widely used are numbers 20 and 21. KR21 is simpler and
may be used for instruments developed by individual researchers. KR20 is technically more
satisfactory but more complex and is used to determine the degree of reliability of standardized
tests.
The formula for KR20 is

KR20 = (N) (1-∑Piqi)


n-1 Ố2t

Where N = the number of dichotomous items


Pi = the proportion of positive responses to the ith item
Qi = 1 –Pi
A ∑ = summation of
Ốt = Variance of the total composite

45
The reliability estimates obtained from Kuder-Richardson formula and Cronbach’s alpha represents
the average of all possible split-half coefficients for the group of respondents.

Cronbach’s alpha (cronbach, 19951) is a unique estimate of the expected correlation of one
instrument with an alternative form composed of the same number of items. It is a generalization of
the Kuder-Richardson coefficient, which can be used for polychotomous items. The value of alpha
depends on the average inter-time correlation and the number of items in the instrument. As the
average correlation among items increases and as the number of items increase, the value of alpha
increases.

The formula for Cronbach’s alpha is given as:

∞ = (N) [1-∑Ố2 (yi)]


N-1 Ố2t

Where N = the number of items


∑Ố2 (yi) = the sum of item variances
Ố2t = the variance of the total composite

Using the correlation matrix, alpha reduces to the expression

∞=Nѓ
[1+ ѓ (N-1)]

Where N = the number of items


Ѓ = the mean interitem correlation

The main interitem correction is obtained by summing the coefficient values of the correlation
matrix for a ten-item instrument may read as follows:

Table 7.1: Correlation Matrix for a ten-item instrument

Items 1 2 3 4 5 6 7 8 9 10
1 .050 .469 .577 .189 .060 .07.0 .080 .081 .090
2 .369 .230 .221 .521 .631 .181 .082 .192
3 .338 .233 .421 .081 .182 .081 .234
4 .248 .568 .321 .123 .281 .568
5 .182 .238 .156 .250 .050
6 .523 .123 .510 .421
7 .850 .520 .615
8 .123 .531
9 .821
10

46
The above average interitem correlation is obtained by summing all the 45 correlations and dividing
by 45, that is,

0.50 + .469 + .369 + .577 + .230 + .338 + .189 + .221 + .233 +

.248 + .060 + .521 + .421 + .568 + .182 + .070 + .631 + .081 +

.321 + .238 + .523 + .080 + .181 + .182 + .123 + .156 + .123 +

.850 + .081 + .082 + .081 + .281 + .250 + .510 + .520 + .123 +

.090 + .192 + .234 + 568 + .050 + .421+ .615 + .531 + .821 =

13.685 + .45 = 0.304

Unless the alternate form method is adopted, alpha should be used to estimate the reliability of
multi-item since it encompasses the Spearman-Brown prophecy formula and the Kuder-Richardson
20

In order to minimize the effects of random measurement errors, the reliability of instruments used
for research should not be below 80.

b) Standard error of measurement

The standard error of measurement is an estimate of test reliability obtained from the reliability
coefficient and the standard deviation of test scores. It is inversely related to the reliability
coefficient. As the reliability coefficient increases, the magnitude of the standard error of
measurement decreases. This is because tests of higher reliabilities are subject to smaller error. In
such tests the deviation of an individual‘s scores from the ‘true’ score would be quite small. The
true score is that obtained from perfectly reliable instrument. The true score could be obtained
hypothetically by testing the individual an infinite number of times and taking the average of this
score.

The standard error of measurement is computed from the formula:

Sm = SD √ 1-r

Where SD = the standard deviation of the set of scores


R = the reliability coefficient

Sm is normally distributed such that about two-third of all test scores are expected to lie within plus
or minus one standard error of measurement of their true score; about 95 percent will lie within plus
or minus two standard error; and 99 percent will lie within three standard errors. One would then

47
expect to find the true score within a band of scores. If it is large, the band of scores will be large
such that the observed score is not so dependable as a measure of the particular characteristic. For
example, if the observed score is 90 and the standard error is 5, there would be 2 chances in 3 that
the true score lies between 85 and 95, and there would be 95 chances in 100 that the true score lies
between 80 and 100, and 99 chances in 100 that the true score lies between 75 and 105.

If Sm were smaller, for example, 2, then one could be 68% certain that the true lies between 88 and
92; 95% certain that the true score lies between 86 and 94; and 99% certain that it lies between 84
and 96. In the latter case, the band of scores within which one expects to find the true score is
narrower and so greater confidence can be placed on the observed scores as a measure of the
particular characteristic.

c) Factor analysis

The use of factor analysis makes it possible to obtain estimates of reliability that approximate the
true reliability better than all other coefficients.

Coefficient theta (O) derived from principal components factor analysis can used to estimate
reliability. Theta is given by the formula:

O = [N ] [1 – 1]
N-1 Y

Where N = the number of items


Y = the largest or first Eigen value

Theta may be considered as a special case of Cronbach’s alpha that is, a maximum alpha
coefficient. The advantages of theta are two-fold. One is that it provides a closer estimate of true
reliability than alpha. Another is that it is not as restrictive as alpha. The assumption underlying
the computation of alpha is that the items are parallel, thus measuring the particular characteristic
equally. If this assumption turns out to be untenable such that the items measure the characteristic
unequally or that the items measure more than one characteristic whether equally, factor analysis
would identify the clusters of items which measure each characteristic, that is, those items that are
more highly correlated with each other than with other items

Another coefficient, omega (Ω), derived from factor analysis, provides an estimate of reliability that
is even closer to the true reliability than theta. However, unlike theta, omega does not assess the
reliability of separate scales if all the items happen not to be parallel measures. Omega is derived
from the common factor model of factor analysis.

48
It is given by the formula:

Ω = 1 – (n - ∑hi2)
(n + 2b)

Where N = the number of items


b = the sum of the correlations among the items
hi2 = the communality of the ith item

If the closest estimate to true reliability is desired, omega should be sued. However, if the
assumption of parallel items is untenable, theta is more useful than omega.

7.3 Validity

A highly reliable measuring instrument is of no practical use if it is not valid. Reliability is a


necessary but not sufficient condition for validity

Validity refers to the degree to which an instrument measures exactly what it purports to measure
and nothing else. Validity is always specific to some particular use. An instrument may be valid for
one purpose but not for another. For example, an instrument of high validity for indicating
reasoning ability may posses a low degree of validity for measuring computational skill. Besides,
the instrument may be valid only in a specific cultural or geographical setting.

Three major types of validity are recognized, namely, content validity, criterion-related validity and
construct validity.

7.3.1 Content validity

Content validity is the extent to which the items of an instrument are representative of the content
and behaviours specified by the theoretical concept being measured.

Content validity is estimated by comparing the sample of represent. If the sample of items covers all
aspects of the content and behaviour, a high degree of content validity is attained…

For achievement tests, tables of specifications, which systematically specify the content, objectives
and evaluation techniques, are constructed in the process of generating valid test instruments. This
procedure enhances the like hood of obtaining an adequate sample of test items.

For measures of a construct, several experts may be used to identify all the sub-components of the
construct and to assess how closely related to the construct the items are. In addition, factor
analysis is applied to data generated from trial testing of the instrument. Items whose
correlations/factor loadings suggest that they are measuring a different construct may then be
modified or eliminated.

49
None of the methods of estimating content validity is completely satisfactory because it is difficult
to specify the entire universe of behaviour/content, which makes up the abstract concepts being
measured. If the entire universe of content/behavior is unknown, there is no way of ascertaining
that it has been adequately sampled

Face validity

Content validity is sometimes erroneously equated with face validity. Face validity refers to the
subjective judgement of assessors about what an instrument appears to be measuring, that is, on face
value. No systematic procedure is adopted.

7.3.2 Criterion-Related validity

Criterion-related validity indicates the extent to which an instrument yields the same results as a
more widely accepted measure. The degree of criterion-related validity is obtained by correlating
the results obtained from the administration of the two instruments. The resulting correlation
coefficient is an estimate of criterion related validity. For example, if the results yielded by a newly
developed instruments for measuring IQ correlate very highly (.95) with the results of the standford-
Binet Intelligence Test, the new instrument is said to possess a highly degree of criterion-related
validity,. In this case, the result of the widely acclaimed instrument, the Stanford - Binet test, is the
criterion variable. Similarly, the Eyesenk Personality Inventory can be used as a criterion for newly
developed personality measures.

The limitation of criterion-related validity is that relevant criterion variables do not exist for some
constructs. It is especially difficult to find appropriate criteria for assessing instruments measuring
very abstract concepts, that is, those concepts for which corresponding overt behaviours are
unknown.

Criterion-related validity is of two types, namely, concurrent validity and predictive validity.

a) Concurrent Validity

Concurrent validity is applicable when a new instrument is administered at the same point in time as
the well-known instrument. A short time interval may be allowed between the administration of the
old and new instruments.

Concurrent validity is useful when the equivalent form of an instrument is needed. Equivalent
forms may be desirable for several reasons. One reason is to minimize the recall of previous
responses by respondents on a second administration of an instrument. Another reason is to update
the older instrument. A third reason is to obtain an alternative, more easily administered
instrument. Otherwise, a shorter, timesaving, or more economical version of the instruments may
be desired.

Concurred validity is attained if the correlation between the results of the newly developed
instrument and those of a suitable equivalent is sufficiently high.

50
b) Predictive Validity

Predictive validity is the extent to which the predictions made by an instrument are confirmed by
the later behaviour of respondents. Predictive validity is concerned with the prediction of future
performance, for example, prediction for success at school from scores on intelligence tests and
other aptitude test; preference scales; prediction of performance in undergraduate courses from
JAMB examination results.

The second instrument is administered after the behaviour that the first instrument attempts to
predict has occurred. An appropriate time interval is required between the administration of the two
instruments. The results yielded by the two instruments (criterion measures) are then correlated. A
sufficiently high correlation index indicates predictive validity.

7.3.3 Construct Validity

Construct validity is the extent to which an instrument measures a hypothetical construct, which it
sets out to measure. A construct is a quality, which we assume to exist in order to explain some
aspects of behaviour. Examples of constructs are anxiety, creativity, attitude, and intelligence.

The process for establishing construct validity differs. One method is to carryout factor analysis
after an administration of the instrument using a larger pool of item than required for the final
version of the instrument. Items, which show low or negative correlation, are omitted in the final
list.
Another method involves the comparison of the characteristics of individuals who scores highly on
the instrument and those who scores are low. An instrument, which differentiates between the two
groups of persons, is likely to be a measure of the proposed construct.

A third method is to compare the results yielded by the instrument with those yielded by a measure
of some other phenomenon to which the proposed construct is positively related as posited in a
theoretical relationship. For example, if aptitude is posited to be related to achievement, then an
instrument measuring achievement is expected to yield scores, which correlate positively with the
scores of an aptitude measure of proven validity and reliability. If the achievement measure is
consistently found by different researchers to yield scores, which positively correlate with several
other indicators that are posited to be positively, related to achievement, then greater confidence can
be placed on the construct validity of the achievement measure.

7.4 Concluding Note

It is necessary to cross-validate instruments using a different sample before the final estimates of
validity and reliability and reliability are made. Any one sample would incorporate some
idiosyncrasies peculiar to that sample.

51
Eight. Methods of Data collection
8.1 Overview

In the first chapter, we described research as systematic inquiry. Any systematic inquiry must obtain
evidence on the basis of which inferences and conclusions are drawn. The process of obtaining
evidence is here referred to as data collection.

Data collection in educational research may take many forms. Notable among the many forms are
tests, projective techniques, consultation of records and documents, direct observation, interviews,
and questionnaires. Adequate discussion of all these methods requires the writing of a separate and
voluminous book. Generalized and widely applicable methods such as questionnaires, interviews,
and observation are discussed in this chapter.

Whichever method of data collection is used by a researcher, trail testing of the measuring
instruments should be undertaken using a few subjects whose characteristics are similar to those in
the sample. The multifaceted purpose of trial testing is to ensure a satisfactory level of
functionality, to estimate reliability, to obtain new insights, and to eliminate ambiguities.
It is important that satisfactory measuring instruments are evolved before pilot work of the entire
study is undertaken. Instruments may need to be modified several times.

Otherwise, instrumental problems may be mistaken for genuine difficulties encountered in piloting.
This may lead to unnecessarily expensive repetitions of the entire pilot work. It may also lead to
abandoning the research erroneously since the essence of piloting is to ascertain the feasibility of
the study. In pilot work, every aspect of the research is evaluated. Particular attention being paid to
testing the hypotheses.

8.2 Questionnaire

A questionnaire is a carefully design instrument for collecting data in accordance with the
specifications of the research questions and hypotheses. It elicits written responses from the
subjects of the research through a series of question/statements put tighter with specific aims in
mind. The questionnaire may be used to ascertain facts, opinions, beliefs, attitudes, and practices.

The components of a questionnaire are the title, the introduction, the response instructions,
biographical information, the questions/statements, return instructions and gratitude. A letter of
reference from an appropriate authority may accompany a questionnaire to elicit the co-operation of
the respondents.

The title of a questionnaire is an appropriate caption for the substantive content of the questionnaire,
not the topic of the research project, for example, “student’s views about laboratory facilities” or
“Attitude to school questionnaire, ASQ”.

The introduction bears the main objective of the research or questionnaire, whichever is appropriate.
A guarantee of anonymity of the respondents and confidential treatment of the information supplied

52
should be included in the introduction. By assuring that no information in the established with the
respondent and the likelihood of eliciting accurate, frank and comprehensive information is
increased.

The response instructions should specify the modes (s) of completion of the questionnaire.
Respondents should specifically be instructed to in the blanks; underline, put a cross, circle, or tick
in the appropriate place. If young or inexperienced persons are involved, some examples should be
given. A familiar sample question and answer could be provided and followed with one or two
practice questions which should be answered before proceeding to the main questions. If the
respondents are to underline their chosen response, a sample question and answer would read thus:

“The colour of a ripe banana is green, yellow, black, blue.” A more elaborate and complex example
is provided in the appendix.

Biographical information refers to personal data of the respondents, which is required for analysis,
and interpretation of the data such as type of school, class in school, occupation, sex, income, age,
experience, qualifications, social class, and marital status. Only variables that must be used for
analysis and interpretation should be included. One should be extremely cautious if information on
certain variables must be sought. Respondents are information on certain variables must be sought.
Respondents are likely to be less relaxed when required to supply information might be better
obtained indirectly. Similarly, if the exact age or length or experience is not strictly required it is
advisable to provide for age ranges or over and under a certain age rather for than exact ages for
example, 30-40; under 50; above 21. Questions requiring information on social class, income,
occupation, and qualification easily lend themselves to prestige bias. Respondents are generally
predisposed to over claim on such variables, thus introducing responses errors. When strictly
required, they should be phrased in as permissive a manner as is possible

The actual questions/statements deal with the substantive content of the research. They may
require factual answers, opinions, or evaluations. Each question/statement is supposed to elicit
answers to specific issues addressed by the research. The questions must not be written aimlessly,
haphazardly, or shoddily. They must all be relevant to the hypotheses and research questions.
Nothing is gained in unnecessarily lengthy questionnaires. In contrast, respondents’ motivation
may be adversely affected. That is not to say that the same question may not be asked using
different formats as is done in attempts to latter case, the repeated questions serve an essential
function. Similarly several questions/statements are used to measure one construct out of a concern
for reliability.

Having chosen the appropriate questions to ask, care should be taken to word the questions in such
a manner that required information is obtained with a minimum of distortion. The language should
be simple, clear, and precise. Ambiguous, suggestive, leading, antagonistic, and embarrassing
questions, which invade privately, must be avoided. A common error in question wording is the
phrasing of double-berrelled questions/statements. The answer to one aspect of a double-barrelled
question may be contradictory to the answer when the other aspect of the question is considered.

53
Consider an apparently simple, clear, and precise question such as:” do boys and girls fight in the
classroom?” or “Is the incidence of fighting’s occur in the classroom?” Granted that these three
interpretations are possible, the question is ambiguous. The Question could also be doubled-
barrelled in the event that boys fight among themselves but girls do not. The respondent is left in a
fix because of the doubled-barrelled nature of the question. A “yes” response would be true of
boys but not of girls. A “no” response would be true of girls but not of boys.

The double-barrelled questions/statements encountered in practice in questionnaire items are, on


face value, much worse than the above example. The best of doubled-barrelled questions class
work and tests for continuous assessment?” the teacher might use either of class work or tests.

Questions/statements should be edited several times from different points of view. They should
subsequently be restructures. Discarded, or corrected. Finally, they should be assessed during trial
testing, checking in additional to the above criteria, the suitability of the language for the conceptual
level of the respondents.

Return instructions direct the respondents to deposit the completed questionnaires at a certain
designated collection point; to mail them to a certain address; who waits to collect them.

The gratitude section ends the questionnaire. It is essential to recognize that the respondent is under
no obligation to complete the questionnaire and is doing the researcher a favour. This is done by
ending with a statement of thanks to the respondent for taking time effort to complete the
questionnaire.

Granted that the content of the questionnaire is superb, the production quality could make a mess if
a good job. The appearance of a questionnaire should therefore, be given proper consideration as it
can affect the attitude of the respondent. The printing should be good. The layout and spacing
should be appealing rather than littered up. If different response modes are used, all questions
requiring one model of response should be grouped together.

A watchword to keep in mind during questionnaire construction and production is respondent


orientation. Conscious effort should be made to see things from the respondent’s perspective.

8.2.1 Structured and Unstructured questionnaires

Based on the format for statements/questions and responses, questionnaires may be classified as
structured/closed and unstructured/open.

In structured/closed questionnaires, alternative responses are supplied and the respondent has to
choose from the available alternatives. Respondents do not have the freedom and opportunity to
express their view. In unstructured questionnaires, blanks are provided for the respondent to freely
supply spontaneous responses. None of the two types is fundamentally better than the other. Each
type should be used when it is considered most appropriate.

54
Structural questionnaires are preferred if the numbers of possible responses are limited. If
for instance, we wish to find out which methods a teacher uses for the assessment of students, the
possible methods are finite, we wish to ascertain the reasons for the use of these methods, and any
numbers of unimaginable reasons are the possible. Since the reasons are limitless, they may not be
comprehensively captured by providing alternative responses. Even when we provide alternative
responses, they would be based on the results of a pilot study in which open-ended responses
yielded a discrete number of reasons, which are then used as alternative responses.
Secondly, structured questionnaires are preferred if comparison among groups of
respondents is important for the study. This is because all the respondents consider the same
universe of content.
Thirdly, structured questionnaires are preferred if the respondents are more competent in
reading than they are in writing.
The advantages of the structured questionnaire are the gain in speed during administration of
the questionnaire and easy quantification and analysis of the responses especially with the aid of the
computer. Structured questionnaires are easier and quicker to complete because minimum writing
is involved. More questions are answered within a unit time.
Structured questionnaires also have disadvantages. One disadvantage is the loss of
spontaneity and expressiveness in the responses. This may lead to a second disadvantage, namely,
the loss of rapport among respondents when the available alternatives fail to do justice to their ideas
and opinions. Thirdly, there is the possible bias of forcing respondents to choose among given
alternatives and of suggesting ideas that were never on their minds. The information obtained may
be regarded as prompted responses with a consequent reduction in their validity. One possible
negative result of the above circumstances is that respondents may complete the questionnaire
haphazardly checking any alternative without thought.
The circumstances under which unstructured questionnaires would be preferred are
elucidated below. Firstly, in cases where answers that respondents would never have imagined are
suggested by alternative responses, it is better to use unstructured questionnaires. Secondly, for
exploratory studies in which the researcher has limited or no clues regarding the likely responses,
unstructured questionnaires have to be used. Thirdly, if it does not make sense to anticipate the
responses of the target population, alternatives may be too lengthy and/or too many to give the
questionnaire an appeal in terms of appearance. Apart from the aesthetics, such lengthy
questionnaires may be too costly for the researcher’s resources. An unstructured Questionnaire may
then be preferred.
The main advantage of unstructured questionnaires is the freedom and spontaneity of
expression granted to the respondents and the consequent rapport and enhanced validity of
responses.

The disadvantages of unstructured questionnaires include the following. Firstly,


qualification of the response is cumbersome. Secondly, the coding of the different response
categories is time-consuming and may lead to loss of information, as a variety of responses may be
lumped into one category for convenience. Thirdly, it is difficult to make comparison across groups
of respondents. Fourthly, some significant points may be momentarily forgotten or inadequately
verbalized at the time of completing the questionnaire.

55
A detailed treatment of the quantification of structured and unstructured questionnaire data may be
looked up in Oppenheim (1966) by interested researchers.

8.2.2. Modes of administration of questionnaires

The modes of administering questionnaires to respondents may be categorized into three, namely:
Mailing, personal delivery with collection on the spot, and personal delivery with collection after a
time internal. Each of these modes is discussed below. Various combinations of these three modes
are possible. For example, a questionnaire may be delivered by hand and return by post

Mail questionnaire

The mail questionnaire is sent by post to the respondent who is expected to complete and
return it by post to the researcher. The researcher should encourage a high response and return rate
by enclosing stamped self-addressed envelopes and if appropriate tangible inducements, such as
gifts. Suitably worded and non-threatening reminders should be sent to those who do not reply by a
given time. Where possible, arrangements should be made for substitute respondents.

Mail questionnaires are preferred when respondents are separated from the researcher by
enormous distances. They are also preferred when socially unacceptable responses are inevitably
required such that personal contact may increase the likehood of faking.
The advantages of mail questionnaires are that they can reach respondents who are very far
away; the respondents can afford time to consult sources of information; the researcher conserves
time and traveling expenses. Anonymous mail questionnaires increase the chances of obtaining
valid but social unacceptable responses.
The disadvantages of mail questionnaires are several. Firstly, mail questionnaires are found to
produce very poor response rates. Low percentage returns reduce the sample size and non-random
subject loss leads to sampling bias. Secondly since the researcher is not available to clarify any
ambiguities and misinterpretations, mail q1uestionnares are not suitable for use with persons of low
intelligence or low educational background. Mail questionnaires must be piloted many times before
the final administration in order to minimize misinterpretations.
Thirdly, mail questionnaires may be passed on to persons who are more competent than the
selected respondent to answer the questions. This results in distortions to the sample. Fourthly, the
researcher has no control over the time it would take to receive all the responses and, therefore,
cannot plan for when to start analysis of data. A planned chronogram cannot therefore be followed.
Fifthly, especially in countries with poor postal services, delivery and return of the questionnaire by
post cannot be guaranteed.

Personal Administration with on-the-spot collection

The investigator or her research assistants in person can deliver questionnaires. These persons wait
for respondents to complete the questionnaire and collect them back. This mode guarantees a
hundred per cent delivery and return. Personal administrations of questionnaires provide
opportunity for clarifying questions to ask by respondents and for explanations to be asked by the

56
researcher. Ambiguities are thus kept to a minimum. The researcher is also in control of the time
for completing the project. This further saves her some anxiety.

Personal delivery is however, not free from problems. One of its demerits is that the
respondent may not have on-the-spot answers to all the questions; a school principal may need to
access the school board to obtain certain information before some responses can be provided.
Another demerit is that the presence of the investigator may influence the respondents to fake
responses or put them under psychological tension. The validity of the responses is thus reduced.
Lastly, the personality of the researcher may positively or negatively affect the diligent completion
of the questionnaire.

Personal delivery with collection after a time interval

A researcher could deliver questionnaires in person and return to collect them after a period of time.
The advantages of this mode over on-the-spot collection are that a respondents are afforded time to
look up information and are more relaxed while completing the questionnaire in the absence of the
researcher. This mode is preferred when documents and other sources need to be consulted in order
to respond appropriately. It’s disadvantages area as follows.

Firstly, the respondents may not be available when he returns to some of the respondents
may not be available when he returns to collect the completed questionnaire once the researcher has
motivated to compete the questionnaires. Some of them may not be left the scene. Secondly, time
and money may be wasted in repeated trips to check on respondents who continually procrastinate
the completion of the questionnaires. Thirdly, mass consultation among respondents in close
proximity to each other may take place in the researcher’s absence. This may give rise to uniform
or stereotyped responses, which reduce the validity of the data.

As discussed above, any mode of questionnaire administration has merits and demerits. The
choice of any mode or a combination of two modes should be dictated by the particular
circumstances and the problem under investigation.

8.2.2 Appraisal of the questionnaire method of data collection

The questionnaire is a very popular method of data collection in education and in the behavioural
sciences in general. This is because of the relative ease and cost-effectiveness with which it is
constructed and administered to large samples when compared to other methods. It is not as time-
consuming or cumbersome as interview or observation. Research assistants can be used with
minimum training. Even postal services can be used for its administration. The major limitation of
the questionnaire is the insistence on written responses by subjects. Other methods must be used to
elicit responses from those who cannot write. These include the handicapped, the illiterate and the
semi-literate. Other methods are discussed in the sections that follow:

57
8.3 Interview

An interview is a face-to-face interaction in which oral questions are posed by an interviewer to


elicit oral responses from the interviewee. It should be realized that an interaction takes place
among the interview situation, the interviewer, the interviewee and the interview schedule. For
maximum success in an interview, the as interview situation should be kept as flexible as possible.

8.3.1 Phases of an interview

Four major phases may arbitrarily be distinguished in an interview. Thesare the preparation phase,
the phase in which rapport is established, the question-answer phase, and the recording phase. The
phases overlap and interact.

Preparation

Much of the success of an interview will be credited to how well the interviewer has prepared.
During the preparation phase, decision is taken on the mode of recording responses. Any recording
instruments are checked and rechecked for validity and reliability. They subjected to a trail-run to
ensure that they are in good working condition. If a tape recorder is used, for instance, new
batteries may need to be inserted and tried out. If pictures would be used for probing, they should
be selected; and tried out. If a gift is to be used for establishing rapport or expressing gratitude, it
should be selected, for example, a bottle of wine.
The cultural background of the interview should be ascertained so that an appropriate salutation
may be chosen in advance for the first encounter. Biographical data about the interview should be
checked out so that he/she is addressed by the appropriate titles during the first encounter. If he/she
has some significant awards and distinctions, these should be noted so they can be mentioned while
establishing rapport.

The questions should be derived from well-defined hypotheses/research questions and edited.
During editing, issues of appropriate length, relevance, ‘palatability’, clarity, simplicity, precision,
language and conceptual levels of questions should be considered. As with questionnaire items, a
respondent-orientation or focus must be maintained. Offensive and derogatory insinuations should
be expunged. The questions are then handed to an external assessor for critique. The external
assessor should be provided with information about the research objectives and the target
population. After critique, a trial testing of the questions is carried out with a sample whose
characteristics are similar to those of the interviewees. Questions are then reviewed thoroughly to
produce the final list. Questions on the final lit should be studied, memorized, and rehearsed to
mastery level in the sequence in which they are likely to be asked.

58
Rapport

A cordial atmosphere is a sine qua non for valid data collection through interviews. No rule about
establishing rapport applies to every situation. A mature investigator should size up the situation
and evolve appropriate strategies. Some guidelines are preferred as follows. The researcher should
courteously seek permission from an appropriate authority. While doing this, information should be
provided with regard to the objectives of the study and the nature of the interview. Notification
should then be given to the interviewee and an appointment booked for date, time and venue. The
venue agreed on should be noise-proof. The chosen time should be prepared for optimum comfort
in terms of sitting arrangements, ventilation, lighting, and decoration. As far as possible, it should
be noise-proof. The chosen time should be such that neither the interviewer nor the interviewee
would be hurrying from the interview to another engagement. Anxiety to meet up with other
commitments is likely to reduce the comprehensives and validity of information obtained from an
interview. The interviewer should give due consideration to his/her appearance in terms of
appropriate inoffensive dressing, neatness, dental and body deodorization and any other aspect that
is likely to irritate the interviewee.

The first contact with the respondent must be friendly, pleasant and courteous, using appropriate
salutation and addressing the interviewee with correct and appropriate titles. Compliments may be
paid even at the risk of flattery. This is likely to keep the interviewee relaxed. The interviewer
should also be relaxed, not tensely or nervously trying to recall the list and sequence of questions.
After appropriate salutation and compliments, the interviewer introduces himself/herself briefly and
modestly and introduces the problem, which is the focus of the interview.

Question-answer

The question and answer phase should be more permissive, flexible, and interacting than a plain
question and answer session. The interviewee should be kept interested and responsiveness tills the
end of the interview using any appropriate strategy. The interviewer should be pointed and
business-like not aimlessly wondering. Producing and probing to yield comprehensive information
should follow up starter questions. Pictures may be earlier statement. While doing this, interviewer
bias should be avoided by being as non-directive as possible. All throughout the question-answer
phase, the interviewer must be relaxed, not tense or nervous.

Recording

Comprehensive information should be recorded from an interview as unobtrusively as possible.


Recording may be done via mental notes, written notes, or taped records.

Mental note

Taking mental notes in the memory removes apprehension on the respondent’s part thus increasing
rapport. However, an interviewer should appropriately assess himself in terms of memory retention
before deciding to sue mental notes. The problem of forgetting can be quite acute, as missing
information cannot be easily reconstructed

59
Written note

Written records are advisable when there are too many questions and responses to keep track of via
mental notes. However, extensive writing is likely to excite or offend respondents thus reducing
rapport and validity of responses. People who are versed in shorthand can use this advantage in
making written notes

Another way out is to use a structured interview schedule in which alternative responses may be
encountered. It is thus essential that a structured schedule should make provision for “others”; that
is alternative responses, which were unanticipated. Structured schedules are time-consuming to
prepare but the fruits of the input are reaped during analysis.

Taped Records

Tape recording of interviews solves all the problems of memory loss, and extensive writing. They
also remove strain from the interviewer. They can be replayed and transcribed at leisure. However,
one must be cautious of the facts that audio or video tape recording instruments can go faulty in the
process. Whatever can be done during the preparation phase must be done. In addition, problem of
instruments reactivity are inherent in tape recording. Respondents could be frightened or excited
during recording, thus channeling attention away from the project and distorting respondent’s
behaviour. As far as possible, recorders should be concealed. Micro recorders are less identifiable
than macro recorders. Remote censors or pick up buttons are also less detectable and should be sued
when available.

8.3.2 Relative Advantages of Interviews

Compared to the questionnaire, as a method of data collecting, the interview has the following
advantages.

Firstly, interviews do not require reading or writing skills from the respondents. They can,
therefore, be used with persons who cannot read or write.
Secondly, the information obtained is not as restrictive as in the questionnaire. Responses
are more spontaneous.
Thirdly, face-to-face interaction with probing elicits more information than the
questionnaire. Opportunity is provided for clarification. Questions are allowed on both sides
Fourthly, the researcher can control the order and sequence in which questions are asked and
answered. The respondent does not know what is coming next unlike questionnaires, which may be
perused before responses are made thus introducing bias in the responses.
Fifthly, the researcher is opportuned to assess the mood of the interviewee and to respond
and adjust to it in order to elicit maximum information
Sixthly, interviews yield a hundred per cent response returns.
Lastly useful relationships may be established between the interviewer and the respondent.

60
8.3.3 Relative Disadvantages of Interview

Compared to the questionnaire as a method of data collection, the interview has the following
disadvantages.
Firstly, absolute lack of anonymity is inherent in the interview. This implies that socially
unacceptable responses that are encouraged by anonymous mail questionnaires may not be
obtained.
Secondly, sources of bias abound in the interview. Bias may arise from selective recording
of responses, from faulty interpretations of responses, form the interview situation, from
expectations expressed facially or by tone of voice, from directive probing, form the influence of
the interviewer’s personality and appearance.
Thirdly, information may be lost during recording
Fourthly, the interview is expensive because traveling and subsistence costs are incurred for
each personal interview. When using research assistants to cover larger samples, additional
expenses are incurred in training them. This is because more skill is required for effective
interviewing than for effective questionnaire administration. Interviews are not so suitable for large
samples.
Lastly, the interview is more time-consuming to prepare for and to execute than the
questionnaire.

8.3.4 Appraisal of the interview method of data collection

The interview method is more thorough in investigative approach than the questionnaire. Any
information obtained from the questionnaire can be obtained from an interview but not vice versa.
However, it is not so popular in educational research because of limitations in resources, time and
money. Its main limitations lie in the non-anonymity of respondents. This increases the likehood
of faking. In addition, bias may arise from many sources including probing, recording, and
interpreting responses.

8.4 Observation

Information provided by respondent in questionnaires and interviews can be inaccurate, prestige-


biased or faked. In contrast, observational techniques of data collection make it possible for the
researcher to obtain first-hand information about the objects, events and object-event interactions of
interest. Direct observation may be undertaken using any of the natural senses or technological
gadgets such as tape and video recorders and satellite monitors. Observation is preferred when
studying children, illiterates, traits that cannot be tested with pen and paper, and phenomena, which
must be looked at. It may be used alone or to supplement information collected using other
methods.

The observation method of data collection uses systematic procedures to identify target phenomena,
to categorize, observe, and record them.

61
8.4.1 Phases of the observation method.

Six phases of systematic observation may arbitrarily be distinguished. Theses are:

(a) Definition of aims and objectives,


(b) Selection and definition of attributes,
(c) Selection of observation modes and training of observers,
(d) Administrative arrangements,
(e) Observation, and
(f) Qualification of observation

Definition of aims and objectives

Systematic observation starts by defending the focus of observation. It is neither necessary nor
feasible to observe everything in any situation. A selection process by which the researcher decides
what to observe is imperative. This is done by defining the aims and objectives of the observation
as derived from the hypotheses and/or research questions. For example, if the research question is:
“what teaching aids are used in biology lessons/” the researcher must ignore other aspects of lessons
such as teacher-pupil relationship, content structure, verbal interactions, questioning techniques and
focus only on teaching aids. The objectives of observation could be to identify different types of
teaching aids used, to count the numbers of each type of teaching aid used, and to record the
number of times each teaching aid is used. The specification of these objectives guides the
researcher to collect only data that is useful for solving the research problem. In the above example,
for instance, the researcher will not collect data on how well each teaching aid was used.

Section and definition of attributes

Having defined the objectives of observation, the researcher proceeds to select and define the target
objects and events. For example, he selects the items that would be classified and accepted as
teaching aids such as charts, models, specimens, projectors, microscopes, drawings, pictures et
cetera. He then defines the characteristics of each item that differentiates it from other items. For
example, what differentiates a picture from a model? The term “used” should also be defined. For
example, is a chart being used when it is displayed in the classroom, when it is referred to, or when
it is used to explain phenomena? The essence of this procedure is to prevent a confusion in
categorization among different observers or by the same observer at different times different places,
and in different encounters,. It enhances the reliability of the counting and recording procedure.
Attributes observed should, as much as possible, the externally observable and denotable
characteristics, not abstract qualities such as inquiry or honestly. Abstract qualities should be
defined by their denotable characteristics.

Selection of observation modes and training of observers

After selection of the target objects and event, a decision is taken on the mode of observation to
adopt. Observation can be carried out using the natural senses or technological gadgets. The

62
observer can be a participant in the target situation or a non-participant. What is most appropriate
for a particular situation should be ascertained before the actual observation is undertaken.

The major consideration that should guide the choice of observation modes is minimum interference
with the observed situation. Minimum interference can be at optimum levels when using one-way
screens, remote censors, light differential and elevated corridors to keep observer out of the view of
participants. Otherwise, micro-recorders may be used to make the subjects less conscious of their
being observed. If a participant’s observer, the researcher must not play a leadership role. Having
selected the modes of observation, observers should be trained, in practice sessions, to use the
chosen modes of observation reliably. Trails testing in similar situations will serve the dual role of
training and refinement of procedures. If trail testing is not possible, videotaped recordings of
similar situations may be used to train observers to observe and code attributes accurately. Inter-
rater reliabilities could be computed as the number of agreements between instructor and trainee
divided by the total number of agreements and disagreements.

Observers who are not consistent should be replaced. After training, observation should commence
without an appreciable time lapse to prevent loss in observer skills.

Administrative arrangements

It is virtually impossible to carry out systematic observation without making prior arrangements.
The essence of making arrangements is to obtain valid data. If one is to observe school lessons, for
instance, the co-operation of school heads and teachers must be sought; technological gadgets must
be appropriately mounted; strategies must be adopted to ensure minimum distortion of the
phenomena under observation.

Observation

Observation should focus on low-inference not denotable characteristics, rather than high-inference
abstract qualities. For example, if observation is carried out to determine teachers’ enthusiasm,
denotable indicators could be punctuality, regularity, extra hours put in after school. Enthusiasm is
then inferred rather than observed. There should be an integrative theoretical or empirical basis for
the inference.

The number of visits needed for reliable observations must be considered. If one is estimating the
frequency of use of different teaching aids, the number of visits required to obtain an adequate
teaching aids, the number of visits required to obtain an adequate sample of lessons for just one
teacher may be as many as thirty. To be very generous, most studies cannot afford more than a
dozen observations on a single teacher. Particularly for cognitive variables, this situation is
unsatisfactory for obtaining a trustworthy mean score for one teacher.

Another issue to be taken into consideration during observation is observer effects. It is virtually
impossible to observe without introducing some distortion to the events that would have occurred if
observation were not taking place. Possible techniques for minimizing the distorting influence of
observation are habituation, assessment of effects, and remote presence. Habituation is a situation

63
in which the observer stays in the setting long enough to be taken for granted prior to the
observation sessions. The observer’s presence is then assumed to be as unnoticed as pieces of
regular furniture in the setting. This technique is not full proof because some subjects are bound to
remain self-conscious or suspicious. Assessment of observer effects may be made if they cannot be
eliminated. For example, if unacceptable behaviours occur in spite of the observation, one may
assume that the distorting effect of observation was minimal. Remote presence means that the
observer and observation gadgets are placed at a distance that is ignored by the subjects. For
example, the author has once used video recorders in a mobile technology vehicle positioned in an
adjoining field to observe lessons. Overhead microphones inside the building were connected to
video monitor inside the vehicle. On another occasion, the teachers whose lessons were observed
were made to wear a tiny microphone with pocketed micro recorders. The observer stayed away
from the classroom. Observer effects were assumed to be less distorting than in a situation were the
observer sat in the class ticking off categories on an observation schedule.

Finally, the researcher must consciously guard against halo effects and interpretation bias during
observation

Quantification of observation

The first step in quantifying observation is coding. Multiple coding systems have been evolved.
Three major systems, which may subsume others, are the sign system, the category system, and the
rating system.

The sign system (interval recording) records an event once within a specified time period regardless
of how often it occurs during that period. For example, the Science Teaching Observation
Schedule, STOS (Egglestion, et.al., 1975) is a sign system which requires the observer to make a
running record of the activities of teachers and pupils (see appendix).

The category system records an event each time it occurs. For example, the Biology Teacher
Behavoiur Inventory, BTBI (Evans, 1869; Balzer, 1969) is a category system, which makes it
possible to code pedagogically relevant non-verbal behaviours in classrooms.

The rating system estimates the frequency of events only once, usually at the end of an observation
session. Many observation schedules used in teacher education to supervise practical teaching
lessons of pre-service and in-service student teachers belong to this system.

It should be noted that anecdotal records do not usually require quantification.

Arithmetic processes are applied directly to quantitative counts obtained from sign and category
system data to yield relevant interpretations. For example, the STOS differentiates teachers into
three groups. Group 1 teacher are those who hold initiative for activities but challenges students to
generate hypotheses and solve problems. Group 2 teachers are those who initiate activities that are
predominantly oriented to fact acquisition. Group 3 teachers are those who allow for pupil-centered
work and practical activities. The implicit interpretation is a discovery-didactic axis on which

64
Group 3 teachers are at the extreme discovery end while group 2 teachers are at the extreme didactic
end with group 1 teachers in between.

In the case of rating systems, the frequency estimate of each event is scaled ordinally. The ordinal
scores of all the events are totaled. The total scores indicate the level of excellence at which each
subject performs.

8.4.2 Problems of observation

Many problems are inherent in observation as a method of data collection. Notable among them are
the following.
Firstly, observer effects are virtually inevitable in the method. They can be minimized by
unobtrusive methods as listed in section 8.4.1.
Secondly, to be properly executed, observation requires expertise. The researcher may not
have the expertise and would have to be trained before he can undertake the study. Additional
assistants often need to be used to conserve time or maintain objectivity and they also have to train.
Thirdly, the number of observations needed to obtain a representative sample of events is
often prohitive. May researchers, therefore resort to studying the target phenomena shoddily.
Fourthly, interpretation bias can distort the observed events, the researcher reporting a
‘coloured’ version rather than the findings per se. This bias can be minimized using a number if
observing, research assistants are trained to undertake the objective observation. They are trained to
record events accurately but are not furnished with information about the objectives of the study or
the events that would confirm the researcher’s expectation. Another strategy is to use multiple
independent observers who must agree for their records, which allow for analysis and re-analysis
may be used.
Fifthly, halo effects (later records of observation being affected by earlier impressions) can
reduce the validity and reliability of information collected through observation.
Lastly, when rating systems are used for recording and quantifying observation, many
shades of meaning may be attached to the same scale point, and between any two contiguous scale
points. Rating errors may occur as a result of ambiguities in the meaning of the scale points. For
example, what is considered as excellent articulation may be regarded as fair by a different
observer? Besides, rating systems are subject to errors of leniency and response sets such as the
tendency to rate subjects towards the middle, rather than at either of two extremes.

8.4.3 Disadvantages of observation

Observation studies enormous amounts of time, energy, and resources to be properly executed. In
practice, they use small samples, which reduce their internal validity (as a result of unrepresentative
samples) as well as the external validity/generalizability of their findings.

8.4.4 Advantages of observation

When properly executed, observation provides unique insights not attained by other methods. It
yields direct first-hand information, which is therefore more valid than reported information

65
obtained from questionnaires and interviews. Observation is peculiarly suitable for the study of
young children, handicapped persons, and illiterates.

8.4.5 Appraisal of observation as a method of data collection

Observation is the only way of obtaining direct first-hand information. |It provides unique insights.
Additional, unexpected but useful information may be encountered during observation sessions. It
is the method of choice for the study of young children and handicapped persons.

However, if properly executed, it involves the expenditures of enormous amounts of time, energy
and resources. Whereas a questionnaire could be administered to a class of thirty students within
twenty minutes, observing the same class may require a minimum of ten sessions of one-hour
duration. Technological gadgets may be required and training in observation skills is mandatory

Because of the cumbersome nature of observation and because some events cannot be subjected to
systematic observation, this invaluable method is poorly patronized in educational research.

66
Nine. Writing Research Reports

9.1 Introduction

Research is useful for generating findings, which are used for rational decision-making and as a
springboard for further research. Unless the findings and the procedures used to generate them are
disseminated, research cannot serve these purposes. In order to disseminate research findings,
research reports are written for circulation to a wide audience.
The impersonal mode is preferred for communication in writing research reports. Instead of
starting; that “I carried out the study in two phases. I identified the behaviour al profiles of the
teachers” the research report would state: “The study was carried out in two phases. The
behavioural profiles of the teachers were identified.”
The arrangement of different parts of the report should make it possible for a reader to easily
locate any section of particular interest to him/her. For example. The reader may wish to peruse the
results, or to examine only the research procedure.
Convectional formats for arranging research reports differ for different types of research
reports. The different forms of presenting research reports. Include the thesis/dissertation, the
journal article, the conference, seminar or workshop paper, a book or monograph. The different
forms require different modes of presentation of the completed research. In the sections that follow,
the requirements for theses, journal articles, and conference papers will be discussed.

9.2 Sample format of thesis/dissertation

As a guideline to the research students, a convectional format for arranging reports in


thesis/dissertation is given below. Particular theses need to have all the sections.
Relevant sections should be used in the appropriate order. The format is as follows:

1. Preliminary Pages

(a) Title page


(b) Acceptance page
(c) Dedication
(d) Acknowledgment page
(e) Abstract
(f) Table of contents
(g) List of tables
(h) List of figures
(i) List of appendices

67
2. Introduction

(a) Background to the problem


(b) Statement of the problem
(c) Significance of the study
(d) Objectives of the study
(e) Scope of the study
(f) Area of study/context of the study
(g) Research questions and hypotheses
(h) Definition of terms

3. Review of Literature

4. Research Methodology

(a) Research design


(b) Population
(c) Sampling techniques and sample
(d) Instrumentation
(e) Data collection
(f) Data analysis techniques
(g) Limitations of the study

5. Results and Discussion

(a) Presentation and analysis of data


(b) Interpretation of findings

6. Summary and Conclusion

(a) Summary of results


(b) General conclusions
(c) Implications of the study and/or recommendations
(d) Suggestions for further study

7. Supplementary Pages

(a) Bibliography
(b) Appendix
(c) Index

The different sections of the format above are described below

68
9.2.1 The Preliminary pages

The title page is the first of the preliminary pages. The title should be clearly brief and to the point.
The essential elements to be embodied in the title are the major variables and the target population.
They should be phrased in such a manner as to describe what the study is about. They must not be
phrased in an emotionally laden manner to suggest that a particular point of view is being sold to
the reader. An example of a thesis title is “Sex-differences in science enrolment among SSCE
candidates in 1995 as the target population. In contrast, “A study of the enrolment of the boys and
girls who sat for science in the SSCE in 1995” is superfluous. The second title contains many extra
words which offer no advantage in the identification of the research, the awarding institution, the
date of the award, the student’s name and matriculation number where applicable. Sometimes the
previous academic qualifications of the student are appended to his name.

The acceptance page is laid out in a manner specified by the institution to which the thesis is
submitted for a degree. It contains any of the following information: the names, signatures of the
departmental head, the dean, the supervisors and date, the name(s) of the student(s), and an
attestation of the originality of the research report. Some institutions require in addition, the name
and signature of the external examiner.

The dedication permits emotionally laden words in which tribute is paid to persons who are dear to
the author or persons who would particularly be interested in the research findings.

The acknowledgement page expresses gratitude to those who helped the researcher in the process of
conducting the research and preparing the report.

The abstract succinctly summarizes the aim of the investigation, the sample, methods of
investigation, the measuring instruments used, and the findings.

The table of contents lays out in tabular form, the chapters, headings and subheadings of the report
with the page numbers in which various sections of the report may be located. It is sequentially
arranged and numbered from preliminary to supplementary pages.

The list of tables is similar to the table of contents. It indicates the page numbers in which the title
of each table should be serially listed. Similarly, the list of figures tabulates the figure numbers,
their titles, and the pages where each figure presented in the report may be located. The list of
appendices is also serially arranged in numerical order.

9.2.2 The Introduction

The background to the problem presents reasoned statements to show that it is worthwhile to
dissipate resources to carry out investigation in the problem area. The reasoning should be clear
and convincing to the reader.

The essential elements of the problem statements were fully discussed in the chapter on “the
Research Problem”.

69
The significance of the study should be written to show the utility value of the research. The
findings of research are expected to profit some institutions or individuals. These institution or
individuals benefits expected to accrue to them ought to be mentioned.

The objectives of the study should state the specific aspects of the problem investigated in the
research and the reasons for focusing on these aspects. The objectives should briefly overview all
the elements that would be investigated. Sometimes this section is interchanged with the purpose of
the study.

The scope of the study should indicate the extent to which the researcher intended to cover the topic,
the geographical area, time period, and variables to be covered. Sometimes this section is
interchanged with delimitation of the study.

In area of the study, a detailed description is given of the location and peculiar features of the
geographical area covered.

The peculiar features may involve a context; for instance, the context of frequent teacher strikes.

The researcher questions and hypotheses were treated in greater detail in the relevant chapter.

The definition of terms section is used to educate the reader on the operational meaning of nay-
coined words, phrases or expressions which cannot otherwise be understood because of their
unconventional usage. Any terms to which appropriate meaning is attached by convectional usage
should not be included in this section. The essence of definition is to ensure that the reader
understands the specific meanings ascribed to the term by the author.

9.2.3. Research methodology

The research design should be written up to show the extent to which extraneous variables were
controlled or eliminated. Any lapses should be reported as a limitation. The research design need
not fall into the neat categories described in the chapter on research design. A combination of
designs may be used. Effective control of extraneous variables may dictate the use of unlabelled
designs. Any plan used should be clearly described even if it cannot be classified under a
convectional label such a Solomon four-group design or fractional design.

The population description should specify all the necessary parameters to ensure that the constitutes
and characteristics of the target population are unambiguous. The population must not be mistaken
for the area of study. For example, the area of study could be Nigeria while the population is
secondary school students. Tabulating the constituent elements and their characteristics often
enhances the population description. For example, the different constituent elements of secondary
school students in Nigeria may be students in federal government schools, students in state
government schools, and students in private schools. They could also be categorized as students in
coeducational schools, single-sex girls’ schools and boys’ schools. They may be further
differentiated as male and female students.

70
A hypothetical tabulation of these constituent elements could read as follows:
Table 9.1. Population of secondary school students in Kenya

Federal State Private


Male Female Male Female Male Female Total

Girls’ school - 100,000 - 500,000 - 50, 000 650,000

Boys School 150, 000 - 1, 000, 000 - 100, 000 - 1,150,000

Coeducational 100, 000 50, 000 5, 000, 000 5,000, 000 500, 000 200, 000 8, 850, 000
School

Total 150, 000 150,000 6, 000, 000 3, 500, 600, 000 250, 000 10, 65, 000
000

The sampling techniques should be described in such a manner as to leave no one in doubt of what
the researcher actually did in selecting sample. Statement such as: “A simple random sample of 50
subjects was drawn from the population” are inadequate. The specific manner in which a simple
random was drawn should be reported; for example, “A table of random numbers was used to select
50 out of 200 subjects”. Similarly, statements such as: “The balloting technique was used” should
be amplified by describing whether pieces of numbered paper were jumbled in a box and members
of the population were called up to pick, et cetera.

Under the subheading “Instrumentation” the tool (s) for data collection such as questionnaire,
attitude scales, tests, should be fully described to show their essential characteristics. Reliability
indices and validation procedures should be reported. If a standard instrument was used the report
should give reasons why it was considered for its administration were fulfilled. If new instruments
were developed, the procedures followed should be outlined. The detailed substantive content in
the appendix.

The section on data collection should show the method(s) by which the researcher obtained the
data. For example, whether research assistants were used, whether they underwent training,
whether the researcher was present at each location to collect the data or postal systems were used.
If permission was sought before data could be collected, such information should be reported. If
instruments such as recorders went faulty during data collection and steps were taken to correct
them, these should be reported. Such information may help a future researcher to be forearmed.
The inclusion of practical details and problems encountered serve the additional purpose of
confirming that the student actually carried out the investigation and experienced the realities of
research. All steps taken to ensure the collection of valid data should be reported.

The section on data analysis techniques describes the techniques applied to the data and the reasons
for their choice where applicable. Reasons for choice should be related to the research design, the
nature of the sample and the type of data. If the mode of analysis is not widely known, it should be

71
reported in detail. As far as possible, the simplest, well-known techniques should be used. It is not
necessary to report the formulae and details of computation of popular techniques such as
correlation, t-tests, X2 tests, and ANOVA.

Limitations of the study should state the desirable conditions which were not met and which are
expected to influence the internal and/or external validity of the research. For example, an
experimental study may inevitably be limited in generalisability of the findings to the target
population because of restrictive experimental study may fail to control for all the relevant
extraneous variables thus reducing the internal validity of the study.

9.2.4 Results and discussion

Data presentation and analysis should clearly and concisely set out the results using the most
illuminated modes of presentation. Tables and figures should be exploited fully. All tables and
figures should be serially numbered and must have titles/headings. The findings of interest
displayed therein should be highlighted using brief verbal descriptions. They should be directly
related to the hypotheses and/or research questions. Sometimes it is appropriate to organize the data
presentation and analysis around the hypotheses, treating each hypothesis in turn. |Other details of
the data may be included in the appendix.

Interpretation of the findings should render the results more meaningful to the reader by discussing
possible explanations for the findings. In doing this, relevant literature should be cited to provide
convincing evidence that the interpretation makes a contribution to existing theory and knowledge
in the area. All the insights obtained while analyzing the data should be made available to the
reader.

9.2.5 Summary and conclusion

The summary should clearly and concisely restate the problem, the hypotheses and/or research
questions, and findings.

The conclusion should be based solely on the findings generated by the research.

Under implications of the study the students may include personal ideas on the relevance of the
findings to educational theory and practice. However, these ideas should be directly derived from
the study. A common error among students is to use this section for speculative statements about
which the research had no evidence. Nothing is gained from an unnecessarily lengthy implications
section. Instead, the student loses marks for speculating outside the scope of the study.

Genuine suggestions for further study should be made as if they are matters arising from the
research.

72
9.2.6 Supplementary pages

The bibliography must include all references cited in the report Related literate which shed light on
the problem but were not cited are also included for more exhaustive study by interested readers.
The arrangement of the bibliographical entries depends on the referencing style. Whichever style is
used, the format should be uniform throughout the thesis. Most institutions prefer the author/date
system in which the entries are arranged in alphabetical order of the authors’ names. Where an
author has more than one work in a year the alphabetical suffixes, a, b, c, d, etc. Are added to the
year of publication to distinguish between the different publications. They should be serially
arranged within the year. If an author has publications in different years, these should also be
serially arranged with the earlier works entered first. Mistakes often occur in the entries. A student
is therefore advised to be meticulous in crosschecking the details of the entries. Further details are
provided under the section heading “References”.

The appendix contains extra information which is part of the report the reader should know about
but not necessary for inclusion in the main report. Such extra information may be in the form of the
complete substantive content of instruments used in the research, samples of collected data in the
original form such as transcripts of lessons, and full-length results too cumbersome to be included
in the main report.

An index is not essential to a thesis but may be included in rare cases where they are considered
useful.

9.3 Writing journal articles

The writing of journal articles is an art, which the uninitiated researcher has to learn and practice
before she can perfect.

The format for writing journal articles is usually prescribed by specific journals. Before writing a
journal article, the author should consult the section on “Notes to Contributors” usually found on
the inside of a journal’s front cover or on the preliminary pages. Each journal specifies the required
length for articles, the type of articles the journal is interested in, and the style manual that should
be followed. Perusal of articles in earlier volumes of the journal is useful.

Because the journals have constraints on space, the articles are very concisely written. A research,
which is reported in 200 pages of a thesis, would usually be reported in not more than 15 pages of a
journal. The problem, hypotheses, methodology and findings, are briefly stated, the literature
review is reported in very economical words to present a reasoned case for the article. The
contribution to knowledge must be elucidated for the article to be accepted for publication.

A journal is usually directed at an informed audience, which is expected to understand the succinct
expressions used, verbosity being reduced to the barest minimum. Flamboyant elaborations are
tolerated. The amateur would benefit from the training of experienced authors. Initial manuscripts
should check for the most appropriate journals and tailor the article to the specifications of such
journals.

73
9.4 Conference/seminar papers

Conference papers are written in a similar manner to journal articles but longer lengths of articles
are usually tolerated. They are written in less formal manner than journal articles, being targeted at
the listening audience.
Many conference papers are eventually published in journals.

9.5 References

All references cited while writing a research report must be listed with full details. The researcher
should ensure that the details of the author, data, place of publication, publisher and page numbers
are accurate. This enables the reader to retrieve the sources cited. Sometimes additional reference
not cited are included in the reference list, which is then called a bibliography. Such additional
references draw the reader’s attention to further literature in the area.

If the Turabian or numeric index referencing system is used in the text, the reference details are
listed at the bottom of each page. Numeric superscripts are used to identify each material in the text
and in the footnote. Otherwise, the list comes at the end of the last chapter. Each entry begins with
the authors’ surname. All the author’s names are arranged in alphabetical order. Each entry is
indented. If the author’s work was cited in a secondary source, the author of the secondary source
is entered in the reference list.

In the classic system, the full details of a work earlier listed are replaced by the Latinized
abbreviations, ibid, op.cit. and loc. it after the author’s name. In the American Psychological
Association (APA) system (APA, 1983) which is followed in this book, full bibliographical details
are always provided. Using the APA system. The modes of entry for books, government
documents, periodicals, technical reports, monographs, conference and workshop proceedings,
dissertations, unpublished manuscripts, personal communications, and minutes of meetings are
discussed below.

9.5.1 Books

A book reference should include the following elements in the sequence in which they are listed.

(a) Surname of author(s) or editor(s) separated from the initials with comma(s)

(b) Initials of the author(s) or editor(s)

(c) Editors or editors specified using the abbreviations, Ed, or Eds. In parenthesis.

(d) Year of publication (in parenthesis) separated from the title with a full stop. If the book
has-been reprinted, the year of first publication is used. For a revised edition, however,
the date of revision is used.

74
(e) Title to include sub-titles where applicable (underlined) delimited by a full stop.

(f) Editions other than the first to be specified using such abbreviations as 2nd. Ed., rev. ed.,
African ed. (in parenthesis). A full stop is used to separate the edition from the volume
number, series number or town of publication which ever is applicable

(g) Volume number and/or series number and a full stop.

(h) Town of publication separated from the publisher with a colon. If the towns are more
than one, they should be delimited using commas.

(i) Publisher (followed by a full stop).

Examples of book references are given below:

Biehler, R.F. & Snowman, J. (1982). Psychology Applied to Teaching (4th. Ed.) Boston: Houghton
Mifflin Company.

Bloom, B.S., Engelhardt, M.D., Furst, E.J. Hill, W.H. & Krathwohl, D.R. (1956). Taxonomy of
Educational Objectives. Handbook 1: Cognitive Domain. New York: Longmans Green Company.

9.5.2 Articles or Chapters in books

The essential features of references to articles in books are listed sequentially below.

(a) Author of article or chapter

(b) Year of publication (in parenthesis). The year of publication is separated from the title
with a full stop.

(c) Title of article. The title is NOT underlined. IT is delimited with a full stop.

(d) The term “in” is introduced to show that the article is contained in a more
comprehensive work.

(e) Editor(s) of the book. Note that the initials of the editor(s) come before the surname.

(f) Ed. Or Eds. (In parenthesis) separated from the title by a full stop.

(g) Title of the book (underlined) delimited by a full stop.

(h) Town of publication separated from the publisher with a colon.

(i) Publisher delimited with a comma.

75
(j) Page numbers in which the article is found. This is followed by a full stop

Examples of references to articles in books are given below:

Olumu., Namu, V; & Muzeyi, U, (1990). Trends in unpublished higher degree research in
Uganda.(1965-1987). In R.O. Ohuche & M. Anyanwu (Eds.) Perspective in Educational Research
and National Development Volume 1. Onitsha: Summer Educational Publishers Limited. 63-73).

9.5.3 Government documents

Examples of references, which are classified as government documents are given below:

The Republic of Uganda (1981). National Policy on Education (rev.ed.). Kampala: Government
Press.
Implementation Committee for the National Illiteracy programme (1978-79). Blueprint: Kampala
Government Press.
The National Planning Committee (1986). Blueprint on Education for the Gifted and Talented
Persons. Kampala: Special Education Section, Ministry of Education.

9.5.4 Periodicals

Periodicals are issued at regular intervals. They include journals, magazines, bulletins, newsletters,
and newspapers. The essential features to be sequentially included in references to articles in
periodicals are as follows:

(a) Author’s surname, separated from the initials with a comma. An article without an
author is entered by the name of the periodical

(b) Year of publication enclosed in parenthesis. Most newspapers are daily periodicals. The
specific date may be indicated and separated from the year of publication with a comma.
A full stop is used to separate the date of publication from the title of the article.

(c) Title of the article. Only the first letter of the title is written in capitals. All other words
begin with small letters except for proper nouns. The title of the article is delimited with
a full stop.

(d) Name of the periodical underlined. The name is separed from the volume number with a
comma. The key words of the name of the periodical begin with a comma. The key
words of the name of the periodical being with capital letters.

(e) Volume number underlined. The issue number enclosed in parenthesis when pagination
is by issue accompanies the volume number. In such cases, each issue begins on page
one. When all the issue number may be omitted. A comma is used to separate the issue
and/or volume number from the page numbers.

76
(f) Page numbers in which the article appears. The page numbers are delimited with a full
stop. If the page numbers are not contiguous, all the pages are listed and separated by
commas. For newspapers and magazines the abbreviations ‘p.’ for one page and ‘pp.’
for more than one page/are inserted to indicate the page numbers

Examples of specific references to periodicals are listed below:

Ali, A. (1988). The status of infrastructural presence for teaching introductory technology in rural
schools. BENSU Journal of Education, 1 (2), 202-213.

The Guardian (1993, July 13). When examination malpractices unsettle scholars’ reliability. The
Guardian, 10(5792), pp.1, 5.

9.5.5 Technical reports and monographs

References for technical reports and monographs are entered in the same manner as books.
However, some monographs or reports may be put together without a publisher. Some examples
are given below:

UNESCO (1981). Bibliography of Major Decisions, Meetings and Documents Related to


UNESCO’s Activities in the Domain of Mother Tongue Instruction and Languages. Paris
UNESCO (Division of Structures, Content, Methods and Techniques of Education).

9.5.6 Published conferences and workshop proceedings

The references for conference and workshop proceedings are documented in the same manner as for
periodicals except that volume and issue numbers are not usually available. Some examples are
given below:
Okeke, E.A.C. (1986). Remedies for students’ poor performance in science proceedings of the 27th
Annual Conference of the Science Teachers Association of Nigeria, 118-129.

Harding, J. (1982). Sex stereotyping: Power and control. In J. Head (Ed.). Science Education for
the Citizen. The proceedings of the UK/USA Science Education Seminar held at Chelsea College,
University of London, 77-78

Onabamiro, S.D. (1984). Kynote address at the National Workshop on the planning for the senior
management officers. Report of the National Workshop for senior managers, 17-21.

9.5.7 Unpublished conference and workshop papers; unpublished manuscripts

Unpublished works are listed without underlining as follows: Ivowi, U.M.O. (1985). A critique of
the national policy on secondary education. Commissioned paper read at the National Workshop on

77
the National Policy on Education, Curriculum Organization of Nigeria, University of Nigeria,
Nsukka chapter, 12-16 February.

Balogun, T.A. (1975). Some Current science curriculum developments: level and global rationale.
Unpublished Mimeograph. University of Ibadan.

Nsereko, U. (1980). Perspectives in morality. Unpublished Manuscript.

9.5.8 Published Dissertations

Published dissertations are documented by citing the volume issue, and page numbers in which their
abstracts appear in the Dissertation Abstract International and their order number as shown by the
following examples:

Abimola, I.O. (1984). A study to describe and explain the alternative conceptions related to human
respiration held by selected Nigerian form four students. Dissertation Abstracts International,
45(5), 1357-A, No. DA 8414218.

Nkpa, N. (1984). Clear biology teaching: student and observer perspectives. Dissertation
Abstracts International, 45(9). 2746-A, No. Da 84267:7

9.5.9 Unpublished dissertation

Unpublished dissertations are listed without underlining. The educational institution awarding the
degree must be mentioned. The number of pages may be included. Examples are listed below:

Nkpa, N. (1980). Kinetic structure of microlessons of science student teachers at the Alvan Ikoku
College of Education. Unpublished M.Ed. Thesis, State University of new York

Okpala, P.N. (1985). Teacher attitudinal variables instructional and assessment practices as
correlates of learning outcomes in physics. Unpublished doctoral dissertation, University of Ibadan.
261 pp.

9.5.10 Personal communications

Personal communications include letters, memos, and telephone conversations. These are cited
only in the text by the author’s name and date thus:

M.A. Madu (personal communication, June 12, 1994).

Minutes of meetings are cited by the details indicated on the minutes for example:

ASUU (1996). Minutes of the meeting of the Academic Staff Union of Universities. University of
Ilorin Branch held on June 13, 199

78
References
.
Bakare, CGM. (1977). Vocational Interest Inventory. Ibadan: Psycho-Educational Research
Productions.

Berelson, B., (1952). Content Analysis in Communication Research. New York: Free Press.

Bloom, B.S., Hastings, H.T., & Madaus, G.F… (1971). Handbook on formative and Summative
Evaluation of Student Learning. New York McGraw-Hill Book Company.

Brown, W. (1910). Some experimental results in the correlation of mental abilities. British Journal
of Psychology, 3:296-322.

Carmines, E.G & Zeller, R.A. (1979). Reliability and validity Assessment. London: Sage
Publications

Cronbach, L.J., Coefficient alpha and the internal structure of tests Psychometrika, 16:297-334

Denzin, N.K. (1978). Sociological Methods: A Sourcebook. 2nd Edition. New York: McGraw-
Hill

Kuder, G.F. & Richardson, M.W. (1937). The theory of the estimation of test reliability,
Psychometrika, 2: 151-160.

Medley, D.M. (1979). The effectiveness of teachers. In P.L. Peterson & E.J. Walberg (Eds.),
Research on Teaching – Concepts, Findings, and Implications. Berkeley, California: McCutchen
Publishing Corporation.

Oppenheim, A.N. (1966). Questionnaire Design and Attitude Measurement. London: Heinemann
Educational Books Ltd.

Ornstein, A.C.E. & Levine, D.U. (1981). Teacher behaviour research: Over-view and outlook.
Phi-Delta Kappan, 62: 592-596.

Power, C. (1977). A critical review of science classroom interaction studies. Studies in Science
Education, 4: 1-30.

Roethlisberger, F.J. & Dickson, MJ. (1939). Management and The Worker. Cambridge, M.A.:
Harvard University Press.

Summer, G.F. (1977). Attitude Measurement. London: Kershaw Publish Company Ltd.

79
80

Вам также может понравиться