Вы находитесь на странице: 1из 17

UNIT 1 – PART A

Define statistics. Explain the categories of


statistics.
• Statistics‘ means numerical presentation of facts. Its meaning is divided into two forms - in plural form and in singular form. In plural form, Statistics‘ means a collection of
numerical facts or data example price statistics, agricultural statistics, production statistics, etc. In singular form, the word means the statistical methods with the help of which
collection, analysis and interpretation of data are accomplished.
• Categories of Statistics:
• Descriptive statistics
• It deals with the presentation and collection of data. This is usually the first part of a statistical analysis. It is usually not as simple as it sounds, and the statistician
needs to be aware of designing experiments, choosing the right focus group and avoid biases that are so easy to creep into the experiment.
• Different areas of study require different kinds of analysis using descriptive statistics. For example, a physicist studying turbulence in the laboratory needs the
average quantities that vary over small intervals of time. The nature of this problem requires that physical quantities be averaged from a host of data collected
through the experiment.
• If a business analyst is using data gathered on a group to describe or reach conclusions about that same group, the statistics are called descriptive statistics. For
example, if an instructor produces statistics to summarize a class’s examination effort and uses those statistics to reach conclusions about that class only, the
statistics are descriptive. Many of the statistical data generated by businesses are descriptive. They might include number of employees on vacation during June,
average salary at the Denver office, corporate sales for 2009, average managerial satisfaction score on a company-wide census of employee attitudes, and
average return on investment for the Lofton Company for the years 1990 through 2008.
• Inferential statistics
• Inferential statistics, as the name suggests, involves drawing the right conclusions from the statistical analysis that has been performed using descriptive statistics. In
the end, it is the inferences that make studies important and this aspect is dealt with in inferential statistics.
• Most predictions of the future and generalizations about a population by studying a smaller sample come under the purview of inferential statistics. Most social
sciences experiments deal with studying a small sample population that helps determine how the population in general behaves. By designing the right experiment,
the researcher is able to draw conclusions relevant to his study.
• If a researcher gathers data from a sample and uses the statistics generated to reach conclusions about the population from which the sample was taken, the
statistics are inferential statistics. The data gathered from the sample are used to infer something about a larger group. Inferential statistics are sometimes
referred to as inductive statistics. The use and importance of inferential statistics continue to grow. One application of inferential statistics is in pharmaceutical
research. Some new drugs are expensive to produce, and therefore tests must be limited to small samples of patients.Utilizing inferential statistics, researchers
can design experiments with small randomly selected samples of patients and attempt to reach conclusions and make inferences about the population. Market
researchers use inferential statistics to study the impact of advertising on various market segments. Suppose a soft drink company creates an advertisement
depicting a dispensing machine that talks to the buyer, and market researchers want to measure the impact of the new advertisement on various age groups. The
researcher could stratify the population into age categories ranging from young to old, randomly sample each stratum, and use inferential statistics to determine
the effectiveness of the advertisement for the various age groups in the population. The advantage of using inferential statistics is that they enable the researcher
to study effectively a wide range of phenomena without having to conduct a census.
Trace the historical development of statistics.

• The word statistics is derived from the Latin word “status” or the Italian word “statista,” and meaning of
these words is “political state” or "government." Shakespeare used the word statist is his drama Hamlet
(1602). In the past, statistics was used by rulers. The application of statistics was very limited, but rulers and
kings needed information about land, agriculture, commerce, populations of their states to assess their
military potential, their wealth, taxation and other aspects of government.
• Gottfried Achenwall used the word statistik at a German university in 1749 to mean the political science of
different countries. In 1771 W. Hooper (an Englishman) used the word statistics in his translation of Elements
of Universal Erudition written by Baron B.F Bieford. In his book, statistics was defined as the science that
teaches us the political arrangement of all the modern states of the known world. There is a big gap
between the old statistics and modern statistics, but old statistics is also used as a part of present-day
statistics.
• During the 18th century, English writers used the word statistics in their works, so statistics has developed
gradually during the last few centuries. A lot of work was done at the end of the nineteenth century.
• At the beginning of the 20th century, William S Gosset developed the methods for decision making based on
small sets of data. During the 20th century, several statisticians were active in developing new methods,
theories and applications of statistics. These days, the availability of electronics is certainly a major factor in
the modern development of statistics.

Trace the historical development of statistics..
• The history of statistics can be traced back at least to the biblical times in ancient Egypt, Babylon, and Rome.
• EGYPT (3,500 BC).
• Used statistics in the form of
• recording the number of of sheep or cattle owned
• the amount of grain produced
• the number of people living in a particular city
• BABYLONIAN GOVERNMENT (3,800 BC)
• Used statistics to measure
• the numbers of men under a king's rule
• the vast territory that the king occupied
• ROMAN EMPIRE (700 BC)
• Used statistics by
• conducting registration to record population for the purpose of collecting taxes.
• MODERN TIMES
• Statistical methods have been used to record and predict such things as:
• birth and death rates
• employment and inflation rates
• sports achievements
• other economic and social trends
• Statistical methods have been also used to assess opinions from polls and unlock secret codes from the game of chance.
• JOHN GRAUNT - English Trade man, modern statistics begun, create the first mortality table, a table that shows how long a person may be expected to live after reaching
a certain age.
• KARL FREIDRICH GAUSS - Brilliant German Mathematician, used statistical methods in making predictions about the positions of the planets in our solar system.
• ADOLPHE QUETILET - Belgian Astronomer, Father of Modern Statistics.
• KARL PEARSON - English Mathematician, made important links b/w probability and statistics.
• SIR RONALD AYLMER FISHER - British Statistician, developed F - tool, in inferential statistics, an F-tool is very useful in testing improvements of production from
agricultural experiments and improvement of precision of results from medical, biological and industrial experimentation.
• GEORGE GALLUP - instrumental in making statistical polling, a common tool in political campaigns.
• In this age if INFORMATION TECHNOLOGY, a lot of computer programs such as, Microstal, Soritec Sampler, SPSS etc., are made available in diskettes or websites that
perform more than the manual calculations in statistics.
List and brief the phases in statistical study.
List and brief the phases in statistical study…
List and brief the phases in statistical study…
Identifying the question
•What is the question? (What are my hypotheses?)
•Is it possible to answer the question with statistics?
•Is the data obtainable? (birth weight, socio economic, drugs, alcohol)
•Is it ethical to obtain such data?
•If not, is there a reasonable substitute?
•Are the assumptions reasonable?
II. Designing a Study
•Identify the population of interest
•Survey
oObtain a representative sample of that population
1.Simple Random Sampling
2.Stratified Sampling (M-F, Age groups)
3.Systematic Sampling (class roster, census list)
4.Convenient Sampling
oSources of Bias
1.Selection bias (undercoverage)
2.Non-response bias (day phone)
3.Response bias (people lie)
•Observational Studies
oUsed when a designed experiment is not ethical
oSubjects studied over a period of time in natural setting
oRecord Variables of interest
oConfounding is a major issue
•Designing an Experiment
oResearcher has control over the subjects or units in the study
oAn intervention takes place that otherwise would not occur
oRandomization used to assign treatments
oStrongest case for causality
•EDA - Exploratory Data Analysis (trends, relationships, differences)
•Pilot Study
III. Collecting Data
•Identify variables
•Identify types of variables
oQualitative
oQuantitative
•Identify Limits of measurement or observation
IV. Analyze the data
•Descriptive statistics: present frequency distribution tables and graphs
•Inferential statistics: use of proper procedures and techniques to infer population parameters and relation.
oCheck the assumptions behind the procedures and techniques.
V. Make Conclusions and Discuss Limitations
•Report and interpret the results of the analysis and answer the original hypotheses.
•Explain the limitations of the study.
•Identify conclusions that the study could not make.
•Identify new questions arise from this study.
What are the scope and characteristics of
statistics?
• Scope of statistics
• The scope of statistics is much extensive. It can be divided into two parts –
• (i) Statistical Methods such as Collection, Classification, Tabulation, Presentation, Analysis, Interpretation and Forecasting.
• (ii) Applied Statistics – It is further divided into three parts:
• a) Descriptive Applied Statistics : Purpose of this analysis is to provide descriptive information.
• b) Scientific Applied Statistics : Data are collected with the purpose of some scientific research and with the help of these data
some particular theory or principle is propounded.
• c) Business Applied Statistics : Under this branch statistical methods are used for the study, analysis and solution of various
problems in the field of business.
• Characteristics if statistics
• a) Aggregate of facts/data
• b) Numerically expressed
• c) Affected by different factors
• d) Collected or estimated
• e) Reasonable standard of accuracy
• f) Predetermined purpose
• g) Comparable
• h) Systematic collection.
Describe the functions and limitations of statistics.
• Functions of statistics
• To provide numerical facts.
• To simplify complex facts.
• To enlarge human knowledge and experience.
• Helps in formulation of policies.
• To provide comparison.
• To establish mutual relations.
• Helps in forecasting.
• Test the accuracy of scientific theories.
• Statistics helps in providing a better understanding and exact description of a phenomenon of nature.
• Statistics helps in the proper and efficient planning of a statistical inquiry in any field of study.
• Statistics helps in collecting appropriate quantitative data.
• Statistics helps in presenting complex data in a suitable tabular, diagrammatic and graphic form for easy and clear comprehension of the data.
• Statistics helps in understanding the nature and pattern of variability of a phenomenon through quantitative observations.
• Statistics helps in drawing valid inferences, along with a measure of their reliability about the population parameters from the sample data.

• Limitations of statistics
• Figures are convincing, and therefore people are easily led to believe them.
• Ignorance of limitation of statistics.
• Lack of test of accuracy.
• Contradiction of data from actual circumstances.
• Lack of specific ability to arrive at correct and appropriate results.
• Can easily be manipulated.
• Defective data
• Inappropriate comparison
• Statistics laws are true on average. Statistics are aggregates of facts, so a single observation is not a statistic. Statistics deal with groups and aggregates only.
• Statistical methods are best applicable to quantitative data.
• Statistics cannot be applied to heterogeneous data.
• If sufficient care is not exercised in collecting, analyzing and interpreting the data, statistical results might be misleading.
• Only a person who has an expert knowledge of statistics can handle statistical data efficiently.
• Some errors are possible in statistical decisions. In particular, inferential statistics involves certain errors. We do not know whether an error has been committed or not.
Explain the applications of statistics in various
fields.
Statistics plays a vital role in every field of human activity. Statistics helps in determining the existing position of per capita income, unemployment, population growth
rates, housing, schooling medical facilities, etc., in a country.
Now statistics holds a central position in almost every field, including industry, commerce, trade, physics, chemistry, economics, mathematics, biology, botany,
psychology, astronomy, etc., so the application of statistics is very wide. Now we shall discuss some important fields in which statistics is commonly applied.
• (1) Business
• Statistics plays an important role in business. A successful businessman must be very quick and accurate in decision making. He knows what his customers want; he
should therefore know what to produce and sell and in what quantities.
• Statistics helps businessmen to plan production according to the taste of the customers, and the quality of the products can also be checked more efficiently by using
statistical methods. Thus, it can be seen that all business activities are based on statistical information. Businessmen can make correct decisions about the location of
business, marketing of the products, financial resources, etc.
• (2) Economics
• Economics largely depends upon statistics. National income accounts are multipurpose indicators for economists and administrators, and statistical methods are
used to prepare these accounts. In economics research, statistical methods are used to collect and analyze the data and test hypotheses. The relationship between
supply and demand is studied by statistical methods; imports and exports, inflation rates, and per capita income are problems which require a good knowledge of
statistics.
• (3) Mathematics
• Statistics plays a central role in almost all natural and social sciences. The methods used in natural sciences are the most reliable but conclusions drawn from them
are only probable because they are based on incomplete evidence.
• Statistics helps in describing these measurements more precisely. Statistics is a branch of applied mathematics. A large number of statistical methods like probability
averages, dispersions, estimation, etc., is used in mathematics, and different techniques of pure mathematics like integration, differentiation and algebra are used in
statistics.
• (4) Banking
• Statistics plays an important role in banking. Banks make use of statistics for a number of purposes. They work on the principle that everyone who deposits their
money with the banks does not withdraw it at the same time. The bank earns profits out of these deposits by lending it to others on interest. Bankers use statistical
approaches based on probability to estimate the number of deposits and their claims for a certain day.
Explain the applications of statistics in various
fields.
(5) State Management (Administration)
Statistics is essential to a country. Different governmental policies are based on statistics. Statistical data are now widely used in making all administrative
decisions. Suppose if the government wants to revise the pay scales of employees in view of an increase in the cost of living, and statistical methods will be used to
determine the rise in the cost of living. The preparation of federal and provincial government budgets mainly depends upon statistics because it helps in estimating
the expected expenditures and revenue from different sources. So statistics are the eyes of the administration of the state.

(6) Accounting and Auditing


Accounting is impossible without exactness. But for decision making purposes, so much precision is not essential; the decision may be made on the basis of
approximation, know as statistics. The correction of the values of current assets is made on the basis of the purchasing power of money or its current value.
In auditing, sampling techniques are commonly used. An auditor determines the sample size to be audited on the basis of error.

(7) Natural and Social Sciences


Statistics plays a vital role in almost all the natural and social sciences. Statistical methods are commonly used for analyzing experiments results, and testing their
significance in biology, physics, chemistry, mathematics, meteorology, research, chambers of commerce, sociology, business, public administration,
communications and information technology, etc.

(8) Astronomy
Astronomy is one of the oldest branches of statistical study; it deals with the measurement of distance, and sizes, masses and densities of heavenly bodies by means
of observations. During these measurements errors are unavoidable, so the most probable measurements are found by using statistical methods.
\Example: This distance of the moon from the earth is measured. Since history, astronomers have been using statistical methods like method of least squares to find
the movements of stars.
Brief the various types of data in statistics.
• Data can be classified as
• Qualitative and Quantitative data
• Quantitative data are anything that can be expressed as a number, or quantified. Examples of quantitative data are scores on achievement tests, number of hours of study, or weight
of a subject. These data may be represented by ordinal, interval or ratio scales and lend themselves to most statistical manipulation.
• Qualitative data cannot be expressed as a number. Data that represent nominal scales such as gender, socio economic status, religious preference are usually considered to be
qualitative data.
• Both types of data are valid types of measurement, and both are used in education journals. Only quantitative data can be analyzed statistically, and thus more rigorous assessments
of the data are possible.
• Primary and Secondary data
• (1) Primary Data
• Primary data are the first hand information which is collected, compiled and published by organizations for some purpose. They are the most original data in character and have not
undergone any sort of statistical treatment.
Example: Population census reports are primary data because these are collected, complied and published by the population census organization.
• (2) Secondary Data
• The secondary data are the second hand information which is already collected by an organization for some purpose and are available for the present study. Secondary data are not
pure in character and have undergone some treatment at least once.
Example: An economic survey of England is secondary data because the data are collected by more than one organization like the Bureau of Statistics, Board of Revenue, banks,
etc.
• Nominal, Ordinal and Interval data
• Nominal Data
Nominal data is named data which can be separated into discrete categories which do not overlap. A common example of nominal data is gender; male and female. Other examples
include eye colour and hair colour. An easy way to remember this type of data is that nominal sounds like named, nominal = named.

Ordinal Data
Ordinal data is data which is placed into some kind of order or scale. (Again, this is easy to remember because ordinal sounds like order). An example of ordinal data is rating
happiness on a scale of 1-10.
• Interval Data
Interval data is data which comes in the form of a numerical value where the difference between points is standardised and meaningful. The most common example of interval data
is temperature, the difference in temperature between 10-20 degrees is the same as the difference in temperature between 20-30 degrees.
• Cross sectional, Temporal and Spatial data
• Cross-sectional data, or a cross section of a study population, in statistics and econometrics is a type of data collected by observing many subjects (such as individuals, firms,
countries, or regions) at the same point of time, or without regard to differences in time.
• Spatial data is the data representing objects in space with identity, well-defined extents, locations, and relationships. Temporal data is the data representing some aspect of time.
• Ungrouped and Grouped data:
• Data which have been arranged in a systematic order are called raw data or ungrouped data. Data presented in the form of a frequency distribution are called grouped data.
How to collect primary and secondary data?
• The first step in any enquiry (investigation) is the collection of data. The data may be collected for the whole population or for a
sample only. It is mostly collected on a sample basis. Collecting data is very difficult job. The enumerator or investigator is the well
trained individual who collects the statistical data. The respondents are the persons from whom the information is collected.
• Types of Data
There are two types (sources) for the collection of data:
(1) Primary Data (2) Secondary Data
• (1) Primary Data
• Primary data are the first hand information which is collected, compiled and published by organizations for some purpose. They
are the most original data in character and have not undergone any sort of statistical treatment.
Example: Population census reports are primary data because these are collected, complied and published by the population
census organization.
• (2) Secondary Data
• The secondary data are the second hand information which is already collected by an organization for some purpose and are
available for the present study. Secondary data are not pure in character and have undergone some treatment at least once.
Example: An economic survey of England is secondary data because the data are collected by more than one organization like the
Bureau of Statistics, Board of Revenue, banks, etc.
How to collect primary and secondary data?
• Methods of Collecting Primary Data
• Primary data are collected using the following methods:
• 1. Personal Investigation: The researcher conducts the survey him/herself and collects data from it. The data collected in this way are usually accurate
and reliable. This method of collecting data is only applicable in case of small research projects.
• 2. Through Investigation: Trained investigators are employed to collect the data. These investigators contact the individuals and fill in questionnaires
after asking for the required information. Most organizations utilize this method.
• 3. Collection Through Questionnaire: Researchers get the data from local representations or agents that are based upon their own experience. This
method is quick but gives only a rough estimate.
• 4. Through the Telephone: Researchers get information from individuals through the telephone. This method is quick and gives accurate information.

• Methods of Collecting Secondary Data
• Secondary data are collected by the following methods:
• 1. Official: e.g. publications from the Statistical Division, Ministry of Finance, the Federal Bureaus of Statistics, Ministries of Food, Agriculture, Industry,
Labor, etc.
• 2. Semi-Official: e.g. State Bank, Railway Board, Central Cotton Committee, Boards of Economic Enquiry, etc.
• 3. Publication of Trade Associations, Chambers of Commerce, etc.
• 4. Technical and Trade Journals and Newspapers.
• 5. Research Organizations such as universities and other institutions.
List and explain the measures of central tendency.
• The central tendency of a variable means a typical value around which other values tend to concentrate; hence this value representing the central
tendency of the series is called measures of central tendency or average.
• Measures are
• Mean
• Arithmetic mean
• The most popular and widely used measure of representing the entire data by one value is known as arithmetic mean. Its value is obtained by adding together all the items and by dividing
this total by the number of items.
• essentials of an Ideal Average.
• Should be easy to understand.
• Clearly and rigidly defined.
• Based on all the observations.
• Simple to compute.
• Least affected by fluctuations.
• Capable of further Algebraic treatment.
• Sampling stability.
• Geometric mean
• Geometric mean is the nth root of the product of N items or values. Geometric mean is appropriate or useful - When ratios or percentages are to be found, - In determining rates of
increase or decrease and When the different values are at vast difference.
• Harmonic mean
• Harmonic mean of a series is the reciprocal of the arithmetic mean of the reciprocal of the values of its items. Harmonic Mean is used in the following cases For determining average
speed or velocity, to find out average price and If the item given in the question which is variable is to be kept as constant in the answer, or vice versa, then harmonic mean will be
calculated.
• Median
• Median is that value of the variable which divides the group into two equal parts, one part comprising all values greater than, and the other all values less than the median
• Mode
• Mode is the value that appears most frequently in a series i.e. it is the value of the item around which frequencies are most densely concentrated.

• Note: Include formula for grouped and ungrouped data for arithmetic, geometric, harmonic mean, median and mode in semester exams.
Describe the various measures of dispersion.
• Dispersion is a measure of the extent to which the individual item vary from a central value Dispersion is used in two senses, (i) difference between the extreme items of the
series and (ii) average of deviation of items from the mean.
• Measures are
• Range
• The difference between the value of the smallest item and the value of the largest item of the series is called range. Simple and easy to be computed. - It takes minimum time to calculate. Not necessary to know all the
values, only smallest and largest value is required. Helpful in quality control of products.
• Coefficient of Scatter
• Variance
• Standard deviation
• Standard Deviation was introduced by Karl Pearson in 1823. It is the most important and widely used measure of studying dispersion, as it is free from those defects from which the earlier methods suffer and satisfies most
of the properties of a good measure of dispersion.
• Standard Deviation is the square root of the average of the square deviations from the arithmetic mean of a distribution.
• Mean deviation
• Mean Deviation is also known as average deviation or first measure of dispersion. It is the average difference between the items in a distribution and the median mean or mode of that series.
• Quartile deviation
• Quartile Deviation gives the average amount by which the two quartiles differ from the median. Quartile deviation is an absolute measure of dispersion.
• Coefficient of variation
• Coefficient of variation is used for the comparative study of stability or homogeneity in more than two or more series.
• Skewness
• Skewness refers to the asymmetry or lack of symmetry in the shape of a frequency distribution. In other words, skewness describes the shape of a distribution.
• A distribution is said to be ‗skewed‘ when the mean and the median fall at different points in the distribution, and the centre of gravity is shifted to one side or the other – to left or right.

• Note: Include formula for grouped and ungrouped data for all the above in semester exams.

Вам также может понравиться