Вы находитесь на странице: 1из 74

MODUL PRAKTIKUM

INFORMATICS STATISTICS
PRACTICAL LAB
ARRANGED BY:
ANNA HENDRI S.J., S.Kom., M.Cs

AHMAD DAHLAN UNIVERSITY


INFORMATICS ENGINEERING
INDUSTRY TECHNOLOGY FACULTY
AGUSTUS, 2019
PREFACE

Assalammu’alaikum Wr. Wb

Alhamdulillah, author conveys the presence of Allah SWT who has given his blessings and blessings so
that the completeness of this Informatics Statistics practicum manual can be completed. This practicum
manual and theory are the third revised edition. In the preparation of this book, the author experienced
a little difficulty in adjusting to the theoretical lecture RPS. Because not all lecture material can be
practiced in practicum implementation.
However, the authors still try to synchronize practical material with theory so that students do not
experience confusion in conducting analysis and processing data. The applications used are SPSS, MS
Excel and WEKA, but the usage is not too much different. Hopefully this Statistics practicum manual can
be useful and facilitate the practicum activities.
The author hopes that he can receive input, criticism from students, practicum assistants and other
statistical support lecturers for further development of this practical book. Finally, the authors are happy
to read and implement it.

Wassalammualaikum Wr. Wb.

Yogyakarta, 12 August 2018

Author

Sri Winiarti, S.T, M.Cs

ii
TABLE OF CONTENTS

Preface

1. Practicum for Data Introduction ……………………………………………………………………………… 13


2. Frequency Distribution Practicum ……………………………………………………………………………… 37
3. Frequency Distribution Practicum (Graph) ………………………………………………………………… 53
4. Competency Test 1 …………………………………………………………………………………………………… 55
5. Practicum of Measure of Center..................................................................................... 57
6. Practicum of Measure of Center (Quartiles)………………………………………………………………. 60
7. Practicum of Dissemination Size ……………………………………………………………………………… 67
8. Probabilistic Statistics Practicum ……………………………………………………………………………. 91
9. Probabilistic Distribution Practicum …………………………………………………………………………. 101
10. Practices of Hypergeometric Distribution …………………………………………………………………… 112
11. Practicum Test for one sample Hypothesis ……………………………………………………………… 130
12. Competency Test 2 …………………………………………………………………………………………………… 135
13. Practical Test Hypothesis of two samples …………………………………………………………………… 137
14. ANOVA Test Practicum ……………………………………………………………………………………………. 14

iii
PRACTICUM 1
IDENTIFICATION OF DATA TYPES AND SAMPLE TAKING
Time :2.5 Hours

Basic competencies :
After following this practicum, students are able to:
1. Identify the types of data in the real world
2. Provide an explanation of data classification and how data processing is in the SPSS application

Indicator: After following this practicum, students are able to:


Recognize and identify the types of data for data analysis needs to fit in the search for solutions in the
field.
Obtain samples taken from the population in a "representative" way, so that sufficient information can be
obtained to estimate the population.
Operate the SPSS application to be used as a tool to process data in data analysis.

INTRODUCTION TO SPSS APPLICATION


SPSS is one of the many statistical software that is widely known among its use. Besides many other
statistical software such as Minitab, Syastas, Microstat and many more.
Figure 1.1. Main Display of SPSS Application
Information:

MENU BAR: Collection of basic commands for operating SPSS.

The menu contained in SPSS are:

a) FILE

For SPSS document file operations that have been made, both for printing repairs and so
on. There are 5 types of data used in SPSS, they are:

1. Data : SPSS documents are in the form of data


2. Systax : The document contains SPSS syntax files
3. Output : documents containing the results of SPSS running out
4. Script : documents that contain SPSS running outs

5. Database

♠ NEW : create a new SPSS worksheet


♠ OPEN : open an existing SPSS document
Generally, there are 3 types of extensions in SPSS worksheets, they are:
1. *.spo : data file generated on the editor data sheet
2. *.sav : text file / object generated by the output sheet
3. *.cht : image / chart object file generated by the chart window
♠ Read Text Data : open a document from a text file (with txt extension), which is
entered / converted in an SPSS data sheet
♠ Save : save documents / work that has been made.
♠ Save As : Save the document again with a different name / place / type of
Document

♠ Page Setup : set the SPSS work page

♠ Print : print output / data / syntax SPSS sheets


There are 2 options for printing, they are:
- All visible output : print the whole worksheet
- Selection : print according to what we highlight / block

♠ Print Preview : see examples of printouts that will be obtained later

♠ Recently used data: contains a list of data files that have been opened before.

♠ Recently used file : contains a list of files as a whole that have been done

b) EDIT

To make edits to the SPSS operation both data, and settings / options for the overall SPSS
configuration.

♠ Undo : cancellation of orders made previously


♠ Redo : command cancels the redo command that was carried out before

♠ Cut : deletion of a cell / text / object, copied for specific purposes with
the command from the paste menu

♠ Paste : displays a cell / text / object that results from the copy

Or cut command

♠ Paste after : repeat the previous paste command

♠ Paste Variable : paste variable commands, namely from conversion to

images, word, etc

♠ Clear : delete a cell / text / object

♠ Find : looking for text

♠ Options : set the general SPSS sheet display configuration


c) VIEW

For setting the capture on the SPSS work screen, and find out the processes that are happening in
the SPSS operation.

♠ Status Bar : knowing the ongoing process


♠ Toolbar : set the toolbar appearance

♠ Fonts : to set the type, font size in the SPSS data editor
- Outline size : SPSS output sheet font size
- Outline font : font type of SPSS output sheet
♠ Gridlines : set cell lines in the SPSS editor
♠ Value labels : adjust the display in the editor to find out the value label

d) DATA

Data menu is used for data processing.

♠ Define Dates : Defines a time for variables that include hour, date, year, and so

on

♠ Insert Variable : insert variable column

♠ Insert case : insert row

♠ Go to case : move the cursor on a particular line

♠ Sort case : sort the value of a variable column

♠ Transpose : Transpose operation on a variable column becomes a row

♠ Merge files : merge several SPSS document files, which are carried out by

combining the various columns

♠ Split file : solve files based on their variable column

♠ Select case : set a variable based on a certain requirement

e) TRANSFORM

Transform menu is used to make changes or additions to data.

♠ Compute : arithmetic and logic operations

♠ Count : to find out the amount of a certain data size on a particular line
♠ Recode : to replace the value in a particular variable column, its nature
replace (into the same variable) or change (into different variable) on the new variable

♠ Categorize variable : change rational numbers to discrete


♠ Rank case : sort the data value of a variable

f) ANALYSE

Analyse menu is used to analyse data that we have entered into the variable. This menu is the
most important menu because all data processing and analysis is done using the correlate,
compare means, variable menu.

g) GRAPH

The graph menu is used to create graphics, including bar, line, pie, etc.

h) UTILITIES

The utilities menu is used to find out variable information, file information, etc.

i) ADD-ONS

The add-ons menu is used to give commands to SPSS if you want to use additional applications,
for example using the Amos application, SPSS data entry, text analysis, etc.

j) WINDOWS

The windows menu is used to make switches (switch) from one file to another.

k) HELP

The help menu is used to help users understand SPSS commands if they encounter difficulties.

TOOL BAR : Collection of commands - commands often used in form picture.

POINTER : A cursor that indicates the position of the cell that is currently active /
selected.
Practical Work (Post Test) 1

1. Based on the data in Table 1.1, determine the data type, data type, its measurement scale!
Table 1.1. ABC Region Toddler Food Product Data
No Registration Number Product name Product Brands
1 Advanced Formula For Children 6-12
md 860513424001 LACTOGEN 2
Months
2 md 860513439001 NESTLE – Danstart
Advanced Formula for Infants 6-12 Months

3 Companion Milky Marie Biscuits For


md 610109028069 PROMIN
Infants Age 8 A
4 Advanced Formula with Bananas and
MD 860511086368 SGM
Carrots for 6-12 Months Old Babies ANANDA
Infant Formula For Special Medical
Purposes
Ages 0-12 Months With Intolerance
5 md 810711021368
Lactose SGM LLM + Presinutri

Growth Milk for Children Aged 1-3 Years


6 md 860513343001 Vanilla Flavour (Images of Girls) Nestle Dancow

Complementary Instant Milk Powder Food


7 md 260710022188 The taste of Mushroom Chicken for Infants Promina
and Children aged 6-24 months

8 md 860512357001 Vanilla Taste Growth Milk for Children SGM Eksplor 1+


1-3 years old Presinutri+

Complementary Instant Milk Powder Food


9 md 860710030188 Taste of Brown Rice for Infants and Ages SUN
6-24 months

Complementary Instant Milk Powder Food


10 md 860710069188 Taste of Brown Rice for Infants and Ages SUN
6-24 months

Complementary Instant Milk Powder Food


11 md 860710144188 Chocolate and Milk Flavour for Babies and SUN
Children Ages 9-24 Months
ASI complementary food Orange Flavour
12 MD 860709352006 Biscuits MILNA
For Infants and Children Ages 6-24
Months
ASI Complementary Food Flavour Biscuits
13 MD 860709357006 Brown Rice for Babies and Children Ages MILNA
6-24 Month

Complementary Instant Milk Powder


Food
14 MD 860710011188 Taste of Brown Rice for Infants and Ages Promina
6-24 months

Complementary Instant Milk Powder


Food
15 MD 860710024188 Flavours of Oranges, Apples, Bananas for SUN
Babies and
Children aged 6-24 months
Honey Flavoured Milk Powder for
16 md 860828029315 Children LACTOGROW 4
Aged 3-5
Year
17 MD 860809448040 Formula with Vanilla Soybean Isolates SGM Eksplor Presinutri
For Children Ages 1-5 Years

2. Look for problems that occur in real life (for example in other cases). Look for and observe
examples of the types of data from the sample that you get through the internet!

For example: UAD Informatics Engineering students who have a work waiting period of less
than
6 months 30%

3. Formulate the problems you face, then detail these problems in the forms of information that
must be presented. Based on the case in step no. 1, the problem formulation is obtained;

a. How to achieve the target of UAD Informatics Engineering students immediately


get a 70% job.
b. What is the strategy that must be used by the Study Program to achieve it? c.
Etc;
4. After understanding the scope of the problem at hand, determine the population to be
investigated. For example: the population is students who have a waiting period of work of
less than 6 months in 2016 are 2011 class year students in the Informatics Engineering Study
Program.

5. Arrange population definition and classification.

6. Make the classification in the form of a table that contains parameters: the type of data and
the type of data and the scale of measurement of the data!

7. Copy the Worksheet below.

8. Classification of Populations in tabulations:

9. For Example; make a tabulation for the case you observed by following the format in Table 2.1.
Table 2.1. Example of the Data Identification Formulation Format

Num Data Variable Data Type Scale Intake Technique


Measurement Sampling
1 Passenger car Quantitative Nominal Probability Sampling

2 Bus Car Quantitative Nominal Probability Sampling

3 Freight cars Quantitative Nominal Probability Sampling

4 Motorcycle Quantitative Nominal Probability Sampling

5 Private car Quantitative Nominal Probability Sampling

6 Fuel type Descriptive Ordinal Probability Sampling


Examples of cases / problems that were studied;

Formulation of the problem specified

Specified population:

Definition of Population:

20
PRACTICAL LAB 2
FREQUENCY DISTRIBUTION

Time : 2.5 Hours


Basic competencies :
Student / i is able to provide an explanation of the distribution of data with SPSS, namely the
creation of tables and lower edges

Indicator :
After following this practicum, students create a frequency distribution table by calculating the
value of the lower edge and making a table

APPLICATION OF SPSS FOR FREQUENCY DISTRIBUTION

To analyse the size of centring, the size of the location and the size of the deviation (when the
size is included in the statistics description), it can be done with a procedure.
a. Analyse Statistics Frequencies b. Analyse Descriptive

Statistics Description Explore

Practical Work (Post Test) 1


The File menu is the first menu from the Data Editor opened by SPSS users. Where the Data Editor in
SPSS has two main parts:
1. Column, characterized by the word var in each column. The columns in SPSS will be filled in by
variables.
2. Rows, characterized by the numbers 1, 2, 3 and so on. Rows in SPSS will be filled by data.
Case: Following data Mobile Users in various countries are taken randomly (figures in rupiah.

37
Table 2.1 Mobile User Data
No Handphone Price Total User Region
Brand
1 Samsung 3-10 Million 46% Korea
2 Sony Ericson 1-2 Million 5% Amerika
3 BlackBerry 2-10 Million 8% Amerika
4 Iphone 4-20 Million 30% Eropa
5 Oppo 1-5 Million 10% Asia
6 Asus 1-3 Million 3% Taiwan
7 LG 1-2 Million 2% Taiwan
8 Zyrex 500-1 Million 1% Tiongkok
9 Nokia 1-3 Million 2% Amerika
10 Avio 2-5 Million 3% China

Data Input Steps:


1. Creating Variables
Click the view variable in the lower left corner, then fill in:
● Variable Name along with the desired information about the variable.

For example: type, price, region, number of users

Things to note when filling in variable names are:

- Variable names must start with a letter and cannot end with a period.

- Maximum length of 8 characters.

- There must not be the same, by not distinguishing lowercase or uppercase letters.
●Type, Width dan Decimal Variable
- The default of each new variable type is numeric, 8 characters wide according to decimal digits
of 2 digits.
- To change the variable type is done by clicking the selection button in the Type column.
- There are 8 types of variables, namely:
a. Numeric: numbers, signs (+) or (-) in front of numbers, decimal indicators
b. Comma: numbers, signs (+) or (-) in front of numbers, decimal indicators, commas as
separators of thousands
c. Dot: numbers, signs (+) or (-) in front of numbers, decimal indicators, periods as separators of
thousands
d. Scientific notation: same as the numeric type, but uses the symbol E for multiples of 10 (eg
120000 = 1.20E + 5)

38
e. Date: displays date or time format data
f. Dollar: give a dollar sign ($), a comma as a separator of thousands and a period as a decimal
g. Custom currency: for currency formats
f. String: usually letters or other characters

Figure 2.2 SPSS Editor Menu


2. Filling Data
Entering data into the Data Editor is done by typing the data to be analysed in cells (cases) under
the heading (column) variable name column.

Figure 2.3. Example of filling data in SPSS

3. Saving Data

After the data has been entered, the data needs to be saved for the next analysis. The data
storage steps are as follows:
Click File Menu → Save Data → (Select storage folder), type File Name → Click OK.

39
Practical Work (Post Test) 2
PROCEDUR: Analyse Descriptive Statistics Frequencies
Ÿ Click menu Analyse Descriptive Statistics Frequencies

Ÿ Highlight the variable to be analysed then move it to the variable box by clicking the mark “”
Ÿ Click Statistics, check all Percentile Values check boxes
(Note: to determine the Percentile value of 10.25 and so on, it is done by putting a check box in
the percentile box)
Ÿ Click chart, Select Histogram if you want to display
Ÿ Click format, mark the ascending value on the order by option to sort the data from the smallest
largest value.
Ÿ Click OK.

Case Example 1: Statistics UTS data scores of 15 grade A children, i.e.:


Table 2.2. Student Mid Year Exam score data
No Name MYE Score
1. Mimi 90
2. Melisa 60
3. Yolin 65
4. Nina 55
5. Parto 70
6. Jerry 71
7. Tom-Tom 72
8. Yusron 80
9. Ableh 76
. 10 Stefanus 56
11 Chandra 59
12 Roy 77
13 Ardian 85
14 Nita 89
15 Mawan 90
The steps are as follows:

Figure 2.1. Forms of application processing with SPSS

Figure 2.2. Form of frequency processing with SPSS applications

Figure 2.3. Forms of graph processing with SPSS applications

41
Case Example 2: An online store collects customer data to find out the loyalty of its customers.
Customer data is shown in Table 2.3.
Table 2.3. Customer data
Long be Number of transactions each The amount of the purchase
Name customer month Price each month
A. Rifai 29 2 34000
Anis-Yudi 29 3 65000
Anis 29 4 75000
Antin 29 6 55000
Alfi 26 1 90000
Asni 26 1 140000
Dewi 25 2 67000
Dewi 23 3 350000
Dede/ Ahli 22 5 165000
Dian 21 6 120000
Dedi 20 7 67000
Happy 20 3 75000
In 20 4 60000
Indra 20 2 60000
Indra 20 2 165000
Ipul 15 3 90000
Iqbal 14 4 35000
Eki 5 2 75000

From table 2.3 make a frequency distribution table and determine the lower and upper edges!

Case example 3.
Table 2.4 is data on 15 arrivals of Trans Yogyakarta buses taken randomly:

Table 2.4. Data on Trans Jogjakarta bus arrival


Total of
Idbus shelter Passenger
49 Terminal Concat 74
53 West Sarjito 21
55 Santika 46
63 Malioboro 3 53
65 Malioboro 3 33
69 Janti 3 21
71 Termconcat 37
73 Termconcat 39
77 South Papmi 32

42
79 MT Haryono 22
81 SD Pujokusuman 55
83 SD Pujokusuman 39
75 South Papmi 34
97 Kehutanan 44
99 Kehutanan 36

Based on table 2.4 make it

1. the frequency distribution table,


2. determine what is the highest frequency value ...?
3. What is the relative frequency value for data> 21 ...?
4. Make a chart with the type of bar!

52
PRACTICUM 3
FREQUENCY DISTRIBUTION (GRAPH AND REACH)
Time : 2.5 Hours

Basic competencies :
Student is able to provide an explanation of the distribution of data distribution with SPSS
Indicator :
After following this practicum, Student is able to make a frequency distribution table by
calculating range values.

HOW TO PROCESS DATA WITH SPSS

v Using Frequency Analysis

Descriptive Statistics Frequencies


Descriptive Statistics Frequencies
Ÿ Highlight the variable to be analysed then move it to the variable box by clicking the mark “”
Ÿ Click Statistics, check all Percentile Values check boxes
(Note: to determine the Percentile value of 10.25 and so on, it is done by marking the
percentile check box)
Ÿ Click chart, select Histogram if you want to display
Ÿ Click format, mark the ascending value on the order by option to sort the data from the smallest
largest value.
Ÿ Click OK.

Practical Work (Post Test) 1


1) Consider the following case
Data on the number of households in each RT in Sewon District is as follows:

77 85 75 76 63 72 81 73 67 86
74 53 76 62 78 88 57 73 80 65
75 71 65 76 85 78 97 67 62 79
71 83 79 60 95 75 61 89 78 96
60 68 74 69 77 94 75 82 78 66

53
2) Make a Frequency distribution table. Steps to make a frequency distribution table:
The number of data = 50, the smallest data value = 53 and the largest data value = 97. then the
range = 97 - 53 = 44. For example, we set the class width = 7, then the number of classes = 44/7 = 7
(rounded). So the class interval is as presented in table 3.1.
Table 3.1. Example of a frequency distribution table

Class Interval fi
52.5 59.5 2
59.5 66.5 9
66.5 73.5 9
73.5 80.5 18
80.5 87.5 6
87.5 94.5 3
94.5 101.5 3

From the table make the graph in the form of Pie, Bar and Histogram

Practical Work (Post Test) 2


1) Make another case with a population of 75 samples. Make a frequency table and make a frequency
table and calculate the range value for the graph in the form of Pie, Bar and Histogram!
2) Match the results that you get using the SPSS program, interpret the results that you have
obtained!

Practical Work (Post Test) 3


Based on the case examples in practicum 2 (tables 2.2 to 2.4), make a frequency distribution table and
the histogram graph, then determine range!
PRACTICUM 4
COMPETENCE TEST 1

Time : 2.5 jam


Basic competencies : students after obtaining this material can:
1. Understand the concept of statistical data
2. Understand descriptive statistics and inference
3. Understanding population and sample data
4. Mention the types of data
5. Understand the general form of addition notation and the postulates of addition notation
6. Understand interval, frequency, range, midpoint, class boundaries and class boundary edges
7. Understand and determine the number of classes
8. Determine the edge of the upper boundary and the lower boundary edge of the class
9. Completion of case studies for the above material using SPSS

1. Explain the concept of statistical data


2. Explain descriptive statistics and inference
3. Explain population and sample data
4. Mention the types of data
5. Explain the general form of addition notation and the postulates of addition notation
6. Calculating interval, frequency, range, midpoint, class boundary and class boundary edge
7. Count and determine the number of classes
8. Calculating interval, frequency, range, midpoint, class boundary and class boundary edge
9. Count and determine the number of classes
10.Calculate and Determine the edge of the upper boundary and the lower limit of the lower
class
11.Calculate and explain the relative frequency, cumulative frequency, cumulative frequency
more than and less than.
12.Calculate and explain relative frequency, cumulative frequency, cumulative frequency more
than and less than
13.Completed case study for frequency distribution material using SPSS
14.Make a frequency distribution table in graphical form

55
15.Resolved a case study of cumulative frequency material using SPSS
16.Analysing the results of data processing from SPSS

Material competency test that will be tested include:


1. data concept
2. frequency distribution
3. graph
4. cumulative frequency
5. relative frequency

Form of Competency Test: students take written examinations and computer practice exams based on
the competency test questions given during the exam

Timing: in the 5th week of the lecture according to the pre-practice schedule in the Basic Computing
laboratory
PRACTICUM 5
MEASURING THE CENTER (MEAN, MEDIAN AND
MODUS)
Time : 2,5 Hours
Basic Competence :
Student can analyse the data for the size of concentration and apply it to EXCEL
Indikator :
for the size of concentration and apply it to After following this practicum,
Praktikan can learn and understand the size of data centering using the Excel
application

APPLICATION OF EXCEL APPLICATIONS FOR FREQUENCY DISTRIBUTION


Hint: How to calculate the average with the excel application
By using Microsoft Excel, it can determine averages for a group of data. Weighted average is different
from the average value that is commonly used, the difference is the weighted average depends on the
variable value and weight of a number.

For more convenience, consider the following example:


Shipping 10 items at a shipping fee of $ 0.20 per item. Because of the added cost of the weight of
consumption of goods, then on the second shipment 40 items now cost $ 0.30 per item. thus also on the
third shipment of 60 goods at a cost of $ 0.35 per item. In general, if using the usual average formula,
the average cost / goods in each shipment - determined by the formula ($ 0.20 + $ 0.30 + $ 0.35) / 3 = $
0.283 - is not an accurate measure of the average cost of a case because it is not reckon that there are
60 cases with a shipping fee of $ 0.35, 40 cases with a cost of $ 0.30 and the rest $ 0.20. By using the
weighted average formula, the average shipping cost of goods in the three cases of shipping goods will
result in $ 0.318. To find the weighted average, in the Excel worksheet follow these steps:
1. Create a table as below

Columns A and B contain the data to be processed Columns D and E will be filled with a
weighted average formula.
Figure 5.1. Example calculation of the average with Excel

2. Next create a weighted average formula.


a) The formula for finding a weighted average of all shipments
i) Using the sumproduct, in cell D3 type the formula = SUMPRODUCT (A3: A5, B3:
B5) / SUM (B3: B5),
ii) Or manual formula by selling D4 type formula = ((A3 * B3) + (A4 * B4) + (A5 * B5)) /
SUM (B3: B5)
b) Formula for finding the weighted average of the first and second combined shipments

i) Using the sumproduct, in cell E3 type the formula = SUMPRODUCT (A3: A4, B3: B4) / SUM
(B3: B4)

ii) Or manual formula by selling E4 type the formula = ((A3 * B3) + (A4 * B4)) / SUM (B3: B4)

Practical Tasks (Post Test) 1


1. It is known that a major in Informatics Engineering at a private university experiences an
uncertain addition and reduction of students each year. Whereas every year there is always a
printing of practicum modules for courses that have practicum. This module is printed at the
beginning of the semester, so that sometimes it is excessive and sometimes it is lacking.
Therefore, it is necessary to calculate the average calculation of the total ordinary module
making each year in order to reimburse the cost of making the module annually. A table listing
students who performed KRS statistics and prices for each statistical module is shown in Table
5.1.
Table 5.1 List of statistics module prices

Year Total Statistic Student Dictate Printed Price each dictate


2005 189 189 7000
2006 150 150 7000
2007 167 170 9000
2008 180 170 9000
2009 180 180 10000
2010 300 190 10000
2011 280 300 10000
2012 280 300 11000
2013 150 160 11000
2014 270 278 11000
2015 350 360 11000
2016 230 250 15000

2. Match the results you get using the Excel program or SPSS.

3. Interpret the results that you have obtained!

Practical Work (Post Test) 2


1. Based on the case examples in practicum 2 (tables 2.2 to 2.4), make a frequency distribution
table and calculate the mean, mode and median values!
2. Match the results that you get using the Microsoft Excel program, interpret the results that you
have obtained!

59
PRACTICUM 6
Practicum of Measure of Center (Quartiles)

Time : 2.5 Hours


Basic Competence :
Students can analyze data for centralization size and apply it to Excel
Indicator :
After following this practicum, Student can learn and understand the size of data
centring using the Excel application

Statistical measures are measures that indicate how a data group is centralized and spread. Within the
statistical measure there are three forms of data description size, namely: data centre size, size of data
variability and size of data distribution forms. The size of the data centres that are widely used to
describe data are the mean (mean count), median and mode. The size of the spread of a group of data
to a data centre is called disperse or variation or diversity of data. The commonly used data disperse
sizes are range, variance and standard deviation

1. Quartile
Is a value that divides data groups that have been sorted (ascending) into four equal parts. Quartile
value consists of quartile 1, quartile 2 and quartile 3. The value of quartile 2 of a data group is the
same as the median value.

2. Calculate the Quartile


By using Excel we can calculate quartiles (percentages) and percentiles (percentiles). If the percentile
divides the data into 100 equal parts, the quartile will divide the data into 4 equal parts. In general,
how to write quartiles can be seen below.

I. Make a table
Fill in the data in column A in the range B3: B20 following data has been sorted from the smallest
value to the largest value).
ii. Type a formula to calculate quartile values
a. In cell D3 type: = = QUARTILE (B3: B20) Equation ii to calculate the minimum value.

60
b. In cell D4 type = QUARTILE (B3: B20,1) This equation is to find the value of Q1, quartile to 1.
c. In cell D5 type = QUARTILE (B3: B20.2) This equation is to find the value of Q2, second
quartile.
d. In cell D6 type = QUARTILE (B3: B20,3) This equation is to find the value of Q3, 3rd quartile in
cell.

6.1. Example of Quartile Calculation in Excel


In cell C10 type the median formula: = MEDIAN (B3: B20) Median value
= Q2

iii. How to calculate percentiles in excel


In statistics relating to population size data there are several terms known as:

1. Median, median is the middle value of a set of data, the median divides the data into 2 equal parts
and is also known as quartile 2 (K2),
2. Quartile (Quartile). Quartiles are observational values that divide data into 4 equal parts,
sometimes called k1, k2 (median) and k3. Manually the quartile can be determined by first
determining the values of n / 4 -> p, and then we get k1 and k3,
3. Decile (Decile), Deciles are observational values that divide data into 8 equal parts.
4. Percentile (percentile), Percentiles are observational values that divide data into 100 equal parts.
iv. How to write the percentile formula in Excel

PERCENTILE (array, k) where:

a. Array can be array or range of data.


b. k is the desired percentage value (percentile value is in the range of 0 to 1, 0 means 0%, while 1
means 100%).

Some things to note when using the percentile function in excel:


a. If the array is empty or contains more than 8,191 data points, then PERCENTIL will generate NUM
#! (error value).
b. If k is non-numeric (not a number), then PERCENTIL returns # VALUE!
c. If k is <0 or if k> 1, PERCENTIL returns NUM #!
d. If k is not a multiple of 1 / (n - 1), PERCENTIL will interpolate to determine the value of the k-th
percentile.

To make it easier to implement the percentile function, please create a table like the one below:
1. Column A will be filled with data, in this example the data is in the range A3: A13 It should be
noted that the data must be sorted from small to large.
2. Next do a percentile calculation. To calculate the 100th percentile in cell D3 type the formula =
PERCENTILE (B3: B20,1)
3. To find the 20th percentile in cell D4 type formula = PERCENTILE (B3: B20,0.2)
4. To determine the 80th percentile in cell D5 type PERCENTILE formula (B3: B20,0.8)

Figure 6.2. Percentile search formulation in Excel

62
Practical Tasks (POST TEST) 1
The data of students who made purchases in the canteen for 50 days is known as follows:

16 35 37 42 40 38 33 35 30 35
18 25 25 24 30 25 35 27 29 32
34 50 23 20 56 33 26 21 31 29
45 36 45 41 30 19 23 42 33 28
30 63 47 69 22 40 59 42 33 30

1. From the table above, calculate the quartile value


2. Match the results you get using the Excel program or SPSS.
3. Interpret the results that you have obtained!

Practical Work (Post Test) 2

1. Observe the phenomenon in real life and take one type of sample data from two different
populations, each population of 100 samples. Calculate the quartile and percentile values
2. Match the results you get using the Excel program or SPSS.
3. Interpret the results that you have obtained!

Practicum assignments (Post Test) 3

Based on the case examples in Practicum 2 (Tables 2.2 to 2.4) determine the quartile value!
PRACTICUM 7
SIZE OF DISTRIBUTION (VARIANCE AND
DEVIATION STANDARDS)
Time : 2.5 hours
Basic competencies : Students can complete data analysis for the size of the spread
Indicator : Students can understand the material size of data dissemination with
using the Excel application

Theory
1. Range
The range or range (r) of a data group is the difference between the maximum value and the
minimum value. By looking at this measure, it can be known roughly about the variation of a data
distribution. This range value is very rough, because it does not consider other values besides the
extreme values.
2. Variance
Variance is the average of the square of the difference or the square of the deviation of all data
values against the average
average count. Variance for sample is denoted by s2, while for population is denoted by σ2
Variance (s) 2 = [Σ (Xi-X)] / (n-1) Actually what is the size of the deviation is the standard deviation
(standard deviation), however the size of the variance this is a square measure of the standard
deviation, so it can also be considered a measure of spread.
3. Standard Deviation
Standard deviation is the square root of variance. Standard deviations are often called standard
deviations. By using the average deviation of the results of observations of the spread already
takes into account all values in the data. However, because in the calculation using absolute
values, the direction of distribution cannot be known, then with the standard deviation this
weakness can be overcome, namely by making a power of value 2, so that negative values become
positive. This standard deviation is the most accurate measure of distribution.
In applying STDEV, the calculation of standard deviation manually uses the following formula:
√(∑( − )2
64 - n-1
Where:
x = data to n
= average x = sample mean value n = the amount of data
4. The coefficient of variance
The coefficient of variation is a measure of variance that can be used to compare data distributions that
have different units. If we compare various variances or two variables that have different units, it cannot
be done by calculating the absolute size of the spread.

Calculate the standard deviation by using Excel


One of the statistical functions available on Microsoft Excel is the standard deviation (standard
deviation).
Standard deviation is a measure of how wide the deviation of the mean value is.

How to write the standard deviation function formula:


STDEV (number1, number2, ...) with: Number1, number2, ... are 1-255 arguments that match the
population sample. You can also use single arrays or references to arrays, not arguments separated by
commas.

Information
a. STDEV assumes that the argument is an example from the population. If your data represents the
entire population, to calculate the standard deviation using STDEVP.
b. Standard deviation is calculated using the "n-1" method.
c. Arguments can be numbers or names, arrays, or references that contain numbers.
d. The logical values and text representations of the numbers that you type directly into the argument
list will be counted.
e. If the argument is an array or reference, only numbers / numbers in the array or reference will be
counted. Blank cells, logical values, text, or error values in arrays or references will be ignored.
f. Arguments that error values or text that cannot be translated into numbers will cause an error.
g. If you want to enter logical values and text representation of numbers in the references as part of the
calculation, use the STDEVA function.
Practical Tasks (Post Test) 1
1. Create a table that contains data (you can use data that are not sequential from small to large values).
2. To calculate the standard deviation, in cell C3 type the following formula: = STDEV (A3: A13).

Figure 6.3. Example of processing Standard Deviation in Excel

3. U To calculate variance, in Cell D3 type the following formula: = var (A3: A13).

Figure 6.4. Example processing of Variance in Excel

4. Based on the data below, look for the mean, range, variance and standard deviation using Excel;

34 50 23 20 56 33 26 21 31 29
45 36 47 41 30 19 23 42 33 28
36 63 47 69 22 40 59 40 30 30
Record the results in this box.

NO Value Search result


1 Mean
2 Range
3 Varian
4 standard deviation

Practical Work (Post Test) 2


1. Look for a sample data / population and take one type of sample data from two different
populations, each population of 30 samples (per pratican). Make a frequency table (per group)
and calculate the value of range, variance, standard deviation and coefficient of variance.
2. Match the results you get using the Excel program or SPSS.
3. Interpret the results that you have obtained!

Practicum assignments (Post Test) 3

1. During 10 repetitions this semester you get a value of 91, 79, 86, 80, 75, 100, 87, 93, 90, and 88.
What is the standard deviation of the test scores? Prove it with SPSS or Excel.

2. Make a case example using sample 45, determine the variance and standard deviation!
PRACTICUM 8
PROBABILISTIC STATISTICS
Time : 2.5 hours

Basic competencies :
Students can complete data analysis for probabilistic statistics
Indicator :
Students can understand probabilistic statistics by using the Weka application

1. PROBABILITY
Probability indicates whether something will happen or not.

For example from 10 scholars, 3 people master cisco, so the opportunity to choose a bachelor who
masters cisco is: p (cisco) = 3/10 = 0.3

2. BAYES THEOREM

with:

p (Hi | E) = probability of Hi hypothesis is true if given evidence (fact) E

p (E | Hi) = probability of the emergence of evidence (fact) E if the hypothesis Hi is known

p (Hi) = probability of Hi hypothesis (according to previous results) regardless of any evidence n =

number of possible hypotheses

Example:
Asih has symptoms of freckles on her face. The doctor suspected that Asih had smallpox:
• probability of appearance of spots on the face, if Asih is exposed to smallpox p (spots2 | smallpox)
= 0.8
• Asih probability of getting smallpox regardless of any symptoms p (smallpox) = 0.4
• probability of appearance of freckles on the face, if Asih has an allergy bintik p (spots | allergies) =
0.3
• Asih probability of being allergic regardless of any symptoms p (allergic) = 0.7
• probability of appearance of spots on the face, if Asih spotty p (spots2 | spotty) = 0.9
• Spotty probabilities regardless of any symptoms p (spotty) = 0.5

then:

• Asih's probability of getting smallpox because there are spots on his face :

• The probability of Asih being allergic because there are spots on his face:

• Asih spotty probability because there are spots on his face:

If after testing the hypothesis one or more evidence (facts) or new observations appear:
with:
e = old evidence
E = new evidence or observation
p (H | E, e) = probability of hypothesis H is true if new evidence E emerges from old evidence ep (H | E) =
probability of hypothesis H is true if given evidence E. p (e | E, H) = link between e and E if the
hypothesis is correct
P (e | E) = the link between e and E regardless of any hypothesis.

Practical Tasks (Post Test) 1


It is known that data from the transportation department has data on the number of bus passengers
riding at bus stops in Jogja in the morning, afternoon, evening, evening and night. The data is shown in
Table 7.1
Table 7.1 Data on the number of Trans Jogja Bus passengers
Halte mor noon Af noon eve night class
Terminal Concat 40 45 44 33 20 many
Sarjito Barat 21 33 43 50 42 average
Santika 46 30 16 36 22 many
Malioboro 3 53 26 26 39 21 many
Janti 3 33 37 39 32 18 many
Papmi Selatan 21 27 35 42 39 few
MT Haryono 37 47 38 32 18 many
SD Pujokusuman 39 38 33 38 25 average

By using weka calculate probability using the Bayes method.


Steps for using WEKA
1. Type the data in table 7.1 in Excel and save it as a .CSV
2. Open the WEKA program where it looks like in Figure 8.1

Figure 8.1 Initial View of WEKA


3. Open the file and select where to place the file. What needs to be noted is that files from Excel are
stored in CSV so they can be read by the WEKA program.

Figure 8.2 Open .CSV file

4. On the top Tab, select classification -> Select Bayes -> star

Figure 8.3 Selection of the BAYES method

5. Outcome the BAYES calculation process using WEKA

Figure 8.4 Process Results with BAYES


6. 6. Perform an analysis of the results of the WEKA program.

Practical Work (Post Test) 2


From the table below, do it like in Task Postest 1. And make an analysis using the Weka
application.
Table 8.2 Student Alumni Data
Age Study period GPA TOEFL First paycheck Long time
(year) work (Million) Wait for work
22 3.5 3.60 360 1 Fast
23 4 3.25 340 2.5 Fast
23 4 3.10 400 2.7 Long
24 5 2.98 3.50 1.5 Average
23 4.5 3.00 410 2.0 Average
25 6 3.20 300 3.0 Long
23 4 2.75 345 1 Long
23 4.5 2.90 376 0.7 Long
26 6 3.35 435 1.2 Average
23 4 3.10 405 2 Fast

Practical Tasks (Post Test) 3


Table 7.3 is nutritional data for toddlers with the parameters Weight, Height and Age that are usually
used to calculate nutrition for children under five. Make an analysis using the Weka application.
Table 8.3. Toddler Data
Toddler Name Age (month) Height (cm) Weight (kg) Nutritional status
Ashila 12 70 10 Average
Raisa 7 60 5 Good
Arya 8 60 6 Good
Azahra 10 70 7 Enough
Kayla 7 50 5 Good
Nayra 8 55 6 Good
Hilmi 11 75 35 Bad
Hanum 10 72 7 Good
Tamima 5 45 6 Good
Maryam 5 47 6 Good
Daffa 6 47 7 Good
Fatma 7 45 6 Good
Hanifa 8 55 6 Good
Reinan 10 68 7 Good
Annisa 12 70 15 Good
PRACTICUM 9
PROBABILISTIC DISTRIBUTION
Time : 1.5 hours

Basic Compentencies :
Students can solve the problem of Normal distribution

Indicator :
Students can use the SPSS application to analyze the Normal probability distribution theory

Theory
In the business world, a lot of data or information is quantitative, in the sense of containing the value of
a number or a certain number, for example: sales of a product in a month, Consumer Price Index,
factory production, etc.

Random variable statistical data can be divided into 2 parts:


1. Discrete Random Variable (containing a certain amount (countable))
2. Continuous Random Variable (containing a value in a certain interval).
According to the random variable type, there are 2 main forms of probability distribution:
1. The probability distribution of random variables that are discrete
a. Binomial Distribution
used to solve probability problems whose random variables are binomial / provide two
alternatives ("Yes" or "No", "Fail" or "Success", etc.)
b. Poisson Distribution
used to explain the likelihood of events in a certain period, or area, or the number of specified,
for example the possibility of the number of work accidents in a factory, the number of death
claims each day received by insurance companies, etc..

2. The probability distribution of random variables that are continuous


a. Normal distribution
a bell-shaped distribution (bell-shaped) is widely used in business statistics such as the
possibility of getting profits / losses on the stock exchange, the weight distribution of products
from fruits sold in a store.
b. Uniform distribution
similar to the poisson distribution, i.e. there is a certain time interval. For example: waiting time
for insurance claims.
An understanding of probability and its various forms of distribution forms the basis for decision making
(statistical inference).
NORMAL DISTRIBUTION TEST
One form of probability distributions whose important role in inference statistics is the normal
distribution. Therefore, after a group of data gets a descriptive statistical treatment, in the sense of
the known average of the data, the variance of the data, etc. So before testing whether the data is
normally distributed or not.

This is very important, because if it turns out that the data is far from assuming a normal distribution
or close to normal, then the data group cannot be tested for hypotheses for normally distributed
(parametric) data. For groups of data that are not normally distributed, non-parametric statistical tests
will be performed.

The following is the profile of 40 respondents who were asked for their opinions on Roti DUTA
MAKMUR products.
Table 9.1 Bread Duta Makmur Product Data

Behaviour Class Age GENDER INCOME Buy


1.00 2.00 1.00 1.00 300.00 25.00
1.00 3.00 1.00 1.00 250.00 35.00
1.00 3.00 1.00 1.00 289.00 32.00
1.00 3.00 2.00 1.00 298.00 31.00
1.00 1.00 2.00 1.00 569.00 35.00
2.00 1.00 2.00 1.00 800.00 26.00
2.00 1.00 3.00 1.00 645.00 30.00
2.00 1.00 4.00 1.00 990.00 30.00
2.00 3.00 2.00 1.00 132.00 32.00
2.00 3.00 4.00 1.00 254.00 30.00
2.00 1.00 3.00 1.00 862.00 33.00
3.00 3.00 4.00 1.00 220.00 37.00
3.00 2.00 1.00 1.00 482.00 36.00
3.00 3.00 2.00 1.00 265.00 30.00
3.00 3.00 3.00 1.00 296.00 31.00
3.00 3.00 3.00 1.00 296.00 34.00

102
3.00 3.00 1.00 1.00 286.00 30.00
3.00 2.00 3.00 1.00 620.00 31.00
3.00 3.00 4.00 1.00 288.00 32.00
1.00 3.00 2.00 2.00 275.00 28.00
1.00 2.00 2.00 2.00 452.00 33.00
1.00 2.00 3.00 2.00 415.00 34.00
1.00 2.00 4.00 2.00 478.00 35.00
1.00 1.00 4.00 2.00 893.00 32.00
1.00 1.00 2.00 2.00 952.00 28.00
1.00 1.00 1.00 2.00 689.00 32.00
2.00 3.00 3.00 2.00 75.00 30.00
2.00 3.00 4.00 2.00 159.00 35.00
2.00 3.00 1.00 2.00 200.00 29.00
2.00 1.00 3.00 2.00 900.00 34.00
2.00 1.00 2.00 2.00 850.00 30.00
2.00 1.00 4.00 2.00 845.00 32.00
2.00 2.00 3.00 2.00 315.00 33.00
3.00 3.00 4.00 2.00 219.00 34.00
3.00 3.00 4.00 2.00 129.00 32.00
3.00 3.00 2.00 2.00 275.00 30.00
3.00 1.00 4.00 2.00 875.00 35.00
3.00 1.00 4.00 2.00 833.00 37.00
3.00 2.00 4.00 2.00 621.00 32.00
3.00 1.00 2.00 2.00 655.00 35.00

Data Description:
 Data for ATTITUDE variable is ordinal data with code:
1 = Like
2 = Enough Like
3 = Dislike
 Data for GROUP variables are nominal data with the code:
1 = Rich
2 = Intermediate
3 = Poor
 Data for the AGE variable are nominal data with the code:
1 = Children
2 = Teenager
3 = young
4 = Adult
 Data for the GENDER variable are nominal data with codes:
1 = male
2 = woman
 Data for INCOME and BUY variables are ratio data.
Income is the respondent's income in a month (in thousands of rupiah) either from working alone or
giving parents, while the BUY variable is the frequency of purchasing bread in a month.
Thus, the first line of the data can be read as consumers (respondents) who like the bread products
of AMBASSADOR DUTA, classified as middle income, teenage age, a man, income (pocket money
from parents) on average Rp. 300,000 per month, and in an average month buy 25 rotiproducts,
DUTA MAKMUR, etc..

From the above consumer profile, will be tested whether the INCOME variable is Normal or not?

Steps in SPSS:
From the Analyze menu, select the descriptive statistics submenu, then choose explore ...

Filling:
 DEPENDENT LIST or the name of the variable to be tested. As is the case, enter the INCOME
variable.
 DISPLAY or output choices to be displayed, which can be either statistical or graphical output (plots).
Because it will only test data normality, select plots.

Figure 9.1. Example Plot menu

To test data normality, do the following steps:


 Click on the Normality plot with tests option
 Deactivate the STEM AND LEAF option.
 Select None in the BOXPLOTS section.
 Press the CONTINUE button to return to the previous dialog box.
 Ignore other parts and press OK
ANALYSIS:

Decision Making Process:


a. Hypothesis
Ho : income data are normally distributed
H1 : income data not normally distributed

b. Basic decision making


By looking at the probability numbers, with conditions:
- Probability > 0,05 Then Ho accepted
- Probability < 0,05 Then H0 Ignored
c. Decision
Based on probability figures:
Because the number in the ASYMP.SIG column is 0.01 which is <0.05, then Ho is rejected, or the

INCOME data distribution does not follow the normal distribution.

When viewed with a plot (graph) it appears that:


 On the NORMAL Q-Q PLOT OF INCOME graph, the data spreads somewhat away from the
straight line, even though the data follows the path to the upper right.
 In the DETRENDED NORMAL Q-Q PLOT OF INCOME chart, the data forms a certain pattern,
which is decreasing, then ascending and decreasing. Given a certain pattern, it can be said that
the data distribution is not normal.

Note:
Although it can be explained through plots, but testing with statistical tools is still more recommended
for the right decision making.

Practical Tasks (Post Test) 1


1) With reference to the table above, test whether the BUY variable is normally distributed or not?
2) Make an analysis of the test results
Practical Work (Post Test) 2
1) Test the following data whether the Age variable is normally distributed. Before testing it, first
change the data into nominal data
Table 9.2 List of Diseases

Gender Job Sallary Status Credit


(Million) Decision
Male PNS 2.5 Single Deserved
Female Swasta 2.5 Married Deserved
Male PNS 4 Married Deserved
Female IRT 1 Married Inappropriate
Female Swasta 2.5 Single Deserved
Male Swasta 0.5 Single Inappropriate
Male PNS 2.5 Single Deserved
Female Swasta 3 Married Deserved
Male Swasta 2 Married Deserved
Female IRT 1 Married Inappropriate
Female Swasta 2.5 Single Deserved
Male Swasta 0.5 Married Inappropriate
Female PNS 2.5 Single Deserved
Female Swasta 1 Married Deserved
Male Swasta 4 Married Deserved
Female IRT 1 Married Inappropriate
Female Swasta 0.7 Single Deserved
Male swasta 0.5 Single Inappropriate

2) Make an analysis of the test results


Practical Tasks (Post Test) 3
Table 9.3 is nutritional data for children under five with parameters Weight, Height and Age that are
usually used to calculate nutrition for children under five. Test whether the variable height and weight
are normally distributed.
Table 9.3. Toddler Data
Toddler Name Age (month) Height (cm) Weight (kg) Nutritional status
Ashila 12 70 10 Average
Raisa 7 60 5 Good
Arya 8 60 6 Good
Azahra 10 70 7 Enough
Kayla 7 50 5 Good
Nayra 8 55 6 Good
Hilmi 11 75 35 Bad
Hanum 10 72 7 Good
Tamima 5 45 6 Good
Maryam 5 47 6 Good
Daffa 6 47 7 Good
Fatma 7 45 6 Good
Hanifa 8 55 6 Good
Reinan 10 68 7 Good
Annisa 12 70 15 Good

Reference:

Winiarti, S, Diktat Statistikm 2010.


http://www.academia.edu/10442600/Praktikum_Teori_Probabilitas
PRACTICUM 9
HYPERGEOMETRIC DISTRIBUTION
Time: 2.5 hours

Basic Competencies:
Students can solve hypergeometric-distribution problems

Indicator:
Students can use the SPSS application to analyze hypergeometric distribution theory

1. Definition of hypergeometric distribution


Hypergeometric distribution is a form of stability without return (ie without replacement), ie every
sampling of data that has been observed is not included in the original population (Algifari, 2010).

2. Differences in binomial and hypergeometric distribution


The difference between the two lies in the return of samples that have been taken in the group. If
the distribution is negative, the sample that has been taken is returned to the group again. but if the
sample hypergeometric distribution is not returned to the group, or the sample is considered lost.

3. Properties of hypergeometric distribution


1. Random samples of size (n) taken without return from (N) objects.
2. As many (k) objects can be named successfully and the rest (N-k) are named unsuccessfully

From the understanding & properties above can be concluded with the following formula:

N : population size or sample space


N : random sample size
k : a lot of insulation or class

112
X : number of class successes (x: 0,1,2,3,4, ... k)

Problem & Solution 1


From 8 motorcycle drivers, 3 people drive "A" motorcycles, 3 people use "B" motorcycles and the rest
drive "C" motorcycles. If 4 people are randomly taken, what is the chance of people driving "A"
motorcycles, 1 person "B" and 2 "C" brands?

Problem & Solution 2


A bag contains 8 marbles consisting of 2 white marbles, 2 purple marbles and 4 yellow marbles.
Determine the hypergeometric probability function of choosing 1 white marble and 1 purple marble

4. Completion of Hypergeometric Distribution with SPSS


In this practicum, Students do primary data collection for hyper geometric distribution of the number of
yellow balls taken by taking 4 balls once each replication and replication 10 times during the practicum.
The following is the result of collecting data obtained:

Figure 10.1 Data on the results of 4 ball collection trials

SPSS processing
The following is data processing using SPSS:
1. Activate Variable View
2. Fill in x and PDF in the Name column
3. Fill in the Decimal column with 0 (zero) on x and 5 (five) on PDF
4. Fill in the two Measure columns with Scale
5. Activate Data View
6. Fill in the values x = 0, 1, 2, 3, 4 with the formulation Pdf.hyper (x, N, n, k)
7. On the Menu Bar click Transform >> Compute Variable

Figure 10.2 Calculation of Hypergeometric Distribution


8. Klik OK
9. Hasilnya adalah sebagai berikut :
10. Berikut rekapan hasil dari peluang 0 bola kuning terambil sampai 4 bola kuning terambil:

Gambar 10.3 Hasil Penghitungan distribusi Hipergeometrik

Practical Tasks (Post Test) 1


From the case below do a calculation using the Hypergeometric distribution and analyze the results of
that calculation.
1. In a box there are 4 bath soaps with the scent of roses and 6 bath soaps with the scent of jasmine. If a
random sample of 3 bath soaps is taken, what is the probability of getting:
a. One rose-scented bath soap
b. Two jasmine bath soap
c. Maximum 2 rose-scented bath soaps
d. Maximum two bath jasmine scented bath soap

2. PT ekarasa, a company engaged in shipping goods, states that shipping goods is said to be good if
shipping 50 packages there are no more than 4 defects. If a random sample of 5 packages of goods from
50 packages are taken, what is the probability:
a. There is one defective package between 2 and 3 defective packages
b. Less than 2 packages are defective
c. Minimum 4 packages that are not defective.

Practical Work (Post Test) 2

1. Look for a sample data / population and take one type of sample data from two different populations.
Make a calculation using probabilistic hypergeometric using SPSS
2. Match the results you get using the Excel program or SPSS.
3. Interpret the results that you have obtained!

Reference
https://id.scribd.com/doc/179102232/Modul-3-Distribusi-Probabilitas
PRACTICUM 11
One Sample Hypothesis Test
Time: 1.5 hours

Basic Competencies:
Students can solve problems in the Hypothesis Test

Indicator:
Students can use the SPSS application to solve the Hypothesis Test questions for the average
test of one sample.

Theory
Hypotheses can be interpreted as guesses about a thing, or a hypothesis is a temporary answer to a
problem, or also a hypothesis can be interpreted as a temporary conclusion about the relationship of a
variable with one or more other variables. But according to Prof. Dr. S. Nasution definition of a
hypothesis is a tentative statement which is a conjecture about what we are observing in an effort to
understand it.
Function
 To test the truth of a theory

 Give new ideas to develop a theory.

 Expanding researchers' knowledge about a phenomenon being studied.

 Hypothesis test

A good hypothesis always fulfils two statements, viz:


 Describe the relationship between variables.

 Can provide instructions on how to test these relationships.

Therefore the hypothesis needs to be formulated before data collection is done. This hypothesis is
called the Alternative Hypothesis (Ha) or the Work Hypothesis (Hk) or Hı. The work hypothesis or Hı is a
temporary conclusion and the relationship between variables has been learned from theories
related to the problem. For testing Hı, there needs to be a comparison namely the Zero Hypothesis (Ho).
Ho is also called the Statistical Hypothesis, because it is used as a basis for testing. The step or procedure
for determining whether to accept or reject a Statistical Hypothesis (Ho) is called Hypothesis Testing.
Therefore, in Hypothesis testing, drawing conclusions. Regarding population based on sample
information not the population itself, the conclusion could be wrong. In Hypothesis Testing there are
two errors or errors, viz:

table 10.1. Mistakes in Hypothesis Testing

The conclusion is right if we accept Ho, because indeed Ho is right, or reject Ho, because indeed Ho is
wrong. If we conclude rejecting Ho even though Ho is correct, then we have made a mistake called error
or type I error (α). (β). If the value of α is reduced, it will become large β. The α value is usually set at
0.05 or 0.01. If α = 0.05, meaning 5 out of every 100 conclusions we will reject Ho, which should be
accepted. Price (1-β) is called a Power of Test or Strength Test.

Techniques in hypothesis testing are based on:


a. One Party Testing
Ho : α = αo
Hı :α > αo
Hı :α < αo

b. Two-Party Testing
Ho : α = αo
Hı : α # αo
 Test an average of one sample
Testing the average of one sample is intended to test the mean or population mean μ equal to a certain
value of μo, versus the alternative hypothesis that the mean or population mean of μ is not the same as
μo. So we will test:
Ho :α = αo opposites Hı : α # αo
Ho is an initial hypothesis.

Experiment

A student conducts research on gallons of pure milk whose contents contain an average of 10 litres.
Random samples have been taken from 10 bottles whose contents have been measured, with the
following results: 10,2 ; 9,7 ; 10,1 ; 10,3 ; 10,1 ; 9,8 ; 9,9 ; 10,4 ; 10,3 ; 9,8. with α = 0,01 Analyze it
manually:

1. hypothesis Ho :α = 10 opposites Hı : α # 10
2. Statistical test t (because α is unknown or n <30).
3. α = 0.01
4. The area of criticism: t < t α/2(n-1) or t > t α/2(n-1).
5. Calculations, from data: average x = 10.06 and sample standard deviations s = 0.2459.

x-μ
t= = 0,772
s/√n

Since t = 0.772 lies between -3,250 and 3,250 concluded to accept Ho, this means that a statement that
the average 10 litres of pure milk content is acceptable.

Analysis using SPSS:


1. Enter the data above in Data View, but first we must specify the name and type of data in Variable
View.
2. Click the Analyze Compare Means One Sample T-Test Menu.
3. So as to produce the following analysis results:
Information on the results of the analysis:
Std error = Standard Error
T = calculated value
Df = degree of freedom
Sig (2-tailed) = probability (α/2)
Mean difference = average comparison
Ho accepted if sig > (α/2), Ho rejected if sig < (α/2),

Practicum Task (Post Test) 1


An entrepreneur believes that the average daily sales of his employees is Rp.1,020.00 with alternatives
not the same. For the purpose of testing his opinion, the employer interviewed 20 of his employees who
were chosen at random. By using α = 0.05. test the opinion and give your analysis. The results of the
interview are as follows.
Table 10.1. Employee Sample Data

Write down the results of the analysis below, and whether Ho was accepted?

Tugas Praktikum (Pos Test) 2


Sports equipment companies develop types of synthetic fishing goods which are claimed to have an
average strength of 8 kg and a standard deviation of 0.5 kg. It is known that with a sample of 50
synthetic fishing lines the average strength is 7.8 kg. With a significance level of 0.01, Test the
hypothesis that the average population is not equal to 8 kg? solve it using the SPSS application to prove
it!

Practical Tasks (Post Test) 3

The average time required per student to enrol in an odd semester at a tertiary institution is 50 minutes
with a standard deviation of 10 minutes. A new registration procedure using a modern machine is being
tried. With this modern machine it is known that 12 students require an average registration time of 42
minutes with a standard deviation of 11.9 minutes. With a confidence level of 0.05, test the hypothesis
that the average value of the modern machine population is less than 50? Assume that the time
population is normally distributed. solve it using the SPSS application to prove it!
PRACTICUM 12
COMPETENCE TEST 2
Time : 1,5 hours
Basic Competencies : Students can solve problems in the Hypothesis Test
Indicator : After getting this material students be able to:

1. Understand the proposition of addition, conditional opportunity and multiplication


proposition
2. Calculate examples of case studies with the sums, conditional opportunities and
multiplication propositions
3. Complete probabilistic case study material using SPSS
4. Analyze the results of data processing from SPSS
5. Understand random variables, theoretical distribution
6. Understand and can distinguish discrete theoretical distribution and continuous
7. Understand Discrete Theoretical Distribution (Uniform)
8. The probability value of a binomial distribution case is calculated
9. Read the binomial table
10. The probability value is calculated from a case of a Normal distribution
11. Read the Normal table
12. Understand the average price test
13. Resolve the average price hypothesis test case
14. Understand the definition of proportion price test
15. Resolve a hypothesis test case for price proportions
16. Understand the average price test
17. Resolve a hypothesis case test for average prices
18. Understand the definition of proportion price test
19. Resolve the hypothesis test case for price proportions
20. Understand the average price test

Competency Material:

1. Basic Probability (chance of an event)


2. Statistic Probabilistic (Bayes)
3. Theoretical Distributions

Competency Test Time: Week 9 lectures

How to Test Competency: Students answer the questions given both in writing and practice.
PRACTICUM 13
Two Sample Hypothesis Test

Time : 1.5 hours

Basic Competencies :
Students can solve problems in the Hypothesis Test

Indicator :
Students can use the SPSS application to complete the Hypothesis Test questions for the average
test of two samples.

Theory
For testing the average of two samples there are 2 types of data:
1. Two Paired Samples.
This means that the two samples are mutually exclusive and the number of observations
(repetitions) is the same in each sample.
2. Free / Independent Samples.
In the average test of two paired samples, the number of observations must be the same (n1 = n2),
whereas in the two free samples the number of observations does not have to be the same.

Practical Steps:
1. Enter the data above in Data View, but first we must specify the name and data type in Variable View.
2. Click Analyze Compare Means Paired Samples T-Test

The display will appear as follows:


Conclusions H0 is accepted because p-value / 2 > 0,05

3. For example, we will test the sample at the real level α = 0.05 that the wheel turn on bicycle 1 is
different from bicycle 2. The wheel turnout data (minutes) of the two bicycles is:

The completion steps are as follows:

1. Enter the data above in Data View, but first we must specify the name and data type in Variable View.
2. Click Analyze Compare Means Independent Samples T-Test

Then the analysis results will appear, write below!


Practical Tasks (Post Test) 1
One teacher argues that there is no difference in the average grades of grade A students and grade B
students, but with alternatives there are differences. To test this opinion, a study was conducted based
on random sampling where there were 8 students in class A and 6 students in class B. It turned out that
the results of the study were as follows:
Class A : 7,5 ; 8,5 ; 7 ; 7,3 ; 8 ; 7,7 ; 8,4 ; 8,5
Class B : 7 ; 6,7 ; 7,3 ; 7,5 ; 6,6
By using α = 5%, test that opinion.

Practicum Task (Post Test) 2


A mathematics lesson is given to 12 students with the usual teaching methods. Another class of 10 students
is given the same lesson but with a method that uses pre-programmed material. At the end of the semester
both students of the class are given the same exam. The first class reaches an average value of 85 with
standard deviation 4, while the class that uses programmable material gets an average value of 81 with
standard deviation 5. With a significance level of 0.1, test the hypothesis that there is no difference between
the two teaching methods.

Practicum Task (Post Test) 3


A shop owner who sells 2 kinds of light bulbs A and B brands, believes that there is no difference in the
average length of light bulbs between the two brands with alternatives there is a difference. To test his
opinion an experiment was carried out by lighting 100 brand A light bulbs and 50 brand B light bulbs, as
a random sample. It turns out that the light bulb of brand A can light for an average of 952 hours, while
the B brand is 987 hours, each with a standard deviation of 85 hours and 92 hours. Using α = 5%, test
the opinion.

Reference:
Winiarti, S, 2010, Diktat Statistik dasar
PRACTICUM 14
ANOVA TEST (REALIBILITY TEST AND VALIDITY TEST)
Time : 2.5 hours

Basic Competencies :
Students can complete the processing and questionnaire for data test materials with SPSS

Indicator :
Proving the truth of an item. Grains that are said to be valid / true if the item has a
contribution to the value of the measured variable.

Theory
Decision items are valid or fall used two ways, namely comparing the rxy value of the calculated results
(SPSS output) with r in the table and comparing the SPSS output probability value with the probability
value used by researchers (usually using 5% for social research and 1% for exact research). If the value of
rxy ≥ rtable or SPSS output probability ≤ 0.05, then the item is valid. Vice versa if the value of rxy <rtable
or the probability value is greater than 0.05, the item can be said to be invalid.

Example:
Will be conducted a study of the influence of leadership and work motivation on work performance
Before conducting research each instrument is tested first to get a valid and reliable instrument. The
instrument trial was only once and was carried out on 10 respondents.

147
TABLE 13.1. FORM OF QUISIONER
Respondent Form Answer (item)
Num. Form1 Form2 Form3 Form4 Form5 Form6 Form7 Form8 TOTAL
1 3 7 5 7 6 4 6 2 40
2 5 3 6 4 6 5 5 4 38
3 2 6 4 4 8 6 6 3 39
4 8 5 6 5 4 3 7 2 40
5 4 5 6 7 8 5 1 6 42
6 3 6 6 5 6 3 5 2 37
7 6 4 5 7 3 4 6 6 41
8 5 5 5 8 4 4 6 5 42
9 7 6 4 5 6 5 2 1 36
10 4 6 5 4 7 4 3 4 37

The steps to answer using SPSS


1. Populating the Table in Variable View

2. Fill Table in Data View


3. Sort Analyze Menu select Correlation left click Bivariate menu.

4. In the Dialog, items in the left box are entered into the Variables column, in the Correlation
coefficient select Pearson, in the Test of Significance dialog box select One Tailed, then
OK.
5. After that the output will appear as below

6. To make it look neat and make it easier for us to read the table, right click / double click on the
output table, select Pivot, Edit then select pivoting traying. An image will appear below the
Move box in the column to the layer and the statistics box in row to column.

150
7. Looks neater results, remember the second variable to be cited is TOTAL.

a. Listwise N=10
To analyze the validity test, one-tailed test of significance is used. And from the results of these
calculations we get the following interpretation,
 Probability between Forms (item)1 and total items are 0,482 which mean p >0,05.
 Probability between Forms (item) 2 and total items are 0,243 which mean p>0,05.
 Probability between Forms (item) 3 and total items are 0,256 which mean p >0,05.
 Probability between Forms (item) 4 and total items are 0,04 which mean p < 0,05.
 Probability between Forms (item) 5 and total items are 0,205 which mean p >0,05.
 Probability between Forms (item) 6 and total items are 0,464 which mean p > 0,05.
 Probability between Forms (item) 7 and total items are 0,245 which mean p >0,05.
 Probability between Forms (item) 8 and total items are 0,017 which mean p <0,05.

A measurement is declared valid if it has a significant correlation. It was said to be significant if p


<0.05. From the above interpretation, it can be concluded that items 1,2,3,5,6, and 7 are not
significant because p> 0.05. Therefore, the questions 1,2,3,5,6 and 7 can be said to be invalid. While
items 4 and 8 each have a significant correlation with total items because p <0.05. Then it can be
concluded that the questions 4 and 8 can be declared valid.

Reliability
The main purpose of reliability testing is to determine the consistency or regularity of the
measurement results of an instrument if the instrument is used again as a measurement tool for
an object or respondent (Triton PB, 2005).
To test the reliability of a questionnaire the Alpha-Cronbach method is used. The standard used
in determining whether or not a research questionnaire is generally a comparison between the
value of r count with r table at 95% confidence level or 5% significance level. Testing reliability
with this Alpha Cronbach method, then the calculated r value is represented by the Alpha value.
According to Santoso (2001: 227), if the alpha count is greater than r table and the alpha count is
positive, then a questionnaire can be called reliable. Cronbach's Alpha formula:

If the reliability coefficient has been calculated, then to determine the closeness of the
relationship can be used Guilford (1956) criteria, i.e.:

1. less than 0.20: Very small and negligible relationship


2. 0.20 - <0.40: Small relationship (not close)
3. 0,40 - < 0,70 : A pretty close relationship
4. 0,70 - < 0,90 : Close relationship (reliable)
5. 0,90 - < 1,00 : Very close relationship (very reliable)
6. 1,00 : Perfect relationship

Example
Based on data on Validity Test:

1. Select the Analyze Menu, sort down select the Scale menu, then click the Reliability Analysis
menu...,
2. 2. A dialog box appears, enter the Item into the Item Box, then press OK

3. The SPSS output will show as follows:

The respondents examined in the questionnaire trial were 10 (N = 10) and none of the data
was excluded or excluded from the analysis. The Cronbach Alpha value is - 0.217 with a total of 8
questions. The r table value for two-sided testing at a 95% confidence level or 5% significance (p =
0.05) can be sought based on the number of respondents. Because the Alpha Cronbach value = -
0.217 (minus value), the questionnaire tested proved unreliable.
Exercise:
1. Look for Validity and Reliability of Job Performance (Y), from the results of tabulated data for
Work Performance Variables as follows:

Statement Item Number


Res Num. Total Y
01 02 03 04 05 06 07 08 09 10
01 4 4 5 4 5 4 4 5 4 3 42
02 3 3 3 4 4 3 3 4 4 3 34
03 4 5 5 5 5 4 5 5 5 5 48
04 4 4 4 4 2 4 4 2 4 2 34
05 3 5 4 5 4 3 5 4 3 4 40
06 4 3 4 4 4 4 3 4 3 4 37
07 4 3 4 4 5 5 3 3 4 4 39
08 4 5 5 5 5 4 5 5 4 3 45
09 5 3 4 4 3 5 3 3 4 4 38
10 4 3 4 4 4 4 3 4 4 4 38
11 5 5 5 5 4 5 5 4 4 5 47
12 4 3 2 3 3 4 3 3 3 1 29
13 1 3 5 2 2 1 3 2 3 1 23
14 3 5 4 4 5 3 5 5 1 4 39

15 4 4 4 1 5 4 1 3 3 5 34
16 4 4 5 3 5 5 4 5 5 4 44
17 4 5 5 3 5 5 5 3 4 5 44
18 3 4 4 4 4 3 4 4 4 4 38
19 4 4 2 4 4 2 4 4 2 1 31
20 4 3 4 3 4 3 4 4 4 4 37
2. Look for Validity and Reliability of work motivation, from the results of tabulating data for Work
motivation Variables as follows:

The results are as follows:

Respondent’s Answers total


Num Respondent
Item 1 item 2 item 3 item 4 item 5 item 6 item 7 item
1 Neviana 3 4 3 4 2 4 2 22
2 Putri 2 2 3 2 2 4 1 16
3 Fita 1 2 3 4 4 2 3 19
4 Hidayatullah 3 2 3 1 2 3 4 18
5 Danar 2 3 4 4 2 3 2 20
6 Ela 4 3 2 4 4 2 2 21
7 Yuni 2 3 2 4 1 2 3 17
8 Bagus 2 2 4 2 2 3 2 17
9 Ardita 4 2 3 2 3 3 4 21
10 Erlind 3 1 1 3 2 4 4 18
11 Ida 2 3 2 3 4 4 4 22
12 Mustofa 2 3 4 5 1 2 5 22
13 Ferdinan 5 2 3 1 2 1 4 18
14 Yunus 2 1 2 3 4 3 4 19
15 Prima 3 2 3 2 5 5 5 25
16 Andy 2 3 3 3 4 2 4 21
17 Arif 4 3 2 3 4 2 4 22
18 Nazar 1 2 3 4 5 4 3 17
19 Irwan 5 4 3 4 2 2 1 21
20 Amsarry 2 4 4 2 2 3 1 18
157
158
159
Post Test 2

Based on table 13.2, do the data processing and analysis of the results.

Patient's Answer Score


Questions (Yes=1/
Yes No No = 0

1. Have you ever forgotten to take medicine?

2. Besides forgetting, maybe you don't take medicine


for other reasons. In the past 2 weeks, have you ever not
taken medicine?

3. Have you ever reduced or stopped taking medicine


without the knowledge of your doctor because you felt the
medicine given made your condition worse?

4. Have you ever forgotten to bring medicine when traveling?

5. Are you still taking your medicine yesterday?

6. When you feel the symptoms experienced have resolved


whether you stop taking medication?

7. Taking medicine every day is an inconvenience for some


people. Do you feel bothered having to take medicine every
day?

8. How often do you forget to take medicine? a. Never


Note: b. Once a week
Always : 7 times a week c. Sometimes
Usually : 4-6 times a week
Sometimes : 2-3 times a week d. Usually
Once in a while : once a week e. Always
Never : Never forget

Вам также может понравиться