Section: III- 21 BSE Biology Assignment #2: Validity and Reliability VALIDITY 1. A teacher who is a friend of yours has just developed a test to measure the content in a social studies unit you both teach. His test takes 30 minutes less time to complete than the test you have used in the past, and this is a major advantage. You decide to evaluate the new test by giving it and the old test to the same class of students. Using the following data, determine if the new test has concurrent validity evidence.
- To prove that the new test has concurrent validity, we must get the means of the scores of the students after answering the old test and the new test: a. 25, 22, 18, 18, 16, 14, 12, 8, 6, 6 = 14.5 b. 22, 23, 25, 28, 31, 32, 34, 42, 44, 48 = 32.9
2. Indicate how and with what types of tests you would evaluate the predictive and construct validity evidence of the new social studies test in Question 1.
- In evaluating the construct validity evidence of the new test in social studies, I must give the new test to two groups of students: history majors and chemistry majors. If the scores of the history majors compared to the chemistry majors are relatively high, then I can say that the new test in social studies has construct validity. But, if the score of the history majors compared to the chemistry majors are relatively low, then, I can say that the new test in social studies does not have construct validity. It only shows that the new test is appropriate to use to students who are studying disciplines relating to social studies, just like the history majors. On the otherhand, to evaluate the predictive validity evidence of the new test in social studies, I think it would be better to use aptitude type of test. In doing this one, again, I should give the new test to the history and chemistry majors. And because history and chemistry majors take the test, Ill be much in favor to history majors to get the high scores. Now, if the scores of the history majors is relatively high compared to chemistry majors, I must say that the new test has predictive validity evidence because history majors had the highest scores. But, if its the other way around, I must say that the new test does not have predictive validity. And I cannot conclude that the history majors suck in social studies because there might be some factors that affect their scores like content of the test and such.
3. Examine the validity of coefficients of the following tests and assuming they have content validity evidence, determine which are suitable for use. State your reasons. What is unusual about Test C?
Test A Test B Test C Concurrent Validity Coefficient .90 .50 .75 Predictive Validity Coefficient .72 .32 .88 - I think the most suitable to use is Test A because it has the closest value to the criterion or basis. And the directions of the coefficients are both towards +1. Although Test C can also be used but there is something unusual in Test C. Based on the 1 st Principle in Concurrent Validity Evidence, Concurrent Validity Coefficients are generally higher than Predictive Validity Evidence and as we see it in the table, Test C has higher Predictive Validity Coefficient than the Concurrent which makes it quite unusual.
4. Assuming the following tests measure the same content and assuming all other things are equal, rank them in terms of their overall acceptability for predicting behavior in upcoming years.
Test A Test B Test C Concurrent Validity Coefficient .90 .80 .85 Predictive Validity Coefficient .50 .65 .60 (One-month interval) Predictive Validity Coefficient .40 .10 .55 (Six-month interval) - Based on my own observation, in terms of their overall acceptability for predicting behavior in upcoming years, Im ranking them like this: Test C, Test A, and Test B. Test C for me has the most acceptable predicting behavior in the upcoming years because it has the highest predictive validity coefficient even after a six-month interval. Although the value had a .5 diminish after six months, still its much closer to +1 compared to Test A and to Test B, especially to Test B which fluctuated from .65 to .10 which is not a good sign of predictive validity evidence.
5. What type of validity evidence will go with the following procedures?
a. Matching test items with objectives. - Content related validity evidence b. Correlating a test of mechanical skills after training with on-the-job performance ratings. - Predictive validity evidence c. Correlating the short form of an IQ test with the long form. - Concurrent validity evidence d. Correlating a paper-and-pencil test of musical talent with ratings from a live audition completed after the test. - Predictive validity evidence e. Correlating a test of reading ability with a test of mathematical ability. - Construct validity evidence f. Comparing lesson plans with a test publishers test blueprint. - Concurrent validity evidence and Face validity evidence
6. The principal is upset. The results of the four-year follow-up study are in. The correlation between the grades assigned by you to your gifted students and their college GPA is lower than the correlation between grades and college GPA for nongifted students. The principal wants to abandon the gifted program How would you defend yourself and your program?
RELIABILITY 7. Test A and test B both claim to be reading achievement tests. Test A reports a test-retest reliability of .88, and test B reports split-half reliability of .91. Test B has concurrent validity of with Mr. Burns classroom objectives. Test A has concurrent validity with a recognized reading test but measures several skills not taught by Mr. Burns. If Mr. Burns wishes to evaluate progress over his course of instruction, which test should be used for both his pre test and pots test? 8. Test A and Test B both claim to be reading readiness tests for use in placing children in the first-grade reading groups. Neither test reports validity data. Test A reports an internal consistency reliability of .86, and test B reports a test-retest reliability of .70. If a first- grade teacher came to you for advice on test selection, what would be your recommendation?