The Effectiveness of a Multi-Stage Adaptive of Testing by using the Delta Scoring Method
DOI:
https://doi.org/10.36473/ujhss.v62i2.2074Keywords:
multistage adaptive testing, Delta-Scoring MethodAbstract
This study aimed to reveal the effectiveness of multi-stage adaptive testing using the Delta- Scoring Method. The data resulting from the application of the Otis- Lennon Test of General Mental Ability (Figure J) was used on Ibb University students. The size of the staging sample was (1600) male and female students, and the data were graded using the Delta program. The adaptive testing sample consisted of (130) students, and the data of their responses were used in the adaptive testing, which consisted of four stages. The first stage consisted of five items, followed by the second, third and fourth stages, each of which consisted of five items as well. Then the score for each student in the four stages was cumulatively calculated, and it was compared to the student's estimated score from the written test with its eighty items. The results showed that the correlation coefficient between the estimated score from the adaptive test and the estimated score from the written test with its eighty items was (0.878), and the average difference in the estimated score was (5.728). The standard error of the mean was (0.096). In addition, the difference between the square root of the mean of the error squares in the mean was (0.053) which indicates the effectiveness of the adaptive testing built using the Delta method.
Downloads
References
• Al Sharifain, Nidal Kamal; Bani Atta, Zayed Saleh (2013). Tracing the effect of a number of steps of the multiple-graded questions and the shape of the distribution of their difficulty on the ability ratings of individuals, the difficulty of the questions and the information function of the test according to the partial estimation model. Journal Educational, 28(109), 213-275.
• Al-Balawi, Khadra Abdullah (2019). The effect of the difference in the ability of individuals on the accuracy of estimating vocabulary and individuals' parameters in multiple-choice items according to the three-way logistic model of the teacher. Journal of Specialized International 8(5), 78-91.
• Al-Bayaida, Alaa Muhammad Moazi (2011). Building an adaptive test of mathematical ability for the seventh grade according to the hierarchical strategy using item response theory. (Unpublished Master dissertation). Mutah University, Mutah. Retrieved from http://search.mandumah.com/Record/785139.
• Al-Bursan, Ismail Salameh (2012). The effectiveness of adaptive testing using dichotomous and polytomous items. Journal of King Saud University, Educational Sciences, and Islamic Studies, 24 (4), 1487-1518.
• Al-Shdifat, Sabah Jamil Fadoos (2008). Building an adaptive test to measure mathematical ability according to the two-stage strategy in the item response theory (unpublished master's dissertation). Yarmouk University, Irbid. Retrieved from http://search.mandumah.com/Record/742259.
• Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. Harcourt Brace Jovanovich.
• Dallas, A. (2014). The effects of routing and scoring within a computer adaptive multi-stage framework (Doctoral dissertation, University of North Carolina at Greensboro).
• De Champlain, A. F. (2010). A primer on classical test theory and item response theory for assessments in medical education. Medical education, 44(1), 109-117.
• Dimitrov, D. (2020b). Using the D-Scoring Method in Large-Scale Assessments [Workshop]. International Conference on Education & Training Evaluation, Education & Training Evaluation Commission (ETEC).
• Dimitrov, D. M. (2016). An Approach to Scoring and Equating Tests with Binary Items: Piloting with Large-Scale Assessments. Educational and Psychological Measurement, 76(6), 954–975. https://doi.org/10.1177/0013164416631100
• Dimitrov, D. M. (2018). The Delta-Scoring Method of Tests with Binary Items: A Note on True Score Estimation and Equating. Educational and Psychological Measurement, 78(5), 805-825.
• Dimitrov, D. M. (2020a). Modeling of item response functions under the D-scoring method. Educational and Psychological Measurement, 80(1), 126-144.
• Dimitrov, D. M., Atanasov, D. V., & Luo, Y. (2021). Person-fit assessment under the D-scoring method. Measurement: Interdisciplinary Research and Perspectives, 18(3), 111-123.
• Dimitrov, D.M., & Atanasov, D.V. (2020). Latent D-scoring modeling: Estimation of item and person parameters. Educational and Psychological Measurement, 81(2), 388-404.
• Domingue, B. W., & Dimitrov, D. M. (2015). A comparison of IRT theta estimates and delta scores from the perspective of additive conjoint measurement (Research Rep., RR-4-2015). Riyadh, Saudi Arabia: National Center for Assessment.
• Domingue, B., & Dimitrov, D. (2021). A comparison of IRT theta estimates and delta scores from the perspective of additive conjoint measurement.
• Hambleton, R.H., & Swaminathan, H. (1985). Item Response Theory. Principles and Application. Boston: k lower-nigh of tests and Item. Applied Psychological Meassurment,9(2), (136- 164).
• Han, K. T., Dimitrov, D. M., & Al-Mashari, F. (2019). Developing multistage tests using D-scoring method. Educational and Psychological Measurement, 79(5), 988-1008. https:// doi.org/10.1177/0013164419841428
• Mousavi, S. A. (2015). The effect of person misfit on item parameter estimation A simulation study. (Unpublished doctoral dissertation). University of Alberta.
• Nasraween, Mueen Salman (2018). The accuracy of estimating the parameters of the paragraphs when using four logistic models within the framework of the paragraph response theory. Journal of Educational Sciences, 45(4), 179-205.
• Öztürk, N. B. (2019). How the Length and Characteristics of Routing Module Affect Ability Estimation in ca-MST? Universal Journal of Educational Research, 7(1), 164-170.
• Patsula, L. N. (1999). A comparison of computerized adaptive testing and multistage testing. University of Massachusetts Amherst.
• Rotou, O., Headrick, T. C., & Elmore, P. B. (2002). A proposed number correct scoring procedure based on classical true-score theory and multidimensional item response theory. International Journal of Testing, 2(2), 131-141.
• Sands, W., Walters, B., & Bride, J. (2001). Computerized adaptive testing: From Inquiry to operation. American psychological association. Washington DC: American Psychological Association.
• Sari, H. İ., & Huggins-Manley, A. C. (2017). Examining content control in adaptive tests: Computerized adaptive testing vs. computerized adaptive multistage testing. Educational Sciences: Theory & Practice, 17(5).
• Vispoel, W. P. (1993). The Development and Evaluation of a Computerized Adaptive Test of Tonal Memory. Journal of Research in Music Education, 41(2), 111–136. https://doi.org/10.2307/3345403.
• Vispoel, W. P., Wang, T., & Bleiler, T. (1997). Computerized adaptive and fixed‐item testing of music listening skill: A comparison of efficiency, precision, and concurrent validity. Journal of Educational Measurement, 34(1), 43-63.
• Wang, T., Hanson, B. A., & Lau, C. M. A. (1999). Reducing bias in CAT trait estimation: A comparison of approaches. Applied Psychological Measurement, 23(3), 263-278.
• Warm, A. (1978). A Primer of Item Response Theory: Us Cost Guard Institute Oklahoma 73/69.
• Weiss, D. J. (1982). Improving Measurement Quality and Efficiency with Adaptive Testing. Applied Psychological Measurement, 6(4), 473–492. https://doi.org/10.1177/014662168200600408.
• Yurekli, H. (2010). The Relationship between Parameters from Some Polytomous Item Response Theory Models. (Unpublished master’s thesis). Florida State University Gainesville.