In a couple of weeks, we will witness the largest gathering of humans at the Kumbha Mela at Allahabad, and in a few months after that we will be witness to the greatest ‘Testing Tamasha’ of the world. Millions of young learners will be taking numerous tests, ranging from engineering, medical to admission to various courses of various Universities. These will lead to joys of success, disappointment of failure, some will feel that their life is ruined and a few driven to suicide as well.
An important question that arises is how accurate and reliable are these tests? What are the attributes of a good test? Should there be an element of accountability of test administrators. As things stand, protecting the secrecy of the test papers and confidentiality of the evaluation system is considered most important and various provisions of the Indian Penal Code are invoked occasionally, but the exam and test providers are not required to live up to any standard of relevance, quality and external review. The entire principles of external verification and working to specified quality standards stand totally abdicated. This is true almost all over the world for most of the tests.
The canonical principle adopted is that if a number of candidates are asked exactly the same questions and required to answer them without any help in the same prescribed time, under proctored conditions, then those who can answer the most to the satisfaction and expectations of the examiners are the best. However the fundamental question of whether the attributes and qualities being sought to be tested have value, and more importantly what is the degree of accuracy and reliability of the scores assigned to Individual test takers.
Almost 4 decades ago in a book on evaluation by Edwin Harper, he summarised the results of some interesting research in how examiners evaluate the scripts assigned to them. He reported that there is a wide variation in the marks given by different examiners to the same script, that led to the conclusion that ‘examiners do not agree with each other’. Even more interesting was the outcome of another experiment where the same answer script was sent to the same examiner again after several months. It was found that there was a significant variation in the scores. This leads to the inference that ‘ examiners do not agree with themselves’.
We have made much progress in moving towards consumer protection, right to information and rights of children to a safe learning environment. We have a great awareness of the negative effects of junk food and serious punishments for adulteration and fake goods.
Shouldn’t fake tests meet the same fate.
The Fairtest organisation in the US has made a scathing observation that the quality of pet food in the US is better regulated than the tests being administered to the children.
There is a whole body of knowledge and good practices for assessment, including item analysis on the basis of facility, reliability and discrimination index, and quality examinations do declare relevant parameters to organisations who use their tests.
But in India we distinguish between students on one mark difference when the error of measurement itself would be much more.
As we come towards the end of 2012, we have the good fortune that the Vice Chancellor of Delhi University is a distinguished Mathematician, the director of NCERT is also a mathematician, and the Chairman of CBSE is an IITian. There can be no better time to acknowledge the importance of educational measurements and systematically improve upon the random unreliable assessment to one where we know what we are measuring and how reliably and accurately.