Internet-Based Assessment 2002 - 2004
Bertil Roos & David Hamilton
This paper focuses on teaching and learning in higher education in Sweden.
The social validity of recent reform initiatives is examined; and their overall
efficiency, effectiveness and validity are considered. The paper concludes with
the suggestion that quality enhancement in higher education might benefit more
from the promotion and recognition of learning than to the control and steering
In December 2001, the managing director of Statens Järnvägar AB (Swedish Railways) held a press conference. The bad news was that the company expected to make a 100 mn kronor loss in 2001. The good news was that the company had a reform plan for 2002. 'How many jobs will be lost?', journalists inquired. SJ's managing director, Sune Karlsson, did not answer directly. Instead, he merely indicate that 'we shall take control over costs' (Dagens Nyheter,12 December, 2001).
It was later revealed in the same source (13 January, 2002) that this cost review would include tighter control over student cards. Too many phantom students have been using rebate cards. Swedish Railways feels, therefore, that it is subsidising the expansion of higher education in Sweden. Such students are like Nikolai Gogol's 'dead souls'. Their existence threatens the company's profitability.
During the 1960s Swedish students took to the streets to defeat capitalism. Recently, it seems, they have approached the same goal by taking to the railways. In short, their behaviour has been one of the unintended effects of the expansion of higher education in Sweden.
The reform of teaching and learning in Swedish higher education is troubled by such examples. The existence provides the basis of this paper. Our general argument, then, is rooted in three assumptions:
1. Educational reform is a social practice;
2. social practices have unintended consequences; and
3. these consequences must be taken into account when the validity of reform policies is considered.
We also draw on two other inspirations. First, we link the unintended consequences of reform to Samuel Messick's extended discussion of validity. Although Messick's original analysis discussed psychometric test use, his central argument can be generalised to all forms of practice:
Once it is denied that the intended goals of the proposed policy use are the sole basis for judging worth, the value of the policy must depend on the total set of effects it achieves, whether intended or not. (see Messick, 1989, p. 85, who used 'test' rather than 'policy').
Our final inspiration also comes from the world of testing, examinations and assessment. They are social practices (cf. Hanson, 1993) and that, accordingly, they should be studied by reference to social as well as intellectual or academic frameworks. In this respect, we follow the pathbreaking work of scholars who have focused on the social distinctions that underpin, variously, formative from summative evaluation, high stakes from low stakes testing, and divergent from convergent assessment.
Higher education is a knowledge-producing industry (cf. Gibbons et al., 1994). Like other industries, its production, financial and quality-assurance systems are constantly changing, together with its raw materials, technologies, sites of production and markets. Nevertheless, higher education also has other functions. It is not only a knowledge-producing industry. It also produces human capital - professionals and citizens with socially-valued capabilities. Moreover, these different production tasks, economic and political, are associated with contrasting reform priorities. Economic imperatives link higher education to national productivity and international competition; while political imperatives link higher education to the promotion of equity and social justice. At times, such imperatives overlap; but at other times they pull in different directions. Political compromise is the outcome. Policy trajectories are never linear and straightforward.
During the 1990s, a new wind of change began to blow in Swedish higher education. A period of general economic stringency brought the real and projected costs of higher education under public and political scrutiny. Further expansion was to be accompanied by a reduction of unit costs. A discourse of expansion was replaced by discourse of efficiency. Between 1989/90 and 1997/98, for instance, the number of students rose by 86% whereas the number of staff increased by only 17%, representing a change in staff:student ratios from 1:10 to 1:15 (Westling, 2000, p. 14).
Such increased efficiency, however, challenged the advancement of quality (i.e. the kind of education that could be offered). The Swedish Agency for Higher Education (Högskoleverket), for example, investigated whether undergraduate teaching was troubled by 'shrinking quality' (Högskoleverket, 2000, p. 6). The report of this investigation identified changes in the working conditions of university teachers. For example, between 1993 and 1998 the number of teaching hours for first year undergraduate biology courses at Stockholm university fell from 8,3 to 6,4 hours per week (Högskoleverket, 2000) , despite a recommendation of 18 hours per week that had been made by an earlier national investigation (SOU 1992:44). Further, only 40% of this teaching was carried out by university teachers with a doctorate (the so-called gold standard for undergraduate teaching). Högskoleverket's conclusion was that if the tendencies noted in the report are 'valid nationwide' there may be a 'decline in the quality of undergraduate education' (p.6).
Output measures of quality also became implicated in this general review of reform. Drop-out and course-switching increased during the 1990s, together with the proportion of students who signed up for courses but did not complete the course requirements. The proportion of first-year students who followed these practices increased from 22 percent in 1990/91 to 24 percent in 1995/96 (SCB, 1999). And the proportion of students who did not complete their term-long (4-month) courses rose from 15,7 % (25240/160763) in 1991/92 to 17.1% in 2000 (43959/256855).
These trends should be treated with caution - it is always difficult to identify cause and effect. Nevertheless, the Swedish public accountant suggested that there were signs of a decrease in the effectiveness of higher education (Riksdagens Revisorer, 2000, p. 6). By the end of the 1990s, then, a narrowly-conceived efficiency discourse, based on quality assurance, was floundering. The educational merits of the reforms were not obvious. To fill this ideological gap, a new discourse - relevant to this paper - is currently being created.
This new emphasis is being followed in a second round of national audits. Quality and effectiveness - or 'quality enhancement processes' - are to grow from 'mobilising the inner resources' (att mobilisera de inre krafterna) of each institution (Högskoleverket, 1998, pp. 6 and 10). The main emphasis of such work is upon the development of institutional practices that 'best favour the development of activities' that, in turn, lead to the 'best long-term outcomes in teaching and research'. The platform for this development is the gathering of information about the work of the institution and, subsequently, its utilisation to take decisions about appropriate measures (Högskoleverket, 1998, pp. 17, 21). In short, higher education began to incorporate the management school maxim: 'work smarter, not harder'.
This new discourse includes two assumptions that are relevant to this paper:
(1) that efficiency can be improved with new measures (such as the proposed open or net-university); and, perhaps to a lesser extent,
(2) that the curriculum of higher education should be exam-driven (cf. to assure quality).
Such solutions are proposed, for instance, in a review of undergraduate teaching conducted under the auspices of the Swedish Council for Higher Education (Rådet för Högskoleutbildning) and published in 1999 (Westling 1999). Among the ten proposals were: 'examinations in higher education must change, so that they are better suited to its educational goals'; and 'Information and communication technology constitutes a pedagogical challenge that must be more consciously addressed' (p. 5-6).
New technologies are transformative. They are not merely sharper tools for solving old problems. In Education, they create new contexts for teaching and learning and, as a result, new demands upon teachers and learners. Information technology raises pedagogic questions about the orientation of teachers and learners, as well as economic questions about the effectiveness of such re-tooling. What is the likely impact of ICT on teaching and learning? And what will be saved, or added, by its adoption? Put another way, the reform of teaching and learning in Swedish higher education is delicately poised between two discourses. The 'intended aims' of higher education may be clear, but the 'effects' of efforts to meet them (e.g. the new regulations for doctoral studies) are, to date, problematic and, in the future, uknown.
From a technocratic perspective, however, the distinction between aim and effects is never problematic. The means assures the end. If ICT is a true technology, its outcomes are assured (i.e. guaranteed). Such a pragmatic (or 'what works') vision was noted long ago. Aristotle defined a techné (the classical Greek word) as a procedure for achieving a desired end - in the sense that sawing is a technology for cutting down trees.
This belief, if accepted, assumes that technologies deliver. It is attractive to professionals, like politicians and teachers, whose work includes delivery. They long for a 'killer application' (Rumble, 2001, p. 230) that will make their work easier. If the invention of the ATM (Bankomat) machine was the killer application that facilitated flexible spending, and if the tetrapak was the killer application that facilitated flexible packaging, is it also possible to reconfigure ICT to facilitate flexible learning? Multi-national multi-media companies certainly believe that such an investment is worth the risk. They are prepared to venture their capital in the search for a tool that embodies knowledge that can be marketed under protected conditions (cf. Bennett, 2001).
As information technology metamorphosed into information and communication technology (ICT), the search was on to replace earlier didactic technologies with on-line instructional technologies that would assure their content, foster their delivery and audit their assessments. This search is fed by a three-pronged cybernetic vision: that assessment can be regarded as a feedback or control technology; that feedback can be used to steer both the content and delivery of instruction; and that assessment is the paramount activity because it can be used to 'drive' the entire instructional endeavour (a view explored in Torrance, 1995).
Insofar as this view is accepted, a new reform agenda comes into view - that higher education should invest in the introduction of systems of steering and control (i.e. quality assurance). Work elsewhere, however, suggests that such social technologies can become self-defeating. Their intentions may be neutralised by side-effects, reducing their practices to rituals, symbolic representations of power without control. Higher education is incorporated into the so-called 'audit society' and, in the process, its members learn the 'strategic necessity of playing the game' (Power, 1999, p. xv).
Social problems associated with the convergence of assessment practice and social auditing have been recognised since the 1960s, when Michael Scriven first made the distinction between formative and summative evaluation (Scriven, 1967, p. 40). And similar social effects have been noted in Bob Linn's discussion of high stakes and low stakes testing (Linn, 2000) and Harry Torrance and John Pryor's analysis of convergent and divergent assessment (e.g. Torrance & Pryor, 2001, p. 617).
To serve as a control or steering technology, assessment emphasises the summative, convergent and high stakes functions of testing, examinations and assessment. Disproportionate attention is given to terminal outcomes; to the differentiation of correct and incorrect answers; and to linking resource allocation to these outcomes. These assumptions are powerful and persuasive. Their translation into practices evokes changes in teaching and learning. But will the actual changes be the same as the intended changes?
The pursuit of this agenda requires that its intentions are not deflected in favour of other goals (or games); and that it is shielded from unintended social influences. All teachers and learners must share this auditing rationale and, with other educational managers, must give close attention to the impact of 'factors jeopardizing internal and external validity' (Campbell & Stanley, 1963, p. 175) , more commonly known as 'threats to validity'.
Both of these requirements are reasonable. Deviation always introduces cracks in the fabric of reform. Currently, there is much discussion about the role of such interference in reducing the authenticity and validity of assessment (cf. Shepard, 2000; Linn, 2000). Likewise, doubt can be cast on the peaceful coexistence of the audit society and the 'learning society' (cf. Wiliam, 2001; Black, 2001). Exploration of this last idea, that quality is a 'complex [social] question' (Högskoleverket, 1998, p. 10), is the focus of a later paper.
Without attention to the 'total set of affects' that they achieve, the validity of reform agendas may be undermined. Auditing practices may be successfully developed, and instruction may be controlled. But is that desirable? Indeed, are such practices compatible with the educational goals - or Bildungsideal - of Swedish higher education in the third millennium?
During the 1990s, attempts have been made to increase the efficiency of Swedish higher education. Financial reform - in favour of output rather than input measures - was the chosen policy instrument. These innovations however, became associated with a range of undesirable effects. Arguably, the net result has not only been a reduction in the quality of teaching but also a consequent reduction in the efficiency of the Swedish higher education system.
As teachers and learners, we feel these effects in our daily work. Yet, as educational researchers we also accept that there is another perspective on testing, examinations and assessment, one that focuses on their educational value. We believe that testing, examinations and assessment can also be used for quality promotion as well as quality assurance (and accreditation). On practical, ethical and professional grounds, we prefer to focus on the promotion and recognition of learning rather than the control and steering of learning. To this end, we remember that testing, examinations and assessment can also operate as low stakes, formative and divergent activities. Indeed, this paper can be read as a prologue to some developmental work in this area. We have recently embarked on a three-year, multi-partner exploration of on-line assessment, as part of the European Commission's MINERVA initiative relating to on-line distance learning (ODL).
In our current work, we try to take a both/and, rather than an either/or stance to the political and educational goals of higher education. The production of knowledge, energising the souls of our students, and creating socially-valued citizens need not be incompatible.
Last and by no means least, we also hope that the creation of self-regulation around matters of teaching and learning will not only foster a reflective attitude (reflekterande förhållningssätt) among students but also enable them to find fresh ways to support the social profitability of the Swedish railway network.
Bennett, R. E. (2001). How the internet will help large-scale assessment reinvent itself. Education Policy Analysis archives, 9(5).
Black, P. (2001). Dreams, strategies and systems: Portraits of assessment past, present and future. Assessment in Education: Principles, Policy & Practice, 8(1), 65-85.
Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research on teaching. In N. Gage (Ed.), Handbook of Research on Teaching (pp. 171-246). Chicago: Rand McNally.
Gibbons, M., Limoges, C., Mowotny, H., Schwartzman, S., Scott, P., Trow, M. (1994). The New Production of Knowledge: The dynamics of science and research in contemporary societies. London: Sage.
Hanson, F. A. (1993). Testing Testing: Social consequences of the examined life. Berkeley: University of California Press.
Högskoleverket. (1998) Fortsatt granskning och bedömning av kvalitetsarbetet vid universitet och högskolor: utgångspunkt samt angrepps- och tillvägagångssätt for Högskoleverkets bedömningsarbete (rapportserie 1998: 21 R). Stockholm: Högskoleverket.
Högskoleverket (2000) Är grundutbildningens kvalitet i farozonen (arbetsrapport 2000:12 AR). Stockholm: Högskoleverket.
Linn, R. L. (2000). Assessment and accountability. Educational Researcher, 29(2), 4-16.
Messick, S. (1989). Validity. In R. Linn (Ed.), Educational Measurement (3 ed., pp. 13-103). New York: Macmillan.
Power, M. (1999). The Audit Society: Rituals of verification. Oxford: Clarendon Press.
Riksdages Revisorer (2000). Högskoleutbildning is samhällsekonomisk belysning. Rapport, 1999/2000:9.
Rumble, G. (2001). Just how relevant is E-education to global educational needs. Open Learning, 16(3), 223-232.
Shepard, L. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4-14.
SCB (1999:006). Andelen högskolenybörjare med mål att ta ut en examen minskade, (Stockholm: pressmeddelande från SCB, 1999-01-12).
Scriven, M. (1967). The methodology of evaluation. In R. W. Tyler, Gagné, R. M., Scriven, M. (Ed.), Perspectives of curriculum evaluation (pp. 39-83). Chicago: Rand McNally.
SOU (1992:44). Resurser för högskolans grundutbildning. Betänkande av Högskoleutredningen.
Torrance, H. (Ed.). (1995). Evaluating Authentic Assessment. Buckingham: Open University Press.
Torrance, H. & Pryor. (2001). Developing formative assessment in the classroom: Using action research to explore and modify theory. British Journal of Educational Research¸27(5), 615-631.
Westling, H. (red.) (2000) Börjar grundbulten rosta? Stockholm: Rådet för Hogskoleutbildning.
Wiliam, D. (2001). An overview of the relationship between curriculum and assessment. In D. Scott (Ed.), Curriculum and Assessment (pp. 165-181). Westport, CT: Ablex Publishing.