As part of my on-going project to annoy my fellow authors here on ALPSblog, I was very happy to circulate the draft of my paper for the International Studies Association panel we’re participating in later this month in Toronto. The panel is to help launch the ISA’s new resources for L&T and will be filmed, so we’ll share our fine content with you then.
More usefully for you, I’ll post here the key part of the paper, which addresses one of the more common questions I get asked about for simulations, namely how to assess it. To answer this, we actually need answers to two questions: do we need to assess, and how do we assess?
Do we need to assess at all?
The first question in any consideration of assessment strategies is that of why we might assess at all. In essence, the answers in favour of assessing boil down to one of three options. At a pedagogical level, assessment might be desirable if it allows students to access a particular form of learning. This is most evident when thinking about developing student reflection, and the production of a reflective report that gains feedback from a marker is a efficient and effective way of achieving this. At the practical level, assessment serves as a system of valorisation, focusing students’ attention onto a particular aspect of an activity. Thus if we tell students they will have a paper after a simulation, assessing their knowledge of the procedural rules involved, then we would expect students to pay more attention to those rules within the simulation itself. And finally, at an institutional level, we might simply be required to assess. This is rare, given the principle of academic discretion, but in some systems, internal and external quality assurance systems would expect any substantial activity within a degree programme to be evaluated and assessed. In a softer form, the alignment of learning objectives and game play mentioned in the previous section might logically lead to a requirement to assess.
In contrast, assessment might be avoided if it offers marginal benefits to its associated costs, or if the simulation element is only a relatively small one within a course/module. Where such boundaries lies is a matter beyond this post, but it is something that needs to be given a suitable amount of thought in either direction, since the consequences can potentially be quite significant.
How to assess
If a decision to assess is made, then it is then necessary to consider what that assessment should look like. Considered in broad terms, the key dimension is that of proximity to the simulation qua simulation. The further one moves from that, the more the options that present themselves falls within conventional assessment approaches, which are more recognisable to new users, but with the cost that they do not access all the pedagogic value that simulations have to offer.
Furthest from the simulation itself, assessment can focus on students’ wider learning from the course/module. The assumption here would be that any simulation was only one element of the teaching package and that assessment was structured to make connections across elements within that package. Thus, a course/module might run for a semester, with one week devoted to a simulation that allows students a different perspective on the given topic: a UN Security Council (UNSC) simulation to let students see how the theoretical discussion about the dynamics of that institution work in practice, for example.
The form of this assessment would look like a conventional piece of coursework or a final exam (“what are the key factors in the operation of the UNSC?” in this example). By integrating the simulation with the rest of the course/module, such assessment promotes more holistic reflection, coupled to a more rounded set of experiences on the part of the student. However, this does come at a price. Because the assessment does not link directly to the simulation, it does not valorise it for students, so they might choose not to engage so fully with it: in the example given, it is possible to answer the question whether or not you attend the simulation. This disconnect from the simulation (and particularly from any of the personal skills development aspect within it) means that this is a low level of alignment to the simulation game play and potentially to the learning objectives. More particularly, it raises the question of whether a simulation is really needed at all.
A second strategy is to focus assessment on the simulation topic itself. Necessarily, this requires that there is enough within the simulation to be meaningfully assessed. That might imply an extended simulation, either in time or in relative importance within the course/module. To use our example, the UNSC simulation might be run over several weeks and act as a means for students to discover dynamics and join it to wider reading. As in the previous strategy, either coursework or a final exam could be used to ask the same kinds of questions, the difference being that the simulation is the primary delivery mode for substantive knowledge.
Because the simulation becomes the key vehicle for learning, the assessment more clearly links to the activity and so valorises the simulation in the learning process. At the same time, it is exactly that link that poses the key challenge – which is also true of the other simulation-focused strategies – namely does the simulation offer sufficient depth and scope to allow the students to answer the assessment questions. This matters because simulations are intrinsically uncertain in their operation: we should expect there to be variation between iterations. In this case, because the questions relate to the substantive knowledge aspect, much care must be given to designed a simulation that allows and encourages students playing it to find, use and reflect upon that knowledge. Thus, if the UNSC game focuses on states’ positions on a given dossier, that might not help with answering a question about negotiation dynamics. In practice, this type of strategy requires a close dialogue between game design and assessment design, to ensure that the two align properly.
The third strategy moves much closer to the simulation itself. Here, students are evaluated by an external assessor on their performance within the simulation. Again, this requires a simulation of sufficient scope to allow all students to have a reasonable opportunity to perform: as such, it is most commonly seen in simulations that run over a full day (think here of non-Higher Education events such as Model United Nations that use judging). The appeal here is clear: students know that they are being watched and evaluated and so have clear and direct incentive to perform to the best of their abilities. Moreover, by keeping the assessment synchronised with the activity, there is scope for very rapid turnaround of assessment.
Despite such attractions, teacher evaluation is highly problematic. While all assessment has a degree of subjectivity, it is much more marked in this instance. This starts with the difficulty of establishing clear criteria: what is to be considered? How do we measure it? How do we weight different elements? Consider two students, one of whom works assiduously throughout the simulation, making repeated and constructive interventions, the other of whom does nothing until the very end when she uses a simple procedural point to secure her objectives: who is the better student?
This problem extends into gathering evidence to support the assessment decision. In practical terms, it is impossible for an individual to observe more than five or six people for any length of time. This in turn implies that either other assessors need to be present (which will heighten the difficulty of evenly applying the assessment criteria) or some form of recording of the simulation (audio or visual) is needed. The difficulty with the latter option is that one risks missing the pertinent aspects of the simulation, such as the conversation in the hallway, or the online traffic between participants. In any large-scale simulation, such a proliferation of communication and negotiation points is a given and must be borne in mind.
To some extent this is a more philosophical question than anything else. Can we assure ourselves that we have sufficient evidence to make an informed decision? To some extent, one could sidestep the issue by assessing on the basis of ‘success’ in the simulation: did the student achieve their aims? The danger there is that it might not be possible all everyone to win and – more importantly – it might be prototypical for there to be winners: the author recalls a European Parliament simulation with such a mechanism, which encouraged students, but which didn’t give them a very useful insight into how that institution works as a consensual body.
Logically, discussion of teacher evaluation leads to the final assessment strategy: student evaluation. This form of assessment is closest to the simulation itself, since it is generated by a participant and set within a framework of that participant’s own understanding. Crucially, and possibly problematically, it requires that students are able to reflect on their own learning processes and are able to integrate substantive knowledge with performative skills: while this should be a given with Higher Education students, it becomes more problematic when using simulations with those not yet at that stage.
Student evaluation also differs from the other strategies in that its focus in not so much on the substantive knowledge, but rather on the skills of critical reflection and integration of understanding. In practice, this simply enlarges the difficulty noted above, namely that the scope of possible answers to a question on the lines of “what have you learnt from this simulation?” is necessarily very much larger than it is for any of the substance-based question outlined above. Even if it is framed more narrowly (“show how your experiences in the simulation illustrate the difficulties of finding agreement within the UNSC”), there is still the possibility – indeed, likelihood – that individual students will produce very different accounts.
This intrinsic flexibility of answer must therefore be accommodated within both the framing of the assessment questions and in the range of what is considered acceptable as a response. This can be done more easily in some contexts than in others. The author runs a module on negotiating in politics, where the assessment is a reflective review of the students’ experience of what they have learnt through a series of negotiations, which they are then asked to link back to the academic literature. Because the module is focused on skills development, informed by acquisition of substantive knowledge, rather than the other way around, this assessment strategy works well in reinforcing the central objective of promoting self-criticality.
It is this last point that is perhaps the most important one. No one of these assessment strategies outlined above is the ‘correct’ one: each is potentially valid, but only within the terms of the learning objectives. Ultimately, how (or whether) one assesses must be a function of what one aims to achieve: without an understanding of the latter, the former cannot be properly determined. Seen in a more practical light, that requires a repeated interrogation of objectives, game play and assessment throughout the design and development process to ensure that they continue to match up and reinforce one another.