Generative AI changes teaching and learning: how to protect the integrity of assessment

This academic year, the UCL Centre for the Pedagogy of Politics (CPP) is hosting a series of online panel events. Our first event on 30 October was on the theme of ‘Using technology to teach politics’. In this guest post, one of the panellists at that event, Simon Sweeney (University of York), offers further reflections on the challenges involved in higher education’s embracing generative AI, where tools such as ChatGPT call into question issues of authorship and have profound implications for assessment.

A few years ago, we were worrying about students’ using essay mills, a form of contract cheating that plagiarism detection software struggled to identify. The Covid-19 pandemic and online delivery coincided with a reported increase in academic dishonesty (AD). In late-2022 the arrival of generative artificial intelligence (GAI) chatbots like ChatGPT is a further challenge to the integrity of assessment.

Universities realised that banning chatbots was not feasible, as AI has become an established feature in our lives and graduate employment. As educators, we need to respond positively to the opportunities AI presents, recognising its benefits and assimilating AI into teaching and learning practice.

This means developing strategies that accommodate students’ use of GAI while protecting assessment integrity.

A key challenge concerning GAI is the issue of authorship. ChatGPT delivers answers to prompts and can generate answers, or parts of answers, that students may pass off as their own work. Student websites offer instructions on how to use ChatGPT to compile a pass standard essay. An imminent deadline, financial pressures, part-time work, or simply the demands of the course, mean many students may find chatbots tempting.

How should we respond? First, universities must train both faculty and students on acceptable and responsible use of GAI. This requires investment to design and deliver face-to-face and online training, with clear guidelines on what is acceptable. Second, universities need to address challenges around assessment.

Adapting assessment to chatbot-enabled learning

Generative AI is likely to get better at delivering competent answers to traditional academic essay questions. Currently, ChatGPT struggles with accurate referencing. Students may provide fake references or make false attribution. One way to reduce this is to insist on references from the module reading list. This helps to ensure that students show learning from the module, not an unreasonable expectation.

A restricted reading list is not ideal, but it did reduce the incidence of milled essays in a political economy module that I teach. This module has an enormous reading list and students are invited to use other sources by included authors. This allows wider research. However, tutors should be alert to bogus referencing, including attributing evidence to sources where no such evidence exists.

An egregious example is referencing two engineers named Evans and Thomas, but these were not (Tony) Evans and (Caroline) Thomas, whose text on poverty, hunger, and development was discussed in a seminar.

How can we address the risk of submissions where a high proportion of content is derived from chatbots? Less reliance on traditional essays or short answer responses that are highly susceptible to chatbot application should help. Programmes should use a range of assessment methods that are less easily prey to chatbot derived content. Traditional unseen examinations are one such type, as are viva voce examinations used in assessing doctoral candidates.

We know that viva voce is hardly viable with large undergraduate cohorts, but this is a cost and human resourcing issue, so might not be insurmountable in all cases. Other forms of ‘live’ or real-time assessment are less subject to ChatGPT-derived content, such as presentations, simulation, and role-playing.

A portfolio-type assessment can work well as can personal reflective statements that require learners to demonstrate their response to module content. This kind of assessment lends itself to idiosyncratic and creative content. Work that shows interpretive engagement and critical analysis should be rewarded with higher marks.

Sensemaking framework

The sensemaking framework (Daft and Weick, 1984; Preuss, et al., 2023) can align effectively with marking criteria and learning outcomes.

Sensemaking is a three-level taxonomy where the lowest level, Level 1, represents scanning for specific information and an ability to demonstrate knowledge gained. This is at best descriptive, indicating a low level of engagement. Level 2 is interpretive and represents an ability to explain and paraphrase information, revealing a deeper understanding, and some critical perspectives around themes and subject content.

The highest level, Level 3, involves analytical and critical thinking, and moreover, clear responses in terms of action-oriented commitment to, for example, behavioural change and lifelong learning. Level 3 projects knowledge and understanding into challenges that require solutions, originality, and contribution to the public good.

Some studies suggest that few students achieve the highest level, but our task as HE educators should be to deliver graduates minded towards promoting societal improvement through action-oriented commitment that addresses the ‘wicked problems’ facing humanity.

Marking criteria may be mapped onto the sensemaking taxonomy. First class and distinction level work fits with Sensemaking Level 3, while a 60+ pass or postgraduate merit standard equates to Level 2. Degree level accomplishment, (40-59 undergraduate, 50-59 postgraduate) accords with Sensemaking Level 1. Specifics may vary from institution to institution, and other factors will be considered such as competence in the skills that constitute ‘graduateness’ (Glover et al., 2002) e.g., IT-capabilities, oral and written communication skills, presentation and design, and English language proficiency.

Individual reflective statements as an assessment mode may be within a portfolio-type submission, comprising short answers to set questions, visual and diagrammatic supports, coursework, note-taking, and including critical commentary on how internet and generative AI technologies have been used.

So, if students use generative AI tools, their use should be discussed and critiqued showing exactly how they are used, with strengths and weaknesses debated, comment on where they are helpful and where they are potentially misleading, or wrong.

We know that ChatGPT, being non-human, lacks judgement and can be inaccurate, or may deliver outright falsehoods. Generative AI lacks emotion and moral principles. In time, it may provide better ‘representation’ of moral judgement. It may get better at faking. In essence, we should work to develop students’ critical engagement, and reward capacity for deep analysis of Internet-derived content.

In all these respects, the reflective statement, as an assessment type, perhaps within a portfolio, can be more individual, more creative, and more idiosyncratic than more traditional assessment methods such as the formal academic essay (Tomlinson, 2022). It can also accommodate generative AI while providing scope for critical engagement and analytical rigour in reporting the use of such technologies.

References

Daft, R. L., & Weick, K. E. (1984). Toward a model of organizations as interpretation systems. Academy of Management Review, 9(2), 284–295.

Glover, D., Law, S. & Youngman, A. (2002) Graduateness and Employability: student perceptions of the personal outcomes of university education. Research in Post-Compulsory Education, 7(3) 293-306. DOI: 10.1080/13596740200200132

Preuss, L., Fletcher, I., & Luiz, J.M. (2023). Using sensemaking as a lens to assess student learning on corporate social responsibility and sustainability. Higher Education Quarterly. https://doi.org/10.1111/hequ.12429

Tomlinson, R.C. (2022). ‘Contract Cheating: the place of the essay in the age of the essay mill’. 27 September. Council for the Defence of British Universities. https://cdbu.org.uk/contracting-cheating-the-place-of-the-essay-in-the-age-of-the-essay-mill/

Dr Simon Sweeney is a Reader in International Political Economy in the School for Business and Society at the University of York. He has a longstanding interest in assessment, student engagement, and the internationalisation of higher education. He served as a UK Bologna Expert between 2006-2013 and is a Senior Fellow of Advance HE. He is the author of European Union in the Global Context (Routledge, 2024).