Reflection on generative AI as teaching assistants: implications and policy recommendations.

 


The emergence of generative AI (GenAI) in education signals a shift in the professional landscape of education, decentralising some traditional instructional roles and prompting a rethinking of what it means to teach. Historically, expertise in education has been concentrated in instructors and (to a lesser extent) human TAs, who provide guidance, feedback, and assessment. However, as GenAI systems become increasingly capable of tutoring, scaffolding learning, and responding to student needs in real time, the boundaries of these roles are being redrawn.

Rather than replacing human expertise, AI TAs built using GenAI demand that we rethink how instructional work is distributed–not just between instructors and human TAs, but across AI-enhanced systems. This shift mirrors what Shaffer, Nash, and Ruis describe as the reconfiguration of professional expertise in response to new tools and technologies. As they argue, professionalisation is not static; it evolves when new ways of knowing and working emerge. In the case of education, AI’s growing presence means that instructors must develop new competencies – not only in teaching content, but in orchestrating AI-enhanced learning environments, interpreting AI-generated insights, and ensuring alignment between AI feedback and pedagogical goals. As such, policy should emphasise training of instructors and human TAs to work effectively with AI. It is also critical to ensure that GenAI integration does not drive pedagogy but rather supports it in meaningful ways. The Technological Pedagogical Content Knowledge (TPACK) framework provides a useful lens for understanding this challenge. Educators must consider how GenAI interacts with both content knowledge (what is being taught) and pedagogical strategies (how it is taught). Without thoughtful integration, there is a risk that AI could push education further toward efficiency-based models, where rapid feedback and automated assessments replace deeper engagement with complex ideas rather than finding an optimal combination of deeper learning and more automated activities where appropriate. Prioritising efficiency and engagement over meaningful understanding, may increase the quality of student work and student experience in the short-term but may not benefit the student in the longer term. Mishra and colleagues’ work also highlights the need to move beyond mere adoption of AI tools to meaningful integration into learning experiences. The presence of AI TAs does not inherently improve education; their effectiveness depends on how they are aligned with broader learning goals. Educators must take an active role in shaping AI’s function within courses, ensuring it complements and enhances human-centred teaching practices rather than supplanting them. Therefore, we recommend against designing of AI Teaching Assistants in ways that replace humans and fully automate all learning activities, and the adoption of cost-cutting measures that create pressure to eliminate human TAs should also be avoided. Beyond reducing the quality of instruction, reduced funding for Teaching Assistants would also decrease opportunities for economically disadvantaged individuals who rely on Teaching Assistant positions as a pathway into academia, ultimately reducing the pipeline of talented scholars into research and scholarship. Henriksen and Mishra’s work on practical wisdom further reinforces this perspective, emphasising that experienced educators bring a form of professional knowledge that AI cannot replicate - one rooted in ethical decision-making, contextual understanding, and reflective practice. As AI transforms the nature of knowledge in education, teachers must ensure that human judgment, adaptability, and social-emotional insights remain at the core of teaching. This highlights the need for educators to approach GenAI critically, leveraging its strengths while maintaining the core humanistic elements of teaching and mentorship. Reprofessionalisation in this context is therefore not just about preparing educators for an AI-integrated classroom–it is about ensuring that humans and AI systems work in sync to advance student learning experiences and outcomes. Just as previous technological shifts reshaped the teaching profession, GenAI requires a reimagining of teacher preparation, assessment design, and professional collaboration. The goal is not merely to integrate AI, but to define new models of expertise in an AI-augmented educational ecosystem–one in which human and AI agents work together to support meaningful learning experiences while maintaining a commitment to equity and effectiveness. 




Assessment practices in higher education serve multiple purposes: they inform students about their progress, provide instructors with actionable insights to guide instruction, and certify learners’ competence. The emergence of AI Teaching Assistants (AI TAs) such as JeepyTA adds new opportunities for formative assessment, while creating possibilities for summative assessment which merit some caution. In both cases, careful design and policy guardrails are necessary to ensure that AI TAs enhance rather than undermine the educational process. A chief benefit of AI TAs is in supporting formative feedback, where fast, specific, and individualised guidance can promote deeper learning. This feedback can be provided both immediately and 24/7. Students who work late at night, study remotely, or juggle other responsibilities often cannot attend regular office hours or wait for TAs to become available. An AI TA can fill this gap by providing immediate, round-the-clock feedback, easing the pressure on human TAs and making support more equitable. Historically, automated assessments (e.g. quizzes, short-answer grading) have helped identify student misconceptions and encouraged targeted practice. Generative AI now expands these possibilities by supporting more complex tasks– from essay drafts to coding projects. In general, an AI TA can assess a broader range of competencies than a typical
 human TA, supporting shifts in assessment from assessing what students know to their conceptual understanding and their process of solving problems. Importantly, using AI for assessment also opens the possibility of assessing student learning through a broader range of artifacts – including annotations, reflections, peer feedback, conversations, and other forms of evidence – enabling a more holistic and nuanced view of learning that extends beyond traditional measures. They can also provide sophisticated, multi-dimensional assessment. As seen in the JeepyTA platform discussed earlier, AI TAs can offer feedback on essay organisation, argument clarity, and conceptual rigor, referencing course rubrics and standards to align with the instructors’ goals. This kind of timely, actionable commentary can help learners iterate more quickly, moving from basic correctness checks toward higher-order thinking and reflection. If these systems are designed carefully to align with the course-specific materials and rubric (as JeepyTA has been primed to do), it is possible to reduce the risk of misleading feedback that the instructor will disagree with–although it still happens, just as a human TA could also provide feedback that an instructor disagrees with. Furthermore, the use of personas (as discussed above) or carefully designed prompting can offer students feedback from different perspectives, highly relevant in some disciplinary areas but difficult for a single human TA or instructor to provide. With AI TAs taking on time-consuming tasks such as answering routine questions, reviewing initial drafts, or providing rubric-aligned suggestions, human TAs and instructors are increasingly able to reallocate their time toward more pedagogically meaningful and relationship-centred activities. These include leading in-depth discussions that challenge students to think critically, working directly with individuals or small groups to support their academic progress, meeting individually with students to support their academic and professional growth, and developing activities that promote academic agency and collaboration. Academic staff can now invest more energy in synthesising performance patterns across student submissions, identifying emerging misconceptions, and making ongoing improvements to assignments or assessments based on observed trends and student needs. These higher-order instructional practices remain difficult for AI systems to replicate, though they can support these tasks in various ways. Rather than displacing humans, then, AI TAs can create space for them to focus on tasks that require interpreting student thinking in context, applying subject-matter expertise, and exercising instructional judgment. Another possible concern with the provision of formative feedback by AI TAs is over-scaffolding, where learners rely so heavily on AI-generated suggestions that their final products no longer represent independent work. Particularly if students can seek several rounds of feedback, or obtain lower-level writing suggestions, the sophistication of today’s generative AI can obscure the boundary between a student’s own efforts and AI-provided content. To avoid issues of this nature, careful consideration of design is needed, which can be supported by policy funding the establishment of guidelines for how much scaffolding is appropriate in different situations and research on methods for producing the benefits of scaffolding while avoiding over-scaffolding. Nonetheless, on the whole, the benefits of providing formative feedback through an AI TA seem – if carefully designed – do outweigh the risks, and policy should encourage higher education to move forward in using AI TAs for formative feedback in ways that can improve student learning. Greater concerns are present for more summative assessment, such as assigning final marks on assignments. Automated essay scoring has a history dating back decades, offering consistency and quick turnaround but often criticised for focusing on superficial textual features. Still, the use of automated essay scoring offers many lessons in how automated grading can be psychometrically validated and used appropriately in ways that support higher education institutions in offering high-quality consistent grading at lower human cost. More advanced, generative AI-based approaches can analyse content in greater detail than most earlier methods, but can be prone to systematic biases, a lack of transparency in scoring, treating inaccurate but widely believed misconceptions as true, and even in some cases have “hallucinations” of information not in the original student work. For high-stakes decisions such as course grades, even small error rates can have consequences for student outcomes and perceptions of fairness. Moreover, incorporating AI into summative grading can amplify existing equity concerns. If an AI TA’s underlying model has been trained on data reflecting cultural or linguistic biases, students from underrepresented backgrounds may be inadvertently penalised. The presence of generative AI in summative grading therefore necessitates robust validation across diverse student populations and consideration of whose perspectives are centred within assessment and even in the evaluation of the fairness of assessments. Policymakers and institutional leaders must establish policies that ensure that any summative use of AI-based scoring be supported by transparent procedures, documented reliability metrics (including evidence that there is limited or no algorithmic bias, and the ability for students (and instructors) to question and appeal automated scores.
Decisions about when AI assistance is acceptable – and how much AI-driven contribution is too much – will differ by course context and disciplinary standards, as well as how far along students are in their development of expertise. In fields like computer science or business, where collaborative problem-solving with tools is central, it may be appropriate to evaluate students in more advanced classes based on how effectively they leverage AI to reach correct solutions. By contrast, in courses emphasising individual mastery of foundational skills, unmediated AI assistance could undermine the competencies being assessed. As such, it may be appropriate to develop guidelines, either at a disciplinary level or in the context of specific standard courses, for which tasks should incorporate AI support, and what types of formative assessment and support are warranted. Overall, there is considerable potential for the use of generative AI-based Teaching Assistants to support formative assessment, and some possibilities in summative assessment as well, if approached with sufficient caution and human oversight is retained. The design of policy to encourage appropriate use has potential benefits for both students and instructors.




The simple existence of AI TAs can play an important role in supporting equity. Many students struggle to get academic help when they need it, not because they do not have the motivation but because structural barriers make access difficult. Some students have jobs or caregiving responsibilities that prevent them from attending office hours. Others study in different time zones and cannot reach instructors or TAs outside of regular class hours. AI TAs can straightforwardly remove some of these barriers by providing immediate responses to course-related questions at any time of day, which makes academic support more accessible to students who might otherwise face difficulties getting assistance when they need it. If designed correctly (and if such design effort is made part of the criteria in requests for proposals and other instruments encouraging the development and use of AI TAs), AI TAs can also be more effective at avoiding unconscious bias, where historically underrepresented students may receive different and lower-quality feedback than other students. When AI TAs are built on large language models that are multilingual, it also becomes possible to provide content in multiple languages, increasing the accessibility of feedback and support for some international students. Similarly, AI TAs can relieve some equity issues impacting human TAs. Large courses create significant demands on Teaching Assistants, who must divide their time between grading, responding to student questions, and assisting with course administration. Many Teaching Assistants take on these responsibilities while managing their own coursework, research, and professional development. The time constraints they face often affect not only their own studies but also the level of detail they can provide in feedback or the number of students they can support individually. AI TAs can reduce some of these pressures by handling routine inquiries and generating structured feedback on assignments. Therefore, policy to adopt AI TAs where appropriate can have fairly rapid and sizable benefits for equity, for both students and Teaching Assistants. However, the adoption of AI TAs can also create equity concerns. Holstein and Doroudi’s research suggests that educational technologies often reinforce existing inequities, benefiting students who already possess strong academic skills while leaving others behind. That said, recent studies on large language models indicate a more complex dynamic: LLMs may provide greater relative benefits to less knowledgeable users, a pattern sometimes referred to as the GPS effect. Just as GPS systems support those unfamiliar with a route more than experienced drivers, LLMs can scaffold novices more effectively than they assist experts. This suggests that, once access is secured, AI TAs could disproportionately benefit those with less prior knowledge – potentially narrowing achievement gaps rather than widening them. Of course, access itself remains a critical barrier, particularly in terms of digital literacy, language fluency, and reliable connectivity. Moreover, the risk remains that AI systems might encode cultural, linguistic, or epistemic biases that privilege dominant norms. The effort to incorporate specific cultural perspectives and funds of knowledge through personas, discussed above, is our first attempt to address this concern within JeepyTA. It is also important to ensure that AI TAs do not primarily benefit students with higher digital literacy and stronger academic foundations. Ultimately, then, it is important that policy requires evidence that AI TAs are fair for all learners – not just supporting development but validation of fairness across learners as part of evaluation.

Comments

Popular posts from this blog

(Day 2) Beyond the Algorithm: Navigating the Future of Artificial Intelligence - 49th Annual UNIS-UN International Student Conference.

(Day 1 - Part 2) Beyond the Algorithm: Navigating the Future of Artificial Intelligence - 49th Annual UNIS-UN International Student Conference.