Reflection on generative AI as teaching assistants: implications and policy recommendations.
The emergence of generative AI (GenAI) in education signals a shift in the professional landscape of education, decentralising some traditional instructional roles and prompting a rethinking of what it means to teach. Historically, expertise in education has been concentrated in instructors and (to a lesser extent) human TAs, who provide guidance, feedback, and assessment. However, as GenAI systems become increasingly capable of tutoring, scaffolding learning, and responding to student needs in real time, the boundaries of these roles are being redrawn.
Rather than replacing human expertise, AI TAs built using GenAI demand that we rethink how instructional work is
distributed–not just between instructors and human TAs, but across AI-enhanced systems. This shift mirrors what
Shaffer, Nash, and Ruis describe as the reconfiguration of professional expertise in response to new tools
and technologies. As they argue, professionalisation is not static; it evolves when new ways of knowing and working
emerge. In the case of education, AI’s growing presence means that instructors must develop new competencies
– not only in teaching content, but in orchestrating AI-enhanced learning environments, interpreting AI-generated
insights, and ensuring alignment between AI feedback and pedagogical goals. As such, policy should emphasise
training of instructors and human TAs to work effectively with AI.
It is also critical to ensure that GenAI integration does not drive pedagogy but rather supports it in meaningful ways.
The Technological Pedagogical Content Knowledge (TPACK) framework provides a useful lens for
understanding this challenge. Educators must consider how GenAI interacts with both content knowledge (what is
being taught) and pedagogical strategies (how it is taught). Without thoughtful integration, there is a risk that AI could
push education further toward efficiency-based models, where rapid feedback and automated assessments replace
deeper engagement with complex ideas rather than finding an optimal combination of deeper learning and more
automated activities where appropriate. Prioritising efficiency and engagement
over meaningful understanding, may increase the quality of student work and student experience in the short-term
but may not benefit the student in the longer term.
Mishra and colleagues’ work also highlights the need to move beyond mere adoption of AI tools to meaningful
integration into learning experiences. The presence of AI TAs does not inherently improve education; their
effectiveness depends on how they are aligned with broader learning goals. Educators must take an active role in
shaping AI’s function within courses, ensuring it complements and enhances human-centred teaching practices
rather than supplanting them. Therefore, we recommend against designing of AI Teaching Assistants in ways that
replace humans and fully automate all learning activities, and the adoption of cost-cutting measures that create
pressure to eliminate human TAs should also be avoided. Beyond reducing the quality of instruction, reduced funding
for Teaching Assistants would also decrease opportunities for economically disadvantaged individuals who rely on
Teaching Assistant positions as a pathway into academia, ultimately reducing the pipeline of talented scholars into
research and scholarship.
Henriksen and Mishra’s work on practical wisdom further reinforces this perspective, emphasising that
experienced educators bring a form of professional knowledge that AI cannot replicate - one rooted in ethical
decision-making, contextual understanding, and reflective practice. As AI transforms the nature of knowledge in
education, teachers must ensure that human judgment, adaptability, and social-emotional insights remain at the
core of teaching. This highlights the need for educators to approach GenAI critically, leveraging its strengths while
maintaining the core humanistic elements of teaching and mentorship.
Reprofessionalisation in this context is therefore not just about preparing educators for an AI-integrated classroom–it
is about ensuring that humans and AI systems work in sync to advance student learning experiences and outcomes.
Just as previous technological shifts reshaped the teaching profession, GenAI requires a reimagining of teacher
preparation, assessment design, and professional collaboration. The goal is not merely to integrate AI, but to define
new models of expertise in an AI-augmented educational ecosystem–one in which human and AI agents work
together to support meaningful learning experiences while maintaining a commitment to equity and effectiveness.
Assessment practices in higher education serve multiple purposes: they inform students about their progress,
provide instructors with actionable insights to guide instruction, and certify learners’ competence. The emergence
of AI Teaching Assistants (AI TAs) such as JeepyTA adds new opportunities for formative assessment, while creating
possibilities for summative assessment which merit some caution. In both cases, careful design and policy guardrails
are necessary to ensure that AI TAs enhance rather than undermine the educational process.
A chief benefit of AI TAs is in supporting formative feedback, where fast, specific, and individualised guidance can
promote deeper learning. This feedback can be provided both immediately and 24/7. Students who work late at night,
study remotely, or juggle other responsibilities often cannot attend regular office hours or wait for TAs to become
available. An AI TA can fill this gap by providing immediate, round-the-clock feedback, easing the pressure on human
TAs and making support more equitable.
Historically, automated assessments (e.g. quizzes, short-answer grading) have helped identify student misconceptions
and encouraged targeted practice. Generative AI now expands these possibilities by supporting more complex tasks–
from essay drafts to coding projects. In general, an AI TA can assess a broader range of competencies than a typical
human TA, supporting shifts in assessment from assessing what students know to their conceptual understanding and
their process of solving problems. Importantly, using AI for assessment also opens the possibility of assessing student
learning through a broader range of artifacts – including annotations, reflections, peer feedback, conversations, and
other forms of evidence – enabling a more holistic and nuanced view of learning that extends beyond traditional
measures.
They can also provide sophisticated, multi-dimensional assessment. As seen in the JeepyTA platform discussed earlier,
AI TAs can offer feedback on essay organisation, argument clarity, and conceptual rigor, referencing course rubrics
and standards to align with the instructors’ goals. This kind of timely, actionable commentary can help learners
iterate more quickly, moving from basic correctness checks toward higher-order thinking and reflection. If these systems are designed carefully to align with the course-specific
materials and rubric (as JeepyTA has been primed to do), it is possible to reduce the risk of misleading feedback that
the instructor will disagree with–although it still happens, just as a human TA could also provide feedback that an
instructor disagrees with. Furthermore, the use of personas (as discussed above) or carefully designed prompting
can offer students feedback from different perspectives, highly relevant in some disciplinary areas but difficult for a
single human TA or instructor to provide.
With AI TAs taking on time-consuming tasks such as answering routine questions, reviewing initial drafts, or providing
rubric-aligned suggestions, human TAs and instructors are increasingly able to reallocate their time toward more
pedagogically meaningful and relationship-centred activities. These include leading in-depth discussions that challenge
students to think critically, working directly with individuals or small groups to support their academic progress,
meeting individually with students to support their academic and professional growth, and developing activities that
promote academic agency and collaboration. Academic staff can now invest more energy in synthesising performance
patterns across student submissions, identifying emerging misconceptions, and making ongoing improvements
to assignments or assessments based on observed trends and student needs. These higher-order instructional
practices remain difficult for AI systems to replicate, though they can support these tasks in various ways. Rather
than displacing humans, then, AI TAs can create space for them to focus on tasks that require interpreting student
thinking in context, applying subject-matter expertise, and exercising instructional judgment.
Another possible concern with the provision of formative feedback by AI TAs is over-scaffolding, where
learners rely so heavily on AI-generated suggestions that their final products no longer represent independent
work. Particularly if students can seek several rounds of feedback, or obtain lower-level writing suggestions,
the sophistication of today’s generative AI can obscure the boundary between a student’s own efforts and
AI-provided content. To avoid issues of this nature, careful consideration of design is needed, which can be supported
by policy funding the establishment of guidelines for how much scaffolding is appropriate in different situations and
research on methods for producing the benefits of scaffolding while avoiding over-scaffolding. Nonetheless, on the
whole, the benefits of providing formative feedback through an AI TA seem – if carefully designed – do outweigh the
risks, and policy should encourage higher education to move forward in using AI TAs for formative feedback in ways
that can improve student learning.
Greater concerns are present for more summative assessment, such as assigning final marks on assignments.
Automated essay scoring has a history dating back decades, offering consistency
and quick turnaround but often criticised for focusing on superficial textual features. Still, the use of automated essay
scoring offers many lessons in how automated grading can be psychometrically validated and used appropriately
in ways that support higher education institutions in offering high-quality consistent grading at lower human cost.
More advanced, generative AI-based approaches can analyse content in greater detail than most earlier methods,
but can be prone to systematic biases, a lack of transparency in scoring, treating inaccurate but widely believed
misconceptions as true, and even in some cases have “hallucinations” of information not in the original student
work. For high-stakes decisions such as course grades, even small error rates can have consequences for student
outcomes and perceptions of fairness. Moreover, incorporating AI into summative grading can amplify existing equity
concerns. If an AI TA’s underlying model has been trained on data reflecting cultural or linguistic biases, students
from underrepresented backgrounds may be inadvertently penalised. The presence of generative AI in summative
grading therefore necessitates robust validation across diverse student populations and consideration of whose
perspectives are centred within assessment and even in the evaluation of the fairness of assessments. Policymakers and institutional leaders must establish policies that ensure that any summative use
of AI-based scoring be supported by transparent procedures, documented reliability metrics (including evidence that
there is limited or no algorithmic bias, and the ability for students (and instructors)
to question and appeal automated scores.
Decisions about when AI assistance is acceptable – and how much AI-driven contribution is too much – will differ by
course context and disciplinary standards, as well as how far along students are in their development of expertise.
In fields like computer science or business, where collaborative problem-solving with tools is central, it may be
appropriate to evaluate students in more advanced classes based on how effectively they leverage AI to reach correct
solutions. By contrast, in courses emphasising individual mastery of foundational skills, unmediated AI assistance
could undermine the competencies being assessed. As such, it may be appropriate to develop guidelines, either at
a disciplinary level or in the context of specific standard courses, for which tasks should incorporate AI support, and
what types of formative assessment and support are warranted.
Overall, there is considerable potential for the use of generative AI-based Teaching Assistants to support formative
assessment, and some possibilities in summative assessment as well, if approached with sufficient caution and human
oversight is retained. The design of policy to encourage appropriate use has potential benefits for both students and
instructors.
Comments
Post a Comment