Generative AI as a teaching assistant.

This section examines how AI Teaching Assistants driven by generative AI can expand and support the traditional roles played by human Teaching Assistants (TAs) and instructors in higher education. AI TAs are designed to automate routine administrative tasks, deliver real-time student support, and provide timely formative feedback at scale in ways that complement and extend human capabilities. Rather than replacements for human TAs, we argue for a model of augmentation AI TAs provide support and feedback that is otherwise infeasible for human instructors and TAs alone: support and feedback that is immediate, personalised, detailed, and available around the clock. The presentation and discussion of JeepyTA, a generative AI system, helps ground these ideas. We also discuss key design and implementation considerations. Much of the recent focus on generative artificial intelligence (GenAI) in education has considered it as a tool used individually either by a student or an instructor, through a commercial off-the-shelf chatbot designed to be an assistant. This role has considerable value, but GenAI can be used in several other fashions, as shown in this report. This chapter considers the role of GenAI in supporting instructors and Teaching Assistants (TAs). TAs serve as a foundational support structure within colleges and universities, bridging the gap between students and academic staff and helping to sustain the quality of instruction in a wide range of disciplines. By leading discussion sections, grading assignments, answering questions, and offering individualised guidance, TAs play a vital role in shaping how students experience their courses. This dual role as both an intermediary and a mentor underscores the significance of the work that TAs do, not just in managing course logistics but in advancing student learning, engagement, and success. As higher education evolves – facing increasing enrolment pressures, budget constraints, and shifts toward online or hybrid delivery – the role of TAs is likely to expand. At the same time, TAs are themselves students, balancing these teaching responsibilities with their own scholarly activities, which can lead to time conflicts, uneven support for learners, and burnout. These challenges raise important questions about how universities can sustainably leverage the benefits TAs provide while addressing the real human limitations of time, expertise, and scalability. Recent advances in GenAI offer a promising avenue to complement TAs’ efforts, while retaining the human element that underpins great teaching. GenAI-driven “teaching assistants,” powered by large language models (LLMs) and other advanced technologies, have the potential to handle repetitive administrative tasks, deliver targeted learning support, and provide immediate feedback to students in a way that human TAs alone simply cannot. By examining the evolving role of TAs and exploring how AI can enrich and extend their capabilities, this section seeks to highlight both current practices and new horizons for more equitable, accessible, and impactful teaching support. We conclude with a discussion of implications, including for policy.

A Teaching Assistant (TA) in higher education (in countries where this role exists) is typically a graduate (Master’s and PhD) or advanced undergraduate (Bachelor’s) student who supports the primary instructor in delivering course content and assisting students. A TA’s responsibilities may vary by institution and discipline, but they generally include facilitating small-group discussions, answering student questions, and supplementing the main instructor’s efforts to create an effective learning environment. In recent years, as more courses have tended to partly or fully move online, TAs also play a large role in managing discussion forums, answering student questions and supporting discussions there. By handling a portion of the teaching and administrative workload, TAs play a critical role in making large or complex courses more manageable, thus enhancing the overall educational experience for both academic staff and students. The origins of Teaching Assistants can be traced back to the late nineteenth century, when growing student enrolments and expanding research expectations prompted universities to look for ways to extend instructional capacity. In these early stages, TAs often served as informal helpers to more senior academic staff, assisting with tasks like grading or lab supervision as part of their own apprenticeship in academia. Over time, and particularly after World War II when higher education systems expanded rapidly, the role of TAs became increasingly formalised. Universities began creating structured programs that provided clearer job responsibilities, training, and professional development opportunities, reflecting the recognition that TAs could significantly enhance both teaching and learning. This evolution laid the foundation for the modern TA role, in which graduate and advanced undergraduate students are systematically integrated into the educational process (Park, 2004[3]). Teaching Assistants carry out a range of tasks that collectively support both the instructor and students in higher education settings. Their responsibilities can be broadly categorised into instructional support and student engagement, though these two areas naturally overlap (Park, 2004[3]). By taking on these roles, TAs help to foster an environment that promotes understanding, participation, and continuous feedback - key components of effective learning (Hattie and Timperley, 2007[6]; Chi and Wylie, 2014[7]). In terms of instructional support, TAs frequently lead discussion sections, tutorials, or laboratory sessions, serving as facilitators who bridge theory and practice (Park, 2004[3]). In these smaller and often more interactive settings, TAs clarify course material, demonstrate practical techniques, and encourage student participation. By adapting teaching methods to the needs of specific groups of students, TAs help maintain a dynamic and inclusive classroom atmosphere. Another critical element of a TA’s role involves assessing student work. TAs often grade assignments, quizzes, and exams under the supervision of the lead instructor (Marshman et al., 2018[8]). This process typically includes reviewing submissions, providing constructive feedback, and highlighting areas for improvement - ideally, guiding students to developing and demonstrating deeper understanding.Grading student work not only eases faculty workload but can also offer TAs valuable, instructor-scaffolded experience in evaluating academic performance, helping TAs to develop deeper understanding of student thinking. As for student engagement in their learning process, in many modern courses – particularly those with hybrid or fully online components – TAs serve as key points of contact on discussion forums (Wadams and Schick-Makaroff, 2022[4]). By responding to questions, facilitating conversation, and sharing clarifications from the instructor, they help maintain an active and supportive online learning community. This work often extends to moderating peer-to-peer exchanges, ensuring that discussions remain on topic and respectful. In addition, TAs often hold regular office hours and meet with students to allow students to seek in-depth explanations, review feedback, or discuss academic challenges. These mechanisms often provide learning support beyond what instructors can offer just-in-time and on-demand, particularly for research-active senior academic staff or large courses. Teaching Assistants provide a range of benefits to higher education institutions, more senior academic staff, and students. Universities often find that using TAs is a cost-effective means of managing large course enrolments while still providing individualised support to students, a topic of constant interest to administrators when university budgets are under pressure. For more senior academic staff, TAs offer substantial advantages by relieving some of the workload associated with teaching, grading, and administrative duties. By delegating tasks such as discussion facilitation, assignment feedback, and routine course management, faculty members can devote more time to developing innovative curricula, advancing their research agendas, and mentoring students (including the TAs) at higher levels. In addition, TAs often introduce diverse perspectives or novel approaches to instruction, encouraging

a collaborative environment in which both senior academic staff and TAs refine teaching strategies (Begley et al., 2019[10]). Finally, students also benefit significantly from the involvement of TAs. In many cases, TAs are more available to answer questions outside of regular class times, and their support on online discussion forums can be accessed asynchronously, providing a flexible option for students who need extra help. TAs’ relative proximity to the student experience - whether by age, academic journey, or shared disciplines - can also result in a peer-mentorship-like atmosphere during office hours and informal interactions. As such, TAs’ greater relatability and students’ perception that they are more understanding (Kendall and Schussler, 2012[11]) can ease anxieties and foster a sense of community, ultimately enhancing the overall learning experience. However, several challenges have been noted for current practices involving Teaching Assistants. Regarding instructional support, many TAs may for example lack pedagogical training or skill. This lack of formal preparation can undermine the quality of their instruction, as they may be uncertain about how to present information clearly. Moreover, TAs sometimes adopt a surface-level approach to grading, focusing on relatively simplistic aspects of correctness rather than attempting to provide feedback that guides students toward deeper conceptual understanding. Compounding these issues is the fact that TAs typically possess less subject-specific expertise than full academic staff, which can limit their ability to answer complex questions or provide advanced guidance. Furthermore, TAs often face challenges around workload and time constraints. Many TAs must balance teaching responsibilities with personal academic obligations, such as coursework, research projects, and preparation for required examinations. Some TAs may find it difficult to invest the necessary time in class preparation, grading, or providing substantive feedback to students. This overload can also lead to high stress and exhaustion, reducing their effectiveness as a TA while also impairing their other work and personal success.This is exacerbated by the uneven training and faculty support provided to TAs. Some instructors involve TAs extensively in designing lesson plans, assessment rubrics, or instructional materials, while others may provide only minimal training and mentorship. Furthermore, many TAs do not have access to teaching mentors other than the instructor. This lack of support can leave TAs uncertain about expectations or best practices, making it harder for them to support students without spending large amounts of their time. Hence, while TAs fulfil crucial roles in supporting learning and engagement, and help bridge the gap between senior academic staff and students, there are several aspects in which current practices are not optimal for either TAs or student.

Could generative AI (GenAI) technologies improve things for both TAs and students? The emergence of GenAI models has created a great deal of enthusiasm for the potential of a wide range of educational benefits. One area of rising interest has been the creation of AI Teaching Assistants–tools that extend the capabilities of human TAs. AI Teaching Assistants (AI TAs) use computational methods – in many recent cases GenAI; but, before that, machine learning and previous-generation natural language processing (NLP); – to perform tasks that were typically in the purview of human TAs, although in many cases beyond what was feasible for human TAs. Their scope has included streamlining routine administrative tasks, providing targeted learning support or rapid formative assessment, and empowering human TAs and instructors with information and insights about their students, enhancing the overall effectiveness of instructional delivery. While they sometimes take over what used to be human tasks, these systems are not intended to replace human educators, but instead to provide support 24/7 and free up valuable time for TAs and instructors to focus on more complex, critical, and high-impact aspects of teaching. Over the past few decades, the use of AI in educational technology has evolved from simple automation tools – like basic quiz generators and grading scripts– to advanced AI systems that can process language and consider context, allowing them to respond to students’ questions and needs in real time. The previous generation of intelligent tutoring systems and question answering systems could offer sophisticated support, but was highly expensive to autho, often necessitating intense focus on only a single aspect of adaptivity. The contemporary use of GenAI, sometimes combined with previous-generation machine learning, creates the potential for a qualitative leap forward in functionality and sophistication, at much lower development cost. These technologies, when used to complement instructors, can take on repetitive tasks – such as answering common questions and some parts of the assessment of student work – thereby providing immediate, round-the-clock support to learners and eliminating the bottleneck that often occurs when TAs or instructors are not available – for instance, for an online learner working from a different time zone than the instructors. At the same time, human TAs and instructors can allocate their expertise to higher-order pedagogical activities, such as facilitating in-depth discussions, offering mentorship, and providing customised feedback for unusual cases and learning challenges. This synergy ultimately helps institutions maintain quality education at scale, addresses the labour-intensive aspects of teaching, and ultimately supports instructors in finding time for high-value personal interaction with learners. In the following section, we will discuss some of the ways that AI TAs can support learners, human TAs, and instructors.

Administrative and logistical support is an area of responsibility for current human TAs that is easy and fairly noncontroversial to replace. By automating more administrative processes such as course enrolment or the monitoring of completion of assignments, instructors can devote more time to pedagogical planning and personalised student engagement AI TAs can also manage course communications by sending out timely reminders for assignments, examinations, and events, ensuring students remain informed and minimising the risk of missed deadlines. Streamlining these tasks can allow human instructors and TAs to focus on higher-level teaching responsibilities, such as curriculum development and individualised feedback. Beyond routine administrative tasks, AI systems can support course logistics by monitoring student participation and promptly alerting instructors to potential engagement and performance issues, as well as distil insights from discussions on course forums for instructors. Another potential area of application for AI TAs is instructional support. AI Teaching Assistants can provide supplementary explanations or resources tailored to students’ individual needs. For example, if a student expresses confusion about a specific topic, these systems can supply targeted materials, such as a textual explanation, brief video tutorials, interactive modules, or suggested readings (Sajja et al., 2024[14]; Essel et al., 2022[27]; Yeti̇ şensoy and Karaduman, 2024[15]). Such a system can provide more attention to customising learning for a given student than would be feasible for even the most dedicated human Teaching Assistant. By engaging in follow-up discussion, an AI TA based on a chatbot can support a student in ways that would be infeasible with a static resource. In addition, just as current GenAI can recommend resources to a learner, it can also assist educators with content curation and lesson planning, suggesting how to communicate topics more effectively. Furthermore, AI TAs can support instructors in evaluating students’ progress, particularly when it comes to formative assessment. Summative assessment through AI has been used in some applications but still needs to meet a higher bar for reliability and fairness. Formative assessments to inform instructors or support learners can be used safely due to the lower stakes. There is a long history of using automated assessments; decades of work assessed learners with automated quizzes and multiple-choice itemsand a previous generation of NLP afforded short-answer grading and automated essay scoring. Previous work also enabled the generation of new items, through procedural templates for instance. However, the advent of generative AI has made it possible both to generate new items in sophisticated, tailored ways and to offer much more detailed, rich feedback on complex artifacts created by students Automated announcement tools can then deliver personalised updates to students, supporting both performance and self-regulation. By making it feasible to offer detailed feedback in a much timelier fashion, students are likely to revise their understanding and adapt in ways that align with the course goals. AI Teaching Assistants embedded in course platforms can effectively serve as a first point of contact for students, promptly addressing frequently asked questions about the syllabus, assignment deadlines, and other logistical concerns. By referencing a structured knowledge base, these systems can also respond to content-related inquiries, offering supplementary explanations or clarifications. Questions can be asked in an external platform, within the course discussion forum, a learning management system (LMS), or in the context of a learning activity itself . These tools can save considerable time for instructors and human TAs. More importantly, the immediate, round-the-clock availability of these tools supports learners who may need help outside of conventional office hours; for example, one study of an AI TA embedded into a discussion forum found that students received responses significantly more often on weekends with the AI TA than during the previous (only human TA support) semester. While such a system cannot respond to all student queries, it can provide support in many cases. In other cases where a query requires more nuanced interpretation or context–such as complex conceptual misunderstandings or unique personal circumstances–AI TAs can escalate the matter to human TAs, thus ensuring students receive appropriate and thorough support. This triaging function can help manage the flow of incoming questions, reducing the volume of simpler queries that human TAs and instructors must handle. As a result, educators are freed to spend more time providing personalised feedback, guiding higherlevel discussions, and engaging students in meaningful academic interactions. While this type of question answering functionality was available even before the advent of LLMs, it required considerable engineering compared to the relative ease of deployment now possible.

One of the key steps to moving these types of advancements from one-off research projects to scalable solutions benefitting a large number of learners is ensuring they integrate seamlessly with existing educational infrastructures. Many AI-based tools of this type so far require learners and instructors to use separate platforms rather than being integrated directly into their primary learning management systems (LMS) or discussion forums. This lack of integration or interoperability can create a fragmented user experience, requiring additional sign-ins, duplicating data entry, and making it harder to track student progress across multiple systems. In contrast, compatibility with widely used LMSs (e.g. Canvas, Moodle, Blackboard) and discussion forum platforms (Piazza, Discourse, phpBB, vBulletin, Flarum) would allow AI TAs to seamlessly access course materials, participation records, and student performance data. Such interoperability not only streamlines the user experience but also supports richer analytics and more effective, personalised interventions, ultimately strengthening the teaching and learning process. Another key step for making these systems usable at scale will be efforts to engineer the human-computer interactions of these systems to facilitate their use by busy human TAs and instructors. Currently, the process of integrating course resources varies in complexity between tools, and the degree of uptake can vary considerably between instructors. There are several ways to accomplish this, including shared folders, access to learning management systems as a simulated student, or tools for uploading resources - but whichever approach is chosen, it must be low-effort for human beings. It should also be easy to continually update these resources, as changes to course materials and syllabi will often occur within a semester and across semesters for courses that are offered on a regular basis. In addition, onboarding and even training is needed for the human TAs and instructors who will collaborate with an AI TA. They will need to understand enough about how the system works, what it can do, and what its limitations are, to ensure that they implement it effectively in their courses. By clearly communicating which queries or tasks the AI TA should handle versus those that call for human expertise, institutions can maintain quality control while maximising efficiency. Over time, incorporating continuous improvement and feedback loops can further refine these boundaries. For instance, platforms should support human TAs and instructors in regularly reviewing the AI TA’s responses to student inquiries. Student and instructor feedback collected through short surveys or mining forum discussions can also highlight areas where the AI TA might be underperforming or producing confusing or inaccurate information. Supporting instructors in checking and refining the system will help to ensure that content remains accurate, relevant, and aligned with educational objectives, reduce instructor frustration, and increase the likelihood of long-term sustained use.

JeepyTA is an example of a course-specific, AI-driven Teaching Assistant designed to integrate with existing classroom and online practices. Developed by the Penn Center for Learning Analytics at the University of Pennsylvania (UPenn) and launched in Fall 2023 (Liu et al., forthcoming[16]), JeepyTA utilises a multi-turn conversational architecture of large language models (LLMs) and is not bound to a specific model – it can be configured to run on many LLMs (e.g. GPT, Llama, or DeepSeek). In courses where JeepyTA has been used, a recent OpenAI GPT model has been used (starting with GPT-3.5 Turbo, moving to GPT-4, GPT-4-Turbo and GPT-4o). JeepyTA has been used in various scenarios: to deliver responses to logistics questions, contextually respond in discussions based on provided course materials, provide targeted feedback to written assignments and coding problems, and to serve as a brainstorming partner. As of Spring 2025, JeepyTA has been deployed across 16 sessions of 14 courses at three higher education institutions in the USA (with deployments also running later in 2025 in Singapore and Colombia). This widespread adoption reflects the growing interest in AI systems that can free human instructors and TAs from repetitive logistical duties, while still delivering responsive, round-the-clock support for learners.

To ensure course alignment, JeepyTA is primed with instructors’ chosen reference materials, including syllabi, textbooks, readings, and past instructor feedback. These resources are embedded in the system’s knowledge base through a retrieval-augmented generation (RAG) workflow: newly uploaded documents are converted into embedding vectors, enabling JeepyTA to retrieve contextually relevant information via semantic search. In doing so, JeepyTA can address administrative queries – such as answering date-related questions from the syllabus – while simultaneously leveraging contextual readings to stimulate in-depth discussions on course-specific topics. Additionally, through collaboration with instructors, in several cases, JeepyTA’s prompts have been iteratively refined to better address specific learning objectives. Separate models have been employed to automate decisions on whether responses appear immediately or await instructor approval, providing finer control over JeepyTA’s engagement in forum discussions. Finally, JeepyTA’s behavior can be customised by category of tasks, giving instructors the flexibility to choose which topics or discussion forum categories it responds to and with what level of human supervision (human-in-the-loop). Across courses, JeepyTA has been integrated into the open-source Flarum platform, appearing as a forum user distinctly marked as an AI Teaching Assistant. Through a Progressive Web App, the forum is accessible on mobile devices, allowing students and instructors to stay engaged on-the-go. In addition to traditional email notifications, users of a mobile app can receive push alerts – such as when JeepyTA responds or specifically mentions them – ensuring timely updates and facilitating faster interaction within the discussion forum.

One of JeepyTA’s primary functions is answering logistics questions about the course. At the beginning of the semester, JeepyTA can handle enrollment-related inquiries, including prerequisites, add/drop deadlines, and options for changing course registration. When students need accommodations, JeepyTA directs them to official university guidelines and relevant support services. It also provides information on class schedules, classroom locations, and changes due to holidays or special events. When a course has multiple sections, JeepyTA helps students confirm where they need to attend. To support coursework, JeepyTA clarifies submission guidelines for assignments, specifying required file formats, submission portals, and deadlines. JeepyTA also assists with technical aspects of online learning platforms when required by the course. It helps students log into external platforms used by instructors (for instance, for video discussions) and provides information like login codes, platform access links, and usage instructions. If students experience submission errors or other technical problems, JeepyTA offers guidance in many cases without needing to involve the instructor. JeepyTA helps students understand grading policies by explaining how grades are calculated based on rubrics, weighted components, and participation requirements. It also assists in interpreting feedback from instructors and TAs and guiding students on resubmissions, appeals, or grade disputes (see Figure 9.1). When students need access to course materials, JeepyTA provides links to lecture slides, reading repositories, and virtual meeting links, ensuring they have the necessary resources. With recent updates, JeepyTA can remember instructor responses and announcements on recurring topics. If students ask about schedule changes, assignment deadlines, or policy updates, JeepyTA provides the latest information. This reduces confusion and keeps students informed without requiring instructors to repeat themselves. It is worth noting that JeepyTA’s ability to answer logistics questions depends on the information instructors choose to provide. It does not generate responses based on general knowledge but instead pulls from course-specific details that instructors input. If a detail was not provided, JeepyTA directs students to the human TA (if available for the course) and instructor or official course documents rather than guessing or giving incomplete information. JeepyTA’s performance in answering logistics questions is not always perfect, as some student inquiries may go beyond what is covered in the course materials. Therefore, instructors can choose to edit JeepyTA’s responses at any time instead of only choosing between fully accepting or discarding them. This option is especially useful when combined with the feature that allows instructors to review JeepyTA’s response before it becomes visible to other students (as explained below). This flexibility allows instructors to keep useful parts, make quick edits, and provide students with accurate information while reducing effort.

In several courses, JeepyTA provides feedback on student essays based on the grading rubric specified for the assignment. This consists of both higher-level conceptual elements and aspects of writing. In terms of higher-level conceptual elements, JeepyTA evaluates essays on the goals of the assignment, such as whether (for example) the student has appropriately discussed the needs of stakeholder groups, whether the student has made arguments in terms of theories discussed in class, or whether the limitations of a proposed solution have been concretely detailed (see Figure 9.2). When students submit drafts, JeepyTA acknowledges what they do well and highlights their strengths. JeepyTA also offers feedback on more mechanical aspects of writing such as argument clarity, evidence use, structure, and writing quality (see Figure 9.2). In addition, JeepyTA comments on lower-level details such as language use, unclear phrasing, grammar mistakes, and wordiness. In these cases, it suggests revisions that can preserve the student’s original intent. GPT models are trained to provide generic responses that apply across many scenarios, which can make their default feedback vague or overly general. To prevent this, JeepyTA was instructed to “provide actionable insights rather than shallow suggestions”. This small detail within prompt engineering makes a difference in helping students receive concrete guidance that improves their revisions. If students need clarification, they can ask follow-up questions, and JeepyTA refines its guidance based on those questions. Instructors can also adjust JeepyTA’s feedback settings to focus on specific aspects of writing or emphasise areas where students generally struggle the most.

Before asking JeepyTA to provide feedback on essays, its responses are first tested on a set of sample essays and the output is reviewed with instructors. This step helps confirm that the feedback aligns with the pedagogical goals of the course. When necessary, the prompts are refined based on the instructor's suggestions in the review process. This process helps JeepyTA provide comments that are clear, relevant to the assignment, and focused on the aspects instructors consider most important. It also creates an opportunity to catch cases where the LLM’s default knowledge base produces inaccurate info, such as in cases where much of the content on the web reflects an incorrect understanding of a specific technical point. In some cases, the prompt specifies a particular tone to shape the feedback style. For example, JeepyTA can be instructed to provide concise and direct feedback or take a more encouraging and supportive tone. This allows the feedback to align with the way instructors and TAs typically communicate with students about their writing. Additionally, past feedback from previous course offerings, along with de-identified student essays from matching pairs, are included in some cases as a reference for JeepyTA. JeepyTA does not use the content of past essays as a source for feedback but instead looks at these examples to follow the structure, level of detail, and key focus areas that instructors and TAs have emphasised. This helps make the feedback more useful to students by reflecting the expectations and priorities set in previous iterations of the course

JeepyTA is also capable of responding to student reflections and questions on the course readings and lectures, offering additional clarification, prompting further thinking, and connecting ideas across course materials. When students share reflections, JeepyTA acknowledges their contributions by reinforcing key ideas from the readings or connecting their insights to broader course themes. If a student raises a question about a concept, theory, or method, JeepyTA provides a response by summarising relevant arguments, explaining terms, or pointing to sections of the readings that address the issue (see Figure 9.3). When a reflection introduces an interesting perspective or critique, JeepyTA may pose follow-up questions to encourage further discussion. To maintain consistency between JeepyTA’s responses and the course content, JeepyTA was specifically instructed to first reference the course materials, with specific materials selected based on their similarity score to the student’s query, while replying. JeepyTA was also prompted to use course-specific language as defined by instructors before the start of the semester in its responses. If a student’s question is not closely related to the course, JeepyTA may be instructed to rely on its knowledge base to respond. Instructors or TAs can modify the visibility settings of JeepyTA’s responses at any time during the semester. If preferred, JeepyTA’s responses can be flagged for instructor review before being shared with the students. This option can be turned on or off at any time during implementation. It is also available for selected categories, such as only for answering logistics questions. This helps prevent the provision of incorrect or misleading information, a particular issue in subject areas where misconceptions are highly present on the web and therefore also in the LLM knowledge base. Instructors or TAs can review flagged responses. If a response is inaccurate, they can discard it and reply directly. If the response is mostly correct but needs refinement, they can edit it before posting. When a response is accurate and well-structured, the instructors or TAs can approve it as is. This additional layer allows JeepyTA to provide timely support while making sure students receive information that is accurate, relevant, and aligned with the course objectives.

For courses involving programming (but not focused on learning to programme), JeepyTA also provides debugging support to students working on assignments by analysing their code and identifying potential errors. When students submit code snippets or describe issues they encounter, JeepyTA reviews the logic, syntax, and structure to pinpoint common mistakes. It then suggests corrections or improvements to resolve the errors (see Figure 9.4). JeepyTA is generally encouraged to use its existing programming knowledge to solve coding issues while following course-specific conventions or practices. In an Educational Data Mining course, for example, student-level crossvalidation is required because this method evaluates how well a model generalises to unseen students. A general LLM chatbot may default to recommending a simple train-test split, a technique that would be acceptable in other contexts but is not the method needed in this course. To prevent this, JeepyTA is instructed to prioritise debugging support based on course materials, assignment requirements, and instructor guidelines rather than relying on broadly used techniques that may not be appropriate in the course context. When addressing programming errors, it refers to the course’s preferred approaches and explains why they are used and how they differ from other methods. In cases where code produces unexpected output, JeepyTA offers strategies for troubleshooting, such as adding print statements, checking variable values, or breaking down complex functions into smaller, testable parts. If students describe the problem rather than submitting code, JeepyTA suggests debugging techniques based on the nature of the issue and guides them through potential causes and solutions. If students describe the problem vaguely or provide incomplete context, JeepyTA asks follow-up questions to clarify the issue before offering suggestions. For example, if a student says, “The code isn’t working,” without specifying the error message or expected output, JeepyTA prompts them to provide more details, such as the error message received, the intended function of the code, or the steps they have already tried. When providing programming code support, JeepyTA’s prompts are designed to avoid simply providing the correct code but instead focus on helping students understand how to diagnose and fix errors themselves. The prompt design encourages students to learn from their own debugging process, which creates opportunities for them to build confidence and capacity to read errors, trace code, and solve problems independently rather than rely on being given the exact fix. JeepyTA has provided debugging support for two courses across two semesters, but its effectiveness in identifying errors has been limited in some cases. One possible reason is that it did not have access to the datasets students were working on in either implementation, which makes it difficult for JeepyTA to verify data structures, variable values, or dataset-specific errors. On the other hand, in other cases, it has caught unusual mistakes (such as a student using the symbol \ instead of |) and typos which can be difficult for instructors and human TAs to see in a lengthy programme (see Figure 9.4). Even when JeepyTA is unable to pinpoint the exact cause of issues, it has helped students clarify their problems and suggests general debugging strategies. This still reduces the time instructors or TAs need to spend guiding students through the initial steps of troubleshooting.

JeepyTA has also been applied to generate summaries of discussion forum conversations. When the use case was first introduced in Spring 2024, summaries were provided only to instructors and TAs to give them an overview of the student discussions. As of Spring 2025, in some courses, these summaries are accessible to all students on the forum. After each weekly discussion, JeepyTA summarises key themes, groups discussions into meaningful themes and recurring arguments, and identifies important questions from the discussion forum. In doing so, JeepyTA credits students who introduced specific points in order to give the instructor and TAs a sense of specific student participation as well as overall trends (See Figure 9.5). The purpose of these summaries is not to replace reading or participating in discussions but to provide an additional layer of support in organising and reflecting on what was discussed. Students are still expected to engage in the full conversation, but the summary can help identify patterns, highlight areas of agreement and disagreement, and uncover questions that might require further discussion. In other words, instead of replacing direct engagement, the summary function can serve as a tool to make the overall direction of conversations more accessible.

In a Games and Learning course offered in Spring 2024, JeepyTA was used as a brainstorming partner in two play journal assignments, where students consulted with JeepyTA to propose educational uses for classic games and Minecraft. A play journal is a structured reflective assignment in which students document and analyse their gameplay experiences to critically engage with classic and contemporary video games by examining their design, narrative, and educational potential. These journals help students develop knowledge of a game through both direct and vicarious experiences, providing a baseline understanding that enables them to propose meaningful educational applications. This structured reflection also prepares students with foundational knowledge of the game, which they can draw upon when consulting with JeepyTA – thus streamlining the labour-intensive process of game analysis and educational integration. While recent initiatives have aimed to make the repurposing of commercial entertainment games for education more accessible, the process still requires significant human effort. This poses challenges for students new to game-based learning and for instructors with limited resources to support student experimentation through direct experience (e.g. playing the game, learning through trial and error) or vicarious experience (e.g. watching YouTube videos, learning from colleagues). JeepyTA offers a solution by enabling users with varying levels of technological, pedagogical, and content knowledge to generate ideas, helping them explore how a game can be adapted for specific educational contexts and learning goals. Students can be guided in developing familiarity with a game and using it in dialogue with JeepyTA (See Figure 9.6).

In a course on cultural foundations for teaching and learning offered in Fall 2024, JeepyTA interacted with students through personas, discussing scenarios and stories related to their cultural experiences with teaching and learning. The persona prompts for JeepyTA consisted of three main components: persona description, situation, and instruction. The persona description section defines the persona’s name and role (e.g. Felipe, a teacher educator), the context in which the persona operates (e.g. teaching elementary education majors), and the personal and cultural background that reveals their funds of knowledge, such as home language, family activities, cultural rituals, and hobbies. The situation section contains the persona’s role in the interaction (e.g. providing feedback, answering questions, mentoring), the task or topic being discussed (e.g. reviewing drafts on a specific subject), and the participants in the conversation (e.g. the persona interacting with a preservice teacher). Finally, the instruction section provides information regarding the tone of voice (e.g. formal, informal, supportive, critical), the level of detail required in responses (e.g. detailed feedback with personal experiences), and specific elements to include, such as relevant examples. For each persona defined on the forum, a separate sub-forum category was created using the persona's name. In these categories, JeepyTA responded while acting as the corresponding persona. Students were informed about the personas and instructed that JeepyTA would post there as the persona indicated by the category name. Since the main goal of this course is to situate novice preservice teachers in culturally relevant and sustaining teaching pedagogies, the persona descriptions intentionally emphasise the cultural and linguistic backgrounds to avoid generating general responses that do not centre specific groups of learners. Thus, the personas were developed based on the lived experiences of four individuals who were members of historically underrepresented groups whom the team interviewed (e.g. Mexican American, Hmong American). These individuals also reviewed the draft persona descriptions to ensure that their identities were accurately portrayed and to avoid racial essentialisation. For instance, one persona, ‘Claire,’ who identified as Hmong-American, shared personal stories related to food, spirituality, and family history, providing preservice teachers with concrete examples that fostered a deeper understanding of Hmong learners. Additionally, the personas guided preservice teachers in understanding concrete ways to incorporate funds of knowledge into classroom settings. For example, the persona ‘Felipe’ suggested specific strategies for adapting class materials to align with Mexican-American families, such as incorporating family tree activities and introducing home craft projects (see Figure 9.7). These approaches offered practical and culturally responsive methods for connecting multicultural perspectives to classroom instruction.

One use case being piloted in Spring 2025 is the generation of discussion prompts to start weekly discussions based on the assigned readings. The goal is to provide a foundation for meaningful discussions while maintaining relevance to the course objectives. When generating the discussion prompt, JeepyTA extracts core arguments, methodologies, and debates from the week’s readings. It identifies recurring themes, unresolved questions, or contrasting viewpoints and frames them in a way that encourages meaningful discussion. If students are expected to relate the reading material to their own research, JeepyTA includes questions that prompt reflection on personal experiences or future applications. If the goal is to explore methodological issues, it focuses on the strengths, limitations, and assumptions underlying the methods presented in the readings (see Figure 9.8). Some prompts JeepyTA suggests may not be immediately useful. For example, it has generated questions that are overly broad and more at the level of the entire course than a specific week’s content. Other discussion questions generated may be too complex and require extensive background knowledge or additional explanation before students can engage with them. A highly detailed methodological critique, for example, may be difficult to address within the scope of a discussion forum. Therefore, instructors and TAs have reviewed all of the discussion prompts before making them visible to students. JeepyTA’s suggestions have provided a starting point that allows instructors and TAs to refine the wording, adjust the focus, or simplify overly technical questions to improve understandability. As such, JeepyTA does not replace instructor or TA expertise in orchestrating discussions, but rather, it streamlines the process by offering an initial draft that helps structure each week’s forum.

JeepyTA was first implemented in Fall 2023 and covered use cases described above. To understand how students viewed the virtual teaching assistant, we distributed a voluntary end-of-semester survey, approved as exempt by the university’s Institutional Review Board, and clearly stated that participation was optional and would not affect grades. Students provided informed consent before answering 13 multiple-choice questions. The survey asked about specific aspects of JeepyTA’s performance, including how quickly and accurately it responded to questions, how clearly and professionally it communicated, and how well it supported student learning, development, and motivation, compared to a human TA. Response options ranged from 1 ("Human TA is significantly better") to 5 ("AI TA is significantly better"). We used two-sample t-tests to check whether the average scores for each question differed from the neutral midpoint of 3. This allowed us to measure whether students viewed JeepyTA as better or worse than a human TA. Results showed that students rated JeepyTA as comparable to a human TA in several areas, including the speed and clarity of its replies, the accuracy and professionalism of its responses, its ability to support learning without giving away answers, and the overall usefulness and quality of its feedback. However, students rated it lower in three areas: offering useful ideas, supporting student development, and motivating students. We also evaluated JeepyTA’s impact on when responses were available to students. In two identical courses offered one year apart, taught by the same professor at the same institution and involving comparable student groups, a difference emerged in response times to students’ queries. During the earlier term, when JeepyTA was not in use, the instructor and the TAs posted 153 responses, with a median response time of 7.09 hours. In the following term of Fall 2023, after JeepyTA was introduced, course staff posted 136 responses, and the median response time dropped to 2.23 hours, a statistically significantly lower time.

Among 89 posts where JeepyTA attempted to answer student inquiries, 22 responses received approval. The AI generated replies in approximately 40 seconds, and course staff approved them within an average of 38 minutes. Because JeepyTA was able to handle these queries, even manually written responses were faster. In the JeepyTA semester, the median human response time was 4.14 hours, statistically significantly faster than the 7.09-hour median time observed in the prior semester. This suggests that JeepyTA improved the efficiency of even fully-human responses, likely because the instructor and TAs had more time to address tasks that deserved more of their attention. Another of JeepyTA’s intended purpose was to support instructors and TAs outside standard working hours. In the term prior to its introduction, course staff posted 62% of replies outside regular U.S. business hours (after 5 p.m. and before 9 a.m.). In the term when JeepyTA was available, this proportion was 60%, not statistically significantly different. However, there appeared to be a difference in the proportion of responses posted during weekends. In the prior semester, instructors posted 10% of all replies on weekends. After JeepyTA was introduced, this number increased to 29% of the total number of replies (including JeepyTA posts approved or edited by the instructors). This increase was statistically significant, suggesting that course staff were better able to focus their work time even on weekends, following the introduction of JeepyTA. Work is currently ongoing to study the impact of JeepyTA’s feedback on student essays (3b). In our initial work, we have found that semester-on-semester, students receive statistically significantly higher assignment grades (according to an independent grader) after receiving JeepyTA’s feedback – going from an average of 64% of students receiving an A or an A+ on their final submission to 95% of students receiving an A or A+. In follow-up work, we are investigating whether students specifically fix the issues identified by JeepyTA in that same essay, and whether they make the same mistakes in subsequent essays (including in a different class where JeepyTA is also offered). A study on JeepyTA’s brainstorming support functionality (3f) revealed that JeepyTA helped students generate a higher volume of ideas (averaging 2.78 per student compared to 1.7 for student-led ideation) and increased the production of fully formed, detailed concepts. However, this came with important trade-offs: JeepyTA-driven ideas often showed thematic overlap, with common suggestions like "teamwork and collaboration" appearing across multiple students' work, while student-driven ideas exhibited greater diversity. In analysing students’ descriptions of their process of using JeepyTA, the researchers identified five distinct patterns: 36% of students found the suggestions insightful and aligned with their own ideas, 18% acknowledged JeepyTA's role in idea generation but did not say if it was actually helpful in doing so, 18% referenced using JeepyTA but did not say how/if its use was connected to their proposed ideas, 18% made no reference to JeepyTA, and 13% critically evaluated and built upon JeepyTA’s recommendations. Notably, 68% of students proposed multiple educational applications spanning knowledge types essential for 21st-century learning: metaknowledge (collaboration, problem-solving), foundational knowledge (mathematics, history, computer science), and humanistic knowledge (digital citizenship, ethical awareness). Especially novel recommendations emerged when students integrated insights from readings, game experiences, and their domain expertise, suggesting that JeepyTA works best as a brainstorming tool when students already possess foundational knowledge they can build upon.

At the Indian Institute of Technology Kanpur, for instance, researchers have piloted AI augmented TAs for an introductory computer science course, where student-to-instructor ratios are often too high for human instructors or TAs to provide individual guidance at scale (Ahmed, 2025[55]). In response to this situation, the team integrated an AI agent into Prutor, a web-based programming platform used by students to submit solutions for C programming assignments. When a student’s programme failed to pass the instructor-defined test cases, they could request assistance by clicking a “Get Help” button within the platform. This action triggered a feedback request that compiled four key inputs and sent them to GPT-4 Turbo via an internal API: the problem description, the student’s buggy code, the test case results, and an optional message written by the student to describe their confusion or ask a specific question. Using this information, GPT-4 Turbo generated targeted feedback linked to specific lines in the student's code that highlights the exact locations of potential errors and explaining what may have gone wrong. The output was routed to a centralised dashboard, where human TAs could review the AI’s draft response, make edits, add notes, or reject it entirely before sending the final feedback back to the student through the same interface. The AI agent was evaluated against two other conditions: one in which human TAs provided all feedback without any AI support, and another in which feedback generated by GPT-4 Turbo was sent directly to students without human review. Researchers examined how these types of feedback methods affected feedback quality (measured through expert evaluations), TA efficiency.

(measured through response times recorded in system logs), and student performance (measured by whether final code submissions passed all instructor-defined test cases). In addition, students rated each piece of feedback based on helpfulness, clarity, and timeliness using built-in rating tools on the platform. Although AI-generated feedback was often rated favourably by students – particularly for its detailed explanations – these positive perceptions did not consistently lead to improved performance. Students receiving AI-assisted or fully automated feedback were not significantly more likely to complete the assignments successfully, and in many cases, manual TA feedback led to faster and more effective problem resolution. One issue observed by the researchers was that some TAs using AI-generated feedback forwarded it without making necessary corrections, even when the output contained inaccuracies or hallucinations. This appeared to reflect a tendency among certain TAs to rely too heavily on the AI drafts instead of critically evaluating their quality. In contrast, TAs working without AI support often provided responses which highlighted the immediate next step or pinpointed the specific source of the error. As a result, students in the manual TA group were, in some cases, able to resolve issues more efficiently, despite often receiving shorter and less detailed feedback.

Another example comes from Czechia, where a GPT-3-based conversational chatbot named Alex was deployed in a university-level English course (Polakova and Klimova, 2024[56]). Alex is a web-based application that combines multiple AI models: GPT-3 is used to generate natural language responses, while Gramformer and T5 are applied to detect and correct grammatical errors. At the beginning of each chat session, GPT-3 generates a topic-specific opening question based on preselected weekly themes. Students then respond freely in English. Their input is processed by Gramformer and T5 to identify grammatical mistakes. When an error is detected, the chatbot enters a correction phase, during which GPT-3 provides a corrected version of the sentence along with an explanation in natural language. The chatbot also allows users to rate the feedback as either "good" or "bad". Over a four-week period, students engaged with Alex in simulated dialogue sessions and received real-time feedback. Though the topic of each session starts with a fixed opening, users can take the conversation in any direction. To constrain the types of interactions, users were limited to one session per day and three per week, with each session requiring a minimum of 1 000 characters of typed input. After completing the four-week programme, students filled out a questionnaire survey about their experience. Analysis of student feedback revealed that learners responded positively to several aspects of the chatbot. Students noted that the chatbot asked clear, easy-to-understand questions and responded quickly, which helped keep the conversation at a natural pace. Many also appreciated the flexibility to practice outside the classroom. The option to access conversation practice on their own helped them gain confidence. According to the survey, 88% of students reported they did not feel stressed using the chatbot, and several commented that the experience felt like chatting with a real person. In terms of learning gains, pre- and post-tests focusing on grammar and vocabulary showed measurable improvements. Upperintermediate students improved their tests scores from about 59% to 75%, while advanced students increased from 80% to 90%. At the same time, students also reported several limitations of the chatbot that affected their overall experience. Some participants encountered technical problems, such as system lags and incomplete replies from the chatbot, which disrupted the flow of conversation. Others pointed out that certain responses felt repetitive or too limited in variation, which reduced the usefulness of later sessions. Survey results also showed that the chatbot failed to increase motivation or encourage continued use: 74% of students did not feel more motivated to learn English as a result of using the chatbot, and 79% said they would rather use other tools like Duolingo or talk to native speakers instead. Though different in design and pedagogical goals, these examples share JeepyTA’s aim of delivering coursealigned, scalable support. Even though these AI-powered teaching assistants differ from JeepyTA in how they function, the courses they support, the tasks they were assigned, and in the regional infrastructure of the learning environment, similar benefits and challenges can be noted.

This section has examined how AI Teaching Assistants, particularly those driven by generative AI, can expand and support the traditional roles played by human TAs and instructors in higher education. We began by considering the foundational role that human TAs play, along with the logistical and pedagogical challenges they often encounter, such as balancing workload with their own academic commitments. Against this backdrop, we reviewed the emergence of AI TAs–tools that are designed to automate routine administrative tasks, deliver real-time student support, and provide timely formative feedback at scale in ways that complement and extend human capabilities. Rather than viewing AI TAs as replacements for human TAs, we argue for a model of augmentation–one where AI tools provide support and feedback that is otherwise infeasible for human instructors and TAs alone: support and feedback that is immediate, personalised, detailed, and available around the clock. To ground these ideas, we discussed JeepyTA – a generative AI system developed by the Penn Center for Learning Analytics at University of Pennsylvania. Deployed across multiple graduate-level courses and multiple institutions and integrated into course discussion forums, JeepyTA supports a variety of instructional needs. These include answering logistical queries, providing formative feedback on essays, assisting students with debugging their code, stimulating creative ideation, summarising discussion posts, and suggesting new discussion prompts for deeper engagement. By embedding course materials and rubrics into an LLM-based system, JeepyTA demonstrates how AI TAs can deliver context-aware and curriculum-aligned responses at scale. Throughout the chapter, we also discussed key design and implementation considerations–such as prompt design, the need for human oversight and review, consideration of ethics and bias, and alignment with policy. These factors are essential for ensuring that AI TAs operate responsibly, transparently, and in service of equitable learning outcomes. Ultimately, the experiences documented here suggest that well-designed AI TAs can ease pressure on human TAs and academic staff, enhance student engagement, and potentially enhance the quality of learning experiences. Importantly, their effectiveness depends not only on the technical sophistication of generative AI, but on careful integration into pedagogical practices. When thoughtfully deployed, AI Teaching Assistants can help institutions create more scalable, responsive, and personalised educational ecosystems–supporting students, TAs, and instructors in new and meaningful ways.

Search This Blog

International Day of Education

Generative AI as a teaching assistant.

Comments

Post a Comment

Popular posts from this blog

(Day 2) Beyond the Algorithm: Navigating the Future of Artificial Intelligence - 49th Annual UNIS-UN International Student Conference.

Ensure that AI complements, rather than replaces, the essential human elements of learning.

(Day 1 - Part 2) Beyond the Algorithm: Navigating the Future of Artificial Intelligence - 49th Annual UNIS-UN International Student Conference.