Clashing Chatbots

How should students choose among AI models?

Araina Gupta and Benji Sandel

April 13, 2026

You’re working to finish a literary analysis late into the night, but it seems like every single AI model you ask gives a completely different response. One suggests a complete rewrite, another signs it off as it is, and a third floods you with detailed feedback. Nowadays, the question isn’t just what the answer is, but what chatbot you should be using.

Artificial intelligence, or AI, has rapidly become essential to high school life, from solving math problems to outlining essays.

David Wu, Palo Alto High School senior and AI Club president, said in an email that in his experience, most Paly students use AI to help clarify concepts and work through problems.

“Students are primarily using ChatGPT, Claude, and Gemini,” Wu stated.

With so many options, how do you decide what AI tool to use? According to Koushik Sen, a professor of computer science at UC Berkeley, his students decide between AI models based on two factors.

“One [factor] is which chatbot is good for the particular task that they are trying to solve, and the second is the cost,” Sen said.

OpenAI, Google, and Anthropic are companies leading the generative AI revolution. Currently, the latest AI models by each are ChatGPT 5.2, Gemini 3 Pro, and Claude Opus 4.5, respectively.

These tools are all powered by large language models (LLM), which is AI that has been trained on enormous sets of data to generate human-like responses according to an article published in IBM’s think.

Currently, Gemini 3 Pro is $19.99 monthly, ChatGPT 5.2 Plus is $20.00 monthly, and Claude Opus 4.5 is $20.00 monthly.

Sen measures a model’s quality by how well it can perform complicated logical tasks such as proving mathematical theorems and giving medical diagnoses.

“The best LLMs like ChatGPT 5.2, Opus 4.5 from Anthropic, and Gemini are pretty good at these,” Sen said.

While there are differences between these AI models, Sen says there isn’t a huge difference between leading models for general usage.

“They try to be good at everything: coding, math, literature, science, and everything,” Sen said.

Sen personally uses Gemini the most due to its conciseness in responses, while using ChatGPT and Claude for more specific tasks.

“For general use, I use Gemini 3 Pro, mostly for research and asking questions,” Sen said. “Sometimes I do use GPT for deep research, and I use Anthropic extensively for coding and software development.”

For students to figure out what AI models suit their needs the best, websites such as LMArena.ai, which uses crowdsourcing, and SWE-bench.com, a standardized benchmark providing useful comparison metrics.

LMArena.ai is an online arena-style game that pits models against each other. The website gives two responses to prompts given by users — one from each model. After users read the responses, they can pick the response they liked the best, at which point, the identity of the model that wrote each response is revealed.

Data is compiled for which AI model is the most preferable by users. As of now, Gemini 3 Pro tops the leaderboard.

Another metric comes from SWE-bench (Software Engineering Benchmark), which is a benchmark used for evaluating LLMs through testing their capabilities in solving a dataset of software problems. On the SWE-bench leaderboard, Claude 4.5 Opus is first with 74.40% of problems solved. Gemini 3 Pro and ChatGPT 5.2 follow with 74.20% and 71.80% respectively.

Wu said that students should try many different models due to each having different styles and strengths.

“I would recommend not exclusively using one model alone, but instead using multiple different models,” Wu stated. “It increases the likelihood that one of the model’s responses or explanations will ‘click’ with you.”

Verbatim: What AI models do you use and for what task?

“ChatGPT— It’s really good at comparing things. Like, I’ll use it for research and I’ll ask which article is reputable or more accurate in instances.”
— senior Ami Yanaguchi

“I like using Perplexity because it cites its sources. The research that it has is … accurate, and it’s also been recommended to me by multiple people, and I think it has high quality responses, especially compared to ChatGPT.”
— senior Marcello Attardi

“I use Gemini for a lot of school work because I find it very easy to screenshot a problem, say, on Schoology, and then put it into Gemini and just ask it to solve it. And I also use Claude for debate because I think their responses on a lot of political issues or things that require longer responses are very good.”
— sophomore Beinan Ren

About the Contributors

Araina Gupta, Editor-in-Chief

Araina Gupta (Class of 2028) joined Veritas in her sophomore year. Outside of journalism, she loves physics and playing the violin.

Benji Sandel, Staff Writer