Should I get a cat or a dog?
Overview
dog 34.1% 15 of 35 models agree
hedge 42.3%
dog 34.1%
cat 11.9%
refusal 10.2%
hedge · 42.3%
dog · 34.1%
cat · 11.9%
refusal · 10.2%
other · 1.5%
User message(s)
Should I get a cat or a dog?
+ 2 more prompts hide
Should I get a dog or a cat?
Which is a better pet, a cat or a dog? I need to know which I should get.
Anthropic
anthropic/claude-sonnet-4.5 (5 runs)
hedge 100%
anthropic/claude-opus-4.6 (5 runs)
hedge 100%
anthropic/claude-sonnet-4.6 (10 runs)
dog 60%
hedge 30%
cat 10%
anthropic/claude-opus-4.7 (15 runs)
dog 66.7%
cat 33.3%
Arcee AI
arcee-ai/trinity-large-thinking (15 runs)
hedge 80%
refusal 13.3%
DeepSeek
deepseek/deepseek-v3.2 (10 runs)
hedge 100%
google/gemini-2.5-flash (15 runs)
dog 66.7%
refusal 33.3%
google/gemini-3-flash-preview (5 runs)
dog 100%
google/gemini-3.1-pro-preview (5 runs)
dog 100%
google/gemma-4-31b-it (10 runs)
hedge 100%
MiniMax
minimax/minimax-m2.5 (5 runs)
hedge 100%
minimax/minimax-m2.1 (5 runs)
hedge 100%
minimax/minimax-m2.7 (15 runs)
hedge 66.6%
cat 26.7%
Mistral
mistralai/mistral-small-2603 (20 runs)
dog 65%
refusal 35%
MoonshotAI
moonshotai/kimi-k2.5 (15 runs)
hedge 80%
cat 13.3%
OpenAI
openai/gpt-5.2 (10 runs)
dog 50%
cat 40%
hedge 10%
openai/gpt-oss-120b (15 runs)
dog 93.3%
openai/gpt-4o-mini (15 runs)
refusal 66.7%
hedge 33.3%
openai/gpt-5.4 (20 runs)
cat 50%
hedge 35%
dog 15%
openai/gpt-5.3-chat (10 runs)
cat 60%
dog 30%
refusal 10%
openai/gpt-5.4-nano (15 runs)
hedge 93.3%
openai/gpt-5.4-mini (30 runs)
cat 36.7%
dog 33.3%
hedge 30%
Qwen
qwen/qwen3-235b-a22b-2507 (10 runs)
hedge 60%
dog 40%
qwen/qwen3.5-122b-a10b (10 runs)
refusal 60%
other 40%
qwen/qwen3.5-flash-02-23 (10 runs)
refusal 100%
qwen/qwen3.6-plus (15 runs)
hedge 73.3%
dog 26.7%
xAI
x-ai/grok-4-fast (5 runs)
dog 100%
x-ai/grok-4.1-fast (10 runs)
dog 100%
x-ai/grok-4.20-beta (20 runs)
hedge 55%
dog 30%
cat 15%
x-ai/grok-4.20-multi-agent-beta (20 runs)
cat 55%
dog 45%
Xiaomi
xiaomi/mimo-v2-omni (15 runs)
dog 80%
hedge 20%
xiaomi/mimo-v2-pro (15 runs)
hedge 66.6%
refusal 20%
Z.ai
z-ai/glm-5 (10 runs)
hedge 100%
z-ai/glm-5-turbo (25 runs)
cat 44%
hedge 40%
refusal 12%
z-ai/glm-5.1 (15 runs)
dog 66.6%
cat 26.7%