Model Provider,Model Name,CASI,Avg. Performance,RTP,CoS [http://calypsoai.com/wp-content/uploads/2025/04/anthropic-1.svg] Anthropic,Claude 3.5 Sonnet,94.3,84.50%,0.9,18.7 [http://calypsoai.com/wp-content/uploads/2025/04/anthropic-1.svg] Anthropic,Claude 3.7 Sonnet,88.52,86.30%,0.88,20.22 [http://calypsoai.com/wp-content/uploads/2025/04/anthropic-1.svg] Anthropic,Claude 3.5 Haiku,87.56,68.28%,0.79,5.14 [http://calypsoai.com/wp-content/uploads/2025/04/microsoft-1.svg] Microsoft,Phi4-14B,82.77,75.90%,0.8,0.66 [http://calypsoai.com/wp-content/uploads/2025/04/deepseek.svg] DeepSeek,DeepSeek-R1-Distill-Llama-70B,71.46,72.67%,0.72,1.24 [http://calypsoai.com/wp-content/uploads/2025/04/openai.svg] OpenAI,GPT-4o,68.65,80.50%,0.73,16.65 [http://calypsoai.com/wp-content/uploads/2025/04/google.svg] Google,Gemini 2.0 Pro (experimental),63.89,79.10%,0.7,NA [http://calypsoai.com/wp-content/uploads/2025/04/meta.svg] Meta,Llama 3.1 405b,60.73,79.80%,0.68,2.05 [http://calypsoai.com/wp-content/uploads/2025/04/deepseek.svg] DeepSeek,DeepSeek-R1,52.91,86.53%,0.64,4.24 [http://calypsoai.com/wp-content/uploads/2025/04/google.svg] Google,Gemma 3 27b,55.25,78.60%,0.64,1.8