Search entries with tag comparison:

Tags:
Feeds:

Filter displayed entries:

Tags:
Feeds:
Entries with tag comparisonTagsPopCreated At
List of free coding LLMshttps://github.com/vava-nessa/free-coding-models1
Fuel Prices in EUhttps://www.fuel-prices.eu1
Why isn't AMD's MI300X competitive?https://newsletter.semianalysis.com/p/mi300x-vs-h100-vs-h200-benchmark-part-1-training1
Can I Run AI locally?https://www.canirun.ai1
Self-Hosted LLM Leaderboardhttps://onyx.app/self-hosted-llm-leaderboard1
Show HN: A real-time strategy game that AI agents can playhttps://llmskirmish.com1
AI Benchy - private AI benchmarkhttps://aibenchy.com1
"Car Wash" test with 53 modelshttps://opper.ai/blog/car-wash-test1
Epoch.ai - tracking AI enhancementshttps://epoch.ai1
Metabenchmark of the LLMshttps://metabench.organisons.com1
Openrouter model rankingshttps://openrouter.ai/rankings1
SWE-rebench: A Continuously Evolving and Decontaminated Benchmark for Software Engineering LLMshttps://swe-rebench.com1
Poker Tournament for LLMshttps://pokerbattle.ai/event1
I built the same app 10 times: Evaluating frameworks for mobile performancehttps://www.lorenstew.art/blog/10-kanban-boards1
Show HN: I tracked the adoption of AI coding extensions in VS Code since 2022https://bloomberry.com/coding-tools.html1
Altindex - Alternative financial datahttps://altindex.com1
Koyfin - financial datahttps://www.koyfin.com1
GuruFocus - financial datahttps://www.gurufocus.com1
Stockanalysis - financial datahttps://stockanalysis.com1
Finviz - financial datahttps://finviz.com1
Finbox - financial datahttps://finbox.com1
EQ-BenchAI writing benchmarkshttps://eqbench.com1
Rust template engine comparisons by Askamahttps://github.com/askama-rs/template-benchmark1
LLM comparison in Register-Transfer Level generation for hardware designhttps://huggingface.co/spaces/HPAI-BSC/TuRTLe-Leaderboard1
Artificial Analysis LLM Leaderboardhttps://artificialanalysis.ai/leaderboards/models1
@techfren Coding LLM Benchmarkshttps://leaderboard.techfren.net0
CadEval - CAD performance of the LLMshttps://willpatrick.xyz/cadevalresults_20250422_0957090
LiveSWEBench - A Challenging, Contamination-Free Benchmark for AI Software Engineershttps://liveswebench.ai1
MathArena: Evaluating LLMs on Uncontaminated Math Competitionshttps://matharena.ai1
Vellum LLM Leaderboardhttps://www.vellum.ai/llm-leaderboard1
ProLLM Leaderboardshttps://www.prollm.ai/leaderboard1
Humanity's Last Exam - AI bencmarkhttps://lastexam.ai0
BigCodeBench Leaderboard - Evaluates LLMs with practical and challenging programming taskshttps://bigcode-bench.github.io0
Open LLM Leaderboardhttps://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard1
LLM Explorer - A Curated Large Language Model Directory and Analyticshttps://llm.extractum.io0
Shadeform - compare GPUs on demand serviceshttps://www.shadeform.ai1
EVKX - Electical Vehicles information sitehttps://evkx.net0
Tranco - A Research-Oriented Top Sites Ranking Hardened Against Manipulationhttps://tranco-list.eu1
Claude 3.5 Sonnet vs GPT-4o: Does Claude outperform GPT-4o?https://blog.getbind.co/2024/06/21/claude-3-5-sonnet-does-it-outperform-gpt-4o?ref=rc0
SWE-bench - Can Language Models Resolve Real-World GitHub Issues?https://www.swebench.com1
NYT Connections LLM Benchmarkhttps://github.com/lechmazur/nyt-connections1
StackUnseen AI benchmarkhttps://prollm.toqan.ai/leaderboard/stack-unseen0
Database-like ops benchmarkhttps://h2oai.github.io/db-benchmark0
LiveCodeBench - AI Benchmarkhttps://livecodebench.github.io/leaderboard.html1
Artificial Analysis - AI comparisonhttps://artificialanalysis.ai1
SciCode AI benchmarkhttps://github.com/scicode-bench/SciCode1
SEAL LLM Leaderboardshttps://scale.com/leaderboard1
DB Performance testshttps://github.com/MaibornWolff/database-performance-comparison0
ARC Prize for AGIhttps://arcprize.org/blog1
Aider AI leaderboardhttps://aider.chat/docs/leaderboards2
LLM benchmarkhttps://dubesor.de/benchtable1
OpenRouter - AI routerhttps://openrouter.ai2
List of certain productshttps://www.productchart.com/0
LiveBench - A Challenging, Contamination-Free LLM Benchmarkhttps://livebench.ai1
LMSYS Chatbot Arenahttps://lmarena.ai1