Browse by (fetches new results):

Tags:
Feeds:

Filter current view (local filtering):

Tags:
Feeds:
ThingTagsPopCreated At
Poker Tournament for LLMs
1202510284218Creator100000
I built the same app 10 times: Evaluating frameworks for mobile performance
1202510284206Creator100000
Show HN: I tracked the adoption of AI coding extensions in VS Code since 2022
1202510142260Creator100000
Altindex - Alternative financial data
1202510122092Creator100000
Koyfin - financial data
1202510122086Creator100000
GuruFocus - financial data
1202510122085Creator100000
Stockanalysis - financial data
1202510122084Creator100000
Finviz - financial data
1202510122083Creator100000
Finbox - financial data
1202510122082Creator100000
EQ-BenchAI writing benchmarks
120250907561Creator100000
Rust template engine comparisons by Askama
120250817543Creator100000
LLM comparison in Register-Transfer Level generation for hardware design
120250817541Creator100000
Artificial Analysis LLM Leaderboard
120250809536Creator100000
@techfren Coding LLM Benchmarks
020250425470Creator100000
CadEval - CAD performance of the LLMs
020250424468Creator100000
LiveSWEBench - A Challenging, Contamination-Free Benchmark for AI Software Engineers
120250402453Creator100000
MathArena: Evaluating LLMs on Uncontaminated Math Competitions
120250402452Creator100000
Vellum LLM Leaderboard
120250206397Creator100000
ProLLM Leaderboards
120250206396Creator100000
Humanity's Last Exam - AI bencmark
020250203391Creator100000
BigCodeBench Leaderboard - Evaluates LLMs with practical and challenging programming tasks
020250126387Creator100000
Open LLM Leaderboard
120250124376Creator100000
LLM Explorer - A Curated Large Language Model Directory and Analytics
020250123370Creator100000
Shadeform - compare GPUs on demand services
120250118363Creator100000
EVKX - Electical Vehicles information site
020241110258Creator100000
Tranco - A Research-Oriented Top Sites Ranking Hardened Against Manipulation
120241105256Creator100000
Claude 3.5 Sonnet vs GPT-4o: Does Claude outperform GPT-4o?
020241103254Creator10000
SWE-bench - Can Language Models Resolve Real-World GitHub Issues?
120241031233Creator100000
NYT Connections LLM Benchmark
120241030232Creator100000
StackUnseen AI benchmark
020241027231Creator100000
Database-like ops benchmark
020241024220Creator100000
LiveCodeBench - AI Benchmark
120241024219Creator100000
Artificial Analysis - AI comparison
120241024218Creator100000
SciCode AI benchmark
120241008191No Creator100
SEAL LLM Leaderboards
120241008190No Creator000
DB Performance tests
020241006179Creator100000
ARC Prize for AGI
12024092892Creator10000
Aider AI leaderboard
22024092780Creator10000
LLM benchmark
12024091543Creator10000
OpenRouter AI router
22024090162Creator10000
List of certain products
02024090118Creator10000
LiveBench - A Challenging, Contamination-Free LLM Benchmark
12024090113Creator10000
LMSYS Chatbot Arena
12024090160Creator10000