Pinned post

As inference splits into prefill and decode, Nvidia's Groq deal could enable a "Rubin SRAM" variant optimized for ultra-low latency agentic reasoning workloads (Gavin Baker/@gavinsbaker)

Gavin Baker / @gavinsbaker : As inference splits into prefill and decode, Nvidia's Groq deal could enable a “Rubin SRAM” variant optim...

19 March 2025

OpenAI launches o1-pro, which uses more compute than o1 for "consistently better responses", to some devs for $150/1M input tokens and $600/1M output tokens (Kyle Wiggers/TechCrunch)

Kyle Wiggers / TechCrunch:
OpenAI launches o1-pro, which uses more compute than o1 for “consistently better responses”, to some devs for $150/1M input tokens and $600/1M output tokens  —  OpenAI has launched a more powerful version of its o1 “reasoning” AI model, o1-pro, in its developer API.

Posted from: this blog via Microsoft Power Automate.

Daily Deals