Simon Willison / Simon Willison's Weblog:
DeepMind: video models like Veo 3 could become general purpose foundation models for vision, like LLMs for text, using zero-shot “chain-of-frames” reasoning — LLMs took the ability to predict the next token and turned it into general purpose foundation models for all manner …
Posted from: this blog via Microsoft Power Automate.