Multimodal AI and Autonomous Agents: The Next Frontier in Publishing
Industry experts from FT Strategies discuss the transformative potential of multimodal AI and AI agents for publishers in 2024.

In a recent episode of the “Newsroom Robots” podcast, hosted by Nikita Roy, a data scientist and media entrepreneur, the conversation turned toward the future of artificial intelligence in journalism. The discussion featured insights from Alia Itskowitz and Sam Gold of FT Strategies, who delved into the burgeoning field of multimodal AI and the role of autonomous AI agents in revolutionizing the publishing industry.
Multimodal AI, which refers to AI systems capable of understanding and generating content across different modes such as text, images and audio, is poised to become a game-changer for publishers.
Local reporting and journalism you can count on.
Subscribe to The Palm Springs Post
Triggered by advancements like GPT vision and insightful articles on GPT-4v’s capabilities, Gold highlighted the untapped potential of multimodal AI in interpreting images to enrich storytelling. He suggested that while the industry is still grappling with text-to-text models, image-to-text applications offer a fresh frontier for innovation.
For instance, publishers could use these models to automatically generate captions for images or translate text within images to reach diverse audiences.
“I think it always comes back to the same thing in our industry, which is are you offering people content that’s truly unique and differentiated or not?” Gold said. “And that’s always at the heart of most of the problems that publications are facing, whether it’s how do I monetize, how do I transform, how do I survive?”
The podcast also touched on how multimodal AI could lead to differentiated user interfaces in news platforms. Gold emphasized the importance of publications becoming unique destinations that offer personalized experiences. He mentioned resources like Google’s AI UX lab (PAIR), which are developing guidelines for responsible AI use in crafting new user experiences.
Itskowitz underscored the enduring importance of quality journalism amidst the integration of AI. She argued that while AI can enhance the distribution of journalism, it cannot replace the human element essential for breaking stories and creating unique content.
“I think we can’t overemphasize this point that although we’re talking a lot about kind of the UX and the format and how AI supports that,” said Itskowitz. “There’s still this massive, overwhelming amount of value that’s created by the human journalists at the end of the day, and this is like Alia says, about breaking stories, about having conversations with people, being on the ground and understanding those emerging stories.”
The conversation then shifted to AI agents—intelligent systems designed to perform autonomous actions to achieve specific goals. Gold explored their potential impact on product evolution within the publishing industry. He cited Bill Gates’ view that agents could be how everyone interacts with software in the future, suggesting a paradigm shift from traditional websites to personalized AI agents as user destinations.
“I think we are going to see more and more of these kind of agent-based systems where you can give really complex tasks like go and scrape some prices,” he said. “Or other examples would be go and build me a website or go and build me this app. And it goes and breaks down that task into different steps, and then goes and uses different tools to try and work through that task.
“It’s promised to be really kind of powerful tools, obviously quite complex, but I think, like with other generative AI, it’s only going to get easier and easier and easier to work with.”
To stay abreast of rapid advancements in AI, Itskowitz and Gold shared their strategies for keeping informed. They follow thought leaders on platforms like LinkedIn and Medium, attend conferences and engage with peers. They also leverage social media algorithms to curate feeds with high-quality content relevant to their interests.
Gold suggested that it might be time for the industry to start considering these tools more seriously, “So maybe it’s too soon for the industry to start thinking about it, but maybe not. I mean, other people will start to think about these tools, and so it’s potentially something to start putting on your research radar.”
In personal anecdotes, Gold shared his use of AI for coding assistance and an experimental project that sends him daily positive messages generated by an LLM (large language model). Itskowitz discussed her use of generative AI for creating poems and fun projects like an “AI Santa” email graphic. She reflected on the core of journalism, saying, “And at the end of the day, it’s about the quality of the scoops, it’s about finding the news and finding an angle, and that’s really what’s going to keep people coming.”