|
Dear Reader,
Welcome to the Jan 302nd issue of our newsletter!
Enterprise AI is officially past the “cute demo” phase: a new usage-driven report by OpenAI shows organizations getting the most value when they move from ad-hoc prompts to deeper workflow integration. But there’s a cost trap here: if you aren’t benchmarking your real prompts across a slate of models, you can end up paying multiples for “premium” outputs that cheaper options match on the metrics that actually matter.
Meanwhile, research on “persona space” by Anthropic argues that reliability isn’t just about accuracy; models can drift in character, and tracking (and constraining) an internal “assistant” direction can help keep behavior stable under pressure. On the markets side, a deep dive into prediction exchanges reveals systematic wealth-transfer patterns where liquidity takers are punished by bias and microstructure, while legacy finance pushes toward 24/7 on-chain settlement via tokenized shares and stablecoin funding.
Aditionally, if you want agents you can trust, the unglamorous foundation is structured outputs: schemas, constrained decoding, and validation, so your “automation” doesn’t implode on malformed JSON.
This week’s research thread is that “multi-agent” is both the superpower and the threat model: the same idea that makes reasoning stronger. Models effectively staging internal debate as a “society of thought” also makes influence ops scarier when it’s weaponized as coordinated swarms that can infiltrate communities, fabricate consensus, and sustain harassment at machine speed.
The security angle is even more blunt: prompt injection isn’t a quirky jailbreak, it’s a structural mismatch between natural-language instruction following and adversarial inputs made worse once you add retrieval, because attackers can bury instructions in the very documents your system is designed to trust. That lands squarely in the agentic RAG world, where planners, tools, and memory expand capability but also expand the attack surface unless you treat retrieval as untrusted and design for containment.
And when you zoom out from models to society, a pair of empirical studies offers a reality check: contact behavior varies sharply with individual and neighborhood socioeconomic factors, while political segregation can be stronger offline than online, a reminder that the most consequential “network effects” often live in geography and institutions, not just feeds.
Finally, there’s a meta-warning for anyone building policy on this literature: high-profile social media research can carry industry ties that aren’t consistently disclosed, which means we need stronger norms and greater transparency if we want trustworthy science guiding high-stakes interventions.
Our current book recommendation is "Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano. You can find all the previous book reviews on our website. In this week's video, we have a talk on "Personalizing Explainable Recommendations with Multi-objective Contextual Bandits."
Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!
Semper discentes,
The D4S Team
"Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano is a clear-headed guide for anyone trying to turn “cool LLM demo” into an agent that can retrieve facts, use tools, and stay anchored to real information. Raieli and Iuculano keep the focus on what matters in practice. How RAG and knowledge graphs change the reliability profile of an agent, and when you need more structure than “just prompt it better.”
For data scientists and ML engineers, the best part is the build-oriented progression. It connects core concepts to concrete patterns—single-agent tool use, retrieval pipelines, and multi-agent coordination—without drowning you in theory. The examples feel like things you’d actually adapt into a prototype at work, and the overall framing consistently nudges you toward grounded, auditable behavior instead of vibes-based generation.
The tradeoff is breadth: if you already know transformers cold, some early sections may read like a warm-up, and the “production” angle is more of a practical starting line than a full MLOps reliability handbook. Still, as a one-stop map of modern agent building—especially where RAG and knowledge graphs stop being buzzwords and start being design choices—it’s an intense, usable read that tends to leave you with a short list of things you want to try next.
- OpenAI: The state of enterprise AI [openai.com]
- Without Benchmarking LLMs, You're Likely Overpaying 5-10x [karllorey.com]
- The assistant axis: situating and stabilizing the character of large language models [anthropic.com]
- The Microstructure of Wealth Transfer in Prediction Markets [jbecker.dev]
- AI Components for a Deterministic System (An Example) [domainlanguage.com]
- 2025 was the third hottest year on record [economist.com]
- NYSE develops tokenized securities platform to support 24/7 trading [theblock.co]
- Structured LLM outputs [nanonets.com]
- How malicious AI swarms can threaten democracy (D. T. Schroeder, M. Cha, A. Baronchelli, N. Bostrom, N. A. Christakis, D. Garcia, A. Goldenberg, Y. Kyrychenko, K. Leyton-Brown, N. Lutz, G. Marcus, F. Menczer, G. Pennycook, D. G. Rand, M. Ressa, F. Schweitzer, D. Song, C. Summerfield, A. Tang, J. J. V. Bavel, S. van der Linden, J. R. Kunst)
- Individual and neighborhood based socioeconomic factors relevant for contact behaviour and epidemic control (L. DiDomenico, M. L. Reichmuth, C. L. Althaus)
- Why AI Keeps Falling for Prompt Injection Attacks (B. Schneier, B. Raghavan)
-
Agentic Reasoning for Large Language Models (T. Wei, T.-W. Li, Z. Liu, X. Ning, Z. Yang, J. Zou, Z. Zeng, R. Qiu, X. Lin, D. Fu, Z. Li, M. Ai, D. Zhou, W. Bao, Y. Li, G. Li, C. Qian, Y. Wang, X. Tang, Y. Xiao, L. Fang, H. Liu, X. Tang, Y. Zhang, C. Wang, J. You, H. Ji, H. Tong, J. He)
-
Overcoming the Retrieval Barrier: Indirect Prompt Injection in the Wild for LLM Systems (H. Chang, E. Bao, X. Luo, T. Yu)
-
Reasoning Models Generate Societies of Thought (J. Kim, S. Lai, N. Scherrer, B. Agüera y Arcas, J. Evans)
-
Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG (A. Singh, A. Ehtesham, S. Kumar, T. T. Khoei)
-
Industry Influence in High-Profile Social Media Research (J. Bak-Coleman, J. West, C. O'Connor, C. T. Bergstrom)
-
The relationship between offline partisan geographical segregation and online partisan segregation (M. A. Brown, T. Ventura, J. A. Tucker, J. Nagler)
Personalizing Explainable Recommendations with Multi-objective Contextual Bandits
All the videos of the week are now available in our YouTube playlist.
Upcoming Events:
Opportunities to learn from us
On-Demand Videos:
Long-form tutorials
- Natural Language Processing 7h, covering basic and advanced techniques using NTLK and PyTorch.
- Python Data Visualization 7h, covering basic and advanced visualization with matplotlib, ipywidgets, seaborn, plotly, and bokeh.
- Times Series Analysis for Everyone 6h, covering data pre-processing, visualization, ARIMA, ARCH, and Deep Learning models.
|
|
|
|