Feb 4th

Next webinar:
Feb 11, 2026 - NLP with PyTorch [Register]

Dear Reader,

Welcome to the 304th issue of our newsletter! If you’re building NLP models and want a practical, modern workflow, join the next edition of the NLP with PyTorch O’Reilly webinar. We’ll cover core patterns you can reuse (data → training → evaluation) with hands-on examples. Registration is already open, so don't miss out!

Recent weeks have seen us moving from “chat with a model” to “delegate work to an agent,” and that shift is starting to expose both pros and cons. On the practical end, step-by-step guides show how people are turning a spare machine into an “AI colleague”, a personal bot that can live inside everyday apps and start handling real workflows. But as agents become more social and autonomous, the risks scale too: an AI-only social network meant for bots to trade tips and chatter surfaced how quickly vibe-built systems can ship without basic security guardrails.

Google's Research on agent scaling lands on the same lesson from another angle: coordination isn’t automatically better, and in multi-step tasks, a single mistake can cascade unless you design for overhead, error propagation, and the reality of tool use. Zoom out and you can see a widening gap between “power users” and everyone else, while low-code promises to broaden access even as it shifts what “programming” means.

Underneath it all is the uncomfortable plumbing: the race for training data is still messy, with fresh reporting describing large-scale book scanning that raises hard questions about consent, ownership, and what we’re willing to destroy to build smarter machines. Finally, the policy pendulum is swinging too, with proposals to restrict kids’ social media access framed as a response to an “uncontrolled experiment”—a reminder that the human systems around the tech are now part of the product.

This week’s research stack reminds us that “intelligence” is a property you have to operationalize, measure, and stress-test in the kinds of systems we actually deploy. One paper argues that many headline claims about human-level AI collapse once you separate fluent output from robust competence, and once you ask what counts as evidence rather than vibes. That demand for rigor echoes the quiet, unglamorous point behind statistical power: if your evaluation can’t reliably detect meaningful differences, you’ll keep mistaking noise for progress (or vice versa).

Meanwhile, several works push toward richer models of how complex behavior emerges from simple parts: network “motifs” that shape stability in ecosystems, multiscale simulations that explain why outbreak risk can accelerate even when local signals look calm, and cell-and-tissue modeling that treats biology less like a diagram and more like interacting machines and agents.

The next leap won’t come from bigger monologues, but from better multi-agent worlds where we can run experiments, quantify failure modes, and learn how coordination breaks.

Our current book recommendation is "Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano. You can find all the previous book reviews on our website. In this week's video, we learn how to Generate 3D City Models from OpenStreetMap (OSM) with Python.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team

"Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano is a clear-headed guide for anyone trying to turn “cool LLM demo” into an agent that can retrieve facts, use tools, and stay anchored to real information. Raieli and Iuculano keep the focus on what matters in practice. How RAG and knowledge graphs change the reliability profile of an agent, and when you need more structure than “just prompt it better.”

For data scientists and ML engineers, the best part is the build-oriented progression. It connects core concepts to concrete patterns—single-agent tool use, retrieval pipelines, and multi-agent coordination—without drowning you in theory. The examples feel like things you’d actually adapt into a prototype at work, and the overall framing consistently nudges you toward grounded, auditable behavior instead of vibes-based generation.

The tradeoff is breadth: if you already know transformers cold, some early sections may read like a warm-up, and the “production” angle is more of a practical starting line than a full MLOps reliability handbook. Still, as a one-stop map of modern agent building—especially where RAG and knowledge graphs stop being buzzwords and start being design choices—it’s an intense, usable read that tends to leave you with a short list of things you want to try next.

Does AI already have human-level intelligence? The evidence is clear (E. K. Chen, M. Belkin, L. Bergen, D. Danks)
Functional motifs in food webs and networks (M. Habermann, A. K. Fahimipour, J. D. Yeakel, T. Gross)
The Importance of Statistical Power Calculations (D. P. Turner, T. T. Houle)
Computational models of cells and tissues: Machines, agents and fungal infection (M. Holcombe)
Reuse of Public Keys Across UTXO and Account-Based Cryptocurrencies (R Stütz, N. Stifter, M. Dragaschnig, B. Haslhofer, A. Judmayer)
Multiscale Modelling Reveals Accelerating Community Outbreak Risks of Measles in the United States (S. Chen, A. I. Bento)
Generative Agents: Interactive Simulacra of Human Behavior (J. S. Park, J. C. O'Brien, C. J. Cai, M. R. Morris, P. Liang, M. S. Bernstein)

Generate 3D City Models from OpenStreetMap (OSM) with Python

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

Feb 11, 2026 - NLP with PyTorch [Register]
Feb 18, 2026 - Code Development with AI Assistants [Register]
Feb 19, 2026 - CrewAI for Production-Ready Multi‑Agent Systems [Register]

On-Demand Videos:

Long-form tutorials

Natural Language Processing 7h, covering basic and advanced techniques using NTLK and PyTorch.
Python Data Visualization 7h, covering basic and advanced visualization with matplotlib, ipywidgets, seaborn, plotly, and bokeh.
Times Series Analysis for Everyone 6h, covering data pre-processing, visualization, ARIMA, ARCH, and Deep Learning models.

Learn More

Unsubscribe

Data For Science, Inc

Data Science Briefing #304

Feb 4th

Upcoming Events:

On-Demand Videos:

Data Science Briefing #311

Data Science Briefing #310

Data Science Briefing #309