Data Science Briefing #304


(view in browser)

Feb 4th

Next webinar:
Feb 11, 2026 - NLP with PyTorch [Register]
Count down to 2026-02-11T18:00:00.000Z

Dear Reader,

Welcome to the 304th issue of our newsletter! If you’re building NLP models and want a practical, modern workflow, join the next edition of the NLP with PyTorch O’Reilly webinar. We’ll cover core patterns you can reuse (data → training → evaluation) with hands-on examples. Registration is already open, so don't miss out!

Recent weeks have seen us moving from “chat with a model” to “delegate work to an agent,” and that shift is starting to expose both pros and cons. On the practical end, step-by-step guides show how people are turning a spare machine into an “AI colleague”, a personal bot that can live inside everyday apps and start handling real workflows. But as agents become more social and autonomous, the risks scale too: an AI-only social network meant for bots to trade tips and chatter surfaced how quickly vibe-built systems can ship without basic security guardrails.

Google's Research on agent scaling lands on the same lesson from another angle: coordination isn’t automatically better, and in multi-step tasks, a single mistake can cascade unless you design for overhead, error propagation, and the reality of tool use. Zoom out and you can see a widening gap between “power users” and everyone else, while low-code promises to broaden access even as it shifts what “programming” means.

Underneath it all is the uncomfortable plumbing: the race for training data is still messy, with fresh reporting describing large-scale book scanning that raises hard questions about consent, ownership, and what we’re willing to destroy to build smarter machines. Finally, the policy pendulum is swinging too, with proposals to restrict kids’ social media access framed as a response to an “uncontrolled experiment”—a reminder that the human systems around the tech are now part of the product.

This week’s research stack reminds us that “intelligence” is a property you have to operationalize, measure, and stress-test in the kinds of systems we actually deploy. One paper argues that many headline claims about human-level AI collapse once you separate fluent output from robust competence, and once you ask what counts as evidence rather than vibes. That demand for rigor echoes the quiet, unglamorous point behind statistical power: if your evaluation can’t reliably detect meaningful differences, you’ll keep mistaking noise for progress (or vice versa).

Meanwhile, several works push toward richer models of how complex behavior emerges from simple parts: network “motifs” that shape stability in ecosystems, multiscale simulations that explain why outbreak risk can accelerate even when local signals look calm, and cell-and-tissue modeling that treats biology less like a diagram and more like interacting machines and agents.

The next leap won’t come from bigger monologues, but from better multi-agent worlds where we can run experiments, quantify failure modes, and learn how coordination breaks.

Our current book recommendation is "Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano. You can find all the previous book reviews on our website. In this week's video, we learn how to Generate 3D City Models from OpenStreetMap (OSM) with Python.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team


"Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano is a clear-headed guide for anyone trying to turn “cool LLM demo” into an agent that can retrieve facts, use tools, and stay anchored to real information. Raieli and Iuculano keep the focus on what matters in practice. How RAG and knowledge graphs change the reliability profile of an agent, and when you need more structure than “just prompt it better.”

For data scientists and ML engineers, the best part is the build-oriented progression. It connects core concepts to concrete patterns—single-agent tool use, retrieval pipelines, and multi-agent coordination—without drowning you in theory. The examples feel like things you’d actually adapt into a prototype at work, and the overall framing consistently nudges you toward grounded, auditable behavior instead of vibes-based generation.

The tradeoff is breadth: if you already know transformers cold, some early sections may read like a warm-up, and the “production” angle is more of a practical starting line than a full MLOps reliability handbook. Still, as a one-stop map of modern agent building—especially where RAG and knowledge graphs stop being buzzwords and start being design choices—it’s an intense, usable read that tends to leave you with a short list of things you want to try next.


  1. How to Set Up Openclaw — Step by Step guide to setup a personal bot [medium.com/modelmind]
  2. Moltbook: When AI Agents Get Their Own Social Network [gradientflow.com]
  3. Low-Code and the Democratization of Programming [oreilly.com]
  4. Two kinds of AI users are emerging. The gap between them is astonishing. [martinalderson.com]
  5. Anthropic ‘destructively’ scanned millions of books to build Claude [www.washingtonpost.com]
  6. Towards a science of scaling agent systems: When and why agent systems work [research.google]
  7. Finland looks to end "uncontrolled human experiment" with Australia-style ban on social media [yle.fi]


Generate 3D City Models from OpenStreetMap (OSM) with Python

video preview

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

On-Demand Videos:

Long-form tutorials

Data For Science, Inc

I'm a maker and blogger who loves to talk about technology. Subscribe and join over 3,000+ newsletter readers every week!

Read more from Data For Science, Inc

(view in browser) Mar 25th Next webinar: Apr 22, 2026 - LangChain for Generative AI Pipelines [Register] Dear Reader, Welcome to the March 18th issue of our newsletter! Announcements We’re excited to announce the official relaunch of the Data For Science website! At Data4Sci, our goal has always been to bridge the gap between complex data and actionable intelligence. Our revamped site makes it easier than ever to explore how we help teams build reliable, production-ready AI—from RAG and...

(view in browser) Mar 18th Next webinar: Mar 18, 2026 - Gemini API with VertexAI for Developers [Register] Dear Reader, Welcome to the March 18th issue of our newsletter! This week’s links sketch a field that is growing up fast: foundation models are no longer a blur of interchangeable systems but a crowded design space of distinct architectures and tradeoffs, while “agentic engineering” is hardening into a real discipline with its own maturity curve, from simple autocomplete to background...

(view in browser) Mar 11th Next webinar: Mar 18, 2026 - Gemini API with VertexAI for Developers [Register] Dear Reader, Welcome to the 308th issue of our newsletter! Announcements The first edition of the CrewAI for Production-Ready Multi‑Agent Systems was a great success and we're already planning on the next edition. Meanwhile, if you missed out on the live session, I put a package together on Gumroad so you can work through it at your own pace. It's five Jupyter notebooks walking through...