Data Science Briefing #303


(view in browser)

Jan 28st

Next webinar:
Feb 11, 2026 - NLP with PyTorch [Register]
Count down to 2026-02-11T18:00:00.000Z

Dear Reader,

Welcome to the Jan 28th issue of our newsletter! If you’re building NLP models and want a practical, modern workflow, join the next edition of the NLP with PyTorch O’Reilly webinar. We’ll cover core patterns you can reuse (data → training → evaluation) with hands-on examples. Registration is already open, so don't miss out!

Across orgs, enterprise AI is quickly shifting from “pilot projects” to a daily work habit, but the gains aren’t evenly distributed: the most advanced users are generating ~6× more activity than median colleagues, and the biggest reported time savings show up when people apply AI across more kinds of work (not just one narrow use case). Netflix’s engineering write-up shows what “deep integration” can look like in practice: taking knowledge-graph search that used to require structured queries and making it feel conversational by turning natural-language questions into validated graph queries so federated data becomes searchable without everyone learning a new language.

On the research frontier, a genomics model now reads up to a million DNA letters at once and predicts regulatory behavior, helping score how specific variants might change gene control in the vast non-coding regions where many disease signals hide. Deployments stumble when incentives, feedback loops, and evaluation are hand-waved, and agentic systems become outright risky when they combine private data + untrusted inputs + outbound communication, a setup that can be exploited to siphon sensitive information. Zooming out, one recent long-form argument frames this moment as technology’s adolescence, warning that much more capable systems could plausibly arrive within a few years and stress-test governance at every level.

Across ecology, epidemiology, and AI, a common theme is that structure quietly sets the rules of the game. In plant communities, it depends on where individuals end up: spatial clustering can soften or intensify competition, reshaping diversity by changing which neighbors interact and how often. In social systems, that same “who-meets-whom” logic governs whether groups converge or fragment. Methodologically, this is pushing modelers beyond vanilla graphs: work on hypergraph percolation shows that when interactions involve groups (not just pairs), directionality and heterogeneous nodes can shift the tipping points of system-wide change, while temporal community detection warns that what looks like a clean pattern may be a statistical mirage with small and large communities are not being inferred equally well as networks evolve.

Meanwhile, on the prediction side, physics-informed graph neural networks are showing how to model complex dynamics without “cheating” the laws of motion—by hardwiring conservation of linear and angular momentum, they learn more stable, transferable representations of interacting parts. And the stakes aren’t abstract: projections of climate-driven shifts in malaria risk across Africa underscore how environmental change reroutes disease through coupled human–vector–climate networks, while connectome research suggests that general intelligence may reflect an architecture that efficiently integrates and routes information across specialized modules.

Our current book recommendation is "Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano. You can find all the previous book reviews on our website. In this week's video, we discuss whether agents will replace search teams.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team


"Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano is a clear-headed guide for anyone trying to turn “cool LLM demo” into an agent that can retrieve facts, use tools, and stay anchored to real information. Raieli and Iuculano keep the focus on what matters in practice. How RAG and knowledge graphs change the reliability profile of an agent, and when you need more structure than “just prompt it better.”

For data scientists and ML engineers, the best part is the build-oriented progression. It connects core concepts to concrete patterns—single-agent tool use, retrieval pipelines, and multi-agent coordination—without drowning you in theory. The examples feel like things you’d actually adapt into a prototype at work, and the overall framing consistently nudges you toward grounded, auditable behavior instead of vibes-based generation.

The tradeoff is breadth: if you already know transformers cold, some early sections may read like a warm-up, and the “production” angle is more of a practical starting line than a full MLOps reliability handbook. Still, as a one-stop map of modern agent building—especially where RAG and knowledge graphs stop being buzzwords and start being design choices—it’s an intense, usable read that tends to leave you with a short list of things you want to try next.


  1. OpenAI: The state of enterprise AI [openai.com]
  2. The AI Evolution of Graph Search [netflixtechblog.com]
  3. AlphaGenome: AI for better understanding the genome [deepmind.google]
  4. How to avoid common AI pitfalls in the workplace [economist.com]
  5. Anthropic Economic Index report: Economic primitives [anthropic.com]
  6. The lethal trifecta for AI agents: private data, untrusted content, and external communication [simonwillison.net]
  7. Dario Amodei — The Adolescence of Technology [www.darioamodei.com]
  8. Adoption of electric vehicles tied to real-world reductions in air pollution, study finds [keck.usc.edu]


Will agents replace search teams?

video preview

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

On-Demand Videos:

Long-form tutorials

Data For Science, Inc

I'm a maker and blogger who loves to talk about technology. Subscribe and join over 3,000+ newsletter readers every week!

Read more from Data For Science, Inc

(view in browser) Mar 25th Next webinar: Apr 22, 2026 - LangChain for Generative AI Pipelines [Register] Dear Reader, Welcome to the March 18th issue of our newsletter! Announcements We’re excited to announce the official relaunch of the Data For Science website! At Data4Sci, our goal has always been to bridge the gap between complex data and actionable intelligence. Our revamped site makes it easier than ever to explore how we help teams build reliable, production-ready AI—from RAG and...

(view in browser) Mar 18th Next webinar: Mar 18, 2026 - Gemini API with VertexAI for Developers [Register] Dear Reader, Welcome to the March 18th issue of our newsletter! This week’s links sketch a field that is growing up fast: foundation models are no longer a blur of interchangeable systems but a crowded design space of distinct architectures and tradeoffs, while “agentic engineering” is hardening into a real discipline with its own maturity curve, from simple autocomplete to background...

(view in browser) Mar 11th Next webinar: Mar 18, 2026 - Gemini API with VertexAI for Developers [Register] Dear Reader, Welcome to the 308th issue of our newsletter! Announcements The first edition of the CrewAI for Production-Ready Multi‑Agent Systems was a great success and we're already planning on the next edition. Meanwhile, if you missed out on the live session, I put a package together on Gumroad so you can work through it at your own pace. It's five Jupyter notebooks walking through...