Data Science Briefing #303


(view in browser)

Jan 28st

Next webinar:
Feb 11, 2026 - NLP with PyTorch [Register]
Count down to 2026-02-11T18:00:00.000Z

Dear Reader,

Welcome to the Jan 28th issue of our newsletter! If you’re building NLP models and want a practical, modern workflow, join the next edition of the NLP with PyTorch O’Reilly webinar. We’ll cover core patterns you can reuse (data → training → evaluation) with hands-on examples. Registration is already open, so don't miss out!

Across orgs, enterprise AI is quickly shifting from “pilot projects” to a daily work habit, but the gains aren’t evenly distributed: the most advanced users are generating ~6× more activity than median colleagues, and the biggest reported time savings show up when people apply AI across more kinds of work (not just one narrow use case). Netflix’s engineering write-up shows what “deep integration” can look like in practice: taking knowledge-graph search that used to require structured queries and making it feel conversational by turning natural-language questions into validated graph queries so federated data becomes searchable without everyone learning a new language.

On the research frontier, a genomics model now reads up to a million DNA letters at once and predicts regulatory behavior, helping score how specific variants might change gene control in the vast non-coding regions where many disease signals hide. Deployments stumble when incentives, feedback loops, and evaluation are hand-waved, and agentic systems become outright risky when they combine private data + untrusted inputs + outbound communication, a setup that can be exploited to siphon sensitive information. Zooming out, one recent long-form argument frames this moment as technology’s adolescence, warning that much more capable systems could plausibly arrive within a few years and stress-test governance at every level.

Across ecology, epidemiology, and AI, a common theme is that structure quietly sets the rules of the game. In plant communities, it depends on where individuals end up: spatial clustering can soften or intensify competition, reshaping diversity by changing which neighbors interact and how often. In social systems, that same “who-meets-whom” logic governs whether groups converge or fragment. Methodologically, this is pushing modelers beyond vanilla graphs: work on hypergraph percolation shows that when interactions involve groups (not just pairs), directionality and heterogeneous nodes can shift the tipping points of system-wide change, while temporal community detection warns that what looks like a clean pattern may be a statistical mirage with small and large communities are not being inferred equally well as networks evolve.

Meanwhile, on the prediction side, physics-informed graph neural networks are showing how to model complex dynamics without “cheating” the laws of motion—by hardwiring conservation of linear and angular momentum, they learn more stable, transferable representations of interacting parts. And the stakes aren’t abstract: projections of climate-driven shifts in malaria risk across Africa underscore how environmental change reroutes disease through coupled human–vector–climate networks, while connectome research suggests that general intelligence may reflect an architecture that efficiently integrates and routes information across specialized modules.

Our current book recommendation is "Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano. You can find all the previous book reviews on our website. In this week's video, we discuss whether agents will replace search teams.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team


"Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano is a clear-headed guide for anyone trying to turn “cool LLM demo” into an agent that can retrieve facts, use tools, and stay anchored to real information. Raieli and Iuculano keep the focus on what matters in practice. How RAG and knowledge graphs change the reliability profile of an agent, and when you need more structure than “just prompt it better.”

For data scientists and ML engineers, the best part is the build-oriented progression. It connects core concepts to concrete patterns—single-agent tool use, retrieval pipelines, and multi-agent coordination—without drowning you in theory. The examples feel like things you’d actually adapt into a prototype at work, and the overall framing consistently nudges you toward grounded, auditable behavior instead of vibes-based generation.

The tradeoff is breadth: if you already know transformers cold, some early sections may read like a warm-up, and the “production” angle is more of a practical starting line than a full MLOps reliability handbook. Still, as a one-stop map of modern agent building—especially where RAG and knowledge graphs stop being buzzwords and start being design choices—it’s an intense, usable read that tends to leave you with a short list of things you want to try next.


  1. OpenAI: The state of enterprise AI [openai.com]
  2. The AI Evolution of Graph Search [netflixtechblog.com]
  3. AlphaGenome: AI for better understanding the genome [deepmind.google]
  4. How to avoid common AI pitfalls in the workplace [economist.com]
  5. Anthropic Economic Index report: Economic primitives [anthropic.com]
  6. The lethal trifecta for AI agents: private data, untrusted content, and external communication [simonwillison.net]
  7. Dario Amodei — The Adolescence of Technology [www.darioamodei.com]
  8. Adoption of electric vehicles tied to real-world reductions in air pollution, study finds [keck.usc.edu]


Will agents replace search teams?

video preview

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

On-Demand Videos:

Long-form tutorials

Data For Science, Inc

I'm a maker and blogger who loves to talk about technology. Subscribe and join over 3,000+ newsletter readers every week!

Read more from Data For Science, Inc

(view in browser) May 13th Next webinar: May 27, 2026 - Code Development with AI Assistants [Register] Dear Reader, Announcements Ever wonder how we can turn thousands of unstructured news articles into structured, actionable insights? In the latest post from Data4Sci, we dive into the fascinating process of transforming raw text from news articles into interconnected networks of information. If you're interested in Natural Language Processing (NLP), entity extraction, and how to connect the...

(view in browser) May 6th Next webinar: May 27, 2026 - Code Development with AI Assistants [Register] Dear Reader, Announcements Ever wonder how we can turn thousands of unstructured news articles into structured, actionable insights? In the latest post from Data4Sci, we dive into the fascinating process of transforming raw text from news articles into interconnected networks of information. If you're interested in Natural Language Processing (NLP), entity extraction, and how to connect the...

(view in browser) Apr 30th Next webinar: May 6, 2026 - Automate the Boring Developer Stuff with LLMs [Register] Dear Reader, Announcements ✈️ Mapping the skies: How do we visualize airline traffic between states? We often think of air travel in terms of airports, but viewing it as a network of state-to-state connections reveals fascinating patterns in how our country moves. Our latest substack uses data visualization to turn raw statistics into a clear story about infrastructure and mobility....