Data Science Briefing #307


(view in browser)

Feb 25th

Next webinar:
Mar 18, 2026 - Gemini API with VertexAI for Developers
Count down to 2026-03-18T17:00:00.000Z

Dear Reader,

Welcome to the Feb 25th issue of our newsletter!

Announcements

The first edition of the CrewAI for Production-Ready Multi‑Agent Systems was a great success and we're already planning on the next edition. Meanwhile, if you missed out on the live session, I put a package together on Gumroad so you can work through it at your own pace.

It's five Jupyter notebooks walking through everything from the basics of CrewAI agents through production-grade patterns: configuration-driven agents, memory across sessions, human approval checkpoints, multi-LLM routing, and structured logging. Each module is self-contained, runs against real APIs, and tested during a Live Training session.

Details here!

This week’s links connect the steady migration from “AI as a chat box” to “AI as a practical engineering substrate.” One piece sits at the frontier: it explores why math is such a brutal benchmark for generative models and why even imperfect systems can still be transformative as collaborators that search, check, and remix ideas at scale. On the builder side, a hands-on walkthrough demystifies reinforcement learning with human feedback by spelling out the moving parts in a way that makes “alignment” stop being magic and become code you can read.

Another guide frames retrieval-augmented generation as a maturity ladder: you start with the toy version, then quickly discover that real reliability comes from unglamorous decisions like chunking strategy, hybrid retrieval, and reranking (plus evaluation that punishes confident nonsense). There’s even a delightfully concrete detour into graph pruning for game maps, showing how simple constraints (stay connected, avoid degenerate paths) and a better search strategy can turn “design intuition” into an algorithm. And finally, the hardware story keeps accelerating: a “thinking” model small enough to run fully on-device under 1GB hints at a near future where private, offline reasoning becomes a default product feature instead of a luxury add-on.

This week’s papers sketch a sobering picture of “smart” systems operating inside messy social worlds: even when a model looks like it’s reasoning, its performance can collapse as problems get more complex, suggesting we should treat chain-of-thought fluency as a surface signal, not a guarantee of reliable cognition. That brittleness shows up in the wild too: sustained, multi-turn interaction can quietly derail a model’s internal state, turning helpful assistants into confident wanderers unless we build guardrails, memory, and evaluation suites that stress-test long-horizon coherence. Layer on top the question of moral competence, and the challenge becomes less “did it answer?” and more “did it answer responsibly, consistently, and for the right reasons across contexts?”

Meanwhile, platform dynamics remind us that outputs don’t land in a vacuum: ranking algorithms can reshape political exposure at scale, and coordinated “chaos agents” can exploit attention systems, narrative incentives, and model weaknesses to push confusion as a strategy. Even our favorite fallback, crowdsourcing, isn’t immune: network topology can systematically distort collective perception, amplifying local consensus into global certainty. The picture that is becoming clear is that if we want trustworthy AI, we need measurement that spans reasoning difficulty, conversational durability, ethical judgment, privacy-preserving data practices (like federated approaches with local differential privacy), and the surrounding information ecosystem that ultimately determines what people see, share, and believe.

Our current book recommendation is "Visualizing Generative AI: How AI Paints, Writes, and Assists" by P. Vergadia and V. Lakshmanan. You can find all the previous book reviews on our website. In this week's video, we have an overview of Hamming codes. The origin of error correction.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team


"Visualizing Generative AI: How AI Paints, Writes, and Assists" by P. Vergadia and V. Lakshmanan is a concept-first, diagram-rich guide that makes modern GenAI feel legible. Priyanka Vergadia’s visual explanations are the star: clean mental models for tokens, embeddings, transformers, and “why the model says what it says,” without burying you in math. It’s the kind of book that helps you keep the whole system in your head fast.

For data scientists and ML engineers, the best value is the shared vocabulary it builds for real-world conversations: architecture tradeoffs, where GenAI fits in products, and what it’s actually good at today (assistive workflows, automation, and augmentation more than magic). It also doesn’t dodge the sharp edges, such as hallucinations, security concerns, and practical limitations, so you’re not left with a glossy, hype-only view.

The main drawback is depth: if you want rigorous internals, training dynamics, evaluation deep dives, or extensive code and end-to-end implementation details, this isn’t the book for you. But as a quick, sticky mental map, something you can read in a weekend and keep referencing when you’re designing, reviewing, or educating stakeholders, it’s a very strong pick, and likely to earn a spot on your “worth recommending” shelf.


  1. The Edge of Mathematics [theatlantic.com]
  2. RLHF From Scratch [github.com/ashworks1706]
  3. Graph Topology and Battle Royale Mechanics [blog.lukesalamone.com]
  4. RAG Systems in 5 Levels of Difficulty [medium.com/data-science-collective]
  5. Claude Code in Action [anthropic.skilljar.com]
  6. LFM2.5-1.2B-Thinking: On-Device Reasoning Under 1GB [liquid.ai]


But what are Hamming codes? The origin of error correction

video preview

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

On-Demand Videos:

Long-form tutorials

Data For Science, Inc

I'm a maker and blogger who loves to talk about technology. Subscribe and join over 3,000+ newsletter readers every week!

Read more from Data For Science, Inc

(view in browser) Mar 5th Next webinar: Mar 18, 2026 - Gemini API with VertexAI for Developers [Register] Dear Reader, Welcome to the 308th issue of our newsletter! Announcements The first edition of the CrewAI for Production-Ready Multi‑Agent Systems was a great success and we're already planning on the next edition. Meanwhile, if you missed out on the live session, I put a package together on Gumroad so you can work through it at your own pace. It's five Jupyter notebooks walking through...

(view in browser) Feb 18th Next webinar: Feb 19, 2026 - CrewAI for Production-Ready Multi‑Agent Systems [Register] Dear Reader, Welcome to the 306th issue of our newsletter! Our schedule this week is completely packed with opportunities to skill up with us. We just finished the first edition of Code Development with AI Assistants and are already getting ready to scale complex automations with CrewAI for Production-Ready Multi-Agent Systems tomorrow, February 19. Don't leave your seat to...

(view in browser) Feb 11th Next webinar: Feb 18, 2026 - Code Development with AI Assistants [Register] Dear Reader, Welcome to the Feb 11th issue of our newsletter! The clock is ticking, and our schedule for next week is completely packed with opportunities to skill up with us. Whether you're looking to turbocharge your workflow during Code Development with AI Assistants on February 18 or ready to scale complex automations with CrewAI for Production-Ready Multi-Agent Systems on February 19,...