Data Science Briefing #308


(view in browser)

Mar 5th

Next webinar:
Mar 18, 2026 - Gemini API with VertexAI for Developers
Count down to 2026-03-18T17:00:00.000Z

Dear Reader,

Welcome to the 308th issue of our newsletter!

Announcements

The first edition of the CrewAI for Production-Ready Multi‑Agent Systems was a great success and we're already planning on the next edition. Meanwhile, if you missed out on the live session, I put a package together on Gumroad so you can work through it at your own pace.

It's five Jupyter notebooks walking through everything from the basics of CrewAI agents through production-grade patterns: configuration-driven agents, memory across sessions, human approval checkpoints, multi-LLM routing, and structured logging. Each module is self-contained, runs against real APIs, and tested during a Live Training session.

Details here!

This week’s newsletter connects “how it works” and “why it matters.” On the delightfully nerdy end, one piece tackles the question of whether data has weight, walking through why solid-state drives can (in theory) get infinitesimally heavier as you write to them, even if it’s the kind of “measurement” that mostly lives in physics-trivia land.

On the sobering end, a deep dive into tens of thousands of kernel issues asks a more human question: who tends to introduce bugs, when they slip in, and what review dynamics seem to help catch them faster. If you’re building with coding agents, another guide usefully reframes “prompting” as engineering: break work into checkable steps, force explicit plans, and design tight feedback loops so the agent can self-correct instead of confidently wandering. For hands-on learning, there are two excellent interactive explainers: one that makes decision trees feel concrete (splits, entropy, overfitting) and another that visually demystifies diffusion by turning noisy static into an image you can see emerge.

And finally, a timely reminder that capability cuts both ways: new research shows how LLM-driven pipelines can scale deanonymization by stitching together identity signals scattered across platforms.

On the academic front, the thread running through this week’s papers is that “systems” don’t just enact behaviors, but rather compute, coordinate, and leave fingerprints you can measure. One study uses cause-specific excess mortality to reveal how the pandemic’s toll in rural India didn’t arrive as a single wave, but as a shifting mix of direct COVID deaths and knock-on effects that standard reporting often misses.

Zooming out, another paper asks what it even means for a system to compute, pushing past the metaphor of “everything is information” toward a more careful accounting of who is doing the computing, what counts as input/output, and where the boundaries of inference actually sit. That same boundary-setting shows up in AI: work on discovering multi-agent learning algorithms with large language models treats LLMs less as oracles and more as search engines over strategy space.

But once agents start interacting at scale, their society becomes an object of study too: early social-network analysis of agent communities reveals emergent roles, clustering, and influence patterns that look uncannily familiar, except that the “people” are programs. In parallel, research on the statistical signature of LLMs suggests that generated text carries detectable regularities, which matters for everything from provenance to governance: if synthetic content is measurable, so is its diffusion through a network. And hovering over the whole stack are papers on economic complexity and on bitcoin as a quasi-religious techno-libertarian project, both of which point to a shared lesson: incentives and beliefs are part of the system dynamics, shaping what gets built, what gets adopted, and what narratives become “common sense” long before the metrics catch up.

Our current book recommendation is "Visualizing Generative AI: How AI Paints, Writes, and Assists" by P. Vergadia and V. Lakshmanan. You can find all the previous book reviews on our website. In this week's video, we compare AI Agent frameworks: AutoGen, CrewAI, and LangGraph.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team


"Visualizing Generative AI: How AI Paints, Writes, and Assists" by P. Vergadia and V. Lakshmanan is a concept-first, diagram-rich guide that makes modern GenAI feel legible. Priyanka Vergadia’s visual explanations are the star: clean mental models for tokens, embeddings, transformers, and “why the model says what it says,” without burying you in math. It’s the kind of book that helps you keep the whole system in your head fast.

For data scientists and ML engineers, the best value is the shared vocabulary it builds for real-world conversations: architecture tradeoffs, where GenAI fits in products, and what it’s actually good at today (assistive workflows, automation, and augmentation more than magic). It also doesn’t dodge the sharp edges, such as hallucinations, security concerns, and practical limitations, so you’re not left with a glossy, hype-only view.

The main drawback is depth: if you want rigorous internals, training dynamics, evaluation deep dives, or extensive code and end-to-end implementation details, this isn’t the book for you. But as a quick, sticky mental map, something you can read in a weekend and keep referencing when you’re designing, reviewing, or educating stakeholders, it’s a very strong pick, and likely to earn a spot on your “worth recommending” shelf.


  1. The largest open-source humanized voice library [github.com/jaymunshi]
  2. Does Data Really Have Weight? [cubiclenate.com]
  3. Who Writes the Bugs? A Deeper Look at 125,000 Kernel Vulnerabilities [pebblebed.com]
  4. Agentic Engineering Patterns [simonwillison.net]
  5. Decision Trees [mlu-explain.github.io]
  6. From Noise to Image - Interactive guide to diffusion [lighthousesoftware.co.uk]
  7. The Complete Guide to Building Skills for Claude [resources.anthropic.com]
  8. An interactive intro to quadtrees [growingswe.com]
  9. Agent of Empires: A terminal session manager for AI coding agents on Linux and macOS. [github.com/njbrake]
  10. Large-Scale Online Deanonymization with LLMs [simonlermen.substack.com]


AutoGen vs CrewAI vs LangGraph

video preview

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

On-Demand Videos:

Long-form tutorials

Data For Science, Inc

I'm a maker and blogger who loves to talk about technology. Subscribe and join over 3,000+ newsletter readers every week!

Read more from Data For Science, Inc

(view in browser) Feb 25th Next webinar: Mar 18, 2026 - Gemini API with VertexAI for Developers [Register] Dear Reader, Welcome to the Feb 25th issue of our newsletter! Announcements The first edition of the CrewAI for Production-Ready Multi‑Agent Systems was a great success and we're already planning on the next edition. Meanwhile, if you missed out on the live session, I put a package together on Gumroad so you can work through it at your own pace. It's five Jupyter notebooks walking...

(view in browser) Feb 18th Next webinar: Feb 19, 2026 - CrewAI for Production-Ready Multi‑Agent Systems [Register] Dear Reader, Welcome to the 306th issue of our newsletter! Our schedule this week is completely packed with opportunities to skill up with us. We just finished the first edition of Code Development with AI Assistants and are already getting ready to scale complex automations with CrewAI for Production-Ready Multi-Agent Systems tomorrow, February 19. Don't leave your seat to...

(view in browser) Feb 11th Next webinar: Feb 18, 2026 - Code Development with AI Assistants [Register] Dear Reader, Welcome to the Feb 11th issue of our newsletter! The clock is ticking, and our schedule for next week is completely packed with opportunities to skill up with us. Whether you're looking to turbocharge your workflow during Code Development with AI Assistants on February 18 or ready to scale complex automations with CrewAI for Production-Ready Multi-Agent Systems on February 19,...