Data Science Briefing #299


(view in browser)

Dec 17th

Next webinar:
Jan 21, 2025 - LangChain for Generative AI Pipelines
Count down to 2026-01-21T18:00:00.000Z

Dear Reader,

Welcome to the Dec 17th edition of the Data Science Briefing!

This week, we're proud to announce three new webinars coming up next year. On Feb 11, 2026, we return to our long running series NLP with PyTorch, followed by two brand new webinar series. Code Development with AI Assistants will take place on Feb 18, followed right after by CrewAI for Production-Ready Multi‑Agent Systems. Registrations for all three events are already open, so go ahead and Sign Up!

A quiet throughline in this week’s links is how much modern “intelligence” (human, machine, or market) depends on systems that make messy reality replayable.

One deep dive shows exchanges forcing fairness by funneling chaotic, geographically distributed order flow through a sequencer that assigns monotonically increasing IDs, turning the order book into an append-only event log that can be deterministically replayed (and snapshotted) for recovery, audit, and downstream analytics. That same obsession with determinism shows up in the way a tiny embedded database earns its reputation: multiple independent test harnesses, differential testing against other engines, fault-injection for out-of-memory and I/O failures, crash simulation, and fuzzing at enormous scale—all in service of high coverage and “fail safely” behavior.

Read through that lens, the AI pieces land as a valuable corrective to vibe-based forecasting: one argues that “AGI” narratives routinely ignore the physical limits and exponential resource costs that govern computation, pushing attention back toward practical diffusion and incremental gains. Another warns that capability isn’t the same as humanness (similar outputs can mask fundamentally different constraints in data, time, and algorithms), which matters for evaluation and safety.

Even the neuroscience angle echoes it: language can look like thought from the outside, yet function more like a specialized parser that interfaces with meaning systems elsewhere. And for builders who care less about philosophy and more about shipping, Google's new Gemini 3.0 Flash model release bets hard on pushing the speed/quality frontier so strongly that reasoning doesn’t automatically mean high latency.

On a similar note, one way to read our latest academic finds is as a tour of trust under constraints: in careers, in money, in networks, and in the tools we use to write about all three. A short, bracing essay reframes academic life as work rather than identity, arguing that de-romanticizing the job can reduce pressure while creating clearer boundaries and structural solutions. That same “treat it like a system, not a soul” lens shows up in a deep primer on stablecoins, which breaks the category into mechanisms (how pegs are maintained), plumbing (issuance/redemption), and the failure modes that matter when a token is asked to behave like cash.

On the security side, a network-science study models deception-based cyber threats spreading over time-varying social graphs and tests targeted interventions, reminding us that timing, heterogeneity, and interaction structure can dominate naive “just moderate harder” instincts. Meanwhile, some LLM-agent papers push toward a more physics-flavored view of autonomy: one proposes evolving “contexts as playbooks” to avoid drift and collapse during self-improvement, while another finds evidence of detailed-balance-like structure in agent state transitions, hinting that agent behavior may be analysable with the same macro tools we use for other complex systems.

And if you want this all to ship, not just sparkle, an in-editor multi-agent writing system plus a RAG evaluation framework underlie the practical thesis: good agents need tight tooling, versioned context, and metrics, not vibes.

Our current book recommendation is "Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano. You can find all the previous book reviews on our website. This week's video compares GPU vs. CPU Parallel Computing in a beginner-friendly approach.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team


"Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano is a clear-headed guide for anyone trying to turn “cool LLM demo” into an agent that can retrieve facts, use tools, and stay anchored to real information. Raieli and Iuculano keep the focus on what matters in practice. How RAG and knowledge graphs change the reliability profile of an agent, and when you need more structure than “just prompt it better.”

For data scientists and ML engineers, the best part is the build-oriented progression. It connects core concepts to concrete patterns—single-agent tool use, retrieval pipelines, and multi-agent coordination—without drowning you in theory. The examples feel like things you’d actually adapt into a prototype at work, and the overall framing consistently nudges you toward grounded, auditable behavior instead of vibes-based generation.

The tradeoff is breadth: if you already know transformers cold, some early sections may read like a warm-up, and the “production” angle is more of a practical starting line than a full MLOps reliability handbook. Still, as a one-stop map of modern agent building—especially where RAG and knowledge graphs stop being buzzwords and start being design choices—it’s an intense, usable read that tends to leave you with a short list of things you want to try next.


  1. How Exchanges Turn Order Books into Distributed Logs [quant.engineering]
  2. Why AGI Will Not Happen [timdettmers.com]
  3. We’ve finally cracked how to make truly random numbers [newscientist.com]
  4. The Polyglot Neuroscientist Resolving How the Brain Parses Language [quantamagazine.org]
  5. How SQLite Is Tested [sqlite.org]
  6. Gemini 3 Flash: frontier intelligence built for speed [blog.google]
  7. AI Capability isn't Humanness [research.roundtable.ai]


GPU vs CPU Parallel Computing for Beginners

video preview

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

On-Demand Videos:

Long-form tutorials

Data For Science, Inc

I'm a maker and blogger who loves to talk about technology. Subscribe and join over 3,000+ newsletter readers every week!

Read more from Data For Science, Inc

(view in browser) Jan 7th Next webinar: Jan 21, 2026 - LangChain for Generative AI Pipelines [Register] Dear Reader, Welcome to the 300th issue of our newsletter and the first one of 2026! Our warmest wishes for a New Year that brings everything you hope for. We kick off the year by tracing a path from fundamentals to flywheels: on the systems side, there’s a practical case for pushing vector graphics rendering off the CPU and onto the GPU. TRMs BigQuery notes read like a reminder that...

(view in browser) Nov 28th Next webinar: Dec 10, 2025 - Claude API for Python Developers [Register] Dear Reader, Welcome to the Thanksgiving edition of the Data Science Briefing! We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib,...

(view in browser) Nov 21st Next webinar: Dec 10, 2025 - Claude API for Python Developers [Register] Dear Reader, Welcome to the 297th edition of the Data Science Briefing! We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib, seaborn,...