Dec 17th

Next webinar:
Jan 21, 2025 - LangChain for Generative AI Pipelines

Dear Reader,

Welcome to the Dec 17th edition of the Data Science Briefing!

This week, we're proud to announce three new webinars coming up next year. On Feb 11, 2026, we return to our long running series NLP with PyTorch, followed by two brand new webinar series. Code Development with AI Assistants will take place on Feb 18, followed right after by CrewAI for Production-Ready Multi‑Agent Systems. Registrations for all three events are already open, so go ahead and Sign Up!

A quiet throughline in this week’s links is how much modern “intelligence” (human, machine, or market) depends on systems that make messy reality replayable.

One deep dive shows exchanges forcing fairness by funneling chaotic, geographically distributed order flow through a sequencer that assigns monotonically increasing IDs, turning the order book into an append-only event log that can be deterministically replayed (and snapshotted) for recovery, audit, and downstream analytics. That same obsession with determinism shows up in the way a tiny embedded database earns its reputation: multiple independent test harnesses, differential testing against other engines, fault-injection for out-of-memory and I/O failures, crash simulation, and fuzzing at enormous scale—all in service of high coverage and “fail safely” behavior.

Read through that lens, the AI pieces land as a valuable corrective to vibe-based forecasting: one argues that “AGI” narratives routinely ignore the physical limits and exponential resource costs that govern computation, pushing attention back toward practical diffusion and incremental gains. Another warns that capability isn’t the same as humanness (similar outputs can mask fundamentally different constraints in data, time, and algorithms), which matters for evaluation and safety.

Even the neuroscience angle echoes it: language can look like thought from the outside, yet function more like a specialized parser that interfaces with meaning systems elsewhere. And for builders who care less about philosophy and more about shipping, Google's new Gemini 3.0 Flash model release bets hard on pushing the speed/quality frontier so strongly that reasoning doesn’t automatically mean high latency.

On a similar note, one way to read our latest academic finds is as a tour of trust under constraints: in careers, in money, in networks, and in the tools we use to write about all three. A short, bracing essay reframes academic life as work rather than identity, arguing that de-romanticizing the job can reduce pressure while creating clearer boundaries and structural solutions. That same “treat it like a system, not a soul” lens shows up in a deep primer on stablecoins, which breaks the category into mechanisms (how pegs are maintained), plumbing (issuance/redemption), and the failure modes that matter when a token is asked to behave like cash.

On the security side, a network-science study models deception-based cyber threats spreading over time-varying social graphs and tests targeted interventions, reminding us that timing, heterogeneity, and interaction structure can dominate naive “just moderate harder” instincts. Meanwhile, some LLM-agent papers push toward a more physics-flavored view of autonomy: one proposes evolving “contexts as playbooks” to avoid drift and collapse during self-improvement, while another finds evidence of detailed-balance-like structure in agent state transitions, hinting that agent behavior may be analysable with the same macro tools we use for other complex systems.

And if you want this all to ship, not just sparkle, an in-editor multi-agent writing system plus a RAG evaluation framework underlie the practical thesis: good agents need tight tooling, versioned context, and metrics, not vibes.

Our current book recommendation is "Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano. You can find all the previous book reviews on our website. This week's video compares GPU vs. CPU Parallel Computing in a beginner-friendly approach.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team

"Building AI Agents with LLMs, RAG, and Knowledge Graphs" by S. Raieli and G. Iuculano is a clear-headed guide for anyone trying to turn “cool LLM demo” into an agent that can retrieve facts, use tools, and stay anchored to real information. Raieli and Iuculano keep the focus on what matters in practice. How RAG and knowledge graphs change the reliability profile of an agent, and when you need more structure than “just prompt it better.”

For data scientists and ML engineers, the best part is the build-oriented progression. It connects core concepts to concrete patterns—single-agent tool use, retrieval pipelines, and multi-agent coordination—without drowning you in theory. The examples feel like things you’d actually adapt into a prototype at work, and the overall framing consistently nudges you toward grounded, auditable behavior instead of vibes-based generation.

The tradeoff is breadth: if you already know transformers cold, some early sections may read like a warm-up, and the “production” angle is more of a practical starting line than a full MLOps reliability handbook. Still, as a one-stop map of modern agent building—especially where RAG and knowledge graphs stop being buzzwords and start being design choices—it’s an intense, usable read that tends to leave you with a short list of things you want to try next.

Academia is just a job (L. Raffington)
Understanding Stablecoins (T. Adrian, P. Bains, M. Bechara, E. M. Cerutti, S. Forte, F. Grinberg, A. Gullo, M. Hengge, K. Kao, T. Mancini-Griffoli, S. M. Peria, M. Miccoli, M. Reuter, N. Sugimoto)
Controlling the spread of deception-based cyber-threats on time-varying networks (N. Gozzi, N. Perra)
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models (Q. Zhang, C. Hu, S. Upasani, B. Ma, F. Hong, V. Kamanuru, J. Rainton, C. Wu, M. Ji, H. Li, U. Thakker, J. Zou, K. Olukotun)
Detailed balance in large language model-driven agents (Z.-Y. Song, Q.-H. Cao, M. Luo, H. X. Zhu)
PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing (J. Hou, A. L. Huikai, N. Chen, Y. Gong, B. He)
Ragas: Automated Evaluation of Retrieval Augmented Generation (S. Es, J. James, L. Espinosa-Anke, S. Schockaert)

GPU vs CPU Parallel Computing for Beginners

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

Jan 21, 2026 - LangChain for Generative AI Pipelines [Register]
Feb 11, 2026 - NLP with PyTorch [Register]
Feb 18, 2026 - Code Development with AI Assistants [Register]
Feb 19, 2026 - CrewAI for Production-Ready Multi‑Agent Systems [Register]

On-Demand Videos:

Long-form tutorials

Natural Language Processing 7h, covering basic and advanced techniques using NTLK and PyTorch.
Python Data Visualization 7h, covering basic and advanced visualization with matplotlib, ipywidgets, seaborn, plotly, and bokeh.
Times Series Analysis for Everyone 6h, covering data pre-processing, visualization, ARIMA, ARCH, and Deep Learning models.

Learn More

Unsubscribe

Data For Science, Inc

Data Science Briefing #299

Dec 17th

Upcoming Events:

On-Demand Videos:

🎉 Data Science Briefing #300 🎉

Data Science Briefing #298

Data Science Briefing #297