Welcome to the Sept 4th edition of the Data Science Briefing!
We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib, seaborn, plotly and bokeh as well as tips on how to use Jupyter widgets. Check it out!
The latest blog post on the Epidemiology series is also out:
Demographic Processes. In this post we explore how to include birth and death rates in your epidemik models. Check it out!
From nuts-and-bolts scaling to policy turbulence, this week’s picks favor results over rhetoric: “How To Scale Your Model” turns TPU/GPU lore into actionable guidance on parallelism and communication bottlenecks for training and inference at scale, a must-read if you’re pushing beyond a single node.
Graph Transformers get a clean, industry-grounded primer arguing that attention over graphs is fast becoming the default for structured data in finance, bio, and recsys. On the ground, a staff engineer’s six-week sprint with Claude Code provides a pragmatic playbook for embracing the “95% garbage” first pass, then iterating with tighter prompts, specifications, and tests.
On the research front, we explore the through-line separating true novelty from comfortable echoes. One team quantifies how LLMs fall into “plot templates,” proposing metrics and mitigations for story diversity, which is nicely complemented by work that measures what models actually memorize and how deduplication and extraction risk shape deployments.
On the learning side, “active reading” frames factual acquisition as targeted retrieval and synthesis rather than brute-force scale. At the same time, a sweeping survey of physical neural networks argues that pushing compute into photonics and other analog substrates can buy massive efficiency.
Outside the lab, a clever mobility study teases apart geography from human choice, showing how much of our movement is constrained by the map versus our willingness to explore; a 65-year analysis of U.S. charts similarly suggests the road to cultural “hits” is steeper in a long-tail attention economy. And in the background, a multi-country longitudinal study on education and brain aging nudges us toward nuance: schooling seems to raise the baseline of cognitive performance more than it slows the slope of decline—an uncomfortable but clarifying distinction for anyone betting on interventions to move the needle.
Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!
Semper discentes,
The D4S Team
Michael Lanham's book, "AI Agents in Action", is a practical guide for developers who want to build autonomous AI agents using large language models (LLMs) and open-source frameworks. The book focuses on real-world engineering rather than abstract theory, offering a step-by-step approach to building agent architectures, managing multi-agent systems, and using LLMs to solve business problems. It's written for developers and technical professionals who have the necessary foundational skills in Python and want to move from theoretical knowledge to hands-on development.
The book's strength lies in its gradual layering of complexity, starting with basic concepts and moving to advanced topics like multi-agent orchestration and prompt engineering. Lanham uses open-source tools like CrewAI, AutoGen, and Nexus, and includes annotated code examples to help readers follow along. This approach effectively bridges the gap between academic theory and practical development, making it a valuable toolkit for machine learning engineers who want to create production-ready solutions for tasks like workflow automation and customer service bots. The book also provides insightful commentary on integrating key components like memory and feedback loops into agent-based systems.
However, the book has some notable limitations. A major critique is its optimistic portrayal of the tools and techniques, often overlooking critical discussions about their limitations, trade-offs, and performance at scale. It focuses on illustrative projects rather than addressing issues of robustness and reliability, which are crucial for high-stakes, enterprise-grade deployments. Another drawback is the lack of extended use cases or full-scale system integration examples, which would provide a more complete understanding of an agent system's lifecycle, maintenance, and long-term performance in a real-world business environment.
Training of physical neural networks (A. Momeni, B. Rahmani, B. Scellier, L. G. Wright, P. L. McMahon, C. C. Wanjura, Y. Li, A. Skalli, N. G. Berloff, T. Onodera, I. Oguz, F. Morichetti, P. del Hougne, M LeGallo, A. Sebastian, A. Mirhoseini, C. Zhang, D. Marković, D. Brunner, C. Moser, S. Gigan, F. Marquardt, A. Ozcan, J. Grollier, A. J. Liu, D. Psaltis, A. Alù, R. Fleury)
(view in browser) Sept 18th Next webinar: Oct 1, 2025 - LLMs for Data Science [Register] Dear Reader, Welcome to the 292nd edition of the Data Science Briefing! We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib, seaborn, plotly and...
(view in browser) Sept 10th Next webinar: Sep 17, 2025 - Machine Learning with PyTorch for Developers [Register] Dear Reader, Welcome to the 292nd edition of the Data Science Briefing! We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib,...
(view in browser) Aug 27th Next webinar: Sep 3, 2025 - Generative Artificial Intelligence with the OpenAI API for Developers [Register] Dear Reader, Welcome to the 290th edition of the Data Science Briefing! We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and...