Welcome to the August 8th edition of the Data Science Briefing!
We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib, seaborn, plotly and bokeh as well as tips on how to use Jupyter widgets. Check it out!
The latest blog post on the Epidemiology series is also out:
Demographic Processes. In this post we explore how to include birth and death rates in your epidemik models. Check it out!
This week’s newsletter features game-changing advances and fresh debates from the heart of AI research. Anthropic’s new work on persona vectors offers precise ways to monitor and control the character traits of large language models, helping developers intervene before harmful personality shifts occur and enabling proactive alignment to human values.
Meanwhile, Google’s MLE-STAR agent sets a new bar for machine learning automation, outperforming rivals by orchestrating web search with fine-grained code refinement to tackle diverse ML tasks and win medals in 63% of Kaggle competitions. DeepMind unveils AlphaEarth Foundations, an AI-powered virtual satellite that integrates petabytes of Earth observation data, mapping land and sea changes across the globe on demand and providing an unmatched resource for climate, agriculture, and development analysis.
On the data front, researchers warn that a major AI training set contains millions of images with sensitive personal details, reigniting privacy concerns over web-scraped data and the urgent need for responsible dataset curation. Also, organizations are rapidly adopting Google’s Gemini Embedding model, harnessing context engineering and retrieval-augmented generation to achieve notable gains in accuracy, efficiency, and multilingual support for enterprise AI applications.
This week’s academic round-up spans insightful breakthroughs across psychology, neurology, epidemiology, and artificial intelligence. The psychophysics of style recasts style perception as an active parsing of form from content, revealing new psychophysical phenomena that shape how we experience and categorize visual aesthetics. In global aging research, Fjell et al. overturn a long-held tenet by demonstrating that higher levels of formal education do not slow cognitive decline or brain aging, while education confers initial advantages in memory and brain structure, its effects do not buffer the inevitable trajectory of age-related decline, underscoring the nuanced role of early-life factors in lifelong cognitive health.
Human mobility emerges as central in epidemic modeling, as Lu and co-authors synthesize cutting-edge data-driven methodologies that integrate real-world movement into predictions, leading to transformative improvements in risk assessment, contact tracing, and public health strategy. Finally, “Your Brain on ChatGPT” warns that habitual outsourcing of cognitive work to AI assistants can foster cognitive debt, where users gain fast results but suffer diminished neural engagement, memory, and creativity, reframing the convenience of AI as potentially costly over the long term.
Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!
Semper discentes,
The D4S Team
Michael Lanham's book, "AI Agents in Action", is a practical guide for developers who want to build autonomous AI agents using large language models (LLMs) and open-source frameworks. The book focuses on real-world engineering rather than abstract theory, offering a step-by-step approach to building agent architectures, managing multi-agent systems, and using LLMs to solve business problems. It's written for developers and technical professionals who have the necessary foundational skills in Python and want to move from theoretical knowledge to hands-on development.
The book's strength lies in its gradual layering of complexity, starting with basic concepts and moving to advanced topics like multi-agent orchestration and prompt engineering. Lanham uses open-source tools like CrewAI, AutoGen, and Nexus, and includes annotated code examples to help readers follow along. This approach effectively bridges the gap between academic theory and practical development, making it a valuable toolkit for machine learning engineers who want to create production-ready solutions for tasks like workflow automation and customer service bots. The book also provides insightful commentary on integrating key components like memory and feedback loops into agent-based systems.
However, the book has some notable limitations. A major critique is its optimistic portrayal of the tools and techniques, often overlooking critical discussions about their limitations, trade-offs, and performance at scale. It focuses on illustrative projects rather than addressing issues of robustness and reliability, which are crucial for high-stakes, enterprise-grade deployments. Another drawback is the lack of extended use cases or full-scale system integration examples, which would provide a more complete understanding of an agent system's lifecycle, maintenance, and long-term performance in a real-world business environment.
Human Mobility in Epidemic Modeling (X. Lu, J. Feng, S. Lai, P. Holme, S. Liu, Z. Du, X. Yuan, S. Wang, Y. Li, X. Zhang, Y. Bai, X. Duan, W. Mei, H. Yu, S. Tan, F. Liljeros)
(view in browser) May 13th Next webinar: May 27, 2026 - Code Development with AI Assistants [Register] Dear Reader, Announcements Ever wonder how we can turn thousands of unstructured news articles into structured, actionable insights? In the latest post from Data4Sci, we dive into the fascinating process of transforming raw text from news articles into interconnected networks of information. If you're interested in Natural Language Processing (NLP), entity extraction, and how to connect the...
(view in browser) May 6th Next webinar: May 27, 2026 - Code Development with AI Assistants [Register] Dear Reader, Announcements Ever wonder how we can turn thousands of unstructured news articles into structured, actionable insights? In the latest post from Data4Sci, we dive into the fascinating process of transforming raw text from news articles into interconnected networks of information. If you're interested in Natural Language Processing (NLP), entity extraction, and how to connect the...
(view in browser) Apr 30th Next webinar: May 6, 2026 - Automate the Boring Developer Stuff with LLMs [Register] Dear Reader, Announcements ✈️ Mapping the skies: How do we visualize airline traffic between states? We often think of air travel in terms of airports, but viewing it as a network of state-to-state connections reveals fascinating patterns in how our country moves. Our latest substack uses data visualization to turn raw statistics into a clear story about infrastructure and mobility....