Welcome to the 280th issue of the Data Science Briefing! We're happy to be back after a much-needed time off.
We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib, seaborn, plotly and bokeh as well as tips on how to use Jupyter widgets. Check it out!
The latest blog post on the Epidemiology series is also out:
Demographic Processes. In this post we explore how to include birth and death rates in your epidemik models. Check it out!
From trust in classrooms to the nuts and bolts of next-gen text models, this week’s links trace a through-line: AI only earns its keep when we surface its blind spots. A data-rich report on Nigerian classrooms shows an adaptive-tutoring pilot compressing two years of learning into six weeks. A forensic tour of Anthropic’s newly published Claude 4 system prompt pulls back the curtain on vendor self-regulation: the prompt reads like a catalogue of past model misbehaviors, capped with concrete tips on safe prompting and a standing order not to be a sycophant. On the technical frontier, we explore how diffusion language models can reduce latency by generating blocks of tokens in parallel, yet still struggle with long contexts and chain-of-thought reasoning.
Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!
Semper discentes,
The D4S Team
"Prompt Engineering for LLMs" by J. Berryman and A. Ziegler is an essential resource for anyone working with large language models. The authors expertly position prompt engineering not merely as writing effective prompts but as a crucial component throughout the entire application development lifecycle. By balancing technical depth with practical accessibility, they create a guide that serves both newcomers and experienced practitioners in the rapidly evolving AI landscape.
The book's greatest strength lies in its practical techniques, which go beyond basic prompt crafting. Readers will discover innovative approaches, such as using log probabilities to quantitatively assess completion quality, generating multiple outputs at varying temperatures, and structuring prompts with multiple roles to enhance focus and relevance. Particularly valuable is the "Little Red Riding Hood Principle," which emphasizes aligning prompts with a model's training patterns to achieve optimal responses.
Beyond techniques, Berryman and Ziegler offer crucial insights into real-world application strategies, including how teams like GitHub Copilot incorporate user feedback for continuous improvement. The authors skillfully explain complex concepts like tokenization and auto-regressive generation while maintaining accessibility for developers who might otherwise struggle with the non-human communication style of LLMs. This balanced approach makes the book an indispensable guide for anyone aiming to build robust, efficient LLM-powered applications in today's AI-driven technological environment.
(view in browser) May 13th Next webinar: May 27, 2026 - Code Development with AI Assistants [Register] Dear Reader, Announcements Ever wonder how we can turn thousands of unstructured news articles into structured, actionable insights? In the latest post from Data4Sci, we dive into the fascinating process of transforming raw text from news articles into interconnected networks of information. If you're interested in Natural Language Processing (NLP), entity extraction, and how to connect the...
(view in browser) May 6th Next webinar: May 27, 2026 - Code Development with AI Assistants [Register] Dear Reader, Announcements Ever wonder how we can turn thousands of unstructured news articles into structured, actionable insights? In the latest post from Data4Sci, we dive into the fascinating process of transforming raw text from news articles into interconnected networks of information. If you're interested in Natural Language Processing (NLP), entity extraction, and how to connect the...
(view in browser) Apr 30th Next webinar: May 6, 2026 - Automate the Boring Developer Stuff with LLMs [Register] Dear Reader, Announcements ✈️ Mapping the skies: How do we visualize airline traffic between states? We often think of air travel in terms of airports, but viewing it as a network of state-to-state connections reveals fascinating patterns in how our country moves. Our latest substack uses data visualization to turn raw statistics into a clear story about infrastructure and mobility....