Data Science Briefing #293


(view in browser)

Sept 18th

Next webinar:
Oct 1, 2025 - LLMs for Data Science [Register]
Count down to 2025-10-01T17:00:00.000Z

Dear Reader,

Welcome to the 292nd edition of the Data Science Briefing!

We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib, seaborn, plotly and bokeh as well as tips on how to use Jupyter widgets. Check it out!

The latest blog post on the Epidemiology series is also out: Demographic Processes. In this post we explore how to include birth and death rates in your epidemik models. Check it out!

This week’s picks show AI maturing from flashy demos to hard choices about governance, infrastructure, and pedagogy: funders are now piloting algorithms to triage grant proposals, promising speed, but raising fresh worries about bias and opacity in how “promising science” gets defined. In the classroom, Google’s “Learn Your Way” turns static textbooks into adaptive, interactive study paths and reports improved learning outcomes in early tests by Google Research.

On the policy front, Meredith Whittaker warns that agentic AI could erode privacy and competition by normalizing cross-app data access and automation without robust safeguards. Meanwhile, the era of unchecked scraping looks to be ending as publishers and infrastructure providers push licensing regimes and emerging standards, tightening access to training data.

Inside academia, Nature flags a new tool spotting undisclosed LLM-generated text in manuscripts and peer reviews—evidence that disclosure norms still lag practice. And under the hood, engineers are tackling inference nondeterminism itself, tracing variance to concurrency and floating-point quirks and proposing ways to make outputs reproducible at scale.

This week’s research traces a provocative arc from “making models think” to managing the messy worlds they enter: reinforcement learning is being retooled to reward multi-step reasoning rather than short-horizon shortcuts, while a comprehensive survey maps the emerging toolbox for Large Reasoning Models to push reliability beyond prompt tricks.

On the foundations side, new bounds on optimal time estimation in stochastic processes suggest hard limits for any system attempting to maintain stable clocks, particularly for inference scheduling and reproducible evaluation. Biology presents both promise and peril, as genome language models can now co-design viable bacteriophages, foreshadowing programmable therapeutics and raising thorny biosecurity questions.

Socially, the line between utility and intimacy blurs as large-scale analyses of “AI companionship” communities reveal attachment patterns that product teams and policymakers can’t ignore. Meanwhile, adversaries exploit the same cognitive hooks: pig-butchering scams follow a predictable lifecycle ripe for automated detection, and “LLM hacking” exposes how seemingly benign annotation workflows can be subverted by prompt injection and data poisoning.

The overall takeaway: better reasoning isn’t enough; we need robust timing, safety, and governance to match the new capabilities.

Our current book recommendation is Mark Carrigan’s "Generative AI for Academics". You can find all the previous book reviews on our website. In this week's video, we have a video comparing Machine Learning vs Human Learning and how They’re Not Alike.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team


Mark Carrigan’s "Generative AI for Academics" is a brisk, sensible map for using LLMs in scholarly life. It avoids both hype and doom, treating generative AI as a set of tools that demand judgment, not blind adoption. The tone is practical and reflective—ideal for faculty, PIs, and grad students who need shared language and guardrails.

The book shines in how it organizes academic work (Thinking, Collaborating, Communicating, Engaging), then pairs each with concrete practices (rubber-ducking, draft refinement, critical oversight). It isn’t a prompt cookbook or a windy manifesto; it’s a clear framework for responsible use, culture-setting, and policy discussions in departments and labs.

Data scientists and ML engineers will find valuable takeaways for literature synthesis, design reviews, code docs, and stakeholder comms. But if you want model internals, rigorous eval protocols, threat modeling, or MLOps patterns, the book skims the surface. Bottom line: keep it close for norms, ethics, and mentoring; pair it with technical playbooks when you need depth.


  1. AI enters the grant game, picking winners [science.org]
  2. Learn Your Way: Reimagining textbooks with generative AI [research.google]
  3. What Meta learned from Galactica, the doomed model launched two weeks before ChatGPT [venturebeat.com]
  4. AI agents are coming for your privacy, warns Meredith Whittaker [economist.com]
  5. AI-Scraping Free-for-All by OpenAI, Google, and Meta Is Over [nymag.com]
  6. AI tool detects LLM-generated text in research papers and peer reviews [nature.com]
  7. Defeating Nondeterminism in LLM Inference [thinkingmachines.ai]


Machine Learning vs Human Learning: They’re Not Alike

video preview

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

On-Demand Videos:

Long-form tutorials

Data For Science, Inc

I'm a maker and blogger who loves to talk about technology. Subscribe and join over 3,000+ newsletter readers every week!

Read more from Data For Science, Inc

(view in browser) Sept 10th Next webinar: Sep 17, 2025 - Machine Learning with PyTorch for Developers [Register] Dear Reader, Welcome to the 292nd edition of the Data Science Briefing! We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib,...

(view in browser) Sept 4th Next webinar: Sep 17, 2025 - Machine Learning with PyTorch for Developers [Register] Dear Reader, Welcome to the Sept 4th edition of the Data Science Briefing! We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of...

(view in browser) Aug 27th Next webinar: Sep 3, 2025 - Generative Artificial Intelligence with the OpenAI API for Developers [Register] Dear Reader, Welcome to the 290th edition of the Data Science Briefing! We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and...