Jul 4th

Next webinar:
Jul 9, 2025 - LangChain for Generative AI Pipelines

Dear Reader,

Welcome to the 4th of July edition of the Data Science Briefing! This month, we're celebrating the 6th anniversary of this humble newsletter, and as usual, we have a few surprises in store for you throughout July!

We're proud to announce that a brand new Data Visualization with Python on-demand video is now available on the O'Reilly website: Python Data Visualization: Create impactful visuals, animations and dashboards. This in depth tutorial is almost 7h in length and covers fundamental and advanced usage of matplotlib, seaborn, plotly and bokeh as well as tips on how to use Jupyter widgets. Check it out!

The latest blog post on the Epidemiology series is also out: Demographic Processes. In this post we explore how to include birth and death rates in your epidemik models. Check it out!

Before we dig in, we're proud to announce that the next edition of the

The "Generative Artificial Intelligence with the OpenAI API" webinar has been scheduled for September 3rd, and you can already register! Also, don't forget that we'll be discussing LangChain for Generative AI Pipelines next Wednesday, July 9th. There are only a few spots left! Register now so you don't miss out!

This week’s newsletter explores the evolving landscape of AI and software engineering, challenging some of the field’s most persistent assumptions. In “Writing Code Was Never The Bottleneck,” the author argues that the true hurdles in software development aren’t in producing code, but in the complex processes of code review, testing, and team communication—bottlenecks that have only become more pronounced as large language models accelerate code generation. Meanwhile, “There Are No New Ideas in AI… Only New Datasets” contends that the most significant leaps in AI have come not from novel algorithms, but from access to new and richer datasets, suggesting that future breakthroughs will depend on how we curate and utilize data rather than on conceptual innovation. On the technical frontier, “Context Engineering for Agents” highlights the emerging discipline of strategically managing the information fed to AI agents, likening it to optimizing a computer’s RAM to ensure agents have just the right context for each decision—a crucial skill as agents tackle increasingly complex, multi-step tasks. Finally, Anthropic’s deep dive into building a multi-agent research system reveals how orchestrating specialized AI agents in parallel can yield more comprehensive and reliable answers, but also introduces new challenges in coordination, cost, and evaluation.

From academia, we have a comprehensive empirical study that reveals that computer vision research is deeply intertwined with the development of surveillance technologies, with the majority of published work and patents focusing on extracting data about humans and normalizing surveillance as a core application of the field. Meanwhile, the quest for efficiency and order is exemplified by a breakthrough algorithm for organizing bookshelves, which approaches the theoretical limits of sorting and has far-reaching implications for data structures and file management in computing. On the architectural front, new work demonstrates that transformers can be rigorously understood as a form of graph neural network, unifying two of the most influential paradigms in deep learning and highlighting the hardware-driven advantages of dense attention mechanisms.

As the AI landscape shifts toward agentic systems, a compelling case is made for small language models (SLMs) as the future of agentic AI, arguing that SLMs are not only sufficiently powerful for specialized tasks but also more economical and operationally suitable than their larger counterparts, especially in heterogeneous agentic systems. Complementing these technical advances, a comprehensive survey of AI4Research maps the expanding role of AI across the scientific workflow, from literature comprehension and hypothesis generation to writing and peer review, and calls for greater interdisciplinarity, transparency, and explainability in research automation.

This week's book is "Behavioral Network Science: Language, Mind, and Society" by T. T. Hills. You can find all the previous book recommendations on our website. In this week's video, we feature a deep dive into AI prompt engineering.

Data shows that the best way for a newsletter to grow is by word of mouth, so if you think one of your friends or colleagues would enjoy this newsletter, go ahead and forward this email to them. This will help us spread the word!

Semper discentes,

The D4S Team

"Behavioral Network Science: Language, Mind, and Society" by T. T. Hills successfully bridges two distinct scientific domains, demonstrating how network analysis can reveal hidden patterns in human behavior. The book tackles an impressive scope of topics, from language evolution and childhood learning to cognitive aging, creativity, and social dynamics, while maintaining remarkable coherence throughout. What sets this work apart is Hills' commitment to practical application, equipping readers with concrete tools, including an introductory guide to network science and accompanying R code that enables hands-on analysis.

This practical approach makes the book uniquely valuable to a diverse audience. Behavioral scientists unfamiliar with network methods will find an accessible entry point, while data scientists can discover rich applications in behavioral research. Hills demonstrates particular skill in addressing contemporary social issues through a network lens, offering fresh perspectives on polarization, echo chambers, and conspiracy theories. The interdisciplinary framework proves especially powerful when examining how individual cognitive processes scale up to shape collective behavior and social structures.

The book's most significant achievement lies in its clarity without oversimplification. Hills effectively conveys complex concepts with precision while maintaining an engaging and accessible tone. This balance makes "Behavioral Network Science" essential reading for anyone seeking to understand how network structures influence human behavior across scales—from individual minds to entire societies.

Computer-vision research powers surveillance technology (P. R. Kalluri, W. Agnew, M. Cheng, K. Owens, L. Soldaini, A. Birhane)
An Algorithm for a Better Bookshelf (E. Klarreich)
Transformers are Graph Neural Networks (C. K. Joshi)
Universal pre-training by iterated random computation (P. Bloem)
Small Language Models are the Future of Agentic AI (P. Belcak, G. Heinrich, S. Diao, Y. Fu, X. Dong, S. Muralidharan, Y. C. Lin, P. Molchanov)
AI4Research: A Survey of Artificial Intelligence for Scientific Research (Q. Chen, M. Yang, L. Qin, J. Liu, Z. Yan, J. Guan, D. Peng, Y. Ji, H. Li, M. Hu, Y. Zhang, Y. Liang, Y. Zhou, J. Wang, Z. Chen, W. Che)
Patterns and Dynamics of Netflix TV Show Popularity (N. Lee, J. Lim, H.-C. Jeong)

AI prompt engineering: A deep dive

All the videos of the week are now available in our YouTube playlist.

Upcoming Events:

Opportunities to learn from us

Jul 9, 2025 - LangChain for Generative AI Pipelines [Register]
Sep 3, 2025 - Generative Artificial Intelligence with the OpenAI API for Developers [Register]

On-Demand Videos:

Long-form tutorials

Natural Language Processing 7h, covering basic and advanced techniques using NTLK and PyTorch.
Python Data Visualization 7h, covering basic and advanced visualization with matplotlib, ipywidgets, seaborn, plotly, and bokeh.
Times Series Analysis for Everyone 6h, covering data pre-processing, visualization, ARIMA, ARCH, and Deep Learning models.

Learn More

Unsubscribe

Data For Science, Inc

🥳🥂🥳 Data Science Briefing #284 🥳🥂🥳

Jul 4th

Upcoming Events:

On-Demand Videos:

Data Science Briefing #312

Data Science Briefing #311

Data Science Briefing #310