From the book cover:

“Data science happens in code. The ability to write reproducible, robust, scaleable code is key to a data science project’s success—and is absolutely essential for those working with production code. This practical book bridges the gap between data science and software engineering, and clearly explains how to apply the best practices from software engineering to data science.

Examples are provided in Python, drawn from popular packages such as NumPy and pandas. If you want to write better data science code, this guide covers the essential topics that are often missing from introductory data science or coding classes, including how to:

  • Understand data structures and object-oriented programming
  • Clearly and skillfully document your code
  • Package and share your code
  • Integrate data science code with a larger code base
  • Learn how to write APIs
  • Create secure code
  • Apply best practices to common tasks such as testing, error handling, and logging
  • Work more effectively with software engineers
  • Write more efficient, maintainable, and robust code in Python
  • Put your data science projects into production
  • And more…”

What experts say

"This book is the missing link data scientists have long sought, masterfully bridging the gap between data science and software engineering. It offers a clear, actionable guide that fills the crucial skill gap many data scientists face in software engineering, elevating their coding practices to new heights. Truly, this is the book we've been waiting for."

— Gabriela de Queiroz, Director of AI, Microsoft; Startup Advisor and Angel Investor

"Catherine's book demystifies how to scale your individual work to production capacity. Whether you are a data scientist, developer, or executive, she makes data services at scale accessible. From startup to massive corporate data, following her best practices will set your data projects up for success."

— Carol Willing, Core Developer of Python; 2017 ACM Software System Award recipient for Jupyter's lasting influence

"I love this book! It's the missing piece on every data scientist's shelf. For years, bootcamps, universities, and industry managers have been trying to get skilled scientists to function more like software engineers. No book bridges that gap, until this one."

— Shawn Ling Ramirez, CEO, eloraHQ

"Software Engineering for Data Scientists is a must read if you want to take your data science skills from ideas to fully implemented systems. It's a terrific guide to help you through the most important engineering aspects of coding. I wish I'd had this book years ago, it would have saved me countless hours! I thoroughly recommend it."

— Laurence Moroney, AI Advocacy Lead, Google

"Since its beginnings, data scientists have come from a wide variety of backgrounds in education and experience. While in many ways this has been a strength of the field, often data scientists lack the software engineering skills to work closely with peers from more traditional software development backgrounds. In this book, Catherine Nelson provides a much-needed bridge between the two disciplines, giving data scientists the knowledge to level up their own work and impact."

— Chris Albon, Director of Machine Learning, The Wikimedia Foundation