Talks
Speaking engagements on data science, software engineering, and machine learning.
2026
PyCon US · with Robert Masson
A hands-on tutorial for anyone who wants to turn Jupyter Notebook code into something robust and deployable. Covers refactoring notebook code into clean Python scripts, writing and running tests with pytest, and building an API with FastAPI — including how to use AI effectively as part of the process.
Tutorial Python Jupyter Notebooks Production Code FastAPI Best Practices
Women in Data Science Puget Sound
AI comes with a significant environmental cost — training and serving models consumes vast amounts of electricity, water, and raw materials, with AI already accounting for around 15% of data center energy usage. Covers the latest data on AI's environmental impact from OpenAI, Google, and Anthropic, what providers aren't disclosing, and practical techniques including prompt optimization, model selection, and caching to reduce both environmental footprint and costs.
Talk AI Sustainability Data Science LLMs Ethics
University of Washington (Invited Seminar)
An invited seminar covering why software engineering best practices matter in data science: what makes code "good", how to write readable and reproducible code, how to structure a project effectively, how to share code for others to use, and how automating workflows fits in with modern AI coding tools.
Talk Data Science Software Engineering Python Best Practices
2025
PyData Seattle · with Robert Masson
A practical introduction to refactoring notebooks into production-ready Python scripts. We covered the differences between notebook and production code, principles of writing modular code, and how to write simple unit tests.
Tutorial Data Science Jupyter Notebooks Production Code Best Practices
Talk Python Podcast
Techniques and tools to move your data science game from local notebooks to full-on production workflows — covering what changes when you go from exploratory work to systems that run reliably in production.
Podcast Python Production Code Jupyter Notebooks
PyCon US
Strategies, tools, and skills that make it easy to refactor notebook code into production-ready systems. Jupyter Notebooks make it easy to start exploring data, but eventually you'll want to turn your exploratory code into something more robust and automated. Covers practical techniques for moving from exploration to scalable, reproducible code.
Talk Python Jupyter Notebooks Production Code Best Practices
2024
Software Engineering Radio
The collaboration between data scientists and software engineers — covering the role of a data scientist, the difference between notebook experiments and automated production pipelines, and the role of software engineering in data science.
Podcast Machine Learning Software Engineering Data Science
MLOps Podcast
As data science projects grow and increasingly include deploying ML models, software engineering fundamentals — modularity, readability, reproducibility — matter more than ever. How to apply SWE principles even in the exploratory, open-ended nature of DS work.
Podcast MLOps Software Engineering Best Practices
2023
Women in Data Science Puget Sound
Data scientists write a lot of code, but most of us didn't start as software engineers. Covers software engineering principles — efficiency, readability, modularity, simplicity, and robustness — and how to apply them to data science work, illustrated with examples from pandas, NumPy, and scikit-learn.
Talk Data Science Software Engineering Python Best Practices
People of AI Podcast
Career journey from geophysicist to principal data scientist — pivoting into ML, setting the standard for building machine learning pipelines, and why it all starts with how you prepare and train your data.
Podcast Career Machine Learning Data Science
2020
PyData Global · with Hannes Hapke
When your machine learning model is deployed to a production system, this is a critical time: your model starts to interact with real people. This is the perfect moment to check that your model's predictions aren't showing any harmful biases. Covers how to build a production pipeline using the TensorFlow ecosystem that includes ways to identify and reduce harmful impacts.
Talk ML Engineering Ethics Fairness TensorFlow
Scale By The Bay · with Hannes Hapke
Production machine learning systems fail in unexpected ways — data shifts, preprocessing mismatches, and misleading metrics. Covers automated ML pipelines using TensorFlow Extended and Kubeflow Pipelines, including data validation, preprocessing, training, in-depth model analysis, and automated orchestration to produce consistent, fair models.
Talk ML Engineering TensorFlow Kubeflow MLOps Fairness
TWIML AI Podcast
A live debate bringing together advocates for several popular programming languages to make their best arguments. Representing Python — covering the ways different languages shape thinking, and the strengths and weaknesses of different approaches.
Podcast Python Machine Learning
PyCon US
A tour of the landscape of Python tools for privacy-preserving machine learning, from federated learning (where data stays on users' devices) to training on encrypted data. Reviews what works, what doesn't, and how tools like TensorFlow Privacy, TensorFlow Encrypted, and PySyft fit into real-world ML pipelines, covering the tradeoffs of each approach and touching on the ethics of using personal data.
Talk Machine Learning Data Privacy TensorFlow Ethics
2018
Grace Hopper Conference
An exploration of data anonymization techniques from a data scientist's perspective, covering practical approaches to protecting privacy while preserving data utility.
Talk Data Science Data Privacy Ethics