Data of an Unusual Size: A practical guide to analysis and interactive visualization of massive datasets 1

A hands-on tutorial on the fundamentals of big data computation from a practical usage lens. It covers distributed computing with Dask and interactive visualization+dashboards with hvPlot+Panel, while working on the cloud with real ~70GB of big data.

This tutorial is co-authored by Dharahs Pothina, my colleague at Quansight. :)

Community-first open source: An action plan! 1

A step-wise guide for creating community-driven projects, including repository management, contributor pathways, and governance principles, with real examples from our journey (at Quansight) transitioning a company-backed OSS project, Nebari, to be more community-oriented.

My colleague at Quansight, Tania Allard, helped prepare the talk and championed several initiatives discussed in the talk.

Collaboration Infrastructure In Data Science: Tools, Challenges, And Best Practices

PyLadiesCon 2023 · 2nd December

A talk sharing tools, principles, and best practices for collaboration while using PyData libraries, with a focus on infrastructure like Jupyter and conda tools, and a discussion about some collaboration-related gaps and potential solutions in our ecosystem.

PyLadiesCon is a fully online conference, and I presented this talk in the APAC track. :)

🔗 Recording

Ensuring runtime reproducibility in the Python ecosystem

PyData Global 2023 · 8th December

A talk about how to proactively think about reproducibility while working on Python projects. It discusses general best practices and dives into a tool, conda-store, built around ensuring reproducibility.

My colleague at Quansight, Jaime Rodríguez-Guerra, co-authored this talk.

PyData Global is a fully online conference, and I presented this talk in the General track.

🔗 Recording

  1. I couldn’t travel to some the conferences listed here due to visa-related issue, so my colleagues at Quansight graciously helped present it. 🌻 ↩︎ ↩︎