Taking Machine Learning to Production with New Features in MLflow
Matei Zaharia
Assistant Professor of Computer Science Original Creator of Apache Spark & MLflow, Databricks
Deploying and operating machine learning applications is challenging because they are highly dependent on input data and can fail in complex ways. Problems such as training/inference differences in data format, data skew, and misconfigured software environments can easily sneak into a production application and impact its quality. To address these types of problems, organizations are adopting ML Platform software and MLOps practices specifically for managing machine learning applications.
In this talk, I’ll present some of the latest functionality added for productionizing machine learning in MLflow, the popular open source machine learning platform started by Databricks in 2018. These include built-in support for model management and review using the Model Registry, APIs for automatic Continuous Integration and Delivery (CI/CD), model schemas to catch differences in a model’s expected data format, and integration with model explainability tools. I’ll also talk about other work happening in the open source MLflow community, including deep integration with PyTorch and its growing ecosystem of model productionization tools.
Demo: CI/CD and MLOps with MLflow
Kasey Uhlenhuth
Sr Product Manager, Machine Learning, Databricks
PyTorch and MLflow, from Research to Production
Lin Qiao
Engineering Director, PyTorch, Facebook
Lin Qiao, engineering director on the Facebook AI team, talks about bringing machine learning to production at scale, including the PyTorch integration with MLflow. She talks about the guiding principles for PyTorch and the goals set back in 2016 during initial development through the present day, with a focus on ecosystem compatibility.
Lin reviews the PyTorch production ecosystem and discusses how MLflow and PyTorch are integrated for tracking, models and model serving.
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: databricks.com/product/unifie...
Connect with us:
Website: databricks.com
Facebook: facebook.com/databricksinc
Twitter: twitter.com/databricks
LinkedIn: linkedin.com/company/data...
Instagram: instagram.com/databricksinc/ Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. databricks.com/databricks-named-leader-by-gartner
- Taking Machine Learning to Production with New Features in MLflow | Keynote Data + AI Summit EU 2020 ( Download)
- Introducing the Next Generation Data Science Workspace | Keynote Data + AI Summit EU 2020 ( Download)
- Simplifying Model Development and Management with MLflow | Keynote Spark + AI Summit 2020 ( Download)
- Project Zen: Making Spark Pythonic | Reynold Xin | Keynote Data + AI Summit EU 2020 ( Download)
- MLflow Model Serving @ Data + AI Summit Europe Meetup ( Download)
- The Quest to Predict the Future of Medicine | Dr. Kira Radinsky | Keynote Data + AI Summit EU 2020 ( Download)
- What's New in MLflow: a System to Accelerate the Machine Learning Lifecycle. ( Download)
- Introducing MLflow for End-to-End Machine Learning on Databricks ( Download)
- Keboola - MLFlow demo ( Download)
- Discussion with Daimler | Stephan Schwarz and Sebastian Findeisen | Keynote Data + AI Summit EU 2020 ( Download)
- Streaming machine learning with Databricks and Github Actions - Universe 2022 ( Download)
- Data Teams Unite! | Ali Ghodsi | Keynote Data + AI Summit EU 2020 ( Download)
- Machine Learning Recap Video on Data and AI Summit 2022 By Databricks ( Download)
- Productionizing Real-time Serving With MLflow ( Download)
- Continuous Delivery of ML-Enabled Pipelines on Databricks using MLflow ( Download)