Aws Reinvent 2021 Large Scale Distributed Training Of Media Ml Models With Amazon Fsx

In this session, learn about the challenges of scalable distributed training of media machine learning models on multi-GPU nodes used by Netflix and how the Amazon FSx solution is used to resolve the data loader performance bottlenecks of the training system. See the impressive results in terms of performance and throughput improvements on multi-node GPUs and the scalability of Amazon FSx.

Learn more about re:Invent 2021 at bit.ly/3IvOLtK

Subscribe:
More AWS videos bit.ly/2O3zS75
More AWS events videos bit.ly/316g9t4

ABOUT AWS
Amazon Web Services (AWS) hosts events, both online and in-person, bringing the cloud computing community together to connect, collaborate, and learn from AWS experts.

AWS is the world’s most comprehensive and broadly adopted cloud platform, offering over 200 fully featured services from data centers globally. Millions of customers—including the fastest-growing startups, largest enterprises, and leading government agencies—are using AWS to lower costs, become more agile, and innovate faster.

#AWS #AmazonWebServices #CloudComputing

  • AWS re:Invent 2021 - Large-scale distributed training of media ML models with Amazon FSx ( Download)
  • AWS re:Invent 2020: AWS infrastructure for large-scale distributed ML training ( Download)
  • AWS re:Invent 2021 - Simplify your file-based workloads with Amazon FSx ( Download)
  • AWS re:Invent 2021 - Deep dive on Amazon FSx for Lustre | AWS Events ( Download)
  • MLOps talks I want to attend at re:Invent 2021 ( Download)
  • AWS re:Invent 2022 - Deep dive on accelerating HPC and ML with Amazon FSx (STG343) ( Download)
  • AWS re:Invent 2021 - From game worlds to real worlds: Large-scale simulation with AWS ( Download)
  • AWS re:Invent 2022 - Train ML models at scale with Amazon SageMaker, featuring AI21 Labs (AIM301) ( Download)
  • AWS re:Invent 2021 - Building on 15 years of compute innovation ( Download)
  • AWS re:Invent 2021 - {New Launch} Introducing AWS Trainium-based Amazon EC2 Trn1 instances ( Download)
  • SDC2020: Amazon FSx For Lustre Deep Dive and its importance in Machine Learning ( Download)
  • How to get the most out of your data with Intelligent Search - AWS Online Tech Talks ( Download)
  • AWS re:Invent 2023 - Accelerate ML and HPC with high performance file storage (STG340) ( Download)
  • AWS re:Invent 2021 - Introducing Amazon Kinesis Data Streams On-Demand Mode ( Download)
  • AWS re:Invent 2022 - Choosing the right accelerator for training and inference (CMP207) ( Download)