Enabling Hybrid Cloud Analytics and AI with Data Orchestration
|Date||11th August 2020 14:38|
The data ecosystem has heavily evolved over the past two decades. There’s been an explosion of data-driven frameworks, such as Presto, Spark, and Hive to run analytics and ETL queries and TensorFlow and PyTorch to train and serve models. On the data side, the approach to managing and storing data has evolved from HDFS to cheaper, more scalable and separated services typified by cloud stores like AWS S3. As a result, data engineering has become increasingly complex, inefficient, and hard, particularly in hybrid and cloud environments.
As the amount of data analyzed and stored continues to grow exponentially, fixed on-premises infrastructure like Apache Hadoop data lakes becomes costly. Add to that the need to support newer and popular frameworks on an already busy data lake, it is common to see Hadoop-based data lakes running at beyond 100% utilization and hybrid processing split between physical and cloud infrastructure. As a result, companies are looking to leverage the flexibility and cost savings of the cloud.
Adit Madan and Parviz Peiravi offer an overview of the Alluxio data orchestration layer that provides a unified data access layer for hybrid and multi cloud deployments, leveraging Intel® Optane™ Persistent Memory for higher performance caching at reduced cost. The data access layer enables distributed compute engines like Presto, TensorFlow, and PyTorch to transparently access data from various storage systems (including S3, HDFS, and Azure) while actively leveraging a multi-tier cache to accelerate data access.
- Challenges with migration of an on-prem data lake to a public cloud
- The simplified migration journey for Big Data and AI to the cloud with Alluxio
- An overview of the hybrid data lake solution, including how Alluxio and Intel® Optane™ Persistent Memory can provide higher performance at reduced cost
- Case studies: bursting compute to the cloud without making a persistent copy
- A special demo
Name: Adit Madan
Title: Technical Product Manager at Alluxio
Bio: Adit Madan is a technical product manager at Alluxio. He is also a core maintainer and PMC member of the Alluxio Open Source project. He was a research engineer at HPE before joining Alluxio. His experience is in distributed systems, storage systems, and large scale data analytics. He has an M.S. from Carnegie Mellon University and a B.S. from IIT.
Name: Parviz Peiravi
Title: Global CTO/Principle Engineer for Financial Services Industry Solutions
Bio: Parviz has been with Intel 23+ years and holds a degree in Computer and Electrical Engineering and is a recipient of Intel Achievement Award (IAA) and Intel Quality Award (IQA). He is primarily responsible for designing and driving development of Artificial Intelligence, Big Data, Service Oriented/Microservices Architecture, Cloud, and IoT computing architectures in support of Intel’s focus areas within financial Services Industry. He is member of Silicon Valley CTO Professionals, Linux Foundation, Blockchain Hyper Ledger, Cloud Computing Group E3C, Cloud Security Alliance (CSA), DMTF, and other organizations.