| Page 696 | Kisaco Research

The Evolution of Hyperscale Data Centers: From CPU-Centric to GPU-Accelerated AI Applications

In recent years, hyperscale data centers have been optimized for scale-out stateless applications and zettabyte storage, with a focus on CPU-centric platforms. However, as the infrastructure shifts towards next-generation AI applications, the center of gravity is moving towards GPU/accelerators. This transition from "millions of small stateless applications" to "large AI applications running across clusters of GPUs" is pushing the limits of accelerators, network, memory, topologies, rack power, and other components. To keep up with this dramatic change, innovation is necessary to ensure that hyperscale data centers can continue to support the growing demands of AI applications. This keynote speech will explore the challenges and opportunities of this evolution and highlight the key areas where innovation is needed to enable the future of hyperscale data centers.

Systems Infrastructure/Architecture

AI/ML Compute

Author:

Manoj Wadekar

AI Systems Technologist

Author:

Dr. Vibhor Aggarwal

Manager: Digital & Scientific HPC

Shell

Vibhor is an R&D leader with expertise in HPC Software, Scientific Visualization, Cloud Computing and AI technologies with 14 years of experience. He and his team at Shell are currently work on problems in optimizing HPC software for simulations, large-scale and generative AI, combination of Physics and AI models, developing platform and products for HPC-AI solutions as well as emerging HPC areas for energy transition at the forefront of Digital Innovation. He has two patents and several research publications. Vibhor has a BEng in Computer Engineering from University of Delhi and a PhD in Engineering from University of Warwick.

Read more about Exploring CXL Use Cases and the Future of Disaggregated Heterogeneous Memory Architecture

Transforming In-Memory Database Infrastructure

Oracle AI Vector Search enables enterprises to leverage their own business data to build cutting-edge generative AI solutions. AI Vectors are data structures that encode the key features or essence of unstructured entities such as images or documents. The more similar two entities are, the shorter the mathematical distance between their corresponding AI vectors. With AI Vector search, Oracle Database is introducing a new vector datatype, new vector indexes (in-memory neighbor graph indexes and neighbor partitioned indexes), and new Vector SQL operators for highly efficient and powerful similarity search queries. Oracle AI Vector Search enables applications to combine their business data with large language models (LLMs) using a technique called Retrieval Augmentation Generation (RAG), to deliver amazingly accurate responses to natural language questions. With AI Vector Search in Oracle Database, users can easily build AI applications that combine relational searches with similarity search, without requiring data movement to a separate vector database, and without any loss of security, data integrity, consistency, or performance.

Author:

Tirthankar Lahiri

SVP, Data & In-Memory Technologies

Oracle

Tirthankar Lahiri is Vice President of the Data and In-Memory Technologies group for Oracle Database and is responsible for the Oracle Database Engine (including Database In-Memory, Data and Indexes, Space Management, Transactions, and the Database File System), the Oracle TimesTen In-Memory Database, and Oracle NoSQLDB. Tirthankar has 22 years of experience in the Database industry and has worked extensively in a variety of areas including Manageability, Performance, Scalability, High Availability, Caching, Distributed Concurrency Control, In-Memory Data Management, NoSQL architectures, etc. He has 27 issued and has several pending patents in these areas. Tirthankar has a B.Tech in Computer Science from the Indian Institute of Technology (Kharagpur) and an MS in Electrical Engineering from Stanford University.

Read more about Transforming In-Memory Database Infrastructure

Recommendation Systems - Data Demands & Infrastructure Requirements

Author:

Puja Das

Senior Director, Personalization

Warner Bros. Entertainment

Dr. Puja Das, leads the Personalization team at Warner Brothers Discovery (WBD) which includes offerings on Max, HBO, Discovery+ and many more.

Prior to WBD, she led a team of Applied ML researchers at Apple, who focused on building large scale recommendation systems to serve personalized content on the App Store, Arcade and Apple Books. Her areas of expertise include user modeling, content modeling, recommendation systems, multi-task learning, sequential learning and online convex optimization. She also led the Ads prediction team at Twitter (now X), where she focused on relevance modeling to improve App Ads personalization and monetization across all of Twitter surfaces.

She obtained her Ph.D from University of Minnesota in Machine Learning, where the focus of her dissertation was online learning algorithms, which work on streaming data. Her dissertation was the recipient of the prestigious IBM Ph D. Fellowship Award.

She is active in the research community and part of the program committee at ML and recommendation system conferences. Shas mentored several undergrad and grad students and participated in various round table discussions through Grace Hopper Conference, Women in Machine Learning Program colocated with NeurIPS, AAAI and Computing Research Association- Women’s chapter.

Read more about Recommendation Systems - Data Demands & Infrastructure Requirements

Indirect/Irregular Workloads within Large Simulations and How to Improve Access through Co-Design

Los Alamos National Laboratory's (LANL) has a diverse set of High Performance Computing codes. Analysis of many of these codes indicate they are heavily memory bound with sparse memory accesses. High Bandwidth Memory (HBM) has proven a significant advancement in improving the performance of these codes but the roadmap for major (step function) improvements in memory technologies is unclear. Addressing this challenge will require a renewed focus on high performance memory and processor technologies that take a more aggressive and holistic view of advancements in ISA, microarchitecture, and memory controller technologies. Beyond scientific simulations, advancements in performance of sparse memory accesses will benefit graph analysis, DLRM inference, and database workloads.

Author:

Galen Shipman

Computer Scientist

Los Alamos National Laboratories

Galen Shipman is a computer scientist at Los Alamos National Laboratory (LANL). His interests include programming models, scalable runtime systems, and I/O. As Chief Architect he leads architecture and technology of Advanced Technology Systems (ATS) at LANL. He has led performance engineering across LANL’s multi-physics integrated codes and the advancement and integration of next-generation programming models such as the Legion programming system as part of LANL's next-generation code project, Ristra. His work in storage systems and I/O is currently focused on composable micro-services as part of the Mochi project. His prior work in scalable software for HPC include major contributions to broadly used technologies including the Lustre parallel file system and Open MPI.

Read more about Indirect/Irregular Workloads within Large Simulations and How to Improve Access through Co-Design

Indirect/Irregular workloads within large simulations and how to improve access through co-design Galen Shipman Computer Scientist(Los Alamos National Laboratories)

Author:

Galen Shipman

Computer Scientist

Los Alamos National Laboratories

Galen Shipman is a computer scientist at Los Alamos National Laboratory (LANL). His interests include programming models, scalable runtime systems, and I/O. As Chief Architect he leads architecture and technology of Advanced Technology Systems (ATS) at LANL. He has led performance engineering across LANL’s multi-physics integrated codes and the advancement and integration of next-generation programming models such as the Legion programming system as part of LANL's next-generation code project, Ristra. His work in storage systems and I/O is currently focused on composable micro-services as part of the Mochi project. His prior work in scalable software for HPC include major contributions to broadly used technologies including the Lustre parallel file system and Open MPI.

Read more about Indirect/Irregular workloads within large simulations and how to improve access through co-design Galen Shipman Computer Scientist(Los Alamos National Laboratories)

Martin Mendoza

Author:

Manoj Wadekar

Terri Wiggins

Yamile Molina

Myra Parker

Author:

Dr. Vibhor Aggarwal

Author:

Tirthankar Lahiri

Author:

Puja Das

Author:

Galen Shipman

Author:

Galen Shipman