Deep learning (DL) has shown impressive capabilities across various fields. However, the growing size of DL models is outpacing hardware capacity, resulting in a demand for large models to be trained and inferenced more efficiently and easily. We'll introduce a user-friendly DL system, Colossal-AI, that enables users to maximize the efficiency of AI training and inference with drastically reduced costs. It integrates advanced techniques such as efficient multidimensional parallelism, heterogeneous memory management, adaptive task scheduling, and more. You'll walk away with a better understanding of parallelism and memory optimization techniques behind large model training and inference, learn practical applications of the DL system (including natural language processing, computer vision, bioinformatics, etc.), and be able to contribute to the large AI model era of the future. Although it's not required, basic understanding of DL and distributed systems will help you better learn the design of Colossal-AI.