Giải pháp điện toán cho AI và Deep Learning của Supermicro

Kết nối AI với công nghệ Deep Learning của Supermicro

Deep Learning, một tập con của Trí tuệ nhân tạo (AI) và Machine Learning (ML), là quy trình tiên tiến trong khoa học máy tính thực hiện các mạng thần kinh nhân tạo đa lớp để thực hiện các nhiệm vụ rất phức tạp để lập trình. Ví dụ: Google Maps xử lý hàng triệu điểm dữ liệu mỗi ngày để tìm ra tuyến đường tốt nhất cho việc di chuyển hoặc dự đoán thời gian để đến đích mong muốn. Deep Learning bao gồm hai phần – đào tạo (training) và suy luận (inference). Phần đào tạo của Deep Learning bao gồm xử lý càng nhiều điểm dữ liệu càng tốt để làm cho mạng thần kinh tự ‘học’ đặc tính và tự sửa đổi để hoàn thành các nhiệm vụ như nhận dạng hình ảnh, nhận dạng giọng nói, v.v… Phần suy luận đề cập đến quá trình lấy một mô hình đã được đào tạo trước đó và sử dụng nó để đưa ra những dự đoán và quyết định hữu ích.

Nền tảng AI & Deep Learning

Giải pháp của Supermicro là cài đặt các framework tùy chọn đa dạng cho Deep Learning, để người dùng cuối có thể trực tiếp bắt đầu triển khai các dự án Deep Learning mà thậm chí không cần đụng đến lập trình GPU. Các giải pháp cung cấp cài đặt tùy biến các framework học sâu bao gồm TensorFlow, Caﬀe2, MxNet, Chainer, Microsoft Cognitive Toolkit và các bộ công cụ khác.

Giải pháp Supermicro AI & Deep Learning cung cấp một lớp phần mềm AI / Deep Learning hoàn chỉnh. Bên dưới là các bộ phần mềm kèm với giải pháp tích hợp đầy đủ từ đầu đến cuối:

Phần mềm cho AI & Deep Learning
Môi trường Deep Learning	Framework	Caffe, Caffe2, Caffe-MPI, Chainer, Microsoft CNTK, Keras, MXNet, TensorFlow, Theano, PyTorch
	Libraries	cnDNN, NCCL, cuBLAS
	User Access	NVIDIA DIGITS
	Operating Systems	Ubuntu, Docker, Docker NVIDIA

Ưu điểm của giải pháp AI & Deep Learning do

Supermicro cung cấp

Một “powerhouse” cho điện toán toán
- Cụm máy tính Supermicro AI & Deep Learning được cung cấp bởi các hệ thống Supermicro SuperServer®, là những powerhouse có mật độ cao và nhỏ gọn cho việc xử lý. Cụm được trang bị GPU mới nhất từ đối tác NVIDIA. Mỗi node xử lý sử dụng GPU NVIDIA® Tesla® V100 hay Ampere A100
Xử lý song song mật độ cao
- Lên đến 32 GPU với bộ nhớ GPU lên đến 2TB cho khả năng xử lý song song tối đa, giúp giảm thời gian đào tạo cho workload chạy hệ thống Deep Learning.
Tăng băng thông với NVLink
- Sử dụng công nghệ NVLink™, cho phép giao tiếp GPU-GPU nhanh hơn với băng thông tối đa 600 Gb/s, nâng cao hơn nữa hiệu năng hệ thống trong các Deep Learning workload lớn.
Xử lý nhanh hơn với Tensor Core
- GPU NVIDIA Tesla V100 và Ampere A100 sử dụng kiến trúc Tensor Core thế hệ mới. Các Tensor Cores này cung cấp khả năng tính toán, xử lý hàng nghìn tỷ phép tính với tốc độ nhanh chóng và đạt độ chính xác caoa, hỗ trợ tối đa cho triển khai Deep Learning và có thể cung cấp tới 312 TFLOPS cho các ứng dụng đào tạo và suy luận.
Thiết kế mở rộng
- Kiến trúc mở rộng với nền tảng mạng network tốc độ cao 200G IB EDR, hoàn toàn có thể mở rộng để phù hợp với sự phát triển trong tương lai.
Rapid Flash Xtreme (RFX) – Lưu trữ NVMe all-flash hiệu suất cao
- RFX là hệ thống lưu trữ hoàn chỉnh hàng đầu, được phát triển và thử nghiệm hoàn toàn cho các ứng dụng AI & Deep Learning kết hợp với Supermicro BigTwin™ cùng với hệ thống lưu trữ song song WekaIO.

Kiến trúc tham khảo cho hệ thống AI & Deep Learning

Supermicro hiện đang cung cấp các giải pháp hoàn chỉnh sau đây đã được kiểm tra kỹ lưỡng và sẵn sàng hoạt động. Các cụm này có thể được thu nhỏ lên xuống để đáp ứng nhu cầu của các dự án Deep Learning của bạn.


	14U Rack	24U Rack
Product SKU	SRS-14UGPU-AI-A100-01	SRS-24UGPU-AI-A100-01
Compute Capability	312TF/5PF/10PF/20POPS	624TF/10PF/20PF/40POPS
Compute Node	2x SYS-420GP-TNAR	4x SYS-420GP-TNAR
Total GPUs	16x NVIDIA AMPERE A100 SXM4 80GB	32x NVIDIA AMPERE A100 SXM4 80GB
Total GPU Memory	Up to 1,280GB	Up to 2,560GB
Total CPU	4x Intel® Xeon® Gold 6354, 3.00GHz, 18-cores	8x Intel® Xeon® Gold 6354, 3.00GHz, 18-cores
Total System Memory	4TB DDR4-3200MHz ECC	8TB DDR4-3200MHz ECC
Networking	InfiniBand EDR 200Gbps; 10GBASE-T Ethernet	InfiniBand EDR 200Gbps; 10GBASE-T Ethernet
Total Storage*	15.2TB (8 NVME SSDs)	30.4TB (16 NVME SSDs)
Operating System	Ubuntu Linux OS or CentOS Linux	Ubuntu Linux OS or CentOS Linux
Software	Caffe, Caffe2, Digits, Inference Server, PyTorch, NVIDIA® CUDA®, NVIDIA® TensorRT™, Microsoft Cognitive Toolkit (CNKT), MXNet, TensorFlow, Theano, and Torch	Caffe, Caffe2, Digits, Inference Server, PyTorch, NVIDIA® CUDA®, NVIDIA® TensorRT™, Microsoft Cognitive Toolkit (CNKT), MXNet, TensorFlow, Theano, and Torch
Max Power Usage	8.8kW (8,800W)	16.0kW (16,000kW)
Dimensions	14 Rack Units, 600 x 800 x 1000 (mm, W x H x D)	24 Rack Units, 598 x 1163 x 1000 (mm, W x H x D)

Các nền tảng máy chủ sẵn sàng cho AI & Deep Learning từ Supermicro

SYS-420GP-TNAR

	HPC, Artificial Intelligence, Big Data Analytics, Research Lab, Astrophysics, Business Intelligence Dual Socket P+ (LGA-4189) 3rd Gen Intel® Xeon® Scalable Processors 32 DIMM Slots; Up to 8TB DRAM 3200/2933/2666 ECC DDR4 LRDIMM;RDIMM; Supports Intel® Optane™ Persistent Memory 200 series 10 PCI-E Gen 4.0 X16 LP Slots AIOM/OCP 3.0 Support 2 M.2 NVMe and SATA for boot drive only; 6x 2.5″ Hot-swap NVMe/SATA/SAS drive bays 4 heavy duty fans with optimal fan speed control 4x 2200W redundant Titanium level power supplies

HPC, Artificial Intelligence, Big Data Analytics, Research Lab, Astrophysics, Business Intelligence
Dual Socket P+ (LGA-4189) 3rd Gen Intel® Xeon® Scalable Processors
32 DIMM Slots; Up to 8TB DRAM 3200/2933/2666 ECC DDR4 LRDIMM;RDIMM;
Supports Intel® Optane™ Persistent Memory 200 series
10 PCI-E Gen 4.0 X16 LP Slots AIOM/OCP 3.0 Support
2 M.2 NVMe and SATA for boot drive only; 6x 2.5″ Hot-swap NVMe/SATA/SAS drive bays
4 heavy duty fans with optimal fan speed control
4x 2200W redundant Titanium level power supplies

SYS-420GP-TNAR+

	HPC, Artificial Intelligence, Big Data Analytics, Research Lab, Astrophysics, Business Intelligence Dual Socket P+ (LGA-4189) 3rd Gen Intel® Xeon® Scalable Processors 32 DIMM Slots; Up to 8TB DRAM 3200/2933/2666 ECC DDR4 LRDIMM;RDIMM; Supports Intel® Optane™ Persistent Memory 200 series 10 PCI-E Gen 4.0 X16 LP Slots AIOM/OCP 3.0 Support 2 M.2 NVMe and SATA for boot drive only; 6x 2.5″ Hot-swap NVMe/SATA/SAS drive bays 4 heavy duty fans with optimal fan speed control 4x 3000W redundant Titanium level power supplies

HPC, Artificial Intelligence, Big Data Analytics, Research Lab, Astrophysics, Business Intelligence
Dual Socket P+ (LGA-4189) 3rd Gen Intel® Xeon® Scalable Processors
32 DIMM Slots; Up to 8TB DRAM 3200/2933/2666 ECC DDR4 LRDIMM;RDIMM;
Supports Intel® Optane™ Persistent Memory 200 series
10 PCI-E Gen 4.0 X16 LP Slots AIOM/OCP 3.0 Support
2 M.2 NVMe and SATA for boot drive only; 6x 2.5″ Hot-swap NVMe/SATA/SAS drive bays
4 heavy duty fans with optimal fan speed control
4x 3000W redundant Titanium level power supplies

Server 4124GO-NART

	AI/Deep Learning, High Performance Computing Dual AMD EPYC™ 7003/7002 Series Processors 8TB Registered ECC DDR4 3200MHz SDRAM in 32 DIMMs 8 PCI-E 4.0 x16 via PCI-E switch; 1 PCI-E 4.0 x 16 LP and 1 PCI-E 4.0 x8 LP via CPUs; AIOM Support 6 Hot-swap U.2 NVMe 2.5″ drive bays (4 via PCI-E switch, 2 via CPU ) 4 Hot-swap heavy-duty cooling fans 2200W (3+1) Redundant Platinum LevelPower Supplies

AI/Deep Learning, High Performance Computing
Dual AMD EPYC™ 7003/7002 Series Processors
8TB Registered ECC DDR4 3200MHz SDRAM in 32 DIMMs
8 PCI-E 4.0 x16 via PCI-E switch; 1 PCI-E 4.0 x 16 LP and 1 PCI-E 4.0 x8 LP via CPUs; AIOM Support
6 Hot-swap U.2 NVMe 2.5″ drive bays (4 via PCI-E switch, 2 via CPU )
4 Hot-swap heavy-duty cooling fans
2200W (3+1) Redundant Platinum LevelPower Supplies

Server 4124GO-NART+

	AI/Deep Learning, High Performance Computing Dual AMD EPYC™ 7003/7002 Series Processors 8TB Registered ECC DDR4 3200MHz SDRAM in 32 DIMMs 8 PCI-E 4.0 x16 via PCI-E switch; 1 PCI-E 4.0 x 16 LP and 1 PCI-E 4.0 x8 LP via CPUs; AIOM Support 6 Hot-swap U.2 NVMe 2.5″ drive bays (4 via PCI-E switch, 2 via CPU ) 4 Hot-swap heavy-duty cooling fans 3000W (2+2) Redundant Titanium Level (96%+) Power Supplies

AI/Deep Learning, High Performance Computing
Dual AMD EPYC™ 7003/7002 Series Processors
8TB Registered ECC DDR4 3200MHz SDRAM in 32 DIMMs
8 PCI-E 4.0 x16 via PCI-E switch; 1 PCI-E 4.0 x 16 LP and 1 PCI-E 4.0 x8 LP via CPUs; AIOM Support
6 Hot-swap U.2 NVMe 2.5″ drive bays (4 via PCI-E switch, 2 via CPU )
4 Hot-swap heavy-duty cooling fans
3000W (2+2) Redundant Titanium Level (96%+) Power Supplies

Server 2124GQ-NART

	AI/Deep Learning, High Performance Computing, Cloud Computing, Research Laboratory/National Laboratory Dual AMD EPYC™ 7003/7002 Series Processors 8TB Registered ECC DDR4 3200MHz SDRAM in 32 DIMMs 4 PCI-E Gen 4 x16 (LP), 1 PCI-E Gen 4 x8 (LP) 4 Hot-swap 2.5″ drive bays (SAS/SATA/NVMe Hybrid) On board BMC supports integrated IPMI 2.0 + KVM with dedicated 10G LAN 2x 2200W Platinum Level power supplies with Smart Power Redundancy

AI/Deep Learning, High Performance Computing, Cloud Computing, Research Laboratory/National Laboratory
Dual AMD EPYC™ 7003/7002 Series Processors
8TB Registered ECC DDR4 3200MHz SDRAM in 32 DIMMs
4 PCI-E Gen 4 x16 (LP), 1 PCI-E Gen 4 x8 (LP)
4 Hot-swap 2.5″ drive bays (SAS/SATA/NVMe Hybrid)
On board BMC supports integrated
IPMI 2.0 + KVM with dedicated 10G LAN
2x 2200W Platinum Level power supplies with Smart Power Redundancy

Server 2124GQ-NART+

	AI/Deep Learning, High Performance Computing, Cloud Computing, Research Laboratory/National Laboratory Dual AMD EPYC™ 7003/7002 Series Processors 8TB Registered ECC DDR4 3200MHz SDRAM in 32 DIMMs 4 PCI-E Gen 4 x16 (LP), 1 PCI-E Gen 4 x8 (LP) 4 Hot-swap 2.5″ drive bays (SAS/SATA/NVMe Hybrid) On board BMC supports integrated IPMI 2.0 + KVM with dedicated 10G LAN 3000W Redundant Titanium Level (96%+) Power Supplies

AI/Deep Learning, High Performance Computing, Cloud Computing, Research Laboratory/National Laboratory
Dual AMD EPYC™ 7003/7002 Series Processors
8TB Registered ECC DDR4 3200MHz SDRAM in 32 DIMMs
4 PCI-E Gen 4 x16 (LP), 1 PCI-E Gen 4 x8 (LP)
4 Hot-swap 2.5″ drive bays (SAS/SATA/NVMe Hybrid)
On board BMC supports integrated
IPMI 2.0 + KVM with dedicated 10G LAN
3000W Redundant Titanium Level (96%+) Power Supplies

Hãng sản xuất

Workload

Theo cấu hình

Hãng sản xuất

Workstation

Hãng sản xuất

Loại lưu trữ

Ứng dụng

Hãng sản xuất

Loại linh kiện

Linh kiện khác

Theo dòng máy chủ

Thiết bị mạng

Hãng sản xuất

Đối tác cung cấp

Workload

Ngành công nghiệp

Giải pháp điện toán cho AI và Deep Learning của Supermicro

Kết nối AI với công nghệ Deep Learning của Supermicro

Nền tảng AI & Deep Learning

Ưu điểm của giải pháp AI & Deep Learning do

Supermicro cung cấp

Kiến trúc tham khảo cho hệ thống AI & Deep Learning

14U Rack

24U Rack