r1.15.5-deeprec2201
liutongxuan
released this
11 Jan 04:36
·
982 commits
to main
since this release
This is the first release of DeepRec. DeepRec has super large-scale distributed training capability, supporting model training of trillion samples and 100 billion Embedding Processing. For sparse model scenarios, in-depth performance optimization has been conducted across CPU and GPU platform.
Major Features and Improvements
Embedding
- Embedding Variable (including feature eviction and feature filter)
- Dynamic Dimension Embedding Variable
- Adaptive Embedding
- Multi-Hash Variable
Distributed Training
- GRPC++
- StarServer
Graph Optimization
- Auto Micro Batch
- Auto Graph Fusion
- Embedding Fusion
- Smart Stage
Runtime Optimization
- CPU Memory Optimization
- GPU Memory Optimization
- GPU Virtual Memory
Optimizer
- AdamAsync Optimizer
- AdagradDecay Optimizer
Op & Hardware Acceleration
- Unique, Gather, DynamicStitch, BiasAdd, Select, Transpose, SparseSegmentReduction, where, DynamicPartition, SparseConcat tens of ops' CPU/GPU optimization.
- support oneDNN-2.3.2 & bf16
- Support TF32
IO & Dataset
- WorkQueue
- KafkaDataset
More details of features: https://deeprec.readthedocs.io/zh/latest/
Release Images
CPU Image
registry.cn-shanghai.aliyuncs.com/pai-dlc-share/deeprec-training:deeprec2201-cpu-py36-ubuntu18.04
GPU Image
registry.cn-shanghai.aliyuncs.com/pai-dlc-share/deeprec-training:deeprec2201-gpu-py36-cu110-ubuntu18.04