Release-1.2.0
Release-1.2.0
Angel 1.2.0,加入了较多的优化和改进,新增了2个算法,修复了多个Bug,建议所有的用户都升级到这个版本,为1.3.0版本的进一步升级做好准备。
Angel Core
- Long类型Key的稀疏Double Vector/Matrix支持
- 稀疏向量性能优化:添加可支持并行运算的稀疏型Vector/Matrix
- PS RPC性能优化:优化网络模型,分离IO操作和RPC请求处理
- 完善MatrixOpLog类型:增加Sparse Double/Dense Int/Sparse Float几种类型
Angel MLLib
- 新增MLR算法
- FM提升:使用PSF进行模型的初始化;计算性能优化;增加分类方法
- GBDT优化:使用PSF实现最佳分裂点查找过程
- LDA升级为LDA* ,保持和VLDB 2017的论文实现一致,并优化性能
Spark on Angel
- 新增KMeans算法
- 对接口进行了一轮重构,隐藏了PSVectorPool的概念
- 模型开始支持Matrix
文档
- 优化psFunc和Core-API文档
- 新增PSModel格式转换工具、全局指标使用说明
- 文档的国际化进度90%
接口优化
- PSModel:增加syncClock接口,建议替代clock().get()简单调用
- DataBlock:加入loopingRead接口,可以重复读取数据以供训练
~~~华丽的致谢分割线~~~
Angel 1.2.0的发布,继续得到各地的Contributors的协助。感谢如下的开发者为这次发布做出的贡献:
- hbghhy :基于Spark on Angel实现的KMeans算法
- hbghhy:加入阿里巴巴用于CTR预估的MLR算法
- shunanzhang:持续的高质量文档翻译
- [SkyData] Augusto Yao:修复了诸多Bug [112, 188]
- [小米] luosmart: 修复了诸多Bug [198 ... ]
同时 ,并对QQ群里诸多热心用户的反馈和意见,深表谢意
Release-1.2.0
Angel 1.2.0 is a version with improvement and enhancement. Two new algorithms became available and recognized bugs are fixed. We recommend all users to upgrade to this version, preparing for further upgrade in the near future (version 1.3.0).
Angel Core
- Added support for Sparse Double Vector/Matrix for long type key
- Optimized performance of Sparse Vector: added sparse Vector/Matrix that supports parallel operations
- Optimized performance of PS RPC: optimized network model, separated IO operations and RPC request handling
- Improved MatrixOpLog type: added Sparse Double, Dense Int and Sparse Float types
Angel MLlib
- Added the MLR algorithm
- Enhanced FM: using PSF for initializing models, optimizing performance of operations, adding classification methods
- Optimized GBDT: using PSF for implementing searching for the best split point
- Upgraded LDA to LDA * (up-to-date with the 2017 VLDB publication in the README file) and optimized for performance
Spark on Angel
- Added KMeans algorithm
- Interface Refactored: PSVectorPool concept is hidden
- Model starts to support
Matrix
Documentation
- Improved psFunc and Core-API documentation
- Added explanations of usage for PSModel format converter and global algorithm metrics
- 90% of documentation available in English
Interface Optimization
- PSModel: added syncClock interface; a simple call of syncClock is recommended to replace the usage of clock().get()
- DataBlock: added to loopingRead interface; data can be read repetitively for training
~~~ Acknowledgement ~~~
Help from developers from all over the world is continuing. We appreciate developers who contributed to the new release:
- hbghhy: Implementation of KMeans on Spark on Angel
- hbghhy: Adding the MLR algorithm used for CTR estimation by Alibaba
- shunanzhang: Continued translation for documentation
- [SkyData] Augusto Yao: Fixed a number of bugs [112, 188]
- [Xiaomi] luosmart: Fixed a number of bugs [198, ...]
Meanwhile, many helpful feedback and suggestions are received from Angel QQ group , and we are also greatly thankful for that.