Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
nlpzhezhao committed Feb 14, 2024
1 parent 6feab42 commit 86531a8
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 7 deletions.
7 changes: 3 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ TencentPretrain has the following features:
* argparse
* packaging
* regex
* For the mixed precision training you will need apex from NVIDIA
* For the pre-trained model conversion (related with TensorFlow) you will need TensorFlow
* For the tokenization with sentencepiece model you will need [SentencePiece](https://github.com/google/sentencepiece)
* For developing a stacking model you will need LightGBM and [BayesianOptimization](https://github.com/fmfn/BayesianOptimization)
Expand Down Expand Up @@ -135,7 +134,7 @@ The above content provides basic ways of using TencentPretrain to pre-process, p
<br/>

## Pre-training data
This section provides links to a range of :arrow_right: [__pre-training data__](https://github.com/Tencent/TencentPretrain/wiki/Pretraining-data) :arrow_left: .
This section provides links to a range of :arrow_right: [__pre-training data__](https://github.com/Tencent/TencentPretrain/wiki/Pretraining-data) :arrow_left: . TencentPretrain can load these pre-training data directly.

<br/>

Expand All @@ -145,7 +144,7 @@ This section provides links to a range of :arrow_right: [__downstream datasets__
<br/>

## Modelzoo
With the help of TencentPretrain, we pre-trained models of different properties (e.g. models based on different modalities, encoders, and targets). Detailed introduction of pre-trained models and their download links can be found in :arrow_right: [__modelzoo__](https://github.com/Tencent/TencentPretrain/wiki/Modelzoo) :arrow_left: . All pre-trained models can be loaded by TencentPretrain directly. More pre-trained models will be released in the future.
With the help of TencentPretrain, we pre-trained models of different properties (e.g. models based on different modalities, encoders, and targets). Detailed introduction of pre-trained models and their download links can be found in :arrow_right: [__modelzoo__](https://github.com/Tencent/TencentPretrain/wiki/Modelzoo) :arrow_left: . All pre-trained models can be loaded by TencentPretrain directly.

<br/>

Expand Down Expand Up @@ -183,7 +182,7 @@ TencentPretrain/
```

The code is well-organized. Users can use and extend upon it with little efforts.
The code is organized based on components (e.g. embeddings, encoders). Users can use and extend upon it with little efforts.

Comprehensive examples of using TencentPretrain can be found in :arrow_right: [__instructions__](https://github.com/Tencent/TencentPretrain/wiki/Instructions) :arrow_left: , which help users quickly implement pre-training models such as BERT, GPT-2, ELMo, T5, CLIP and fine-tune pre-trained models on a range of downstream tasks.

Expand Down
5 changes: 2 additions & 3 deletions README_ZH.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,6 @@ TencentPretrain有如下几方面优势:
* argparse
* packaging
* regex
* 如果使用混合精度,需要安装英伟达的apex
* 如果涉及到TensorFlow模型的转换,需要安装TensorFlow
* 如果在tokenizer中使用sentencepiece模型,需要安装[SentencePiece](https://github.com/google/sentencepiece)
* 如果使用模型集成stacking,需要安装LightGBM和[BayesianOptimization](https://github.com/fmfn/BayesianOptimization)
Expand Down Expand Up @@ -132,7 +131,7 @@ python3 inference/run_classifier_infer.py --load_model_path models/finetuned_mod
<br>

## 预训练数据
我们提供了链接,指向一系列开源的 :arrow_right: [__预训练数据__](https://github.com/Tencent/TencentPretrain/wiki/预训练数据) :arrow_left:
我们提供了链接,指向一系列开源的 :arrow_right: [__预训练数据__](https://github.com/Tencent/TencentPretrain/wiki/预训练数据) :arrow_left:TencentPretrain可以直接加载这些预训练数据。

<br>

Expand All @@ -142,7 +141,7 @@ python3 inference/run_classifier_infer.py --load_model_path models/finetuned_mod
<br>

## 预训练模型仓库
借助TencentPretrain,我们训练不同性质的预训练模型(例如基于不同模态、编码器、目标任务)。用户可以在 :arrow_right: [__预训练模型仓库__](https://github.com/Tencent/TencentPretrain/wiki/预训练模型仓库) :arrow_left: 中找到各种性质的预训练模型以及它们对应的描述和下载链接。所有预训练模型都可以由TencentPretrain直接加载。将来我们会发布更多的预训练模型。
借助TencentPretrain,我们训练不同性质的预训练模型(例如基于不同模态、编码器、目标任务)。用户可以在 :arrow_right: [__预训练模型仓库__](https://github.com/Tencent/TencentPretrain/wiki/预训练模型仓库) :arrow_left: 中找到各种性质的预训练模型以及它们对应的描述和下载链接。所有预训练模型都可以由TencentPretrain直接加载。

<br>

Expand Down

0 comments on commit 86531a8

Please sign in to comment.