From bfe41cfd76a88c5db35e50f9dbb9764638522825 Mon Sep 17 00:00:00 2001 From: "Toshinori Sato (@overlast)" Date: Thu, 26 Nov 2015 15:52:34 +0900 Subject: [PATCH] Aperiodic data update on 2015-11-26(Thu) --- README.ja.md | 2 +- README.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.ja.md b/README.ja.md index 50ce32fd..b2544230 100644 --- a/README.ja.md +++ b/README.ja.md @@ -12,7 +12,7 @@ Web上の文書の解析をする際には、この辞書と標準のシステ ## 特徴 ### 利点 -- MeCab の標準のシステム辞書では正しく分割できない固有表現などの語の表層(表記)とフリガナの組を約203万組(重複エントリを含む)採録しています +- MeCab の標準のシステム辞書では正しく分割できない固有表現などの語の表層(表記)とフリガナの組を約201.5万組(重複エントリを含む)採録しています - この辞書の更新は開発サーバ上で自動的におこなわれます - 毎月月初と中旬に更新する予定です - Web上の言語資源を活用しているので、更新時に新しい固有表現を採録できます diff --git a/README.md b/README.md index 719673ce..57a21373 100644 --- a/README.md +++ b/README.md @@ -19,7 +19,7 @@ When you analyze the Web documents, it's better to use this system dictionary an ## Pros and Cons ### Pros -- Recorded about 2.03 million pairs(including duplicate entries) of surface/furigana(kana indicating the pronunciation of kanji) of the words such as the named entity that can not be tokenized correctly using default system dictionary of MeCab. +- Recorded about 2.015 million pairs(including duplicate entries) of surface/furigana(kana indicating the pronunciation of kanji) of the words such as the named entity that can not be tokenized correctly using default system dictionary of MeCab. - Update process of this dictionary will automatically run on development server. - I'm planning to renew this dictionary in monthly beginning of the month and middle of the month. - When renewing by utilizing the language resources on Web, a new named entity can be recorded.