Skip to content

Latest commit

 

History

History
26 lines (22 loc) · 894 Bytes

README.md

File metadata and controls

26 lines (22 loc) · 894 Bytes

NaturalSpeech2

Progress

  • Align datasets
  • Implement modules
  • Training
  • End-To-End Synthesizer
  • Add Loss CE RVQ
  • Subjective Evaluation
  • Objective Evaluation
  • Demo Page

Objective Evaluation

Prompt WER Speaker cosine Similarity UtteranceLevel Pitch Mean MAE UtteranceLevel Pitch Std MAE UtteranceLevel Duration Diff
Ground Truth 0.86 - - - -
2 Seconds
4 Seconds
6 Seconds
8 Seconds
4 Seconds(PrefixPrompt) (avg utter duration)