- 东北切蒲英(日语:東北きりたん)是SSS合同会社发表的东北支援角色,是以日本秋田县北部的鹿角市的地方美食切蒲英(日语:“きりたんぽ”)为原型而设定的角色。同时具有UTAU、 NEUTRINO、CeVIO AI歌唱音源和VOICEROID语音声源。
- 2020年2月22日神秘的软件工程师SHACHI,发布了东北切蒲英的音源演示歌曲,同时也公开配布了基于深度学习的歌声合成引擎NEUTRINO,只要交给她乐谱就可以很熟练的唱起来。1
下载NEUTRINO与KIRITAN音源
前往NEUTRINO官网,点击Start Now
来到官方的谷歌云盘,下载以下两个文件,解压备用
NEUTRINO-macOS-v1.0.0.zip
歌声ライブラリ(Singer Library)/東北きりたん(NEUTRINO-Library)-v1.0.0.zip
导入并使用KIRITAN作为默认音源
将解压出的KIRITAN文件夹放置到NEUTRINO/model
,然后编辑NEUTRINO/Run.sh
,将ModelDir
改为KIRITAN
# NEUTRINO
ModelDir=KIRITAN
StyleShift=0
安装gcc
该软件使用了gcc编译,而mac默认只有clang,因此还需要安装gcc。下面命令应该默认安装gcc11版本,如果安装了其他版本,请强制指定版本为11。
brew install gcc
修复动态链接库错误
NEUTRINO/bin/musicXMLtoLabel
的动态链接库指向有误,需要额外修复
cd bin
# 原链接
otool -L musicXMLtoLabel
musicXMLtoLabel:
/Users/user213944/.homebrew/opt/gcc/lib/gcc/11/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.29.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)
/Users/user213944/.homebrew/opt/gcc/lib/gcc/11/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0)
# 进行修改
install_name_tool -change /Users/user213944/.homebrew/opt/gcc/lib/gcc/11/libstdc++.6.dylib /usr/local/opt/gcc/lib/gcc/11/libstdc++.6.dylib musicXMLtoLabel
install_name_tool -change /Users/user213944/.homebrew/opt/gcc/lib/gcc/11/libgcc_s.1.dylib /usr/local/opt/gcc/lib/gcc/11/libgcc_s.1.dylib musicXMLtoLabel
# 再次查看
otool -L musicXMLtoLabel
bin/musicXMLtoLabel:
/usr/local/opt/gcc/lib/gcc/11/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.29.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1311.100.3)
/usr/local/opt/gcc/lib/gcc/11/libgcc_s.1.dylib (compatibility version 1.0.0, current version 1.0.0)
合成测试
使用sample1进行合成测试。修改Run.sh
的BASENAME
为sample1
# Project settings
BASENAME=sample1
NumThreads=4
执行Run.sh
,观察输出
sistine:NEUTRINO rumia$ ./Run.sh
03:15 : start MusicXMLtoLabel
Convert MusicXML to label -> score/musicxml/sample1.musicxml
output full label -> score/label/full/sample1.lab
output mono label -> score/label/mono/sample1.lab
03:15 : start NEUTRINO
NEUTRINO - NEURAL SINGING SYNTHESIZER (Electron v1.0.0-Stable)
Linguistic feature (duration) : 1 [msec]
Load timing model : 241 [msec]
-> Load completed.
-> Tohoku Kiritan - NEUTRINO Singer Character Library (v1.0.0-Stable-Timing model)
Predict timing feature : 409 [msec]
Linguistic feature (acoustic) : 712 [msec]
Load acoustic model : 726 [msec]
-> Load completed.
-> Tohoku Kiritan - NEUTRINO Singer Character Library (v1.0.0-Stable-Acoustic model)
Predict acoustic features : 2051 [msec]
Finish : 66688 [msec]
Generation rate : 0.647793 [gen/sec]
-- File and Parameter information --
label length : 129 [line]
wav length : 43.2 [sec]
frame period : 5 [frame]
full_label : score/label/full/sample1.lab
timing_label : score/label/timing/sample1.lab
output f0 : ./output/sample1.f0
output mgc : ./output/sample1.mgc
output bap : ./output/sample1.bap
model directory : ./model/KIRITAN/
stat_timing : ./model/KIRITAN/stats_timing.bin
model timing : ./model/KIRITAN/model_timing.bin
stat_acoustic : ./model/KIRITAN/stats_acoustic.bin
model acoustic : ./model/KIRITAN/model_acoustic.bin
timing flag : 0
random flag : 0
acoustic flag : 0
number of threads : 4
style shift : 0
------------
04:21 : start WORLD
WORLD - NEUTRINO Edition (v1.0.0-Stable)
Load Acoustic features : 1 [msec]
Decode Acoustic features : 7 [msec]
Synthesis : 186 [msec]
Finish : 1518 [msec]
Generation rate : 28.4585 [gen/sec]
-- File and Parameter information --
wav Length : 43.2 [sec]
sampling rate : 48000 [Hz]
sampling bit : 16 [bit]
pitch shift : 1
formant shift : 1
number of parallel : 4 [thread]
hi-speed synthesis : 0
realtime synthesis : 0
smooth pitch : 0
smooth formant : 0
enhance breathiness : 0
-------------------
04:23.2N : start NSF
NSF_IO - Neural Source Filter (I/O) (v1.0.0-Stable)
Linguistic feature (duration) : 5 [msec]
Linguistic feature (acoustic) : 279 [msec]
Separate feature : 289 [msec]
Synthesis (NSF) : 344 [msec]
Error: Failed to run NSF. Please check log (NSF/NSF.log).
05:15 : END
执行到最后一步NSF时出现了问题,手动替换变量尝试重新执行
sistine:NEUTRINO rumia$ ./bin/NSF_IO score/label/full/sample1.lab score/label/timing/sample1.lab output/sample1.f0 output/sample1.mgc output/sample1.bap KIRITAN output/sample1_nsf.wav -t
NSF_IO - Neural Source Filter (I/O) (v1.0.0-Stable)
Linguistic feature (duration) : 0 [msec]
Linguistic feature (acoustic) : 239 [msec]
Separate feature : 251 [msec]
Synthesis (NSF) : 310 [msec]
Write wav : 160424 [msec]
Finish : 160666 [msec]
Generation rate : 0.268881 [gen/sec]
-- File and Parameter information --
label length : 129 [line]
wav length : 43.2 [sec]
frame period : 5 [frame]
full_label : score/label/full/sample1.lab
timing_label : score/label/timing/sample1.lab
input f0 : output/sample1.f0
input mgc : output/sample1.mgc
input bap : output/sample1.bap
nsf directory : NSF/
output f0 : NSF/output/f0/
output mgc : NSF/output/mgc/
output bap : NSF/output/bap/
output list : NSF/output/wav.list
output wav : output/sample1_nsf.wav
------------
成功。现在合成的最终结果已经输出到了output/sample1_nsf.wav
。效果非常好,接近人声。
编辑乐谱
可以使用MuseScore编辑乐谱,然后将乐谱放到score
,同样修改Run.sh
的BASENAME
,重新执行即可。