问题:
代码学习:
https://github.com/shawnwun/NNDIAL
https://github.com/MiuLab/DDQ
* The encoder modules contain:
- LSTM encoder : an LSTM network that encodes the user utterance.
- RNN+CNN tracker : a set of slot trackers that keep track of each slot/value pair across turns.
- DB operator : a discrete database accessing component.
* The decoder modules contain:
- Policy network : a decision-making module that produces the conditional vector for decoding.
- LSTM decoder : an LSTM network that generates the system response.
构架:
data的基本形式:
DB:
输入的是:
输出的是:
训练过程:
https://github.com/shawnwun/NNDIAL/blob/master/nn/nnsds.py
有RL的具体训练步骤,可以看到reward是如何定义的。