attention,self-attention,multihead attention,Transformer【亟待解决】 - 代码天地

attention,self-attention,multihead attention,Transformer【亟待解决】

企业开发 2022-04-04 18:30:34 阅读次数: 0

大体认识：一文读懂注意力机制 - 知乎 (zhihu.com)

自注意力机制主要是key-value,query怎么选，self-attention mechanism有自己的一套选法。

细节认识：[1,2,3,4,5]

Self-Attention Layer的出现原因

为了解决RNN、LSTM等常用于处理序列化数据的网络结构无法在GPU中并行加速计算的问题。

Transformer-讲这个之前得先讲讲multihead，Transformer架构其实和之前讲的Seq2Seq一致，只是把Seq2Seq里面得Rnn换成了transformer block(就像Resnet50里面的Residual block一样)。

Transformer论文逐段精读【论文精读】 - 哔哩哔哩 //gOOd notes

Transformer论文逐段精读【论文精读】_哔哩哔哩_bilibiliM

Attention Is All You Need_夏末的初雪的博客-CSDN博客_attention is

Attention is all u need

Multi-head attention

Self-Attention

Layer Norm

Batch Norm

Every column is a feature,batch norm do every column by normalization.

Like the blue rectangle shape

Sample length

Decoder

Layernorm

Attention

from left to right

The same way,

Scaled dot-product attention (缩放点积注意力)

n rows,dimension-dk

Key

ZHENDENIUPI:parallel computing save time cost,improve efficiency

Optional

Concretelly speaking

FPN,forward-feed posiontional-wise network

for Semantic representation ,have weights to learn for better expression.

posiontional embedding:Attention haven't posiontional func,so we need p-e for input.

better details for the rest :

Transformer论文逐段精读【论文精读】 - 哔哩哔哩

at 3.3 Position-wise Feed-Forward Networks 作者：BeBraveBeCurious https://www.bilibili.com/read/cv13759416 出处：bilibili

笔记忘保存了···，以后再填吧

参考资料

[1]BERT的原理与应用

[2]64 注意力机制【动手学深度学习v2】_哔哩哔哩_bilibili

[3]65 注意力分数【动手学深度学习v2】_哔哩哔哩_bilibili

[4] 66 使用注意力机制的seq2seq【动手学深度学习v2】_哔哩哔哩_bilibili

[5]67 自注意力【动手学深度学习v2】_哔哩哔哩_bilibili

一文读懂注意力机制 - 知乎 (zhihu.com) //总结篇

Pytorch 图像处理中注意力机制的代码详解与应用（Bubbliiiing 深度学习教程）_哔哩哔哩_bilibili

猜你喜欢

转载自blog.csdn.net/weixin_43332715/article/details/123619076

attention,self-attention,multihead attention,Transformer【亟待解决】

多任务学习：Transformer based MultiHead Self-Attention Networks

Attention与Self-Attention

Self-Attention与Transformer

Self-attention & Transformer

Self-Attention（什么是Self-Attention）

Attention 和self-attention

Transformer中的Self-Attention

Self-Attention 和 Transformer

self-attention与Transformer补充

Self-attention详解

Self-attention

关于self-attention

TransformerVision（一）|| Self-Attention和MultiHead Self-Attesntion原理

【AI】12_Attention and Self-Attention

NLP 3.4 Attention，self-attention

浅谈Attention与Self-Attention的前世今生

self-attention和cross-attention

Self-Attention GAN 中的 self-attention 机制

self-attention与softmax的推导

On the Integration of Self-Attention and Convolution

self-attention学习笔记

Self-Attention运行过程

self-attention的通俗解释

学习笔记（二）__Self-Attention及Transformer

NLP入门（4）— Self-attention & Transformer

对Transformer中self-attention的理解

Transformer中self-attention实现

ELMo/GPT/Bert/Attention/Transformer/Self-Attention总结

【论文解读】Attention Is All You Need（Transformer and Self-Attention）

今日推荐

周排行

Access的四舍五入取整

8.23 前端学习过程

入门学习过程方向与漏洞复现总结：

操作分布式文件之八：如何批量并行读写远程文件和事务补偿处理

应邀出个教程（搭建tensorflow跑网络环境）

Kubernetes之Pod控制器应用进阶

14-[mysql内置功能]--

HDU6212 区间dp 好题

VS2015生成代码图

验证手机号的工具类

每日归档

更多

2024-10-21(0)

2024-10-20(0)

2024-10-19(0)

2024-10-18(0)

2024-10-17(0)

2024-10-16(0)

2024-10-15(0)

2024-10-14(0)

2024-10-13(0)

2024-10-12(0)