论文笔记 Co-Attending Free-Form Regions and Detections （AAAI2018) - 代码天地

论文笔记 Co-Attending Free-Form Regions and Detections （AAAI2018)

其他 2018-12-11 00:52:59 阅读次数: 0

版权声明：本文为博主原创文章，未经博主允许不得转载。 https://blog.csdn.net/pku_langzi/article/details/81174386

Co-Attending Free-Form Regions and Detections with Multi-Modal Multiplicative Feature Embedding for Visual Question Answering

现在做VQA的，很多方法都是基于question在图像中寻找显著性区域，来获得相应answer。

attention主要分为两支free-form region based 和 detection-based 。

两支单独做，各有弊端，比如free-form的图像的切分往往会把object分成很多细粒度块，而如cat的身体与狗的身体块，可

能很相似，这样会误导模型产生错误的答案，而dection based mechanism往往事先检测出实体区域，对于许多涉及前景的

问题很有利，但比如“How is the weather today?”这样的问题确比较难，因为或许不存在sky这样的bounding box。

因此本文将两种方式结合起来，彼此互补。

方法思想

这里写图片描述

网络结构

这里写图片描述

free based是利用Resnet152提取14*14*2048特征，可视为划分196个图像区域。

扫描二维码关注公众号，回复： 4454211 查看本文章

detection based是利用fasterrcnn提取19个bounding box特征19*4097（其中4096是图像特征，1是

bounding box的检测得分。）

attention细节

这里写图片描述

数据集

VQA, COCO-QA

结果

这里写图片描述

论文网址：Co-Attending Free-Form Regions and Detections with Multi-Modal Multiplicative Feature Embedding for Visual Question Answering

github源码：dual-mfa-vqa

猜你喜欢

转载自blog.csdn.net/pku_langzi/article/details/81174386

论文笔记 Co-Attending Free-Form Regions and Detections （AAAI2018)

论文笔记 Memory Fusion Network for Multi-view Sequential Learning (AAAI2018)

论文笔记 Acquiring Common Sense Spatial Knowledge through Implicit Spatial Templates (AAAI2018)

Free-Form Region Description with Second-Order Pooling 论文笔记

论文|Free-Form Image Inpainting with Gated Convolution

Seeing without Looking: Contextual Rescoring of Object Detections for AP Maximization 论文笔记

【AAAI2018】阿里提出基于注意力机制的用户行为建模框架论文学习笔记

[GAN]Free-Form Image Inpainting with Gated Convolution论文翻译（回归帖~\(￣︶￣*\))）

论文阅读笔记（二十三）【AAAI2018】：Video-Based Person Re-Identiﬁcation via Self Paced Weighting

ICCV 2017 《Multi-label Image Recognition by Recurrently Discovering Attentional Regions》论文笔记

自由变形技术（Free-Form Deformation）

Free-Form Image Inpainting with Gated Convolution

【AAAI2020】长期跟踪GlobalTrack的论文笔记

论文笔记-理解-Minimizing Supervision for Free-space Segmentation

图像修复3: Free-Form Image Inpainting with Gated Convolution

HBase 笔记四预先设置regions

论文笔记: Co-Forest (2007 年半监督协同训练经典论文)

【论文笔记】Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation (AAAI 2019)

论文解读：Where To Look: Focus Regions for Visual Question Answering

论文《On the Number of Linear Regions of Deep Neural Networks》翻译

词向量算法—【AAAI2018】蚂蚁金服公开的基于笔画的中文词向量算法

Top-Down Feedback for Crowd Counting Convolutional Neural Network (AAAI2018) (人群密度)

AAAI2018中的自注意力机制(Self-attention Mechanism)

MCAN：Deep Modular Co-Attention Networks for Visual Question Answering——2019 CVPR 论文笔记

【论文阅读笔记】Instance Segmentation of Visible and Occluded Regions for Finding and Picking Target(写不开了，，)

FoLR:Focus on Local Regions for Query-based Object Detection论文学习笔记

R-CNN(Regions with CNN features)学习笔记

论文笔记之Label-Free Supervision of Neural Networks with Physics and Domain Knowledge

【论文笔记】FASF：Feature Selective Anchor-Free Module for Single-Shot Object Detection

论文笔记《Cell-Free Massive MIMO With Radio Stripes and Sequential Uplink Processing》

今日推荐

周排行

Leetcode简单题61~80

解决zookeeper磁盘IO高的问题

多线程相关方法详解

Maven-setting.xml文件详解

Maven 项目的 classpath 理解

渊亭科技大数据笔试题

配置JVM内存分配

计算机网络个人学习笔记（三）网络层：第三部分连载

js中两个等号(==)和三个等号(===)的区别

用C程序自动打开电脑上的程序

每日归档

更多

2024-09-18(0)

2024-09-17(0)

2024-09-16(0)

2024-09-15(0)

2024-09-14(0)

2024-09-13(0)

2024-09-12(0)

2024-09-11(0)

2024-09-10(0)

2024-09-09(0)