跟潮流，读文章

其他 2020-02-12 15:50:31 阅读次数: 0

文章目录

第1期：DeepRL每日论文快报
第2期：DeepRL每日论文快报
第3期：DeepRL每日论文快报
第4期：DeepRL每日论文快报
第5期：DeepRL每日论文快报
第6期：DeepRL每日论文快报
第7期：DeepRL每日论文快报
第8期：DeepRL每日论文快报
第9期：DeepRL每日论文快报
第10期：DeepRL每日论文快报
第11期：DeepRL每日论文快报
深入理解Hindsight Experience Replay论文[待处理]
DQN系列
第12期：DeepRL每日论文快报
第13期：DeepRL每日论文快报
第14期：DeepRL每日论文快报

打算有计划的读文章，做一点积累，而不是像以前临到任务来了才慌手慌脚的草草读几篇。
正好，这两天关注的公众号连续发文，看样子是个系列——《DeepRL每日论文快报》，20191104。

第1期：DeepRL每日论文快报

【第1期：DeepRL每日论文快报】
本期关注：
【20191104_Automatic Testing and Falsification with Dynamically Constrained Reinforcement Learning】
有限时间里没有我感兴趣的内容——好吧我没看懂(ಡωಡ)hiahiahia，弃之
在这里插入图片描述

第2期：DeepRL每日论文快报

【第2期：DeepRL每日论文快报】
本期关注：
【20191105_Learning Fairness in Multi-Agent Systems】
fariness, multi-agent, hierarchical，看到越来越多的hierarchical了，不过落地应用复杂度较低

【20191105_VASE: Variational Assorted Surprise Exploration for Reinforcement Learning】
RL & surprise，Variational Assorted Surprise Exploration (VASE)

【20191104_百度PARL再度夺冠NeurIPS仿生人挑战赛：强化学习控制的流畅行走】。文中的RL框架【PARL】看起来很不错啊，不过利弊都有。大概用paddlepaddle就要弃tf咯？未来可以尝试，但是短期内恐怕还是难以承受。
在这里插入图片描述

第3期：DeepRL每日论文快报

【第3期：DeepRL每日论文快报】
本期关注：
【20191106_Feedback Linearization for Unknown Systems via Reinforcement Learning】
反馈线性化控制 & RL，不太懂

第4期：DeepRL每日论文快报

【第4期：DeepRL每日论文快报】
本期关注：
【20191106_Dynamic Cloth Manipulation with Deep Reinforcement Learning】
很有意思，要达到的控制不仅是轨迹，而求还有时间响应和执行速度的要求
在这里插入图片描述

第5期：DeepRL每日论文快报

【第5期：DeepRL每日论文快报】
本期关注：
【20191107_Gradient-based Adaptive Markov Chain Monte Carlo】by deepmind
‘introduce a gradient-based learning method to automatically adapt Markov chain Monte Carlo (MCMC) proposal distributions to intractable targets’

第6期：DeepRL每日论文快报

【第6期：DeepRL每日论文快报】
本期关注：
【20191108_Gym-Ignition: Reproducible Robotic Simulations for Reinforcement Learning】
RL + Gazebo，十分有价值
【20191108_DeepRacer: Educational Autonomous Racing Platform for Experimentation with Sim2Real Reinforcement Learning】
由仿真环境迁移到现实环境，堪称完美

第7期：DeepRL每日论文快报

【第7期：DeepRL每日论文快报】
本期关注：
【20191115_Mapless Navigation among Dynamics with Social-safety-awareness: a reinforcement learning approach from 2D laser scans】
在这里插入图片描述
从二维仿真，到三维Gazebo仿真，到真实环境，好文章

第8期：DeepRL每日论文快报

【第8期：DeepRL每日论文快报】
本期关注：
【20191118_Real-Time Reinforcement Learning】
能够解决不同任务
在这里插入图片描述

第9期：DeepRL每日论文快报

【第9期：DeepRL每日论文快报】
本期关注：
【20191207_Accelerating Training in Pommerman with Imitation and Reinforcement Learning】
The Pommerman simulation was recently developed to mimic the classic Japanese game Bomberman, PPO , complex multi-agent competitive environment, very sparse and delayed rewards. 在这里插入图片描述

第10期：DeepRL每日论文快报

【第10期：DeepRL每日论文快报】
本期关注：
【20191214_Human-Robot Collaboration via Deep Reinforcement Learning of Real-World Interactions】
人机联合学习，SAC：深度解读Soft Actor-Critic 算法
在这里插入图片描述

第11期：DeepRL每日论文快报

【第11期：DeepRL每日论文快报】
本期关注：
【20191219_Dota 2 with Large Scale Deep Reinforcement Learning】
期待已久的dota2解析终于发文
在这里插入图片描述

深入理解Hindsight Experience Replay论文[待处理]

【20191018_深入理解Hindsight Experience Replay论文】

DQN系列

【20191224_DQN系列(1)：Double Q-learning】
【20191226_DQN系列(2): Double DQN算法原理与实现】

第12期：DeepRL每日论文快报

【第12期：DeepRL每日论文快报】
本期关注：
【20200110_Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards】
propose an effective reward shaping method through predictive coding to tackle sparse reward problems
【20200110_Interestingness Elements for Explainable Reinforcement Learning: Understanding Agents’ Capabilities and Limitations】
propose an explainable reinforcement learning (XRL) framework that analyzes an agent’s history of interaction with the environment to extract interestingness elements that help explain its behavior
在这里插入图片描述

第13期：DeepRL每日论文快报

【第13期：DeepRL每日论文快报】
本期关注：
【20200121_Continuous-Discrete Reinforcement Learning for Hybrid Control in Robotics】
控制机械臂
在这里插入图片描述
本期关注：
【20200121_MushroomRL: Simplifying Reinforcement Learning Research】
MushroomRL：一个有趣的python库

第14期：DeepRL每日论文快报

【第14期：DeepRL每日论文快报】
本期关注：
【20200210_On Simple Reactive Neural Networks for Behaviour-Based Reinforcement Learning】
Our findings suggest that robotic learning can be more effective if each behaviour is learnt in isolation and then combined them to accomplish the task
在这里插入图片描述
【20200210_Local Policy Optimization for Trajectory-Centric Reinforcement Learning】

方小汪

发布了38 篇原创文章 · 获赞 3 · 访问量 2619

私信关注

猜你喜欢

转载自blog.csdn.net/weixin_42828571/article/details/102907406

跟潮流，读文章

读文章，写代码

读文章:文件操作

读文章写Demo

已读博客文章

读阮老师文章感想

6-2 读文章(*)

跟大佬一起读源码：CurrentHashMap的扩容机制

跟张博士读RL论文---DQN(ICML版本)

那些值得一读的Ofbiz文章

menu未读文章小图标

了解 HTTPS，读这篇文章就够了

读Runnable、collable、Excutor、Future文章总结

跟BW PSA维护有关的几篇文章

跟老齐学Django 3：文章管理

跟大佬学django-发布博客文章

一篇让你读懂IPFS跟Filecoin的文章

紧跟潮流

使用redis数据库记录文章的已读状态

【转录组入门】02：读文章得到测试数据

好文章，值得一读，哈哈哈

读阿里程序员的文章有感

读邹欣《师生关系》文章有感

使用Jupyter lab前应该读的几篇文章

2019-11-3 读的一些文章笔记

《GAN万字长文综述》的读文章笔记

SDUT ACM 读连岳文章有感

码农，请读一本跟技术无关的书

读《图解力，跟顶级设计师学作信息图》

添雨跟打器保存文章进度功能-by老随风

今日推荐

周排行

Leetcode简单题61~80

解决zookeeper磁盘IO高的问题

多线程相关方法详解

Maven-setting.xml文件详解

Maven 项目的 classpath 理解

渊亭科技大数据笔试题

配置JVM内存分配

计算机网络个人学习笔记（三）网络层：第三部分连载

js中两个等号(==)和三个等号(===)的区别

用C程序自动打开电脑上的程序

每日归档

更多

2024-09-18(0)

2024-09-17(0)

2024-09-16(0)

2024-09-15(0)

2024-09-14(0)

2024-09-13(0)

2024-09-12(0)

2024-09-11(0)

2024-09-10(0)

2024-09-09(0)