https://www.ncbi.nlm.nih.gov/pubmed/30370339

2018 May 5;2:2398212818772964. doi: 10.1177/2398212818772964. eCollection 2018.

Model-based and model-free pain avoidance learning.

Wang O1,2, Lee SW3, O'Doherty J4, Seymour B1,5,6, Yoshida W1.

Author information

1: Department of Neural Computation for Decision-making, Advanced Telecommunications Research Institute International, Kyoto, Japan.
2: Department of Biology, Stanford University, Stanford, CA, USA.
3: Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea.
4: Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.
5: Computational and Biological Learning Laboratory, Department of Engineering, University of Cambridge, Cambridge, UK.
6: Center for Information and Neural Networks, National Institute for Information and Communications Technology, Osaka, Japan.

Abstract

Background: While there is good evidence that reward learning is underpinned by two distinct decision control systems - a cognitive 'model-based' and a habitbased 'model-free' system, a comparable distinction for punishment avoidance has been much less clear. Methods: We implemented a pain avoidance task that placed differential emphasis on putative model-based and model-free processing, mirroring a paradigm and modelling approach recently developed for reward-based decision-making. Subjects performed a two-step decision-making task with probabilistic pain outcomes of different quantities. The delivery of outcomes was sometimes contingent on a rule signalled at the beginning of each trial, emulating a form of outcome devaluation. Results: The behavioural data showed that subjects tended to use a mixed strategy - favouring the simpler model-free learning strategy when outcomes did not depend on the rule, and favouring a model-based when they did. Furthermore, the data were well described by a dynamic transition model between the two controllers. When compared with data from a reward-based task (albeit tested in the context of the scanner), we observed that avoidance involved a significantly greater tendency for subjects to switch between model-free and model-based systems in the face of changes in uncertainty. Conclusion: Our study suggests a dual-system model of pain avoidance, similar to but possibly more dynamically flexible than reward-based decision-making.

KEYWORDS:

Decision-making; pain avoidance; reinforcement learning; uncertainty

PMID:

30370339

PMCID:

PMC6187988

DOI:

10.1177/2398212818772964

기존의 강화학습은, Reward를 주면서 reinforced 되면서, 학습을 하게된다.

이 방법은, negative reward 라고 할 수 있는 pain을 avoidance 하면서 (positively reinforced 가 아닌)

mixed (dual) strategy 로 학습을 한다는 것.

기계 학습에서, 여러가지 방법들이, 더 응용되고 개발되는 추세.

사람도 그렇듯이, 아이스크림이 좋으면 더 먹고, 더먹으면 더 좋아하고.

식초를 먹으면 너무 시므로, 생식초는 안먹고 되고, 식초 냄새만 맡아도 멀리하고.

사람의 decision making 에 관여하는 요소는 워낙 많으니까.

그런 것들의 알고리즘을 점차 개발해 나가면,

사람과 유사한, 또는 더 나은, 의사결정 수단을 만들어 나갈 수 있을 듯...

2018 Model-based and model-free pain avoidance learning.pdf

저작자표시 비영리 변경금지

'Reinforcement Learning' 카테고리의 다른 글

★Randomised controlled trial of WISENSE, a real-time quality improving system for monitoring blind spots during esophagogastroduodenoscopy. (0)	2019.03.16
Introduction to the special issue on deep reinforcement learning: An editorial. (0)	2018.08.24
An adaptive deep Q-learning strategy for handwritten digit recognition. (0)	2018.05.18
Encouraging Physical Activity in Patients With Diabetes: Intervention Using a Reinforcement Learning System. (0)	2017.10.16

의료와 인공지능

Model-based and model-free pain avoidance learning.

Model-based and model-free pain avoidance learning.

Author information

Abstract

KEYWORDS:

'Reinforcement Learning' 카테고리의 다른 글

티스토리툴바

Model-based and model-free pain avoidance learning.

Model-based and model-free pain avoidance learning.

Author information

Abstract

KEYWORDS:

'Reinforcement Learning' 카테고리의 다른 글

'Reinforcement Learning' Related Articles

티스토리툴바