Our paper about Contextual Multi-Armed Bandits got accepted to ICCM. Here is the pdf:
I also gave a talk about Contextual Multi-Armed Bandit tasks at City University. The slides can be found below.
I think the CMAB is a really cool task that combines decision making and function learning and I really do believe that it can be seen as an almost quintessential decision making task. At the moment, our best model at describing participants’ behaviour is a Gaussian Process Thompson Sampler. This means that participants approach the task in a (close to-)rational way and probability match outcomes over time. If this is true, then it would mean that we are quite smart and well-adapted to changing environments. In the near future, I plan to test models that are more or less myopic against each other.