Putting bandits into context (pre-print)

A pre-print of our contextual multi-armed bandit paper is now available on the biorxiv here. We model participants’ behavior in contextual multi-armed bandit tasks, that is tasks that require both function learning and decision making, by using a combination of Gaussian process regression and a diverse set of acquisition functions. We find that participants’ learning is universal but very local and that they seem to directly trade-off between rewards and uncertainty.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s