Fixation patterns in simple choice as optimal use of cognitive resources

Welcome to my poster! This isn't a poster, you say? Well, I think using a big horizontal pdf to present your work in a virtual setting makes about as much sense as putting a screen door on a submarine. Free yourself from the shackles of imitating print media! Embrace the silver lining!

💡

Reading tips Want a quick overview? Just read the sections with gray backgrounds. Want more details? Click on the triangles to the left of the text to expand hidden content.

Questions? Comments? Looking for a walkthrough? Email me! fredcallaway@princeton.edu

Read the preprint! https://psyarxiv.com/57v6k

Abstract

🏃

TL;DR (short version): We present a model of attention-modulated evidence accumulation for simple N-alternative choice, and approximate the optimal policy for allocating attention within the model. Comparing to human fixation patterns in existing binary and trinary choice datasets, the model accounts for many previously identified effects and we also confirm several novel predictions. Together, our results suggest that the evolving state of the decision process influences the fixation process in a way that is consistent with the optimal use of limited cognitive resources.

Long version (click the triangle to expand)
Even when choosing among a small set of alternatives, people don't perfectly evaluate every option. Instead, decisions seem to be made on the basis of sequentially accumulated, noisy evidence. Previous research has suggested that the accumulation process is modulated by visual attention such that evidence accumulates more rapidly for attended items. But what guides attention itself?
To answer this question, we formalize decision making as a Bayesian evidence accumulation process—similar to a drift diffusion model, but with explict representations of uncertainty. To capture attention, we assume that samples can only be collected for the fixated item. We additionally assume a cost for each sample as well as a cost for switching between items (making saccades). The problem of attention allocation is thus cast as a sequential decision problem in which a decision maker must continuously decide whether to select an item or keep sampling, and in the latter case, which item to sample from. The optimal attention allocation policy is the one that maximizes the expected value of the chosen item less the costs incurred by the decision making process.
We approximate the optimal fixation policy using tools from metareasoning in artificial intelligence. We find that fixations are drawn to items whose value estimates are uncertain and close to those of the competing item(s). Furthermore, we find that in the case of trinary—but not binary—choice, attention is preferentially directed to items with higher estimated value. The model thus provides a normative foundation for recently proposed models of both uncertainty-directed and value-directed attention. It additionally specifies a near-optimal tradeoff between these two factors, as well as a near-optimal stopping rule and fixation-termination rule.
Comparing model predictions to human behavior in two previously collected binary and trinary choice datasets, we find that the model accounts for many previously identified effects and we also confirm several novel predictions. Together, our results suggest that the evolving state of the decision process influences the fixation process in a way that is consistent with the optimal use of limited cognitive resources.

Background

**A common simple choice task** asks participants to make binary choices between snack items. By asking participants to rate the items in an earlier stage of the study and recording their eye fixations while they their choice, we can see how value and attention jointly determine choice.

Choices and reaction times in simple value-based decision making tasks are well modeled by information sampling (or, "evidence accumulation") models such as the Drift Diffusion Model and the Leaky Competing Accumulators model.

Manipulating participants visual attention shows that people are more likely to choose items that they look at longer, when they are positively valenced—the effect reverses for aversive items.

This effect is captured by models like the attentional DDM, in which evidence accumulates more rapidly for the attended item. However, these models leave a critical question unanswered:

❓ How do people decide what to look at when making decisions ❓

To address this question, we frame attention allocation in simple choice as an active information sampling problem, and derive the optimal policy.

Information sampling model

🏃

TL;DR: At each time step, the decision maker receives a noisy sample of the true utility of the fixated item. She integrates these samples into posterior beliefs by Bayesian inference. Her attention-allocation policy determines both when to stop sampling and also which item to fixate at each step.

**Sampling and belief updating in a binary choice task.** The top row shows the experimental display, with the fixated item denoted by the eye symbol. The bottom two rows depict the first few steps of the sampling and belief updating process. The decision maker's beliefs about the value of each item are denoted by the Gaussian probability density curves. The true values of each item ( $u^{(L/R)}$ , dashed lines) are sampled from standard normal distributions; this is captured in the decision maker's initial belief state (first column). Every time step, $t$ , the decision maker fixates one of the items and receives a noisy sample about the true value of that item ( $x_t$ marks). She then updates her belief about the value of the fixated item using Bayesian updating (shift from light to dark curve). The beliefs for the unfixated item are not updated. The process repeats each time step until the decision maker terminates sampling, at which point she chooses the item with maximal posterior mean.

Bayesian update equations

See Tajima et al. (2016) and (2019) for similar Bayesian formulations of evidence accumulation, but without attentional modulation.

Our model was directly inspired by a formalization of computational resource allocation for artificial intelligence systems.

Optimal policy

🏃

TL;DR: The (approximately) optimal attention policy fixates items whose value estimates are uncertain and close to the competing values. When there are more than two items, it also shows an asymetric effect of value, attending mostly to the items with the top two estimates.

The optimal policy maximizes expected payoff, which is defined

where the cost incurred at each time step includes a fixed sampling cost as well as an additional switching/saccade cost if the fixated item is different from the last timestep.

It's intractable to compute the optimal policy for decisions between more than two items. 😫Thus, we approximate it using a recently proposed method based on reinforcement learning and value of information features.

**Illustration of the optimal policy. The heatmaps show the probability of fixating on item 1 as a function of the precision of its value estimate and the mean of its relative value estimate.** We see that the optimal policy tends to fixate on items that are uncertain and have estimated values similar to the other items. In the case of trinary—but not binary—choice, we additionally see a stark asymmetry in the effect of relative estimated value. While the policy is likely to sample from an item whose value is substantially higher than the competitors, it is unlikely to sample from an item with value well below. In particular, the policy has a strong preference to sample from the items with best or second-best value estimates.

Two key features drive optimal attention allocation: 🤔 Uncertainty: how unsure am I about this item's value? 😍 Value (when N > 2): how valuable do I think this item is?

Illustration of how uncertainty and value affect the value of sampling an item.
Each panel shows a belief state for trinary choice. The curves show the probability density of each item's value and the area of the shaded regions show the probability that the item's true value is better than the current leading value estimate (the gray item). This probability correlates strongly with the value of sampling the item because sampling is only valuable if it changes the choice. In each case, it is more valuable to sample the orange item than the purple item because either (top) its value is more uncertain, or (bottom) its value is closer to the leading value.

Illustration of "value of information" features used to approximate the optimal policy.
Our approximation method comes from the field rational metareasoning, which is concerned with the allocation of computational resources in AI systems. In our model, a "computation" corresponds to drawing a value sample and updating the belief accordingly. The plot below shows that we can bound the expected benefit of further computation by (relatively) easily computable expressions. Our method basically learns to predict the true value of sampling by interpolating between these extremes. See the preprint for a thorough discussion and derivations.

The solid line shows the average value of the item chosen after different numbers of computations selected by a near-optimal policy assuming no computational costs. The dashed lines show values for two of the VOI features in the initial belief state: VOI_myopic is the value after one computation and VOI_full is the asymptotitc value after infinite computations.

Results

🏃

TL;DR: Human fixation patterns show evidence for both uncertainty-directed and value-directed attention. We confirm novel predictions that fixation durations increase over the course of the trial and and are greater in trinary vs. binary choice. However, the optimal policy doesn't account for classic attentional choice biases in binary choice.

Each plot compares human data and model predictions for either binary or trinary choice, as indicated by either two or three dots in the panel. Expand for details.
Error bars (human) and shaded regions (model) indicate 95% confidence intervals computed by 10,000 bootstrap samples (the model confidence intervals are often too small to be visible). To provide a sense of the uncertainty in the fitting process, we depict the predictions of the thirty best-fitting parameter configurations. Each light red line depicts the predictions for one of those parameters. The dark red line shows the aggregated (mean) prediction.

Basic psychometrics

Number of fixations

Uncertainty-directed attention 🤔

We saw that estimate certatinty (precision, $\lambda$ ) was a major driver of fixations in the optimal policy. To test for this in humans, we can look at relative cumulative fixation time, because precision increases linearly with fixation time. Consistent with this, we see that people (1) are more likely to fixate on less fixated items, and (2) don't allow any item get too far below the others.

**Distribution of fixation advantage for a newly fixated item.** Fixation advantage is the cumulative fixation time to the item minus the mean cumulative fixation time to the other item(s). The first fixation is excluded from this plot.

Caveat and stronger test of this effect.
A purely mechanical effect can account for the pattern above: the item that is currently fixated will on average have received the most fixation time, but it cannot be the target of a new fixation, which drives down the fixation advantage of newly fixated items. We can use the three-item dataset to address this issue. In this case, the target of each new fixation (excluding the first) must be one of the two items that are not currently fixated. Thus, comparing the cumulative fixation times for these items avoids the previous confound. We see the same pattern in both the data and model predictions. This suggests that uncertainty is not simply driving the decision to make a saccade, but is also influencing the location of that saccade.

Value-directed attention 😍

The second key driver of attention in the optimal policy is estimated value, which directs fixations to the items with top two posterior means (see heatmaps above). As a results the model predicts that fixation time depends on signed relative value only in the trinary case. Indeed, people do show a strong effect in the trinary case, but they also seem to show the effect in the binary case (to a lesser extent).

**Proportion of time fixating the left item as a function of its relative rating.**

We draw similar conclusions when looking only at the first fixation.

Looking at the time course of attention suggests that attention is driven by estimated values rather than the item's true values. (expand for explanation)
Previous work has suggest that fixations might be driven by the items' true values. In contrast, our model suggests that fixations are influenced by the value estimates constructed during the decision process. We can try to distinguish between these accounts by looking at the time course of attention. Early in the decision making process, estimated values will be only weakly related to true value. However, with time the value estimates become increasingly accurate and will thus more closely correlate with true value. Thus, if the decision maker always attends to the items with high estimated value, she should be increasingly likely to attend to items with high true value as the trial progresses. In the trinary case, this is exactly what we see. Interestingly, it seems that people immediately attend preferentially to the better item in the binary case—but again, the effect is weaker over all.

**Probability of fixating the lowest rated item as a function of the cumulative fixation time to any of the items** (roughly equal to time since trial onset).

Two additional tests using the third and fourth fixation in the trinary case.
The model makes even starker predictions in the three-item case. Consider all trials in which the decision-maker samples from different items during the first three fixations. The model predicts that the fourth fixation should be to the first-fixated item if its posterior mean is larger than that of the second-fixated item, and vice versa. As a result, the probability that the fourth fixation is a refixation to the first-fixated item increases with the difference in ratings between the first- and second-fixated items (panel D).
Panel E shows a similar effect for the third fixation: the probability of refixating the first-fixated item (rather than fixating the still unfixated item) increases with the value of the first-fixated item.
(D) Probability that the fourth fixation is to the first-fixated item as a function of the difference in rating between that item and the second-fixated item. (E) Probability that the third fixation is to the first fixated item as a function of its rating.

Fixation durations increase over the trial and are shorter in trinary vs. binary choice

The optimal policy makes two novel (and one pre-established) predictions about fixation durations that are confirmed in the human fixation data.

Later (but non-final) fixations are longer.

Fixations are longer in binary choice case. (We use the same parameters for both datasets!)

The final fixation is shorter (also predicted by the aDDM).

Intuition for model predictions:
The intuition for the first two effects is that more evidence is needed to alter beliefs when their precision is already high; this occurs late in the trial, especially in the two-item case where samples are split between fewer items. The intuition for the last effect is that the final fixation is cut off when a choice is made.

**Duration of fixation by fixation number.** Final fixations are excluded from all but the last bin. A model fixation is defined as a contiguous sequence of samples drawn from the same item, and the duration is 100ms times the number of samples.

Attentional choice biases

A primary and frequently reproduced finding is that people tend to choose the item they looked at more (when the items are desirable). Our model predicts this effect in trinary, but not binary choice.

Probability that the left item is chosen as a function of its final fixation advantage, given by total fixation time to the left item minus the mean total fixation time to the other item(s).

What's going on?
Two mechanisms that predict the attentional choice bias. Left: in the aDDM, attending to an item inflates its value, making it more likely to be chosen; this is a a causal chain structure. Right: in the proposed model, value estimates drive both fixations and choice, correlating them through a common cause structure.
The standard explanation for this effect (as implemented in the aDDM) is that (1) attention strengthens evidence, and (2) evidence for desirable items is generally positive; this implies that stronger evidence is also more positive evidence.
In contrast, our model assumes an unbiased prior distribution and so evidence is not systematically positive or negative. As a result, we predict no bias in the binary case.
However, we do predict a bias in the trinary case due to a common cause mechanism. Items with higher value estimates are more likely to be both fixated and chosen.
In a recent preprint, Jang, Sharma, and Drugowitsch present a very similar formalization of attention-modulated binary choice (converging evidence for the approach!) Their model predicts the choice bias because they assume a mean-zero prior (similar to the aDDM). However, in our own attempts to fit the prior, we found that a nearly unbiased prior gives the best description of the full data, even though it fails to capture the choice bias. Puzzling! 🤔

Additional choice biases
Probability that the last fixated item is chosen as a function of its relative rating.
Probability of choosing the first-seen item as a function of the first-fixation duration.

Conclusion

When making simple choices, visual attention and value estimation interact reciprocally, allowing people to sample the most useful information for making a choice.

Future work

many-alternative (N>10) choice

feature-based attention (multi-attribute choice)

neurally plausible approximations to the optimal policy

👉

Related work at SNE Check out Romy Frömer's talk in the "Risk, uncertainty, delay" session Considering what we know and what we don’t know: Expectations and metacognition guide value integration during economic choice

😄

Thanks for reading! Want more info? Read the preprint or send me an email!