
Fixation patterns in simple choice as optimal use of cognitive resources
Welcome to my poster! This isn't a poster, you say? Well, I think using a big horizontal pdf to present your work in a virtual setting makes about as much sense as putting a screen door on a submarine. Free yourself from the shackles of imitating print media! Embrace the silver lining!
Questions? Comments? Looking for a walkthrough? Email me! fredcallaway@princeton.edu
Read the preprint! https://psyarxiv.com/57v6k
Abstract
Long version (click the triangle to expand)
Even when choosing among a small set of alternatives, people don't perfectly evaluate every option. Instead, decisions seem to be made on the basis of sequentially accumulated, noisy evidence. Previous research has suggested that the accumulation process is modulated by visual attention such that evidence accumulates more rapidly for attended items. But what guides attention itself?
To answer this question, we formalize decision making as a Bayesian evidence accumulation process—similar to a drift diffusion model, but with explict representations of uncertainty. To capture attention, we assume that samples can only be collected for the fixated item. We additionally assume a cost for each sample as well as a cost for switching between items (making saccades). The problem of attention allocation is thus cast as a sequential decision problem in which a decision maker must continuously decide whether to select an item or keep sampling, and in the latter case, which item to sample from. The optimal attention allocation policy is the one that maximizes the expected value of the chosen item less the costs incurred by the decision making process.
We approximate the optimal fixation policy using tools from metareasoning in artificial intelligence. We find that fixations are drawn to items whose value estimates are uncertain and close to those of the competing item(s). Furthermore, we find that in the case of trinary—but not binary—choice, attention is preferentially directed to items with higher estimated value. The model thus provides a normative foundation for recently proposed models of both uncertainty-directed and value-directed attention. It additionally specifies a near-optimal tradeoff between these two factors, as well as a near-optimal stopping rule and fixation-termination rule.
Comparing model predictions to human behavior in two previously collected binary and trinary choice datasets, we find that the model accounts for many previously identified effects and we also confirm several novel predictions. Together, our results suggest that the evolving state of the decision process influences the fixation process in a way that is consistent with the optimal use of limited cognitive resources.
Background

- Choices and reaction times in simple value-based decision making tasks are well modeled by information sampling (or, "evidence accumulation") models such as the Drift Diffusion Model and the Leaky Competing Accumulators model.
- Manipulating participants visual attention shows that people are more likely to choose items that they look at longer, when they are positively valenced—the effect reverses for aversive items.
- This effect is captured by models like the attentional DDM, in which evidence accumulates more rapidly for the attended item. However, these models leave a critical question unanswered:
❓ How do people decide what to look at when making decisions ❓
To address this question, we frame attention allocation in simple choice as an active information sampling problem, and derive the optimal policy.
Information sampling model

- See Tajima et al. (2016) and (2019) for similar Bayesian formulations of evidence accumulation, but without attentional modulation.
- Our model was directly inspired by a formalization of computational resource allocation for artificial intelligence systems.
Optimal policy
The optimal policy maximizes expected payoff, which is defined

where the cost incurred at each time step includes a fixed sampling cost as well as an additional switching/saccade cost if the fixated item is different from the last timestep.
It's intractable to compute the optimal policy for decisions between more than two items. 😫Thus, we approximate it using a recently proposed method based on reinforcement learning and value of information features.

Two key features drive optimal attention allocation: 🤔 Uncertainty: how unsure am I about this item's value? 😍 Value (when N > 2): how valuable do I think this item is?
Illustration of how uncertainty and value affect the value of sampling an item.
Each panel shows a belief state for trinary choice. The curves show the probability density of each item's value and the area of the shaded regions show the probability that the item's true value is better than the current leading value estimate (the gray item). This probability correlates strongly with the value of sampling the item because sampling is only valuable if it changes the choice. In each case, it is more valuable to sample the orange item than the purple item because either (top) its value is more uncertain, or (bottom) its value is closer to the leading value.
Illustration of "value of information" features used to approximate the optimal policy.
Our approximation method comes from the field rational metareasoning, which is concerned with the allocation of computational resources in AI systems. In our model, a "computation" corresponds to drawing a value sample and updating the belief accordingly. The plot below shows that we can bound the expected benefit of further computation by (relatively) easily computable expressions. Our method basically learns to predict the true value of sampling by interpolating between these extremes. See the preprint for a thorough discussion and derivations.
The solid line shows the average value of the item chosen after different numbers of computations selected by a near-optimal policy assuming no computational costs. The dashed lines show values for two of the VOI features in the initial belief state: VOI_myopic is the value after one computation and VOI_full is the asymptotitc value after infinite computations.
Results
Each plot compares human data and model predictions for either binary or trinary choice, as indicated by either two or three dots in the panel. Expand for details.
Error bars (human) and shaded regions (model) indicate 95% confidence intervals computed by 10,000 bootstrap samples (the model confidence intervals are often too small to be visible). To provide a sense of the uncertainty in the fitting process, we depict the predictions of the thirty best-fitting parameter configurations. Each light red line depicts the predictions for one of those parameters. The dark red line shows the aggregated (mean) prediction.
Uncertainty-directed attention 🤔
We saw that estimate certatinty (precision, ) was a major driver of fixations in the optimal policy. To test for this in humans, we can look at relative cumulative fixation time, because precision increases linearly with fixation time. Consistent with this, we see that people (1) are more likely to fixate on less fixated items, and (2) don't allow any item get too far below the others.

Caveat and stronger test of this effect.
A purely mechanical effect can account for the pattern above: the item that is currently fixated will on average have received the most fixation time, but it cannot be the target of a new fixation, which drives down the fixation advantage of newly fixated items. We can use the three-item dataset to address this issue. In this case, the target of each new fixation (excluding the first) must be one of the two items that are not currently fixated. Thus, comparing the cumulative fixation times for these items avoids the previous confound. We see the same pattern in both the data and model predictions. This suggests that uncertainty is not simply driving the decision to make a saccade, but is also influencing the location of that saccade.
Value-directed attention 😍
The second key driver of attention in the optimal policy is estimated value, which directs fixations to the items with top two posterior means (see heatmaps above). As a results the model predicts that fixation time depends on signed relative value only in the trinary case. Indeed, people do show a strong effect in the trinary case, but they also seem to show the effect in the binary case (to a lesser extent).

Looking at the time course of attention suggests that attention is driven by estimated values rather than the item's true values. (expand for explanation)
Previous work has suggest that fixations might be driven by the items' true values. In contrast, our model suggests that fixations are influenced by the value estimates constructed during the decision process. We can try to distinguish between these accounts by looking at the time course of attention. Early in the decision making process, estimated values will be only weakly related to true value. However, with time the value estimates become increasingly accurate and will thus more closely correlate with true value. Thus, if the decision maker always attends to the items with high estimated value, she should be increasingly likely to attend to items with high true value as the trial progresses. In the trinary case, this is exactly what we see. Interestingly, it seems that people immediately attend preferentially to the better item in the binary case—but again, the effect is weaker over all.

Two additional tests using the third and fourth fixation in the trinary case.
The model makes even starker predictions in the three-item case. Consider all trials in which the decision-maker samples from different items during the first three fixations. The model predicts that the fourth fixation should be to the first-fixated item if its posterior mean is larger than that of the second-fixated item, and vice versa. As a result, the probability that the fourth fixation is a refixation to the first-fixated item increases with the difference in ratings between the first- and second-fixated items (panel D).
Panel E shows a similar effect for the third fixation: the probability of refixating the first-fixated item (rather than fixating the still unfixated item) increases with the value of the first-fixated item.
(D) Probability that the fourth fixation is to the first-fixated item as a function of the difference in rating between that item and the second-fixated item. (E) Probability that the third fixation is to the first fixated item as a function of its rating.
Fixation durations increase over the trial and are shorter in trinary vs. binary choice
The optimal policy makes two novel (and one pre-established) predictions about fixation durations that are confirmed in the human fixation data.
- Later (but non-final) fixations are longer.
- Fixations are longer in binary choice case. (We use the same parameters for both datasets!)
- The final fixation is shorter (also predicted by the aDDM).
Intuition for model predictions:
The intuition for the first two effects is that more evidence is needed to alter beliefs when their precision is already high; this occurs late in the trial, especially in the two-item case where samples are split between fewer items. The intuition for the last effect is that the final fixation is cut off when a choice is made.

Attentional choice biases
A primary and frequently reproduced finding is that people tend to choose the item they looked at more (when the items are desirable). Our model predicts this effect in trinary, but not binary choice.

What's going on?
Two mechanisms that predict the attentional choice bias. Left: in the aDDM, attending to an item inflates its value, making it more likely to be chosen; this is a a causal chain structure. Right: in the proposed model, value estimates drive both fixations and choice, correlating them through a common cause structure. The standard explanation for this effect (as implemented in the aDDM) is that (1) attention strengthens evidence, and (2) evidence for desirable items is generally positive; this implies that stronger evidence is also more positive evidence.
In contrast, our model assumes an unbiased prior distribution and so evidence is not systematically positive or negative. As a result, we predict no bias in the binary case.
However, we do predict a bias in the trinary case due to a common cause mechanism. Items with higher value estimates are more likely to be both fixated and chosen.
In a recent preprint, Jang, Sharma, and Drugowitsch present a very similar formalization of attention-modulated binary choice (converging evidence for the approach!) Their model predicts the choice bias because they assume a mean-zero prior (similar to the aDDM). However, in our own attempts to fit the prior, we found that a nearly unbiased prior gives the best description of the full data, even though it fails to capture the choice bias. Puzzling! 🤔
Conclusion
When making simple choices, visual attention and value estimation interact reciprocally, allowing people to sample the most useful information for making a choice.
Future work
- many-alternative (N>10) choice
- feature-based attention (multi-attribute choice)
- neurally plausible approximations to the optimal policy