Palamedes fits psychometric
functions (PFs) using a maximum likelihood criterion. Simply put, what that means is that, of all possible PFs (i.e., all
combinations of possible values for the free parameters), it finds that PF that has the highest probability of generating
the exact same responses to the stimuli that your observer generated. Unfortunately, there is no analytical way in which to
do this. Instead, Palamedes has to perform a search for this best-fitting PF. This turns out to be tricky. There are an infinite
number of candidates to consider! Understanding how maximum likelihood fitting works will help you get better fits and avoid
mistakes. We'll explain by example.
Let's say you perform a 2AFC experiment. You use 9 different stimulus levels (log stimulus levels run from -0.4 to
0.4 in steps of 0.1) and 10 trials at each of these stimulus levels. The number of correct responses at these stimulus levels
are 6, 6, 7, 9, 10, 7, 9, 10, 10 respectively. These proportions correct are shown by the black symbols in Figure 1. Nice!
Looks like you used an appropriate set of stimulus levels (proportions correct go from near guessing levels to perfect with
some intermediate values in between).
Maximizing the Likelihood: Nelder-Mead Simplex Search
So, of all the
possible PFs, which maximizes the likelihood? PFs are completely determined by their
shape (Logistic, Weibull, etc.) and the values of four parameters: the 'threshold', the 'slope', the 'guess rate' and the
'lapse rate'. You used a 2AFC task and that means the guess rate equals 0.5. You have utmost confidence in your observer and
you feel it is safe to assume that the lapse rate equals 0. You also assume that a Gumbel (or log-Weibull) describes the PF
well. This reduces your problem to finding that combination of a threshold value and a slope value that maximizes the likelihood.
How to find it? Palamedes employs a commonly used search algorithm procedure known as the Nelder-Mead
simplex search (Nelder & Mead, 1965). Imagine
a tripod standing on the side of a hill. Longitude corresponds to threshold, latitude corresponds to slope, and the height
of the hill corresponds to likelihood. The tripod's goal is to get to the top of the hill. The position of each of the three feet of the tripod defines a combination of a threshold value and a slope value (in other words, each foot represents a possible PF). The triangle defined by the positions of the feet is termed a 'simplex'. The tripod calculates the likelihood (height
of the hill) associated with each of the three PFs. It then tries to swap out the position
of the lowest foot with a position that is higher up the hill. It uses a couple of simple rules to find a better position
for its lowest foot. For example, the first thing it tries is to swing it's lowest foot to the other side of the line that
runs between the other two feet. If this foot is now higher than the other two feet it will swing it up further in the same
direction. If this position farther out is even higher it decides to put the foot there (this 'move' is called an expansion). In order to see the full set of rules click the link at the bottom of this page. Once the
tripod has moved its lowest foot to a higher
position on the hill, there is now a new lowest
foot. It will now go through the same set of rules to move this new lowest foot to a higher position on the hill. Etc., etc. This process generally works very well: If the tripod starts on the side of a hill, it will generally
find its way to the top of the hill. The animated
gif below (Figure 2) shows the moves the tripod makes for our problem above when started at a rather arbitrary point on the
hill side. Note that it does indeed make it to the top. Note also that along the way the simplex changes its shape and size
and that when it gets close to its target it gets really small.
The simplex will never
make it to the exact position of the highest point on the hill but it can get arbitrarily close to it. The decision to stop
the search is based on the size of the simplex and is set using 'tolerance' values. There are actually two tolerance values
involved, one ('TolX') has to do with the values of the parameters, the other ('TolFun') has to do with the likelihood values.
Simply put, the search stops when all feet of the tripod are no farther than the value of TolX from the best foot in either
the threshold direction or the slope direction (measured in whatever units threshold and slope are measured in) AND the log
likelihood associated with the worst foot is no farther than the value of TolFun below the log likelihood associated with
the best foot. In other words, when the simplex is visualized in three dimensions with the third dimension corresponding to
the value to be maximized, the search stops once the simplex fits in a box with length and width no larger than TolX and a
height no more than TolFun. Both TolX and TolFun are, by default, set to 1e-6 (0.000001) in Palamedes. Once the tolerance
criteria have been met the search stops and the threshold and slope values corresponding to the position of the best foot
are reported as the parameter estimates. The solution converged on using the above strategy is the green curve in Figure 1.
Code snippet 1 (click link at bottom of page to download code snippets, code Snippets require Palamedes) performs the fit
described above starting at the same initial position of the vertex as shown in Figure 2 (note though that after the 23 iterations
shown in Figure 2, tolerance values have not been met and more iterations are performed in code snippet 1 than shown in Figure
This principle will
readily generalize to higher dimensional parameter spaces. For example, let's say you read somewhere that you should allow
the lapse rate to vary when you fit a PF. You now have three parameters to estimate (threshold, slope and lapse rate). Nelder-Mead
will work, in principle, for parameter spaces of any dimension. The simplex will always have n + 1 verteces for an n-dimensional
parameter space. A graphical representation of the three dimensional (threshold x slope x lapse rate) likelihood function
for the data in Figure 1 is given in Figure 3.
Local and Global Maxima: Ending up on the Wrong Hill
The solids in Figure
3 contain those regions where the likelihood is greater than one-fourth the maximum likelihood and the marginal contours show
marginal maxima. The hill analogy doesn't work very well in 3D parameter space, but you can now perhaps think of the simplex
as a tetrapod hovering in parameter space that has, say, a thermometer at the end of each of its four 'feet' and is 'stepping'
through 3D space looking for a heat source. Each step involves moving the coldest foot to a warmer location while keeping
the other three feet in position. The observant reader will have realized the problem. There are two heat sources in the likelihood
function shown in the figure, one hotter than the other. Note that there is still only one unique location that has the absolute
'maximum likelihood'. This happens to have a lapse rate equal to 0 and so it is the solution we found before (when we fixed
the lapse rate at 0). A problem occurs if the simplex starts near the smaller heat source. Each step the tetrapod makes will
be towards a warmer spot and as a result it will move toward the nearer, smaller heat source and converge on its center. Its
center, by the way, corresponds to the PF shown in red in Figure 1. This solution is known as a local maximum: While it is
the maximum point in its immediate neighborhood, it is not the absolute maximum in the parameter space. The inherent problem
is that the simplex will only feel around in its immediate proximity. As such, it will find the top of whatever hill it was
positioned on when it started its search. Importantly, the simplex has no way of knowing whether the top of the hill (or the
source of heat or whatever) it eventually converges on is the top of the highest hill (or the center of the biggest heat
in order to avoid ending up in a local maximum, it
is important that we find a starting position for our
simplex that is on the hillside of the highest hill in the likelihood space. One way
in which to do this is to find a starting position by first performing a 'brute force' search
through a grid defined across the parameter
space. The grid is defined by specifying a range of values for all of the free parameters. Likelihood values are then calculated for all combinations
of parameter values contained in the grid. The
combination with the highest likelihood serves as the starting point for the simplex. Lucky
for us, today's computers can perform a brute force search through a finely spaced search grid in a very short period of time.
For example, the (very modest) laptop at which this is written, just performed a brute force search
through this grid: searchGrid.alpha = -.4:.005:.4,
searchGrid.beta = 10.^[-1:.025:2], searchGrid.lambda
= [0:.01:.15] in about 2/3 of a second. There
are a total of 311,696 PFs in this grid (55,809 of which are contained in the space shown in Figure 3). Code snippet 2 performs the fit as described above.
Note that the search grid that searches for a starting point
for a simplex should include all the free parameters that will be considered by the simplex search. For example, the lesser heat source shown in Figure 3 will have a higher likelihood than the greater heat source
when we fix the lapse rate at 0.06. Thus, a brute force search through threshold and slope values that
uses a fixed value of 0.06 for the lapse rate would result in an initial position for the simplex
near the lesser heat source. A subsequent simplex search that
includes the lapse rate but starts at the initial position we found while fixing the lapse rate at
0.06 would then converge on the maximum value in the lesser heat source. This is demonstrated in code snippet 3.
Failed Searches: Not Finding the Top of Any Hill
a search results in a local maximum as described above, the Nelder-Mead algorithm will nevertheless report that the search
converged successfully (e.g., see code snippet 3). To Nelder-Mead, simply put, reaching the top of the hill that it started
on means that the search was successful. It has no way of knowing whether the peak it has reached corresponds to a local or
a global maximum. This problem of local maxima, however, can be adequately dealt with using a brute-force search as described
however, the Nelder-Mead algorithm will report that it failed to converge. If this happens, Palamedes will generally report
parameter estimates that are nowhere near those expected. Also, the 'exitflag' returned by, say, PAL_PFML_Fit will be set
to 0 (or logical FALSE) in case of a failed search. Under what circumstances might this happen?
Figure 4 shows some hypothetical data from a 2 AFC experiment.
There were 5 stimulus intensities used with 10 trials at each of these intensities. Using PAL_PFML_Fit to perform a simplex
search for a threshold value and a slope value while keeping the guess and lapse rates fixed (we used 0.5 and 0.03 respectively)
will not converge. The simplex search will chase after a maximum in the likelihood function but will not find one. After a
criterion number of iterations and/or function evaluations (the default is 800 for each when searching for two parameters),
the simplex gives up on finding a maximum, and PAL_PFML_Fit sets the 'exitflag' to 0 (or logical FALSE) to let you know about
it. The parameter values it returns for this specific problem are -4550.9613 for the threshold and 0.000035559 for the slope
(code snippet 4). These numbers seem outrageous and arbitrary. However, keep in mind that the Gumbel is a monotonically increasing
function of the data, but that the data follow an overall downward trend! As we will argue, Palamedes failed to find a maximum
because there was no maximum to be found! The data do not support the estimation of a threshold and a slope. The seemingly
outrageous and arbitrary fit of Palamedes is a typical case of 'garbage in, garbage out'. Or is it?
The green horizontal line in Figure 4 which has a value of
0.86 (the overall proportion correct across all 50 trials) is actually not a horizontal line. It is actually
(a tiny part of) a Gumbel function! If by now you guessed that it is a Gumbel with threshold equal to -4550.9613 and slope
equal to 0.000035559 (guess rate = 0.5, and lapse rate = 0.03) you are right. The parameter estimates Palamedes reported are
not arbitrary at all! Even more, from a purely statistical perspective they are not outrageous either! Palamedes is asymptotically
near fitting a horizontal line at the observed overall proportion correct observed. This is the best fit that can be obtained
given the constraint that a Gumbel needs to be fitted. Note that there is indeed no maximum in the likelihood function. The
line shown in Figure 4 can always be made a little straighter and more horizontal by moving the threshold to an even lower
value and adjusting the slope value such that, within the stimulus intensity range, the function has values that are asymptotically
close to 0.86.
Any researcher who obtains
the data shown in Figure 4 will readily understand that PAL_PFML_Fit will not give a fit with values for the threshold and
slope that the researcher might have expected. It is a little harder to see why problems arise when the data look fine (say
6, 8, 8, 10, 9 correct responses, respectively, for the 5 stimulus levels, each again out of 10 trials). Even though the fit
to these data is fine (threshold = -0.1040, slope = 1.836, code snippet 5), finding a standard error using a bootstrap routine
or determining the goodness of fit using Monte Carlo simulations leads to poor results (specifically: warnings from Palamedes
stating that not all simulations converged combined with very large values for standard errors, code snippet 5). The problem
is that some of the simulations performed during the bootstrap will result in simulated data such as those shown in Figure
4, resulting in failures of convergence and extreme parameter estimates. The standard error that a routine such as PAL_PFML_BootstrapParametric
produces is just the standard deviation of the simulated parameter estimates and standard deviations are very sensitive to
How to Deal With Failed Fits
The proper interpretation of such a result is that even though the fit to your data may have resulted
in a realistic value for the slope parameter, you cannot place much confidence at all on this value. The essence of the problem
with this fit is that not enough data were collected to determine reliable estimates of a threshold and a slope. Simply put,
even though the fit to your data produced a realistically valued slope estimate, it is very possible that this was just luck.
Consider this: an observer that behaves like the (nice) PF you fitted apparently can produce garbage data (i.e., of the kind
that cannot be fit; this is why the bootstrap failed). This might suggest then also that a garbage observer could have produced
the nice data that your observer produced. In other words, you can not be confident that your observer is not a producer of
garbage. Once again: You do not have enough data to obtain reliable estimates of a threshold and a slope.
that some software packages may 'fail to fail' and will happily report nice-looking estimates of your parameters and their
standard errors even if your data do not support these estimates. This is frustrating for us, because it seems as if Palamedes
fails where other packages seem to succeed. When Palamedes fails to fit some data while another package appears to fit those
data just fine, we encourage you to compare the log-likelihood values of the fit between the two packages and also to inspect
how the fitted functions compare to your data (which is good practice anyway).
We regularly get questions from users regarding failed fits
and it is almost universally the case that the user simply did not have sufficient data to estimate the parameters they were
attempting to estimate. The solution here is to make sure that you have sufficient data to support estimation of the model
that you wish to fit. In other words: either get more data or use a fixed value for the parameters regarding which you do
not have sufficient information in your data. By way of demonstration, code snippet 6 refits the above data but this time
using a fixed value for the slope. It successfully estimates the threshold, generates a standard error for the threshold estimate
and determines the goodness-of-fit for the overall fit of this model. Even the data shown in Figure 4 allow estimation of
a threshold (with a standard error) and a goodness-of-fit of the resulting model. If you use a fixed slope in your fit, it
might be best to avoid a parametric bootstrap (use the nonparametric bootstrap instead). If you do perform a parametric bootstrap,
it is pertinent that you use a value for the slope that is realistic: the value for the SE on the threshold will, in some
circumstances, be determined entirely by the value for the slope that you use.
- Rule number one: Don't estimate
parameters unless your data contain sufficient information to allow their estimation. Rather, fix them (or get more data).
- Make sure to select parameter ranges for the search
grid that are wide enough to contain the PF associated with the (absolute) maximum likelihood. To this end, plot some PFs
with various thresholds and slopes to determine what range of values to include in the search grid.
- Know your Gumbel from your Weibull! (see here: www.palamedestoolbox.org/weibullandfriends.html). Also know, say, your Logistic from your Quick. Etc. For
example, a Logistic with beta = 1.5 has a very different steepness compared to a Quick with beta = 1.5.
- Space slope values in the searchGrid on a log scale
(but provide their values in linear units). Use something like: searchGrid.beta = 10.^[-1:.1:2];
- If you follow an adaptive method with an ML fit,
do not free parameters that were not targeted by the adaptive method (e.g., if you use an up/down method do not estimate the
slope in subsequent ML fit, if you use the original psi method (Kontsevich & Tyler, 1999) do not free the lapse rate in
a subsequent ML fit).
case a fit fails to converge, increasing the maximum number of iterations and function evaluations allowed might help [see
code snippet 7 on how to increase these values or refer to help section of a function such as PAL_PFML_Fit (type 'help PAL_PFML_Fit')].
Note though that your likelihood function may not contain a maximum in which case this (or any other) strategy will not work.
- You can increase the precision of parameter estimates
by changing the tolerance values using the 'searchOptions' argument of a function such as PAL_PFML_Fit. Reducing tolerance
values (i.e., increasing precision) might require you to increase the maximum number of iterations and/or function evaluations
allowed also. See code snippet 7.
Kontsevich, L.L. & Tyler, C.W. (1999). Bayesian adaptive
estimation of psychometric slope and threshold. Vision Research, 39(16), 2729-2737.
Nelder, J.A. & Mead, R. (1965). A simplex method for function
minimization. Computer Journal, 7, 308-313.