拟合心理物理曲线

本章要点

介绍常规的用mean square error拟合心理物理曲线
介绍心理物理曲线背后代表的概率原理
用MLE的方法拟合心理物理曲线

在介绍完常见的心理物理曲线后，我们接下来考虑如何在心理学实验中应用它。

我们先来思考一个简单情境：一个被试在每个试次回答正确的概率是𝜃, 求被试完成了10个试次，结果是1110010101 (1=正确, 0=错误)的概率?

由概率论知识可知

p(data|\theta) = \theta^k * (1-\theta)^{n-k}

其中，k为正确的试次数，n为总试次数

基于上述思考，我们进一步考虑：以上的问题假定每个试次被试做对的概率是恒定的𝜃, 如果每个试次中呈现的刺激强度大小不一，我们如何得到被试每个试次里面做对的概率？想要回答这个问题，就需要用到心理物理曲线。

接下来我们考虑一个RDK的感知决策实验，探究如何使用心理物理曲线去回答上述疑问。

首先我们预定义好一个心理物理函数，这里我们选用前文介绍的Weibull函数

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize

# 先定义weibull函数
def weibullfun(x, alpha, beta, gamma, g):
    k = (-np.log((1-g)/(1-gamma)))**(1/beta)
    y = 1 - (1 - gamma)*np.exp(-(k*x/alpha) ** beta) # hack here to avoid the complex number of 
    return y

然后我们依据前人研究里设置的RDK的coherence水平，绘制Weibull函数中不同coherence水平下的正确率

from scipy.stats import bernoulli
coh = np.array([0.032, 0.064, 0.128, 0.256, 0.512]) # coherence level in literature

# 定义参数
g = 0.82
gamma = 0.5 # Baseline chance level
beta = 3. # slop steepness
alpha = 0.25 # coherence threshold corresponding to g

pCoh = weibullfun(coh, alpha, beta, gamma, g)

plt.plot(coh, pCoh, 'o-')
plt.xlabel('coherence')
plt.ylabel('Prob of correct')

接下来我们模拟出每种coherence水平各1000个试次的数据

cohTrial是一个长度nTrialPerCoh x 5的数组，代表每个trial的coherence level
data是一个长度nTrialPerCoh x 5的数组，代表每个trial被试做对(1)和做错(0)

nTrialPerCoh = 1000 # how many trials per coh level
data = np.empty((coh.size, nTrialPerCoh))
for iCoh in range(coh.size): # loop coherence level
    # use bernouli process to generate data
    data[iCoh, :] = bernoulli.rvs(pCoh[iCoh], size=nTrialPerCoh)

cohTrial = np.tile(coh[:, np.newaxis], (1, nTrialPerCoh)).flatten() #(5000, ) array
data = data.flatten() # (5000, ) array

print(cohTrial) 
print(cohTrial.size) 
print(data) 
print(data.size) 

# 输出结果：
# [0.032 0.032 0.032 ... 0.512 0.512 0.512]
# 5000
# [0. 1. 1. ... 1. 1. 1.]
# 5000

至此，我们模拟出了五种coherence水平的总共5000个试次的数据，每个数据点包含coherence和决策结果两类信息。我们需要利用这些数据来拟合心理物理函数。

这里介绍一种常见的拟合方法——最大似然估计（Maximum Likelihood Estimation，MLE）。假定 f(x|θ) 是 Weibull 函数，其中 θ=α 为曲线的参数，数据D = [d1, d2, ... di ...]。这里我们固定 γ= 0.5, g = 0.82, β = 3。那么负对数似然函数为：

NLL(\theta) = -LL(\theta) = -\log(L(\theta))= -\sum_{i} \log(f(d_i|\theta))

那么我们求θ，就是

\hat{\theta} = \arg \min_{\theta} NLL(\theta)

下面我们写出负对数似然函数这里我们只假定threshold α是需要估计的自由参数

eps = np.finfo(float).eps # 赋值最小精度给eps是为了防止出现log(0)的情况
def negloglikeli(alpha, coh, data):
    g = 0.82
    beta = 3
    gamma = 0.5

    prob = np.zeros(coh.size)
    for i in range(coh.size):
        prob[i] = weibullfun(coh[i], alpha, beta, gamma, g)
        prob[i] = prob[i]*data[i] + (1-prob[i])*(1-data[i])
   
    prob = prob * 0.999 + eps
   
    return -np.log(prob).sum()

下面来优化这个负对数似然函数

from scipy.optimize import minimize

res = minimize(fun=negloglikeli, x0=(0.3), args=(cohTrial, data), bounds=((eps, 1),)) 
# 因为coherence threshold只能是(0,1),所以我们设定边界

print('The estimated threshold is', res.x)

# 输出结果
# The estimated threshold is [0.25238537]

注意，前面的模拟过程有一个前提假设，即假定所有的试次，被试都是按照心理物理曲线严格执行。但在现实世界中，显然存在被试不严格按心理物理曲线进行决策的情况。因此，这里引入失误率(lapse rate)。lapse rate是指被试因为各种原因（比如走神了）不是按照心理曲线来做反应，而是以随机的策略来做决策。假定在所有试次中，有𝜏比例的试次被试会完全走神，随机选择，那么概率是:

p(d_i | \theta) = (1 - \tau) * f(d_i)|\theta) + \tau * \gamma

最后更新于10个月前