【CVX】Bounding consumer preference

2023-10-06 22:16:16

Navigator

Bounding consumer preference
Counting problems with Poisson distribution
- CVX code
Logistic regression
Reference

Bounding consumer preference

We model consumer preference in the following way. We assume there is an underlying utility function: R n → R \mathbb{R}^n\to\mathbb{R} Rn→R with domain [ 0 , 1 ] n [0, 1]^n [0,1]n, u ( x ) u(x) u(x) gives a measure of the utility derived by the consumer from the goods basket x x x. It is also reasonable to assume that u u u is concave. This models satiation, or decreasing marginal utility as we increase the amount of goods.

Now suppose we are given some consumer preference data, but we do not know the underlying utility function u u u. Specifically, we have a set of goods baskets a 1 , a 2 , … , a m ∈ [ 0 , 1 ] n a_1, a_2, \dots, a_m\in [0, 1]^n a1,a2,…,am∈[0,1]n, and some information about preferences among them:
{ u ( a i ) > u ( a j ) ( i , j ) ∈ P u ( a i ) ≥ u ( a j ) ( i , j ) ∈ P w e a k \begin{cases} u(a_i)>u(a_j)\quad (i, j)\in\mathcal{P}\\ u(a_i)\geq u(a_j)\quad (i,j)\in\mathcal{P}_{weak} \end{cases} {u(ai)>u(aj)(i,j)∈Pu(ai)≥u(aj)(i,j)∈Pweak
with the function u u u as the infinite-dimensional optimization variable. Since the constraint are all homogeneous, we can express the problem in the from
f i n d u s . t . { u : R → R concave and nondecreasing u ( a i ) ≥ u ( a j ) + 1 ( i , j ) ∈ P u ( a i ) ≥ u ( a j ) ( i , j ) ∈ P w e a k find \quad u\\ s.t. \begin{cases} u:\mathbb{R}\to\mathbb{R}\text{ concave and nondecreasing}\\ u(a_i)\geq u(a_j)+1\quad (i,j)\in\mathcal{P}\\ u(a_i)\geq u(a_j)\quad (i,j)\in\mathcal{P}_{weak} \end{cases} findus.t.⎩⎪⎨⎪⎧u:R→R concave and nondecreasingu(ai)≥u(aj)+1(i,j)∈Pu(ai)≥u(aj)(i,j)∈Pweak

Counting problems with Poisson distribution

In a wide variety of problems the random variable y y y is nonnegative integer valued, with a Poisson distribution with mean μ > 0 \mu>0 μ>0:
P ( y = k ) = e − μ μ k k ! \mathbb{P}(y=k)=\frac{e^{-\mu}\mu^k}{k!} P(y=k)=k!e−μμk
Given a number of observations which consist of pairs ( u i , y i ) , i = 1 , … , m (u_i, y_i), i=1, \dots, m (ui,yi),i=1,…,m, where y i y_i yi is the observed value of y y y for which the value of the explanatory variable is u i ∈ R n u_i\in\mathbb{R}^n ui∈Rn. Try to find a MLE of the model parameters a ∈ R n a\in\mathbb{R}^n a∈Rn and b ∈ R b\in\mathbb{R} b∈R from these data:
∏ i = 1 m ( a i T u i + b ) y i exp ⁡ ( − ( a T u i + b ) ) y i ! \prod_{i=1}^m\frac{(a_i^Tu_i+b)^{y_i}\exp(-(a^Tu_i+b))}{y_i!} i=1∏myi!(aiTui+b)yiexp(−(aTui+b))
the log-likelihood function is
l ( a , b ) = ∑ i = 1 m ( y i log ⁡ ( a T u i + b ) − ( a T u i + b ) − log ⁡ ( y i ! ) ) l(a, b)=\sum_{i=1}^m(y_i\log(a^Tu_i+b)-(a^Tu_i+b)-\log(y_i!)) l(a,b)=i=1∑m(yilog(aTui+b)−(aTui+b)−log(yi!))
An MLE of parameters a a a and b b b can be obtained by solving the following convex optimization optimization problem
max ⁡ ∑ i = 1 m y i log ⁡ ( a T u i + b ) − ( a T u i + b ) \max\sum_{i=1}^my_i\log(a^Tu_i+b)-(a^Tu_i+b) maxi=1∑myilog(aTui+b)−(aTui+b)

CVX code

%%
clc;
clear all;
rng(729);
n = 10;
m = 100;
atrue = rand(n, 1); % 设置分布参数：a
btrue = rand; % 设置分布参数 b

u = rand(n, m);
mu = atrue'*u+btrue;

%% generate random variables y from a Poisson distribution
L = exp(-mu);
ns = ceil(max(10*mu));
y = sum(cumprod(rand(ns, m))>=L(ones(ns, 1), :));

% MLE
cvx_begin
    variables a(n) bb(1)
    maximize sum(y.*log(a'*u+bb)-(a'*u+bb))
cvx_end

Logistic regression

Considering a random variable y ∈ { 0 , 1 } y\in\{0, 1\} y∈{0,1} with
{ P ( y = 1 ) = p P ( y = 0 ) = 1 − p \begin{cases} \mathbb{P}(y=1)=p\\ \mathbb{P}(y=0)=1-p \end{cases} {P(y=1)=pP(y=0)=1−p
The logistic model has the form
p = exp ⁡ ( a T u + b ) 1 + exp ⁡ ( a T u + b ) p=\frac{\exp(a^Tu+b)}{1+\exp(a^Tu+b)} p=1+exp(aTu+b)exp(aTu+b)

%%
rng(729);
% data
a = 1;
b = -5;
m = 100;

u = 10*rand(m, 1);
y = rand(m, 1)<exp(a*u+b)./(1+exp(a*u+b)); % binary variables, 0 or 1
plot(u, y, 'o');
axis([-1, 11, -0.1, 1.1]);

%% cvx
U = [ones(m, 1) u];
% cvx_expert true enables the use of successive approximation methods to
% handle exponentials, logarithms and entropy
cvx_solver mosek
cvx_expert true
cvx_begin
    variables x(2)
    maximize (y'*U*x-sum(log_sum_exp([zeros(1, m); x'*U'])))
cvx_end

ind1 = find(y==1);
ind2 = find(y==0);

av = x(2);
bv = x(1);
us = linspace(-1, 11, 1000)';
ps = exp(av*us+bv)./(1+exp(av*us+bv));

hold on;
plot(us, ps, '-');
plot(u(ind1), y(ind1), 'o');
plot(u(ind2), y(ind2), 'o');
hold off;

Reference

Convex Optimization S.Boyd Page 340