This is the notes of Continuous Stochastic Structure Models with Apllication by Prof. Vijay S. Mookerjee. In this note, we are talking about an application of Stochastic Structure Model: Control of Negative Reviews.
Consider there is a firm $\mathcal{F}$ and on the website of this firm $\mathcal{F}$, customers can write their reviews and ratings on the website about this firm $\mathcal{F}$. What should the firm $\mathcal{F}$ do to suppress the negative effect of the negative reviews/ratings? How much effort should this firm make? What is the optimal strategy of response to the negative reviews/ratings?
Q1. Why do the responses to the negative reviews help?
A1. Customers need to browse the reviews and ratings to help them make their decision. There should be positive effect of responding to the negative reviews/ratings.
Q2. Must there be positive effect of replying to negative reviews?
A2. No. Some reviews might come from the irrational customers, thus replying to each negative reviews is not the optimal strategy. But there should be optimal strategy that could lead to positive effect.
Abstract Model
- State $X(t)$: Average of last $N$ most recent ratings
- 3 forces driving the state $X(t)$:
- positive reviews
- negative reviews
- response to negative reviews
Arrival of reviews
The arrival of reviews should follow Poisson Distribution $\mathcal{P}(\lambda)$
For instance, an individual keeping track of the amount of mail they receive each day may notice that they receive an average number of 4 letters per day. If receiving any particular piece of mail does not affect the arrival times of future pieces of mail, i.e., if pieces of mail from a wide range of sources arrive independently of one another, then a reasonable assumption is that the number of pieces of mail received in a day obeys a Poisson distribution.1
Comparison between Poisson Distribution and Binomial Distribution
- Poisson Distribution $\mathcal{P}(\lambda)$:
- Binomial Distribution $\mathcal{B}(n,p)$:
Recommended Reading: Neverland|20171104 Notes - Data Types
Table of Common Distributions - TAMU Stat
Deriving Poisson Distribution from Binomial Distribution
See the notes of Andrew Chamberlain, Ph.D. from the Medium Medium|Deriving the Poisson Distribution from the Binomial Distribution, which is quite clear. One helpful reminder is $\lim_{n \to \infty} (1+\frac{1}{n})^n = e$, where $e$ is the Euler’s number. And based on this definition, $\lim_{n \to \infty} (1-\frac{c}{n})^n = e^{-c}$.
Does subset of Poisson still follow Poisson Distribution?3:
$X = r$ follow Poisson Distribution $\mathcal{P}(\lambda)$, and select $Y = k$ with probability $p$ from $X$.
Question:Does $Y=k$ still follow Poisson Distribution? If so, what kind of Poisson Distribution?
Proof:
- Marginalization: $\mathbb{P} (Y=k) = \sum_{r=k}^{+\infty} \mathbb{P} (X=r,Y=k) = \sum_{r=k}^{+\infty} \mathbb{P} (X=r) \cdot \mathbb{P} (Y=k|X=r)$;
- $\mathbb{P} (X=r)$ should be PMF of Poisson Distribution $\mathcal{P}(\lambda)$ and $\mathbb{P} (Y=k|X=r)$ should be Binomial Distribution $\mathcal{B}(r,p)$;which is MPF of Poisson Distribution $\mathcal{P}(\lambda p)$.
Modeling the state
- Force 1: Positive, $\rho \lambda (1-p)(b-X)$;
- Force 2: Negative, $\beta \lambda \rho X$;
- Force 3: Firm can control, $\alpha (b-X)$
$X_t$ is CIR process.
Recommended Reading: Neverland|20171117 Notes: Stochastic Process, Parameter Estimation, PDE
Data
$X_t$ state: Average of last n most recent ratings, e.g. $n = 20$.
Ctrip data of reviews and ratings from 2012.4 to 2014.6
DID Analysis: $\Delta r_j = \frac{1}{n} \sum_{i=j+1}^{j+n} r_i - \frac{1}{n} \sum_{i=j-n}^{j-1} r_i$, where $n$ is the window size.
Measure improvement of ratings from response, $\Delta r_j$ increase after response.
Average Similarity Scores using TF-IDF decrease after response.
Length of negative review increase after response, which means that the cost of writing negative review increase.
Propensity Score Matching (PSM):
Validation of Poisson Distribution
Parameter Estimation
$k_1, k_3, \sigma$ $\leftrightarrow$ $\alpha, \beta, \rho, p, \lambda, \sigma$
we don’t care about $sigma$.
$\lambda$: Review arrival rate;
$p$: Prob{Negative Review} $\rightarrow$ separate $\beta$ out
3 equations and 5 parameters to be estimated:
Go outside of the problem and find data, separate “endogenous”:
- When no response, $\alpha=0$
- After response, $\alpha>0$
In equation of $k_1$, we can separate $\alpha$ and $p$
Prediction compared with ARIMA/GARCH
$ARIMA(p,q)$, $GARCH(p,q)$ $\stackrel{V.S.}{\Leftrightarrow}$ SDE
Probability Response Strategy:
- Mean control
- Mean-Variance Control
- Service-level Control
Recommended Reading: Baoduge|TIME SERIES ANALYSIS WITH ARIMA/GARCH MODEL
1. Statistics|The Poisson Distribution ↩
3. CrossValidate|Is a subset of a Poisson process also following a Poisson process? ↩