Taking everything that we have seen so far, we will initially specify our model without any interaction terms: \[
\begin{aligned}
\log(wage_i) &= \beta_0 + \beta_1 educ_i + \beta_2 educ_i^2 + \beta_3 exper_i + \beta_4 exper_i^2 \\
&+ \beta_5 metro_i + \beta_6 female_i + \beta_{7} black_i + \beta_{8} female_i \times black_i + \epsilon_i,\quad i = 1,...,N
\end{aligned}
\]
We expect the following signs for the non-intercept coefficients:
\(\beta_2 > 0\) - generally it may be hard to say - with each additional year of education, the effect on wage may be lessened, however, if the additional year is for PhD-level education, then the additional year of education may have an increased effect on wage. For now, we will assume that is the case, i.e. \(\beta_2 > 0\).
\(\beta_4 < 0\) - generally, experience is correlated with age. Furthermore, the level of knowledge tends to even-out for each additional year of experience the more years of experience you have. In other words, if you already have a lot of experience, then an additional year of experience would have a lessened (but not necessarily negative) effect on wage. Note: exper^2 is alongside exper, so we do not know (as of right now), if there is a number of years of experience that results in a decrease in wage. We are assuming that it would result in an increase, but at a lower rate (\(\beta_4 < 0\)), compared to someone with less initial years of experience;
\(\beta_{8}\) - while we have already determined that race does not have a significant effect on \(wage\), this does not take into account that race of a specific gender may face discrimination, rather than race as a whole. To account for this, we include the gender and race interaction term. We also include the separate terms for gender and race, in order to have a more concise coefficient interpretation, if the interaction term is statistically significant.
The square of education is significant - we do not remove the lower order educ variable, even though it is insignificant. On the other hand - we see that the interaction term between race and gender is insignificant, so we remove it.
When including interaction and polynomial terms, the high VIF is deceptive, since the reason for such a high \(\rm VIF\) is the square of the same variable, but it is NOT a linear relationship!
This is why it is better to check for multicollinearity BEFORE including interaction and polynomial terms!
So, we can write down our final model as follows (coefficient standard errors are also provided in parenthesis): \[
\begin{aligned}
\underset{(s.e.)}{\widehat{\log(wage_i)}} &= \underset{(0.1840)}{1.5556} + \underset{(0.0253)}{0.0367} \cdot educ_i + \underset{(0.0009)}{0.0025} \cdot educ_i^2 + \underset{(0.0040)}{0.0276} \cdot exper_i - \underset{(0.0001)}{0.0004} exper_i^2 \\
&+ \underset{(0.0382)}{0.1334} \cdot metro_i - \underset{(0.0299)}{0.1592} \cdot female_i
\end{aligned}
\]
We can interpret the coefficients as follows:
The \(metro\) coefficient shows that if a person is from a metropolitan area, their wage will be, on average, \(0.1334 \times 100 = 13.34\%\) higher, compared to people, who are not from a metropolitan area, ceteris paribus;
The \(female\) coefficient is negative, which indicates that there might be a pay gap between genders - the coefficient shows that if someone is female, their wage is around \(0.1592 \times 100 = 15.9+2\%\) lower, compared to someone who is not female, ceteris paribus.
Since education and exper have polynomial terms - their interpretation is a bit different. For example, the partial derivative with respect to educ is: \[
\dfrac{\partial \widehat{\log(wage_i)}}{\partial educ_i} = 0.0367 + 2 \times 0.0025 \times educ_i = 0.0367 + 0.005 \times educ_i
\] This means that the effect of education now depends on the initial value of education.
Another way to look at it is to take two cases:
In the first case we have \(educ_i\), and predict \(\widehat{\log(wage_i)}^{(0)}\).
In the second case, we increase it by one (\(educ_i + 1\)) and calculate \(\widehat{\log(wage_i)}^{(1)}\). We assume that every other explanatory variable is the same (i.e. ceteris paribus).
Then the difference between the two predictions is: \[
\begin{aligned}
\widehat{\log(wage_i)}^{(1)} - \widehat{\log(wage_i)}^{(0)} &= 0.0367 \times (educ_i+1) + 0.0025 \times (educ_i+1)^2 \\
&- (0.0367 \times educ_i + 0.0025 \times educ_i^2) \\
&= 0.0367 + + 0.0025 \times (2 \cdot educ_i + 1) \\
&= 0.0392 + 0.005 \times educ_i
\end{aligned}
\] In other words, if education increases by one year, then wage increases by \(100 \cdot \left[ 0.0367 + 0.0025 \cdot (2 \cdot educ + 1) \right] \%\). The effect is more pronounced with higher values of education.
For example if \(educ_1 = 10\) and \(educ_2 = 20\), then wage will increase by a larger amount for the individual, who has \(20\) years of education and obtains an additional year, compared to someone who went from \(10\) to \(11\) years of education.
Interpreting interaction effects (click to expand)
Assume that our model has interaction effects - how would we interpret the interaction coefficient?
For example, let’s say that our estimated regression model is:
\[
\begin{aligned}
\widehat{\log(wage)} &=
{1.6205} +
{0.0406} \cdot educ +
{0.0019} \cdot educ^2 \\
&+
{0.0286} \cdot exper -
{0.0004} \cdot exper^2 \\
&+ {0.1423} \cdot metro -
{0.0809} \cdot south \\
&- {0.6067} \cdot female +
{0.0312} \cdot \left(female \times educ\right)
\end{aligned}
\] To highlight the effect of gender, we can rearrange the above model as: \[
\begin{aligned}
\widehat{\log(wage)} &=
{1.6205} +
{0.0406} \cdot educ +
{0.0019} \cdot educ^2 \\
&+
{0.0286} \cdot exper -
{0.0004} \cdot exper^2 \\
&+ {0.1423} \cdot metro -
{0.0809} \cdot south \\
&+ \left({0.0312} \cdot educ -
{0.6067}\right) \cdot female
\end{aligned}
\] A possible interpretation could be as follows: if the person is female then their \(\log(wage)\) differs by \((0.0312\cdot educ−0.6067\)), compared to the base non-female group, ceteris paribus.
By specifying this type model we can see how much education offsets discrimination based on gender - if \({0.0312} \cdot educ - {0.6067} = 0\), then having \(educ = 0.6067 / 0.0312 \approx 19.45\) years of education will offset gender-based discrimination.