California Election Analysis

How Race Shaped California Voter Patterns in the 2016 Presidential Election

Introduction

In most states during the 2016 Presidential Election, Democratic candidate Hillary Clinton received a smaller vote share against Republican nominee Donald Trump than President Barack Obama achieved in 2012 against Mitt Romney. California was one of the few states where Clinton outperformed Barack Obama against the GOP candidate, winning the Golden State by an additional 8% compared to Obama’s result against Mitt Romney.

I plan to investigate how this change in vote share is correlated with race and ethnicity at county levels. My project will incorporate 3 separate data sets: 2012 Presidential Election Results By County In California, 2016 Presidential Election Results By County In California, and finally with Demographic Data by California County. For simplicity, I’ve combined the relevant data from each of these sources into a combined spreadsheet that I will be referencing throughout this analysis.

In my analysis, I’ll first calculate the change in Democrat-GOP vote in each county, and create a chart illustrating the overall shifts. Next, I’ll create a regression to model how percentage of non-hispanic whites per county maps to changes in voting percentage and will plot this using R. Finally, I will investigate whether the percentage of non-hispanic whites in a county has a relation to the proportions of votes cast for third-party candidates in 2016, and create the relevant graphics.

Data Analysis

Library Imports

library(ggplot2)
library(readr)
library(readxl)
library(tidyr)
library(knitr)
library(dplyr)

Data Import

CA_Data <- read_excel("CA Election Analysis Data Set.xlsx")

Deriving Democratic Margin In Each County

To begin our analysis, we need to create values for the margin of Democratic victory in each county for both 2012 and 2016. The variables will be called “Diff_2012” and “Diff_2016” and is simply equal to Dem Percent minus GOP Percent per county in a given year. In counties that the GOP won this value will be negative.

CA_Data <- CA_Data %>% mutate(Diff_2012 = `2012 Dem Percentage` - `2012 GOP Percentage`)
CA_Data <- CA_Data %>% mutate(Diff_2016 = `2016 Dem Percentage` - `2016 GOP Percentage`)

Let’s create a scatterplot to illustrate how the differential exists over counties in the state over the two election years. I’ve added a red line with a slope of 1 to illustrate what the vote share would look like if there was a perfect correlation between the 2012 and 2016 percentage differential in each county.

ggplot(data = CA_Data) +
    geom_point(mapping = aes(x = Diff_2012, y = Diff_2016)) +
    geom_hline(yintercept = 0) +
    geom_vline(xintercept = 0) +
    geom_abline(slope = 1, intercept = 0, colour = "red")

cor(CA_Data$Diff_2012, CA_Data$Diff_2016)

## [1] 0.9803417

In fact, the correlation between Diff_2012 and Diff_2016 is .98 which is in the range we’d expect for such a self-evident relationhip.

The variable “Dem_Shift” will represent the change in the county margin of victory for the Democrat between 2012 and 2016. “Dem_Shift” is equal to Diff_2016 minus Diff_2012. A positive value indicates where Clinton had a higher vote percentage of Obama, and a negative value indicates where Obama had a higher vote percentage than Clinton.

CA_Data <- CA_Data %>% mutate(Dem_Shift = Diff_2016 - Diff_2012)

Let’s create a histogram to track overall shifts in the state.

ggplot(data = CA_Data) +
    geom_histogram(mapping = aes(x = Dem_Shift*100), boundary = 0,
        binwidth = 2) + 
    labs(title = "Histogram of Democratic Margin Shift", x = "Percentage Shift",
         y = "Number of Counties") +
    geom_vline(xintercept = 0, colour = "blue")

While there is heavy correlation between the difference in 2012 and the difference in 2016, there are some obvious non-random deviations as a whole. For example, we can see that the vast majority of counties that voted for Obama in 2012 voted in higher margins for Clinton in 2016. Within counties that voted for Romney in 2012, a slight majority voted for Trump in 2016 by greater margins. Overall, more counties swung towards Clinton than swung towards Trump, which makes sense considering that her overall margin of victory in 2016 outperformed Obama’s showing in 2012 by 7 percent. Through this analysis, it is clear that there is increasing political polarization in California. In the next section, we will analyse counties by percentage of non-hispanic whites to get a greater understanding of the ethnocultural factors in these trends.

Democratic Margin Shift In the Context of Race

In the United States, non-hispanic whites tend to vote Republican while minorities such as Black, Hispanic, and Asian Americans generally support Democrats. Let’s see how well this holds up in California – a state that is more liberal across the board – evaluating if there is a correlational relationship between the Non-Hispanic White Percentage per county and 2016 GOP Percent per county. We will use the 2016 Presidential Election for this comparison.

cor(CA_Data$`Non-Hispanic White Percentage`, CA_Data$`2016 GOP Percentage`)

## [1] 0.4443113

ggplot(data = CA_Data, aes(x = `Non-Hispanic White Percentage`, y = `2016 GOP Percentage`)) + 
  geom_point(color='black') +
  geom_smooth(method = "lm", se = FALSE)

The correlation between the percentage of non-hispanic whites in a county and their vote percentage for Republicans is .44, which vindicates our assumption. This can be seen on the graph, where I have included a linear best-fit line. Next we will produce a similar plot but switch the y-axis to the “Dem_Shift” variable.

cor(CA_Data$Dem_Shift, CA_Data$`Non-Hispanic White Percentage`)

## [1] -0.5725335

ggplot(data = CA_Data, aes(x = `Non-Hispanic White Percentage`*100, y = Dem_Shift)) + 
  geom_point(color='black') +
  geom_hline(yintercept = 0) +
  geom_vline(xintercept = 50) +
  geom_smooth(method = "lm", se = FALSE)

The correlation between “Non-Hispanic White Percentage”" and “Dem_Shift”" is -.57, suggesting an inverse relationship. This means that there is a stronger relationship between “whiter” counties shifting relatively towards the GOP in 2016 than voting for the GOP in general. Judging from the scatterplot, counties with a greater percentage of Non-Hispanic Whites are more likely to have shifted towards the GOP from 2012 to 2016. Conversely, counties with a higher percentage of minorities tended to swing towards the Democrats from 2012 to 2016. In fact all but one that had a Non-Hispanic White population percentage of less than 50% swung towards Clinton in the 2016 Presidential Election. In the counties with a higher proportion of Non-Hispanic White residents, there was a far greater variation in the magnitude and direction of the swing, however the majority shifted towards the GOP.

Third Party Votes and Race

The 2016 Presidential election was extremely controversial because Clinton won the popular vote but lost the electoral college – and therefore the presidency – to Trump. Interestingly the number of votes cast for third party candidates in the tipping point states of Pennsylvania, Michigan, and Wisconsin were greater than Trump’s margin of victory. Political commentators often claimed that these third party votes were often cast by white voters that had little to lose by casting a “protest vote”, in other words voting for a candidate that had no chance of winning instead of voting for the lesser of two perceived evils. While I cannot say if this is true or not, we can see how the racial breakdown of counties may have affected the number of third party votes cast in California.

CA_Data <- CA_Data %>% mutate(Third_Party_2012 = 1 - `2012 Dem Percentage` - `2012 GOP Percentage`)
CA_Data <- CA_Data %>% mutate(Third_Party_2016 = 1 - `2016 Dem Percentage` - `2016 GOP Percentage`)
CA_Data <- CA_Data %>% mutate(Third_Party_Shift = Third_Party_2016 - Third_Party_2012)

We’ve created our three variables. “Third_Party_2012” and “Third_Party_2016” are simply the votes in each county that went to neither Clinton nor Trump. “Third_Party_Shift” is the number of third party votes in 2016 minus the number of third party votes in 2012 in each county.

ggplot(data = CA_Data) +
  geom_boxplot(mapping = aes(x = "", y = Third_Party_Shift*100)) +
  labs(title = "Percentage Shift Towards Third Party\nCandidates Between 2012 and 2016 ", x = NULL, y = "Percentage Shift")

What we find is that every single county had more votes go third party candidates in 2016 than in 2012. The distribution is fairly tight as well, with all observations within the range of a 2% to 7% shift. Obviously, there was a huge movement towards third party candidates in the 2016 Presidential election in California. However, what happens if we apply the same investigation on race as we did earlier?

cor(CA_Data$Third_Party_Shift, CA_Data$`Non-Hispanic White Percentage`)

## [1] 0.3661224

ggplot(data = CA_Data, aes(x = `Non-Hispanic White Percentage`*100, y = Third_Party_Shift)) + 
  geom_point(color='black') +
  geom_hline(yintercept = 0) +
  geom_vline(xintercept = 50) +
  geom_smooth(method = "lm", se = FALSE)

Putting all this together, we see that there is again a positive correlation between the proportion of Non-Hispanic Whites in a county and shifting towards third party candidates in 2016. While the correlation is only .37, it is still clear from the best fit line that there is a demonstrable pattern: “whiter” counties were more likely to switch their votes to favor third party candidates.

Conclusion

Political Polarization

The majority of counties in California shifted towards the Democrat between the 2012 and 2016 Presidential elections. Nearly all counties that voted for Obama increased their vote share for Clinton. However, among counties that voted for Romney, most voted for Trump by even higher margins.

Race and Main Party Preference

We found a strong correlation between the percentage of Non-Hispanic Whites in a county and how Republican that county voted in 2016. More suprisingly however, was that there was a stronger relationship between how “white” a county was and how much the margin of victory shifted towards the GOP in 2016 relative to 2012. This suggests that white voters in the aggregate are trending more Republican over time, which is a common argument to explain the changing political coalitions in “whiter” states like Michigan, Ohio, Wisconsin, and Minnesota that also swung towards Trump in 2016.

One counterpoint though, is that the least variance in the regression is in counties that are majority-minority (less than 50% Non-Hispanic White). Nearly all of these already quite Democratic counties voted for Clinton at higher margins than for Obama. An explanation for this of course may be that Donald Trump was a particularly controversial candidate that was uniquely distastful to minority voters in California. However, we can still see that a large contingent of “whiter” counties shifted towards Trump as well, so there is clearly some polarization across the board here.

Race and Third Party Preference

Perhaps the most shocking result of all was that every county in California swung towards third parties by margins of 2% to 7%, which is pretty huge! We identified a positive correlation between the percentage of Non-Hispanic Whites in a county and a swing towards third party candidates in 2016. That is, “whiter” counties tended to shift more of their votes towards third party candidates in 2016 compared to in 2012. Overall, this means that there is a similar relationship between counties with a higher proportion of Non-Hispanic Whites shifting more Republican and more third party simultaneously.

Key Takeaways

Putting this all together, we’ve discovered that the state of California is undergoing significant political polarization. We also found out that the divergence is related to race, by which majority-minority counties are shifring firmer into the Democratic column and “whiter” counties are more strongly preferring Republicans than in the past, or at least Donald Trump over Hillary Clinton. We have also uncovered a trend of “whiter” counties shifting their support from the main parties to third party candidates.

Much of this debate will center around whether the 2016 Presidential Election was an anomaly or the new normal for politics in California. Both Trump and Clinton were historically unpopular candidates, which may have led to some interesting trends within the state’s votes. However, there is no doubt that there is a real effect of race on both choosing your party, and deviating from your previous choices.