Team Undergrad8: 2021 New England Police Expenditure, COSI 116A F23

Eliora Kruman, Harry Cheng, Grant Gu, Cole Simmons

Project-long Course Project as part of COSI 116A: Information Visualization, taught by Prof. Dylan Cashman, Brandeis University.

Motivation

Summary of user needs and motivating questions.

Our motivation was to explore the relationship between state/local budgets and police expenditure. Do state and local budgets correspond with each other? Are there clear trends between police expenditure and economic brackets or population size? What does police expenditure look like across specific regions?

Our visualization was primarily developed for the user to discover this information themselves. With this visualization, we hope to be able to identify discrepancies and analyze trends among state revenue and police expenditure. We focused our data on New England states to provide users with a deeper understanding of a specific region. This data isn't meant for fun — it's meant to help identify how these fundings are allocated. We believe that users will not necessarily know the results or the specific breakdowns, so we hope to allow them the opportunity to discover this data, and perhaps to confirm or disprove pre-existing notions, by interacting with our visualization.

The primary consumer of our visualization will be U.S. citizens and taxpayers. We hope that this visualization will be accessed by people interested in politics in order to better understand where their money is going and to get a fuller picture of the budget. Hopefully, people will find this a more accessible way of engaging with their local government and understanding how their taxes are spent for the community.

Expectation: There is a clear focus, and a developed explanation of the problem, and a reasonable response is proposed.

Visualization

Include the interactive visualization as part of this page. Static example follows.

Demo Video

Embedded MP4 demo video using the HTML5 <video> tag. For example, this screen recording Prof. Cody Dunne made of Mike Bostock's flexible transitions in D3 slide:

Visualization explanation

Our goal was to create three visualizations for our users. Since our data is focused on New England, our immediate thought was to create a map—maps are great for comparing the different states, and they’re also a fun, recognizable visualization for users to interact with. We wanted a monochromatic color scheme, since the data is sequential. Since we want to analyze trends and outliers, we then decided that we should create a scatterplot. It’s a simple-to-read visualization that emphasizes many of the comparisons we’re looking to make (position is a strongly identifiable channel). There’s no chart junk and it’s a direct representation of the data. But since we have multiple components to analyze, we determined that it would be best to create four linked scatterplots. This has the benefit of not overwhelming the user with a complicated number of encodings, and also creating an honest representation of the data with axes that minimize the lie factor. By adding brushing and linking, we can make it so that the states a user selects stand out on the neighboring charts. Lastly, we chose to create a treemap. We wanted to create context for our visualizations, and give the user a broader understanding of how much revenue these states are receiving. While the scatterplots show the local and state revenues, we thought it was important to show the total police expenditure per total revenue. We also thought this was an integral enough piece of information that it warranted its own new visualization. A treemap uses area to proportionately encode which states are spending the most on police given their individual revenue. We used the same monochromatic color scheme so that the comparisons would be consistent across the board.

Final visualization screenshots (PNG images), design justifications, UI walk-through, and linked presentation slides.

Data Analysis

Summary of data, data types, and data preprocessing.

This data is collected from an annual survey as a census for the 2021 fiscal year of state government finances. The Census Bureau derives the state financial statistics from administrative records provided to them by state governments, and merges this data with any additional financial forms the Bureau reaches out to collect. The first column of data is categorical, and it describes the data found in each row (police expenditure, federal revenue, etc.). There are then 51 additional columns, 1 for each state plus the U.S. total. Each of these columns are further divided into 5 columns: state & local government, state government CV, state government, local government CV, local government. This means that each state column is divided into categorical data, with each data entry being the name of the subcolumn. Each of these subcolumns are also assigned a quantitative number ID. Then, each of these sub columns contains quantitative data. Much of this data was superfluous for our research, so we created a smaller dataset focused on only police expenditure and revenue for states in the New England region. We combined this data with state population (categorical data sourced from Wikipedia), and personal income per capita (categorical data from an additional datasource).

The first trend we spotted was that the state government spends a significantly larger amount of funding on welfare than on policing. However, the average local government spends about twice as much on policing than welfare. This could be interpreted in one of two ways: Either there is a discrepancy between the state and local governments’ goals, or, this is reasonable because the state already spends so much on welfare that it makes sense for the local government to spend more on policing. The other trends we saw were not surprising. States with higher populations spent more on policing, and states with higher revenue spent more on policing as well. We then examined the police budget in comparison to the personal income per capita of states, and found no specific trend.

Expectation: Data sources match the problem statement and are appropriate. These descriptions should be very explicit so someone could read your page and properly reproduce your results.

Task Analysis

Summary of task table.

We had five initial tasks for our users, ranked in order of descending priority:

Compare police funding between regions - highlighting differences
Correlation between economic brackets and police budgeting
Correlation between population size and police budgeting
Do local budgets correspond to state budgets?
Change of police funding over time - is funding increasing/decreasing similarly across the nation, with points of interest such as the ‘80s and the recent Defund the Police Movement?

At a low-level abstraction, all five of these tasks are comparison queries. Comparing police funding between regions also falls under identifying points of interest. At a mid-level abstraction, tasks 2 through 5 focus on exploration, while task 1 focuses on browsing. At a high-level abstraction, all of these tasks focus on discovering and deriving information. Our visualization accomplishes all of these tasks except for the fifth, which we considered to be the lowest priority, since we did not want to overwhelm the user with information. We also found in our Data Analysis section that local budgets in proportion to state budgets was exactly what you’d anticipate, so that is lower in our priority as well.

Overall, our main goal is to create visualizations which are good at comparing categorical data, and to place an emphasis on the regional (state-based) aspect of this data. We want the visualizations to be easy for users to comprehend, and we want to give users the ability to discover the data themselves through interacting with the multiple components.

Expectation: Clearly describes domain tasks, processes, goals and abstract tasks for domain problems.

Design Process

Sketches and design choices to justify final visualization.

The tasks we prioritized were to compare police funding between states, as well as identify its correlations with economic brackets and population size. We also sought to compare this on a state and local level, so as to show potential discrepancies and similarities between the two. To accomplish this, we created a map of New England states which compares police expenditure per capita through a monochrome color scale. We think this design will be quick for users to grasp and creates a strong visual representation for the user to compare which states provide more funding. Next, we have a set of linked scatter plots which show more detailed comparisons of economics, populations, and state/local revenues. Users might select a state of interest based on the map and look into it in more detail using the scatter plots, which is a strong tool for identifying trends and outliers. The user could also start with the scatter plots and move to the map. Lastly, the treemap provides additional context of which states spend the most on police given their total revenue. We’ve linked our map and scatterplots so that users can compare a subset of states with all the information provided. The most significant difference from our original plans is that we previously considered focusing on change over time. However, from the data we’ve explored so far, there hasn’t been a significant change in police funding. In addition, we don’t want to give the user too much to focus on.

Expectation: Evidence of iterative improvement. Logical discussion of design choices grounded in theory from course. Discusses feedback from usability testing.

Conclusion

Short summary of work completed and areas for improvement/future-work.

We have successfully implemented brushing and linking between all of our scatterplots and the map. We have also implemented brushing and linking from the scatterplots and map to the treemap. We have a consistent blue color encoding across visualizations (which also matches the topic of policing!). We have tooltips for the map and treemap so that the user can view more details on demand to discover the exact values represented by these visualizations. We have created a meaningful display of data to help users explore police budgeting in the New England region, with an emphasis on population, revenue, and average household income. This effectively responds to tasks one through four from our task table. We think this visualization is not overly complex and is well suited for the average taxpayer who wants to learn about their state’s budget.

In the future, some things we would like to do are:

Make the color legends update with brushing and linking, as opposed to a set scale
Enable additional brushing and linking from the treemap
Allow you to reload your webpage with a region you choose!
Allow users to select the year they'd like to examine, or perhaps move through time on a scale so that they can see the change over time

Expectation: Meaningfully wraps up project and has good future directions.

Acknowledgments

List here where any code, packages/libraries, text, images, designs, etc. that you leverage come from.

D3: Data-Driven Documents by Mike Bostock.
Pure CSS responsive "Fork me on GitHub" ribbon by Chris Heilmann.
Class assignments: brushing and linking based on homework 4, map based on in-class map assignment
Color gradient tutorial by Visual Cinnamon