During 2020, COVID-19 created uncertainty for many Americans. In response to the economic crisis facing The United States of America, Congress passed the CARES Act on March 27th. One component of the CARES Act was the Paycheck Protection Program, which was meant to fund paychecks of employees of small businesses in America during the lockdowns and any unforeseen events. If certain criteria were met, the loan would be fully forgiven at a later date . The question I asked was whether these loans were issued fairly across the state of New Jersey. I wanted to see if the data supported the idea that these loans were given out regardless of demographics or psychographics to buisnesses that needed the money to support their employees.

The dataset I used was provided by the US Small Business Administration from their website. The dataset contained 21,858 entries with information such as business name, location, loan range, and how many employees the loan would help. The biggest challenge with the dataset was determining a way to compare areas. The dataset contained town names and congressional districts, while the latest census data was grouped by county. I manually added the county column into the dataset.

Several assumptions were made during this study. The first assumption was that counties are a viable breakdown of the state to use in this kind of study. Ideally, the dataset and this study would have benefited from knowing the demographics of each program participant. The second assumption is that the information is accurate. When going through the dataset, I found several errors in the location columns. This leads me to believe that the information was inputted from a form filled out by the loan recipient. This point also comes up when looking at how many jobs the program helped fund, as it appears that this was a form that the loan receiver filled out and was sometimes left blank and defaulted to zero.

I started by running a few simple graphs to familiarize myself with the data, such as a breakdown of how many loans in each loan range were issued. Based on these graphs, I decided to concentrate on population and loan range as the main factors when looking at the data. I then created several charts looking at factors such as race, age, and education compared to population and the amount of loans issued.

Untitled

Untitled

Untitled

Based on the charts created, I was able to determine that the loans were distributed evenly across the state. The counties with higher populations received more loans, which makes sense considering the loans were meant for small businesses. I included and presented a chart that compared poverty levels of each state because I think this best represented the finding of all the charts. The amount of loans an area received was in direct relation to the population of the county. Other factors like race, education, and poverty did not have a direct effect on how many loans were secured in the county. I then created a chart that looked at the loan range distribution of the state to check if there was an even distribution of high and low loan amounts in each county.

Untitled

After exploring the distribution of loans in each county, I decided to look at the industries that received the loans. I was particularly interested in the loans that were secured by the office of lawyers. I created a chart to see if a certain area got most of the loans, and the chart appeared to be in proportion to the area's population.

Untitled

Untitled

The last thing I wanted to look at was how many jobs these loans were helping. The first chart I created was a straight plot of the number reported. Unfortunately, the most frequent number by more than triple the next highest number was 0. By eliminating the 0 values, we get a more standard distribution leaning towards the lower values.

Untitled

Untitled

In conclusion, it is evident that the loans were distributed evenly across the state. For this data to be studied further, I would like to investigate how many jobs the loan program saved and for how long.

Covid Paycheck Protection Program Analysis using R

Introduction

The Covid-19 pandemic brought about significant economic uncertainty for many Americans. In response, the US Congress passed the CARES Act on March 27, 2020, which included the Paycheck Protection Program (PPP) designed to fund the paychecks of employees of small businesses in America during lockdowns and other unforeseen events. The loans were fully forgiven at a later date if certain criteria were met. The question this project sought to answer was whether these loans were issued fairly across the state of New Jersey. The aim was to determine if the data supported the idea that these loans were given out regardless of demographics or psychographics to businesses that needed the money to support their employees.

Data Collection

The dataset used in this project was provided by the US Small Business Administration from their website. It contained 21,858 entries with information such as business name, location, loan range, and how many employees the loan would help. The dataset posed a challenge when it came to determining a way to compare areas. It contained town names and congressional districts, while the latest census data was grouped by county. To circumvent this issue, a manual county column was added to the dataset.

Data Cleaning

Assumptions were made during this study. The first assumption was that counties were a viable breakdown of the state to use in this kind of study. Ideally, the dataset and this study would have benefited from knowing the demographics of each program participant. The second assumption was that the information was accurate. When going through the dataset, errors in the location columns were discovered. This led to the conclusion that the information was inputted from a form filled out by the loan recipient. It was also noted that the form for the number of jobs the program helped fund was sometimes left blank and defaulted to zero.