INTRODUCTION
The Electric Vehicle Population Data, available on data.gov,(https://catalog.data.gov/dataset/electric-vehicle-population-size-history-by-county), offers insights into the registration of Battery Electric Vehicles (BEVs) and Plug-in Hybrid Electric Vehicles (PHEVs) through the Washington State Department of Licensing (DOL). This dataset tracks the monthly registration counts by county for both passenger vehicles and trucks. It is a valuable resource for researchers, analysts, and policymakers interested in understanding electric vehicle adoption trends within Washington State. The information provided is derived from the integration of data from the National Highway Traffic Safety Administration (NHTSA), the Environmental Protection Agency (EPA), and the DOL's titling and registration records.
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
warnings.simplefilter(action='ignore', category=DeprecationWarning)
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
projectdata =('Electric_Vehicle_Population_Size_History_By_County.csv')
df = pd.read_csv(projectdata)
df.head()
print(df.head)
<bound method NDFrame.head of Date County State Vehicle Primary Use \ 0 September 30 2022 Riverside CA Passenger 1 December 31 2022 Prince William VA Passenger 2 January 31 2020 Dakota MN Passenger 3 June 30 2022 Ferry WA Truck 4 July 31 2021 Douglas CO Passenger ... ... ... ... ... 20814 January 31 2023 Rockingham NH Passenger 20815 July 31 2020 Carson City NV Passenger 20816 February 28 2022 Island WA Passenger 20817 December 31 2020 San Diego CA Passenger 20818 November 30 2019 Goochland VA Passenger Battery Electric Vehicles (BEVs) \ 0 7 1 1 2 0 3 0 4 0 ... ... 20814 1 20815 1 20816 744 20817 14 20818 3 Plug-In Hybrid Electric Vehicles (PHEVs) Electric Vehicle (EV) Total \ 0 0 7 1 2 3 2 1 1 3 0 0 4 1 1 ... ... ... 20814 0 1 20815 0 1 20816 350 1094 20817 2 16 20818 1 4 Non-Electric Vehicle Total Total Vehicles Percent Electric Vehicles 0 460 467 1.50 1 188 191 1.57 2 32 33 3.03 3 3575 3575 0.00 4 83 84 1.19 ... ... ... ... 20814 14 15 6.67 20815 10 11 9.09 20816 62257 63351 1.73 20817 2724 2740 0.58 20818 271 275 1.45 [20819 rows x 10 columns]>
PREPROCESSING
df['Date'] = pd.to_datetime(df['Date'])
df.drop(['Date', 'County', 'State'], axis=1, inplace=True)
missing_values=df.isnull().sum()
datanotmissing=missing_values.dropna()
# Select the quantitative columns
quantColumns = df.select_dtypes(include=['int64', 'float64'])
# Get the statistical summary
quantColanalysis = quantColumns.describe()
# Print the statistical summary
print(quantColanalysis)
Battery Electric Vehicles (BEVs) \ count 20819.000000 mean 217.516211 std 2278.533317 min 0.000000 25% 0.000000 50% 1.000000 75% 3.000000 max 72333.000000 Plug-In Hybrid Electric Vehicles (PHEVs) Electric Vehicle (EV) Total \ count 20819.000000 20819.000000 mean 80.063644 297.579855 std 646.373208 2915.504792 min 0.000000 0.000000 25% 0.000000 1.000000 50% 1.000000 1.000000 75% 2.000000 4.000000 max 17501.000000 89834.000000 Non-Electric Vehicle Total Total Vehicles Percent Electric Vehicles count 2.081900e+04 2.081900e+04 20819.000000 mean 2.509806e+04 2.539564e+04 4.139216 std 1.067324e+05 1.090860e+05 11.055350 min 0.000000e+00 1.000000e+00 0.000000 25% 4.300000e+01 4.400000e+01 0.390000 50% 1.630000e+02 1.650000e+02 1.220000 75% 8.380000e+03 8.421500e+03 2.995000 max 1.399823e+06 1.430937e+06 100.000000
SUMMARY DATA ANALYSIS
# Numerical summary
numerical_summary = df.describe()
print(numerical_summary)
# Graphical summary
sns.scatterplot(df)
plt.show()
Battery Electric Vehicles (BEVs) \ count 20819.000000 mean 217.516211 std 2278.533317 min 0.000000 25% 0.000000 50% 1.000000 75% 3.000000 max 72333.000000 Plug-In Hybrid Electric Vehicles (PHEVs) Electric Vehicle (EV) Total \ count 20819.000000 20819.000000 mean 80.063644 297.579855 std 646.373208 2915.504792 min 0.000000 0.000000 25% 0.000000 1.000000 50% 1.000000 1.000000 75% 2.000000 4.000000 max 17501.000000 89834.000000 Non-Electric Vehicle Total Total Vehicles Percent Electric Vehicles count 2.081900e+04 2.081900e+04 20819.000000 mean 2.509806e+04 2.539564e+04 4.139216 std 1.067324e+05 1.090860e+05 11.055350 min 0.000000e+00 1.000000e+00 0.000000 25% 4.300000e+01 4.400000e+01 0.390000 50% 1.630000e+02 1.650000e+02 1.220000 75% 8.380000e+03 8.421500e+03 2.995000 max 1.399823e+06 1.430937e+06 100.000000
# Create correlation matrix
corr_matrix = quantColumns.corr()
# Print the correlation matrix
print(corr_matrix)
plt.figure(figsize=(10, 8))
sns.heatmap(corr_matrix, annot=True, cmap='coolwarm')
plt.title('Correlation Matrix Heatmap')
plt.show()
Battery Electric Vehicles (BEVs) \ Battery Electric Vehicles (BEVs) 1.000000 Plug-In Hybrid Electric Vehicles (PHEVs) 0.981358 Electric Vehicle (EV) Total 0.999092 Non-Electric Vehicle Total 0.779821 Total Vehicles 0.789699 Percent Electric Vehicles -0.012019 Plug-In Hybrid Electric Vehicles (PHEVs) \ Battery Electric Vehicles (BEVs) 0.981358 Plug-In Hybrid Electric Vehicles (PHEVs) 1.000000 Electric Vehicle (EV) Total 0.988656 Non-Electric Vehicle Total 0.870713 Total Vehicles 0.878351 Percent Electric Vehicles -0.020843 Electric Vehicle (EV) Total \ Battery Electric Vehicles (BEVs) 0.999092 Plug-In Hybrid Electric Vehicles (PHEVs) 0.988656 Electric Vehicle (EV) Total 1.000000 Non-Electric Vehicle Total 0.802487 Total Vehicles 0.811900 Percent Electric Vehicles -0.014014 Non-Electric Vehicle Total \ Battery Electric Vehicles (BEVs) 0.779821 Plug-In Hybrid Electric Vehicles (PHEVs) 0.870713 Electric Vehicle (EV) Total 0.802487 Non-Electric Vehicle Total 1.000000 Total Vehicles 0.999873 Percent Electric Vehicles -0.063489 Total Vehicles \ Battery Electric Vehicles (BEVs) 0.789699 Plug-In Hybrid Electric Vehicles (PHEVs) 0.878351 Electric Vehicle (EV) Total 0.811900 Non-Electric Vehicle Total 0.999873 Total Vehicles 1.000000 Percent Electric Vehicles -0.062494 Percent Electric Vehicles Battery Electric Vehicles (BEVs) -0.012019 Plug-In Hybrid Electric Vehicles (PHEVs) -0.020843 Electric Vehicle (EV) Total -0.014014 Non-Electric Vehicle Total -0.063489 Total Vehicles -0.062494 Percent Electric Vehicles 1.000000
# Select three pairs of columns
pairs = [('Battery Electric Vehicles (BEVs)', 'Plug-In Hybrid Electric Vehicles (PHEVs)'),
('Plug-In Hybrid Electric Vehicles (PHEVs)', 'Non-Electric Vehicle Total'),
('Battery Electric Vehicles (BEVs)', 'Non-Electric Vehicle Total')]
# Calculate and print the correlation for each pair
for pair in pairs:
corr = df[pair[0]].corr(df[pair[1]])
print(f'Correlation between {pair[0]} and {pair[1]}: {corr}')
Correlation between Battery Electric Vehicles (BEVs) and Plug-In Hybrid Electric Vehicles (PHEVs): 0.9813584261124682 Correlation between Plug-In Hybrid Electric Vehicles (PHEVs) and Non-Electric Vehicle Total: 0.8707133558457827 Correlation between Battery Electric Vehicles (BEVs) and Non-Electric Vehicle Total: 0.7798212378906964
DISCUSSION
QUESTION 1 - Can the number of charging stations, average income, population density, and environmental policies in a county predict the adoption rate of electric vehicles (categorized as high, medium, or low)?
QUESTION 2 - Do factors such as average commute distance, public transportation availability, electric vehicle incentives, and average electricity cost in a county predict the percentage increase in electric vehicle registrations year over year?