The Orange Envelope: Data Cleaning & Visualization

Pay the Meter!




You park your car, turn off the engine, and go to the grocery store. You only need a gallon of milk, it'll be quick. You quickly run to the store grab the milk, but you did not take into account the long line. You finally pay for the milk, and come back to your car. There is an orange envelope.


"NOOOOoooooOoooo"

Yep, you just got a ticket. And you think "Why didn't I just pay the quarter?"
Well if you are like me, you probably would have never thought that an officer was even nearby. It's as if they were hiding by the corner, as if they were a lion hunting its prey. Like the lion eating its prey, the officer eating at your wallet.

And so, this got me to think. Why me? Was it because I was driving a black car the officer decided to check my car? Was it because I owned a Hyundai? Or was it because it was the end of the month and all the officers had to meet their quotas?


NYC-Parking Violation Issued Fiscal Year 2019
I decided to look into parking violations issued in the fiscal year of 2019.
And from the parking violations I chose the meter-violations.


The data shows parking violations with 'meter' and its respective fine amount
Good thing I got the ticket in Queens, because in Manhattan I would have had to pay 30$ more.


Sorting and cleaning the data using the time:

In order to find out when most meter violations were given, the data  was sorted into different times, months, and day of the month.

From the data, it can be seen that most tickets were given during 12 pm to 2 pm.



There seems to be more meter violations on Wednesdays.
There was a fairly consistent spread in the amount of tickets given each month/day of the month.




Number of Meter Violations by each Borough



The greatest amount of meter violations happened in Manhattan, which is not surprising., however it was surprising that Staten Island only had 1% of the total of amount of meter violations. 


Sorting and Cleaning Vehicle Make Column:

The vehicle make column was very messy, with around 162 unique values. The code for some vehicles were not consistent. For example HYUN & HYUND both refers to HYUNDAI, and so it must be fixed. Other codes were fixed to match their true descriptors.

The link for Top 15 leading car brands in 2019  was used to help clean the vehicle make column.
The top 15 leading car brands were found with the help of fuzzywuzzy library. For example, similar values such as HYUN & HYUND were grouped together into HYUNDAI. The rest that did not fall into the top 15 leading car brands were sorted as OTHER.


Ford had the highest number of meter-violation tickets with 14%, however RAM had only 0%. The car brand Ram was listed in 2019's leading car brands, however there were barely any Ram's, in fact, out of 1.8 million tickets only one was Ram.


Sorting and Cleaning Vehicle Color Column:

Just like the vehicle make column, the vehicle color column had different color codes representing the same color. Also similar colors such as light blue & dark blue was sorted into one column called blue.

'YW','YELLO','YELL' = 'YELLOW'
White cars were given the greatest number of meter tickets
Sorting and Cleaning Vehicle Body-Type Column:

The vehicle body-type column was sorted into 5 different categories. The 5 were trucks, sedans, suburban, vans, and others. Others could be anything from buses to motorcycles to boats.

Suburban vehicles were given the greatest amount of meter tickets.

Results:

More than one third of meter tickets were given to suburban vehicles and more than one third of tickets were given to white vehicles.

Meter tickets were given mostly from 12pm to 2pm. Was it because this is lunch time, and more people are out for lunch and forgot to feed their meters? Or maybe, more officers are out around this time giving out tickets? Or maybe both.

There seemed to be a pattern in the amount of meter tickets given and the day of the week. The amount increased from Monday to Wednesday, then decreased from Wednesday to Saturday, with Wednesday being the peak.


So what does this mean?

Well, nothing really, because as we know correlation is not causation.
And that you should just feed your meter!





Comments