BACKGORUND
About
In order to find the 'best' places for a fun night out in NYC I used 311 noise complaint data and the record of active liquor licenses in NYC. I used multiple data manipulation methods including dictionaries, along with mapping and plotting tools for the visualizations.
To clean the data: for both datasets, selected columns and and values of interest.
Created a dictionary to convert numeric value of all the zip-codes to their respective neighborhood name.
Created a function to apply this dictionary to each zip-code in the data frame.
Create a dictionary to clean street names and make sure they appear the same in both data frames.
Merge data frame where incident addresses match to have a new data frame of places that occur in both original data frames
Have the user input a neighborhood
Count how many times a establishment (bar/club/restaurant) occurs to find top 5-10 places with most noise complaints.
Create function to clean latitude and longitude
Using folium created a map to plot top 10 places to go out
Using import matplotlib created a bar graph to find top 5 places go out
​
TECHNIQUES
DATA + CITATIONS
​
​
This is the dataset of " The Liquor Authority Current List of Active Licenses" which includes many columns such as License Type Code (for example OP which stands for on premise), Country, Serial Number, License Class Code, Premise Name, Premise Address, Premise Zip code and a few others
​
https://data.cityofnewyork.us/Social-Services/311-Noise-Complaints/p5f6-bkga/data
This is the dataset from NYC OpenData of 311 Noise Complaints with multiple columns such as, Complaint Type (ex: Noise - Street/Sidewalk), Descriptor (ex: Loud Music/Party), Location Type (ex: Street/Sidewalk), Incident Zip, Incident Address, X and many others.