BACKGORUND

About

In order to find the 'best' places for a fun night out in NYC I used 311 noise complaint data and the record of active liquor licenses in NYC. I used multiple data manipulation methods including dictionaries, along with mapping and plotting tools for the visualizations.

To clean the data: for both datasets, selected columns and and values of interest.

Created a dictionary to convert numeric value of all the zip-codes to their respective neighborhood name.

Created a function to apply this dictionary to each zip-code in the data frame.

Create a dictionary to clean street names and make sure they appear the same in both data frames.

Merge data frame where incident addresses match to have a new data frame of places that occur in both original data frames

Have the user input a neighborhood

Count how many times a establishment (bar/club/restaurant) occurs to find top 5-10 places with most noise complaints.

Create function to clean latitude and longitude

Using folium created a map to plot top 10 places to go out

Using import matplotlib created a bar graph to find top 5 places go out

TECHNIQUES

DATA + CITATIONS

https://data.ny.gov/Economic-Development/Liquor-Authority-Current-List-of-Active-Licenses/hrvs-fxs2/data

This is the dataset of " The Liquor Authority Current List of Active Licenses" which includes many columns such as License Type Code (for example OP which stands for on premise), Country, Serial Number, License Class Code, Premise Name, Premise Address, Premise Zip code and a few others

https://data.cityofnewyork.us/Social-Services/311-Noise-Complaints/p5f6-bkga/data

This is the dataset from NYC OpenData of 311 Noise Complaints with multiple columns such as, Complaint Type (ex: Noise - Street/Sidewalk), Descriptor (ex: Loud Music/Party), Location Type (ex: Street/Sidewalk), Incident Zip, Incident Address, X and many others.