“A” for Awful?

Introduction

Walking around NYC looking at restaurants I notice many more”A” grade inspection than B’s or C’s. I was curious to see what the distribution was, and what correlation, if any, occurred when comparing to Yelp Reviews.

Data Sets

The New York City Restaurant Inspection Results data set was exported from Engima.io

This data is provided by the Department of Health and Mental Hygiene (DOHMH). The date range used was from 2013 to the start of April 2016.  Filtering out pre permit inspections, negative scores and malformed phone numbers. (Mainly containing “_” and less that 7 digits). I joined to the Yelp Data on the condition that the restaurant had at least 25 Yelp Reviews. In total I have a list of 10,114 restaurants in the NYC area.

yelp-logo-small@2x

ID Mapping

I am using the Business Phone column as my unique identifier to connect with the Yelp API’s Phone Search Protocol.

 

Grade scale: “Restaurants with a score between 0 and 13 points earn an A, those with 14 to 27 points receive a B and those with 28 or more a C.”

 

Exploratory Analysis:

Health Inspection Distribution

 

We see a large step decrease across all Boro’s between the 13 to 14 score mark, which is also the distinction of an A to a B. I was curious as to why this occurs as the rest of distribution is more smooth. Any restaurant initially falling short of an A gets a repeat visit within two to three weeks, which allows the restaurant to improve food safety practices. This is very likely the reason for the strong right skew in the data.

Delta

The jump between an A and B is highlighted above. There is about 3x as many scores of 13 which is the cut off for an A compared to a score of 14. This jump between scores is not seen at the cut off between a B and C.

By Boro

Boro

The distribution from all the Boro’s are fairly similar except for Staten Island which over indexes more restaurants with a C grade

What surprised me most was what I learned from the How we score and grade document the DOH put out.

• A public health hazard, such as failing to keep food at the right temperature, triggers a minimum of 7 points. If the violation can’t be corrected before the inspection ends, the Health Department may close the restaurant until it’s fixed.

• A critical violation, for example, serving raw food such as a salad without properly washing it first, carries a minimum of 5 point

So a restaurant can have mice, and food at unsafe temperatures and still earn an A! Based on this I really hope “Critical” violations would trigger a minimum of 14 points in the future.

The Correlation (mutual relationship or connection between two or more things) between Health Ratings and Yelp scores is weak and negative (-.03). In other words the higher the Yelp Review the lower the Health inspection score, and remember the lower the score the better.  But as the correlation coefficient is so low and we have a large sample size this finding is not statistically significant.

 

I believe this is due to a false dichotomy that higher Yelp ratings means better health scores, for example service is not reflected in Health inspections nor is price. A restaurant can be squeaky clean and still serve tasteless food. My take from this was to create Quadrants for different Yelp and Health Rating pairs

Q1 High scores and Ratings (Maybe best not to know)
Q2 Low Scores and High Ratings (Crème de la crème)
Q3 Low scores and Ratings (Clean but bland)
Q4 High Scores, Low Ratings (Run don’t walk)

 

Quadrants

Some Q2 Listings (Score less than 5 so no mice):

  • Perfect Picnic NYC
  • D’Amore Winebar & Ristorante
  • Da Capo
  • Naturally Delicious
  • L’industrie Pizzeria
  • Aux Merveilleux De Fred
  • Peasant Stock

I am not including names of restaurants from other Quadrants, but the data is open, so please explore and let me know of any cool findings.

 

 

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *