identifying outliers

22 Views Asked by At

I am working on a code in python to identify outliers. I have a dataset with 500 restaurant orders, and their total. I want to identify the outliers in the dataset. And then decide if they are valid data point or wrong values. And then remove the invalid ones.

The problem is I only have the total price of the orders, the item name, and the quantity ordered. I was wondering if it is possible to mathematically estimate the price per item.

Sample of my dataset:

{1215.5$: [('Shrimp', 10), ('Fish&Chips', 6), ('Salmon', 8), ('Pasta', 5)],
 1230.0$: [('Shrimp', 10), ('Salmon', 10), ('Fish&Chips', 8)],
 1234.0$: [('Salmon', 9), ('Fish&Chips', 3), ('Pasta', 8), ('Shrimp', 10)],
 1292.5$: [('Pasta', 7), ('Salmon', 9), ('Fish&Chips', 7), ('Shrimp', 9)],
 1301.5$: [('Pasta', 5), ('Shrimp', 9), ('Salmon', 8), ('Fish&Chips', 10)],
 1314.5$: [('Shrimp', 10), ('Pasta', 5), ('Fish&Chips', 10), ('Salmon', 7)],
 1343.5$: [('Shrimp', 8), ('Fish&Chips', 10), ('Salmon', 9), ('Pasta', 7)]}

If someone can recommend a method to do this mathematically. I can work on the code by myself. Thank you