What is the best clustering method to detect anomalies for data with mostly categorical data?

Asked Jun 05 '21 at 16:23

Active Jun 06 '21 at 02:24

Viewed 20 times

I have a dataset with about 85 columns. Out of the 85 columns, 70+ are categorical. My goal is to identify the outliers in this dataset through clustering methods as I do not have a target column.

What is the best way to approach this? Is it advisable to convert all the 70+ columns to dummies in pandas and use a clustering algorithm like DBScan?

edited Jun 06 '21 at 02:24

nbro

39,006
12
98
176

asked Jun 05 '21 at 16:23

user13074756

What is the best clustering method to detect anomalies for data with mostly categorical data?

0 Answers0