1

I have a dataset with about 85 columns. Out of the 85 columns, 70+ are categorical. My goal is to identify the outliers in this dataset through clustering methods as I do not have a target column.

What is the best way to approach this? Is it advisable to convert all the 70+ columns to dummies in pandas and use a clustering algorithm like DBScan?

nbro
  • 39,006
  • 12
  • 98
  • 176
user13074756
  • 111
  • 1

0 Answers0