Questions tagged [data-mining]

The process of discovering patterns in large data sets by AI.

An interdisciplinary subfield involcing computer science, statistics, databases and machine learning. It utilizes AI to extract information from large data, and to transform it into formats better suiting the requirements.

10 questions
8
votes
3 answers

What are the differences between machine learning, pattern recognition and data mining?

I know a little about these subjects. I found them similar to each other. Can anybody explain the differences between them?
3
votes
1 answer

Algorithm for seasonal trends

I have a very big table with lots of names and how much they are searched by date. I would like to find trending patterns. When does a name rise and when does it fall. Without knowing the name or the pattern before. The rise could be during the…
JoergP
  • 131
  • 2
2
votes
1 answer

Mining repeated subsequences in a given sequence

Given an alphabet $I=\left\{i_1,i_2,\dots,i_n\right\}$ and a sequence $S=[e_1,e_2,\dots,e_m]$, where items $e_j \in I$, I am interested in finding every single pattern (subsequence of $S$) that appears in $S$ more than $N$ times, and that has a…
2
votes
1 answer

How to define the "Pre-Processing" in machine learning?

Is every process (such as data acquisition, splitting the data for validation, data cleaning, or feature engineering) that is done on the data before we train the model always called the pre-processing part? Or are there some processes that are not…
2
votes
1 answer

How do I know if my dataset is ready for a machine learning model?

I am new in this area of Machine Learning and Neural Networks. Currently, I'm taking some courses on Udemy and reading a book about it, but I still have one big question regarding data pre-processing. In all of those Udemy's lessons, people always…
1
vote
0 answers

how to convert one structured data to another without specifying structure

I have lots of text documents structured as { { Item1=[ {a1=1, a2=2, a3=3}, {a1=11, a2=22, a3=33}, {a1=41, a2=52, a3=63}, …
1
vote
0 answers

Is this a classification problem?

I’m not really sure which machine learning approach is best for my problem at hand. I work in an engineering company that designs and builds different kinds of ships. In my particular job, I collect the individual weight of items on these vessels.…
1
vote
1 answer

What data formats/pipelining are best to store and wrangle data which contains both text and float vectors?

Often in NLP project the data points contain both text and float embeddings, and it's very tricky to deal with. CSVs take up a ton of memory and are slow to load. But most the other data formats seem to be meant for either pure text or pure…
1
vote
2 answers

What features should a dataset to predict monthly retail sales for a motorcycle spare parts shop have?

I am making an AI model to predict monthly retail sales of a motor cycle spare parts shop, for that to be possible I have to first create a dataset. The problem I am facing is what features should the dataset have? I already did some research on…
Kaawya
  • 13
  • 2
0
votes
1 answer

What are some good papers or resources for aspect extraction and opinion modelling from video or audio?

I am quite new to deep learning. I just finished the deep learning specialization by Professor Andrew NG and Deep Learning AI. Now, my professor (instructor) has advised me to look into some classic papers for aspect extraction and opinion mining…