For questions related to feature engineering, which is the process of using domain knowledge to extract features from raw data via data mining techniques.
Questions tagged [feature-engineering]
36 questions
4
votes
2 answers
When is it necessary to manually extract features to feed into the neural network rather than providing raw data?
Usually, Neural Networks uses raw data. You do not need to extract features manually. NN's can find & extract good features which is a pattern of an image, signal or any kind of data. When we check layer outputs in a NN, we can see and visualize how…

dasmehdix
- 257
- 1
- 8
3
votes
1 answer
Is automated feature engineering a path to general AI?
I recently came across the featuretools package, which facilitates automated feature engineering. Here's an explanation of the package:
https://towardsdatascience.com/automated-feature-engineering-in-python-99baf11cc219
Automated feature…

SuperCodeBrah
- 273
- 1
- 11
3
votes
1 answer
How to perform prediction when some features have missing values?
Sorry if this is too noob question, I'm just a beginner.
I have a data set with companies' info. There are 2 kinds of features: financial (revenue and so on) and general info (like the number of employees and date of registration)
I have to predict…

Denis Ka
- 31
- 1
2
votes
2 answers
Are derived or computed inputs bad for CNNs?
I am building a CNN and am wondering if inputting derived or computed inputs are generally bad for the effectiveness of CNNs? Or just NNs in general?
By derived or computed values I mean data that is not "raw" and instead is computed based on the…

NullFucksException
- 21
- 2
2
votes
0 answers
Is it a good practice to split sparse from dense features?
I have a mixture of real (float) and categorical features to use as input in a neural network. I encode the categorical features using one-hot / multi-hot encoding.
If I want to use all the features as input what is usually/empirically the best…

Michael
- 159
- 5
2
votes
1 answer
What can be an example for the prior knowledge used in Deep Learning systems?
It is known that machine learning algorithms expect feature engineering as an initial step. Now, consider the following paragraph, taken from 1.1 The deep learning revolution of the textbook named Deep learning with PyTorch by Eli Stevens, Luca…

hanugm
- 3,571
- 3
- 18
- 50
2
votes
1 answer
How does a decision tree split a continuous feature?
Decision trees learn by measuring the quality of a split through some function, apply this to all features and you get the best feature to split on.
However, with a continuous feature it becomes problematic because there are an infinite number of…

Recessive
- 1,346
- 8
- 21
2
votes
1 answer
Feeding CNN FFT of an image, a dumb idea?
My dataset consists of about 40,000 200x200px grayscale images of centered blobs bathed in noise and occasional artifacts like stripes other blobs of different shapes and sizes, fuzzy speckles and so on in their neighborhood.
They are used in a…

Dumitrescu Calin
- 21
- 1
2
votes
2 answers
Is feature engineer an important step for a deep learning approach?
I'd like to ask you if feature engineering is an important step for a deep learning approach.
By feature engineering I mean some advanced preprocessing steps, such as looking at histogram distributions and try to make it look like a normal…

Daviiid
- 563
- 3
- 15
2
votes
0 answers
How to find good features for a linear function approximation in RL with large discrete state set?
I've recently read much about feature engineering in continuous (uncountable) feature spaces. Now I am interested what methods exist in the setting of large discrete state spaces. For example consider a board game with grid as a basic layout. Each…

s1624210
- 121
- 1
2
votes
2 answers
Why are decision trees and random forests scale invariant?
Feature scaling, in general, is an important stage in the data preprocessing pipeline.
Decision Tree and Random Forest algorithms, though, are scale-invariant - i.e. they work fine without feature scaling. Why is that?

stoic-santiago
- 1,121
- 5
- 18
2
votes
0 answers
Visualisation for Features to Predict Timeseries Data
I have a course assignment to use an LSTM to predict the movement directions of stock prices. One of the things I am asked to do is provide a visualization to compare the predictive powers of a set of N features (e.g. 1-day return, volatility,…

georgi koyrushki
- 21
- 1
2
votes
0 answers
How to feed key-value features (aggregated data) to LSTM?
I have the following time-series aggregated input for an LSTM-based model:
x(0): {y(0,0): {a(0,0), b(0,0)}, y(0,1): {a(0,1), b(0,1)}, ..., y(0,n): {a(0,n), b(0,n)}}
x(1): {y(1,0): {a(1,0), b(1,0)}, y(1,1): {a(1,1), b(1,1)}, ..., y(1,n): {a(1,n),…

Maximus
- 121
- 3
1
vote
0 answers
Best feature engineering approach for interest-based age classification
I have a dataset which has users (rows) with the list of their interests (IABs), which looks like this
user_id | gender | list of interests
--------+--------+--------------------------------
user 1 | male | games, productivity
user 2 | female |…

theodre7
- 11
- 2
1
vote
1 answer
Features for a Content-Based recommendation system
I'm working on a hybrid recommendation system (collaborative and content-based) for an online ordering/shopping app. So far I've managed to identify a data-source for the collaborative model (likely item-based) but I'm having trouble deciding on…

S_Khan
- 11
- 3