Feature Crosses
In linear regression problems, some data doesn’t lend itself easily to direct linear solutions. There are many ways to solve such models, but one particularly valuable one is feature crosses: that is, creating new columns that combine existing features to make data that is easier to work with.
For example: let’s suppose you have a data set where features with negative x and y values OR positive x and y values have a strong correlation to one label, and features with either a negative x/y AND a positive x/y correlate to the the other label. By creating a feature cross for the values of x * y, you could ensure all data with the same positive/negative values showed as positive, and all data with one negative/one positive showed as negative. Suddenly, a linear line becomes a feasible way of separating labels.
Feature Crosses and One-Hot Vectors
Feature crosses are particularly useful with one-hot vectors. For example: if you have a one-hot vector for musical instruments a person plays (Guitar = 1, Flute = 0, Drums = 0, Vocals = 1, Bass = 0), and another one-hot vector for type of music they play, by multiplying the two you can draw interesting conclusions about the type of music someone who plays guitar AND vocals is likely to perform, versus isolated data for guitar and vocals. In short: feature crossing is particularly helpful when used with one-hot vectors
Feature Crosses and Weights
Note that feature crosses don’t have to be (and often shouldn’t) be used INSTEAD of feature data, but as well as. Much like regular data, you will need to experiment with finding weights for feature crosses that most accurately fit training (and test) data