Further Readings
Transformations
The transformation process can be used to achieve normal distribution behaviour in a set that does not show it naturally. These ways are supported by the Wolta library:
name |
core operation |
calculation |
|---|---|---|
log |
natural log |
ln(A) |
log-m |
natural log |
ln(A + m) |
log2 |
base 2 log |
log_2(A) |
log2-m |
base 2 log |
log_2(A + m) |
log10 |
common log |
log(A) |
log10-m |
common log |
log(A + m) |
sqrt |
square root |
A^0.5 |
sqrt-m |
square root |
(A + m)^0.5 |
cbrt |
cube root |
A^(1/3) |
Attention
‘A’ stands for a set, m stands for the smallest member of the set A.
Note
This process is reversible.
Categorization With Normal Distribution
If the set has a normal distribution, it can be split into three groups. In order to do that min, max, standard deviation and arithmetic mean should be calculated.
class |
interval |
|---|---|
0 |
min <= y <= mean - std |
1 |
mean - std < y < mean + std |
2 |
mean + std <= y <= max |
Welkin Classification
Welkin classification is designed for prediction on datasets that have discrete value distributions in features for each class. It determines the intervals in every feature for each class independently by calculating the minimum and maximum values.
Attention
In the prediction phase, it may search the most fitted class for a test sample (strategy=’travel’) or when a class has enough coherence, it can be accepted directly as a prediction (strategy=’limit’). Limit strategy is less time consuming than travel strategy.
Attention
Also, there is no need to use all the features for training and predicting, only requested features can be used thanks to the priority parameter.
Dist Regression
Dist Regressor comes up with the idea that the sets with smaller ranges may cause more accurate predictions for regression algorithms. If the output set has a normal distribution then the algorithm splits it into three classes (for more information about it read Categorization With Normal Distribution article) and then trains regression models individually. After that, in the prediction phase, first it uses classification and detects the class then what regression algorithm will be used and finally according to that makes a prediction.
Note
By using random under sampler, classes can be balanced for classification.
Classification with Commune Technique
This technique follows these steps:
Train the models.
Find the algorithm that gives the best accuracy score for the dataset. It is called a ‘general model’.
Find the algorithm that gives the best accuracy score for each class independently. Each one of them is called a ‘submodel.’’
Collect the samples from the validation set that do not have the same result in general and submodel. Predict them again with all algorithms and find the best result given one. It is called an ‘instead model’ and it is used in situations like this in the prediction phase.