Further Readings

Transformations

The transformation process can be used to achieve normal distribution behaviour in a set that does not show it naturally. These ways are supported by the Wolta library:

name

core operation

calculation

log

natural log

ln(A)

log-m

natural log

ln(A + m)

log2

base 2 log

log_2(A)

log2-m

base 2 log

log_2(A + m)

log10

common log

log(A)

log10-m

common log

log(A + m)

sqrt

square root

A^0.5

sqrt-m

square root

(A + m)^0.5

cbrt

cube root

A^(1/3)

Attention

‘A’ stands for a set, m stands for the smallest member of the set A.

Note

This process is reversible.

Categorization With Normal Distribution

If the set has a normal distribution, it can be split into three groups. In order to do that min, max, standard deviation and arithmetic mean should be calculated.

class

interval

0

min <= y <= mean - std

1

mean - std < y < mean + std

2

mean + std <= y <= max

Welkin Classification

Welkin classification is designed for prediction on datasets that have discrete value distributions in features for each class. It determines the intervals in every feature for each class independently by calculating the minimum and maximum values.

Attention

In the prediction phase, it may search the most fitted class for a test sample (strategy=’travel’) or when a class has enough coherence, it can be accepted directly as a prediction (strategy=’limit’). Limit strategy is less time consuming than travel strategy.

Attention

Also, there is no need to use all the features for training and predicting, only requested features can be used thanks to the priority parameter.

Dist Regression

Dist Regressor comes up with the idea that the sets with smaller ranges may cause more accurate predictions for regression algorithms. If the output set has a normal distribution then the algorithm splits it into three classes (for more information about it read Categorization With Normal Distribution article) and then trains regression models individually. After that, in the prediction phase, first it uses classification and detects the class then what regression algorithm will be used and finally according to that makes a prediction.

Note

By using random under sampler, classes can be balanced for classification.

Classification with Commune Technique

This technique follows these steps:

  1. Train the models.

  2. Find the algorithm that gives the best accuracy score for the dataset. It is called a ‘general model’.

  3. Find the algorithm that gives the best accuracy score for each class independently. Each one of them is called a ‘submodel.’’

  4. Collect the samples from the validation set that do not have the same result in general and submodel. Predict them again with all algorithms and find the best result given one. It is called an ‘instead model’ and it is used in situations like this in the prediction phase.