Progressive Tools

This part of the library is designed for progressive learning.

make_run

It creates multiple models and calculates the accuracy for each one of them, the size of the train data is repeatedly getting bigger till the dedicated size.

Parameters

Datatype

Default Value

model_class

any AI model class

X_train

multidimensional array

y_train

1D array

X_test

multidimensional array

y_test

1D array

init_per

integer

1

limit_per

integer

100

increment

integer or float

1

metrics

list

None

average

string

weighted

params

dict

None

Note

init_per must be less than limit_per.

These are the valid keywords for metrics:

algo_type

metrics keyword

sklearn function

clf

acc

accuracy_score

clf

f1

f1_score

clf

hamming

hamming_loss

clf

jaccard

jaccard_score

clf

log

log_loss

clf

mcc

matthews_corrcoef

clf

precision

precision_score

clf

recall

recall_score

clf

zol

zero_one_loss

reg

var

explained_variance_score

reg

max

max_error

reg

var

explained_variance_score

reg

abs

mean_absolute_error

reg

sq

mean_squared_error

reg

rsq

root_mean_squared_error

reg

log

mean_squared_log_error

reg

rlog

root_mean_squared_log_error

reg

medabs

median_absolute_error

reg

poisson

mean_poisson_deviance

reg

gamma

mean_gamma_deviance

reg

per

mean_absolute_percentage_error

reg

d2abs

d2_absolute_error_score

reg

d2pin

d2_pinball_score

reg

d2twe

d2_tweedie_score

Attention

average value must be valid for sklearn’s metrics functions.

Note

params is for the model, model does not have to be created in default settings, it can be manipulated.

Priority (in return)

Returns

Datatype

Condition

1

percentage_log

list

always

2

metrics_log

list

always

get_best_model

It calculates the optimum dataset size for the model.

Parameters

Datatype

Default Value

percentage_log

list

metrics_log

list

requested_metrics

string

Priority (in return)

Returns

Datatype

Condition

1

best_percentage

integer or float

always

2

best_score

float

always

path_chain

Sometimes, train data can be kept in different files with different sizes. It is preferred when the data is too big to store in RAM. That function is designed for these situations.

Parameters

Datatype

Default Value

paths

list

model_class

any AI model class

X_test

multidimensional array

y_test

1D array

output_column

string

metrics

list

None

average

string

weighted

params

dict

None

These are the valid keywords for metrics:

algo_type

metrics keyword

sklearn function

clf

acc

accuracy_score

clf

f1

f1_score

clf

hamming

hamming_loss

clf

jaccard

jaccard_score

clf

log

log_loss

clf

mcc

matthews_corrcoef

clf

precision

precision_score

clf

recall

recall_score

clf

zol

zero_one_loss

reg

var

explained_variance_score

reg

max

max_error

reg

var

explained_variance_score

reg

abs

mean_absolute_error

reg

sq

mean_squared_error

reg

rsq

root_mean_squared_error

reg

log

mean_squared_log_error

reg

rlog

root_mean_squared_log_error

reg

medabs

median_absolute_error

reg

poisson

mean_poisson_deviance

reg

gamma

mean_gamma_deviance

reg

per

mean_absolute_percentage_error

reg

d2abs

d2_absolute_error_score

reg

d2pin

d2_pinball_score

reg

d2twe

d2_tweedie_score

Attention

average value must be valid for sklearn’s metrics functions.

Note

params is for the model, model does not have to be created in default settings, it can be manipulated.

Priority (in return)

Returns

Datatype

Condition

1

metrics_log

dict

always