Progressive Tools

This part of the library is designed for progressive learning.

make_run

It creates multiple models and calculates the accuracy for each one of them, the size of the train data is repeatedly getting bigger till the dedicated size.

Parameters	Datatype	Default Value
model_class	any AI model class
X_train	multidimensional array
y_train	1D array
X_test	multidimensional array
y_test	1D array
init_per	integer	1
limit_per	integer	100
increment	integer or float	1
metrics	list	None
average	string	weighted
params	dict	None

Note

init_per must be less than limit_per.

These are the valid keywords for metrics:

algo_type	metrics keyword	sklearn function
clf	acc	accuracy_score
clf	f1	f1_score
clf	hamming	hamming_loss
clf	jaccard	jaccard_score
clf	log	log_loss
clf	mcc	matthews_corrcoef
clf	precision	precision_score
clf	recall	recall_score
clf	zol	zero_one_loss
reg	var	explained_variance_score
reg	max	max_error
reg	var	explained_variance_score
reg	abs	mean_absolute_error
reg	sq	mean_squared_error
reg	rsq	root_mean_squared_error
reg	log	mean_squared_log_error
reg	rlog	root_mean_squared_log_error
reg	medabs	median_absolute_error
reg	poisson	mean_poisson_deviance
reg	gamma	mean_gamma_deviance
reg	per	mean_absolute_percentage_error
reg	d2abs	d2_absolute_error_score
reg	d2pin	d2_pinball_score
reg	d2twe	d2_tweedie_score

Attention

average value must be valid for sklearn’s metrics functions.

Note

params is for the model, model does not have to be created in default settings, it can be manipulated.

Priority (in return)	Returns	Datatype	Condition
1	percentage_log	list	always
2	metrics_log	list	always

get_best_model

It calculates the optimum dataset size for the model.

Parameters	Datatype	Default Value
percentage_log	list
metrics_log	list
requested_metrics	string

Priority (in return)	Returns	Datatype	Condition
1	best_percentage	integer or float	always
2	best_score	float	always

path_chain

Sometimes, train data can be kept in different files with different sizes. It is preferred when the data is too big to store in RAM. That function is designed for these situations.

Parameters	Datatype	Default Value
paths	list
model_class	any AI model class
X_test	multidimensional array
y_test	1D array
output_column	string
metrics	list	None
average	string	weighted
params	dict	None

These are the valid keywords for metrics:

algo_type	metrics keyword	sklearn function
clf	acc	accuracy_score
clf	f1	f1_score
clf	hamming	hamming_loss
clf	jaccard	jaccard_score
clf	log	log_loss
clf	mcc	matthews_corrcoef
clf	precision	precision_score
clf	recall	recall_score
clf	zol	zero_one_loss
reg	var	explained_variance_score
reg	max	max_error
reg	var	explained_variance_score
reg	abs	mean_absolute_error
reg	sq	mean_squared_error
reg	rsq	root_mean_squared_error
reg	log	mean_squared_log_error
reg	rlog	root_mean_squared_log_error
reg	medabs	median_absolute_error
reg	poisson	mean_poisson_deviance
reg	gamma	mean_gamma_deviance
reg	per	mean_absolute_percentage_error
reg	d2abs	d2_absolute_error_score
reg	d2pin	d2_pinball_score
reg	d2twe	d2_tweedie_score

Attention

average value must be valid for sklearn’s metrics functions.

Note

params is for the model, model does not have to be created in default settings, it can be manipulated.

Priority (in return)	Returns	Datatype	Condition
1	metrics_log	dict	always