autogl.solver

Auto solver for various graph tasks

class autogl.solver.AutoNodeClassifier(feature_module=None, graph_models=('gat', 'gcn'), nas_algorithms=None, nas_spaces=None, nas_estimators=None, hpo_module='anneal', ensemble_module='voting', max_evals=50, default_trainer='NodeClassificationFull', trainer_hp_space=None, model_hp_spaces=None, size=4, device='auto')[source]

Auto Multi-class Graph Node Classifier.

Used to automatically solve the node classification problems.

Parameters:

feature_module (autogl.module.feature.BaseFeatureEngineer or str or None) – The (name of) auto feature engineer used to process the given dataset. Default deepgl. Disable feature engineer by setting it to None.
graph_models (Sequence of models) – Models can be str, autogl.module.model.BaseAutoModel, autogl.module.model.encoders.BaseEncoderMaintainer or a tuple of (encoder, decoder) if need to specify both encoder and decoder. Encoder can be str or autogl.module.model.encoders.BaseEncoderMaintainer, and decoder can be str or autogl.module.model.decoders.BaseDecoderMaintainer.
nas_algorithms ((list of) autogl.module.nas.algorithm.BaseNAS or str (Optional)) – The (name of) nas algorithms used. Default None.
nas_spaces ((list of) autogl.module.nas.space.BaseSpace or str (Optional)) – The (name of) nas spaces used. Default None.
nas_estimators ((list of) autogl.module.nas.estimator.BaseEstimator or str (Optional)) – The (name of) nas estimators used. Default None.
hpo_module (autogl.module.hpo.BaseHPOptimizer or str or None) – The (name of) hpo module used to search for best hyper parameters. Default anneal. Disable hpo by setting it to None.
ensemble_module (autogl.module.ensemble.BaseEnsembler or str or None) – The (name of) ensemble module used to ensemble the multi-models found. Default voting. Disable ensemble by setting it to None.
max_evals (int (Optional)) – If given, will set the number eval times the hpo module will use. Only be effective when hpo_module is str. Default None.
default_trainer (str (Optional)) – The (name of) the trainer used in this solver. Default to NodeClassificationFull.
trainer_hp_space (list of dict (Optional)) – trainer hp space or list of trainer hp spaces configuration. If a single trainer hp is given, will specify the hp space of trainer for every model. If a list of trainer hp is given, will specify every model with corrsponding trainer hp space. Default None.
model_hp_spaces (list of list of dict (Optional)) – model hp space configuration. If given, will specify every hp space of every passed model. Default None. If the encoder(-decoder) is passed, the space should be a dict containing keys “encoder” and “decoder”, specifying the detailed encoder decoder hp spaces.
size (int (Optional)) – The max models ensemble module will use. Default None.
device (torch.device or str) – The device where model will be running on. If set to auto, will use gpu when available. You can also specify the device by directly giving gpu or cuda:0, etc. Default auto.

evaluate(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test', label=None, metric='acc')[source]

Evaluate the given dataset.

Parameters:	dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If `None`, will use the processed dataset passed to `fit()` instead. Default `None`. inplaced (bool) – Whether the given dataset is processed. Only be effective when `dataset` is not `None`. If you pass the dataset to `fit()` with `inplace=True`, and you pass the dataset again to this method, you should set this argument to `True`. Otherwise `False`. Default `False`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`. mask (str) – The data split to give prediction on. Default `test`. label (torch.Tensor (Optional)) – The groud truth label of the given predicted dataset split. If not passed, will extract labels from the input dataset. metric (str) – The metric to be used for evaluating the model. Default `acc`.
Returns:	score(s) – the evaluation results according to the evaluator passed.
Return type:	(list of) evaluation scores

fit(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, balanced=True, evaluation_method='infer', seed=None) → autogl.solver.classifier.node_classifier.AutoNodeClassifier[source]

Fit current solver on given dataset.

Parameters:	dataset (autogl.data.Dataset) – The dataset needed to fit on. This dataset must have only one graph. time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default `-1`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. train_split (float or int (Optional)) – The train ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. val_split (float or int (Optional)) – The validation ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. balanced (bool) – Wether to create the train/valid/test split in a balanced way. If set to `True`, the train/valid will have the same number of different classes. Default `True`. evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If `infer`, will automatically determine. Default `infer`. seed (int (Optional)) – The random seed. If set to `None`, will run everything at random. Default `None`.
Returns:	self – A reference of current solver.
Return type:	autogl.solver.AutoNodeClassifier

fit_predict(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, balanced=True, evaluation_method='infer', use_ensemble=True, use_best=True, name=None) → numpy.ndarray[source]

Fit current solver on given dataset and return the predicted value.

Parameters:	dataset (torch_geometric.data.dataset.Dataset) – The dataset needed to fit on. This dataset must have only one graph. time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default `-1`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. train_split (float or int (Optional)) – The train ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. val_split (float or int (Optional)) – The validation ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. balanced (bool) – Wether to create the train/valid/test split in a balanced way. If set to `True`, the train/valid will have the same number of different classes. Default `False`. evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If `infer`, will automatically determine. Default `infer`. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`.
Returns:	result – An array of shape `(N,)`, where `N` is the number of test nodes. The prediction on given dataset.
Return type:	np.ndarray

classmethod from_config(path_or_dict, filetype='auto') → autogl.solver.classifier.node_classifier.AutoNodeClassifier[source]

Load solver from config file.

You can use this function to directly load a solver from predefined config dict or config file path. Currently, only support file type of json or yaml, if you pass a path.

Parameters:	path_or_dict (str or dict) – The path to the config file or the config dictionary object filetype (str) – The filetype the given file if the path is specified. Currently only support `json` or `yaml`. You can set to `auto` to automatically detect the file type (from file name). Default `auto`.
Returns:	solver – The solver that is created from given file or dictionary.
Return type:	autogl.solver.AutoGraphClassifier

predict(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node class number.

Parameters:	dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If `None`, will use the processed dataset passed to `fit()` instead. Default `None`. inplaced (bool) – Whether the given dataset is processed. Only be effective when `dataset` is not `None`. If you pass the dataset to `fit()` with `inplace=True`, and you pass the dataset again to this method, you should set this argument to `True`. Otherwise `False`. Default `False`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`. mask (str) – The data split to give prediction on. Default `test`.
Returns:	result – An array of shape `(N,)`, where `N` is the number of test nodes. The prediction on given dataset.
Return type:	np.ndarray

predict_proba(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node probability.

Parameters:	dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If `None`, will use the processed dataset passed to `fit()` instead. Default `None`. inplaced (bool) – Whether the given dataset is processed. Only be effective when `dataset` is not `None`. If you pass the dataset to `fit()` with `inplace=True`, and you pass the dataset again to this method, you should set this argument to `True`. Otherwise `False`. Default `False`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`. mask (str) – The data split to give prediction on. Default `test`.
Returns:	result – An array of shape `(N,C,)`, where `N` is the number of test nodes and `C` is the number of classes. The prediction on given dataset.
Return type:	np.ndarray

class autogl.solver.AutoGraphClassifier(feature_module=None, graph_models=('gin', 'topkpool'), hpo_module='anneal', ensemble_module='voting', max_evals=50, default_trainer='GraphClassificationFull', trainer_hp_space=None, model_hp_spaces=None, size=4, device='auto')[source]

Auto Multi-class Graph Classifier.

Used to automatically solve the graph classification problems.

Parameters:

feature_module (autogl.module.feature.BaseFeatureEngineer or str or None) – The (name of) auto feature engineer used to process the given dataset. Disable feature engineer by setting it to None. Default deepgl.
graph_models (list of autogl.module.model.BaseModel or list of str) – The (name of) models to be optimized as backbone. Default ['gat', 'gcn'].
hpo_module (autogl.module.hpo.BaseHPOptimizer or str or None) – The (name of) hpo module used to search for best hyper parameters. Disable hpo by setting it to None. Default anneal.
ensemble_module (autogl.module.ensemble.BaseEnsembler or str or None) – The (name of) ensemble module used to ensemble the multi-models found. Disable ensemble by setting it to None. Default voting.
max_evals (int (Optional)) – If given, will set the number eval times the hpo module will use. Only be effective when hpo_module is str. Default None.
default_trainer (str (Optional)) – The (name of) the trainer used in this solver. Default to NodeClassificationFull.
trainer_hp_space (Iterable[dict] (Optional)) – trainer hp space or list of trainer hp spaces configuration. If a single trainer hp is given, will specify the hp space of trainer for every model. If a list of trainer hp is given, will specify every model with corrsponding trainer hp space. Default None.
model_hp_spaces (list of list of dict (Optional)) – model hp space configuration. If given, will specify every hp space of every passed model. Default None. If the encoder(-decoder) is passed, the space should be a dict containing keys “encoder” and “decoder”, specifying the detailed encoder decoder hp spaces.
size (int (Optional)) – The max models ensemble module will use. Default None.
device (torch.device or str) – The device where model will be running on. If set to auto, will use gpu when available. You can also specify the device by directly giving gpu or cuda:0, etc. Default auto.

evaluate(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test', label=None, metric='acc')[source]

Evaluate the given dataset.

Parameters:	dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If `None`, will use the processed dataset passed to `fit()` instead. Default `None`. inplaced (bool) – Whether the given dataset is processed. Only be effective when `dataset` is not `None`. If you pass the dataset to `fit()` with `inplace=True`, and you pass the dataset again to this method, you should set this argument to `True`. Otherwise `False`. Default `False`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`. mask (str) – The data split to give prediction on. Default `test`. label (torch.Tensor (Optional)) – The groud truth label of the given predicted dataset split. If not passed, will extract labels from the input dataset. metric (str) – The metric to be used for evaluating the model. Default `acc`.
Returns:	score(s) – the evaluation results according to the evaluator passed.
Return type:	(list of) evaluation scores

fit(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, evaluation_method='infer', seed=None) → autogl.solver.classifier.graph_classifier.AutoGraphClassifier[source]

Fit current solver on given dataset.

Parameters:	dataset (autogl.data.dataset) – The multi-graph dataset needed to fit on. time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default `-1`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. train_split (float or int (Optional)) – The train ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. val_split (float or int (Optional)) – The validation ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. evaluation_method ((list of) str autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If `infer`, will automatically determine. Default `infer`. seed (int (Optional)) – The random seed. If set to `None`, will run everything at random. Default `None`.
Returns:	self – A reference of current solver.
Return type:	autogl.solver.AutoGraphClassifier

fit_predict(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, evaluation_method='infer', seed=None, use_ensemble=True, use_best=True, name=None) → numpy.ndarray[source]

Fit current solver on given dataset and return the predicted value.

Parameters:	dataset (torch_geometric.data.dataset.Dataset) – The dataset needed to fit on. This dataset must have only one graph. time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default `-1`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. train_split (float or int (Optional)) – The train ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. val_split (float or int (Optional)) – The validation ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If `infer`, will automatically determine. Default `infer`. seed (int (Optional)) – The random seed. If set to `None`, will run everything at random. Default `None`. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`.
Returns:	result – An array of shape `(N,)`, where `N` is the number of test nodes. The prediction on given dataset.
Return type:	np.ndarray

classmethod from_config(path_or_dict, filetype='auto') → autogl.solver.classifier.graph_classifier.AutoGraphClassifier[source]

Load solver from config file.

You can use this function to directly load a solver from predefined config dict or config file path. Currently, only support file type of json or yaml, if you pass a path.

Parameters:	path_or_dict (str or dict) – The path to the config file or the config dictionary object filetype (str) – The filetype the given file if the path is specified. Currently only support `json` or `yaml`. You can set to `auto` to automatically detect the file type (from file name). Default `auto`.
Returns:	solver – The solver that is created from given file or dictionary.
Return type:	autogl.solver.AutoGraphClassifier

predict(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node class number.

Parameters:	dataset (autogl.data.Dataset or None) – The dataset needed to predict. If `None`, will use the processed dataset passed to `fit()` instead. Default `None`. inplaced (bool) – Whether the given dataset is processed. Only be effective when `dataset` is not `None`. If you pass the dataset to `fit()` with `inplace=True`, and you pass the dataset again to this method, you should set this argument to `True`. Otherwise `False`. Default `False`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`.
Returns:	result – An array of shape `(N,)`, where `N` is the number of test nodes. The prediction on given dataset.
Return type:	np.ndarray

predict_proba(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node probability.

Parameters:	dataset (autogl.data.Dataset or None) – The dataset needed to predict. If `None`, will use the processed dataset passed to `fit()` instead. Default `None`. inplaced (bool) – Whether the given dataset is processed. Only be effective when `dataset` is not `None`. If you pass the dataset to `fit()` with `inplace=True`, and you pass the dataset again to this method, you should set this argument to `True`. Otherwise `False`. Default `False`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`. mask (str) – The data split to give prediction on. Default `test`.
Returns:	result – An array of shape `(N,C,)`, where `N` is the number of test nodes and `C` is the number of classes. The prediction on given dataset.
Return type:	np.ndarray

class autogl.solver.AutoLinkPredictor(feature_module=None, graph_models=('gat', 'gcn'), hpo_module='anneal', ensemble_module='voting', max_evals=50, default_trainer='LinkPredictionFull', trainer_hp_space=None, model_hp_spaces=None, size=4, device='auto')[source]

Auto Link Predictor.

Used to automatically solve the link prediction problems.

Parameters:

feature_module (autogl.module.feature.BaseFeatureEngineer or str or None) – The (name of) auto feature engineer used to process the given dataset. Default deepgl. Disable feature engineer by setting it to None.
graph_models (list of autogl.module.model.BaseModel or list of str) – The (name of) models to be optimized as backbone. Default ['gat', 'gcn'].
hpo_module (autogl.module.hpo.BaseHPOptimizer or str or None) – The (name of) hpo module used to search for best hyper parameters. Default anneal. Disable hpo by setting it to None.
ensemble_module (autogl.module.ensemble.BaseEnsembler or str or None) – The (name of) ensemble module used to ensemble the multi-models found. Default voting. Disable ensemble by setting it to None.
max_evals (int (Optional)) – If given, will set the number eval times the hpo module will use. Only be effective when hpo_module is str. Default None.
default_trainer (str (Optional)) – The (name of) the trainer used in this solver. Default to NodeClassificationFull.
trainer_hp_space (list of dict (Optional)) – trainer hp space or list of trainer hp spaces configuration. If a single trainer hp is given, will specify the hp space of trainer for every model. If a list of trainer hp is given, will specify every model with corrsponding trainer hp space. Default None.
model_hp_spaces (list of list of dict (Optional)) – model hp space configuration. If given, will specify every hp space of every passed model. Default None. If the encoder(-decoder) is passed, the space should be a dict containing keys “encoder” and “decoder”, specifying the detailed encoder decoder hp spaces.
size (int (Optional)) – The max models ensemble module will use. Default None.
device (torch.device or str) – The device where model will be running on. If set to auto, will use gpu when available. You can also specify the device by directly giving gpu or cuda:0, etc. Default auto.

evaluate(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test', label=None, metric='auc')[source]

Evaluate the given dataset.

Parameters:	dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If `None`, will use the processed dataset passed to `fit()` instead. Default `None`. inplaced (bool) – Whether the given dataset is processed. Only be effective when `dataset` is not `None`. If you pass the dataset to `fit()` with `inplace=True`, and you pass the dataset again to this method, you should set this argument to `True`. Otherwise `False`. Default `False`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`. mask (str) – The data split to give prediction on. Default `test`. label (torch.Tensor (Optional)) – The groud truth label of the given predicted dataset split. If not passed, will extract labels from the input dataset. metric (str) – The metric to be used for evaluating the model. Default `auc`.
Returns:	score(s) – the evaluation results according to the evaluator passed.
Return type:	(list of) evaluation scores

fit(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, evaluation_method='infer', seed=None) → autogl.solver.classifier.link_predictor.AutoLinkPredictor[source]

Fit current solver on given dataset.

Parameters:	dataset (torch_geometric.data.dataset.Dataset) – The dataset needed to fit on. This dataset must have only one graph. time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default `-1`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. train_split (float or int (Optional)) – The train ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. val_split (float or int (Optional)) – The validation ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If `infer`, will automatically determine. Default `infer`. seed (int (Optional)) – The random seed. If set to `None`, will run everything at random. Default `None`.
Returns:	self – A reference of current solver.
Return type:	autogl.solver.AutoNodeClassifier

fit_predict(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, evaluation_method='infer', use_ensemble=True, use_best=True, name=None) → numpy.ndarray[source]

Fit current solver on given dataset and return the predicted value.

Parameters:	dataset (torch_geometric.data.dataset.Dataset) – The dataset needed to fit on. This dataset must have only one graph. time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default `-1`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. train_split (float or int (Optional)) – The train ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. val_split (float or int (Optional)) – The validation ratio (in `float`) or number (in `int`) of dataset. If you want to use default train/val/test split in dataset, please set this to `None`. Default `None`. balanced (bool) – Wether to create the train/valid/test split in a balanced way. If set to `True`, the train/valid will have the same number of different classes. Default `False`. evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If `infer`, will automatically determine. Default `infer`. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`.
Returns:	result – An array of shape `(N,)`, where `N` is the number of test nodes. The prediction on given dataset.
Return type:	np.ndarray

classmethod from_config(path_or_dict, filetype='auto') → autogl.solver.classifier.link_predictor.AutoLinkPredictor[source]

Load solver from config file.

You can use this function to directly load a solver from predefined config dict or config file path. Currently, only support file type of json or yaml, if you pass a path.

Parameters:	path_or_dict (str or dict) – The path to the config file or the config dictionary object filetype (str) – The filetype the given file if the path is specified. Currently only support `json` or `yaml`. You can set to `auto` to automatically detect the file type (from file name). Default `auto`.
Returns:	solver – The solver that is created from given file or dictionary.
Return type:	autogl.solver.AutoGraphClassifier

predict(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test', threshold=0.5) → numpy.ndarray[source]

Predict the node class number.

Parameters:	dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If `None`, will use the processed dataset passed to `fit()` instead. Default `None`. inplaced (bool) – Whether the given dataset is processed. Only be effective when `dataset` is not `None`. If you pass the dataset to `fit()` with `inplace=True`, and you pass the dataset again to this method, you should set this argument to `True`. Otherwise `False`. Default `False`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`. mask (str) – The data split to give prediction on. Default `test`. threshold (float) – The threshold to judge whether the edges are positive or not.
Returns:	result – An array of shape `(N,)`, where `N` is the number of test nodes. The prediction on given dataset.
Return type:	np.ndarray

predict_proba(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node probability.

Parameters:	dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If `None`, will use the processed dataset passed to `fit()` instead. Default `None`. inplaced (bool) – Whether the given dataset is processed. Only be effective when `dataset` is not `None`. If you pass the dataset to `fit()` with `inplace=True`, and you pass the dataset again to this method, you should set this argument to `True`. Otherwise `False`. Default `False`. inplace (bool) – Whether we process the given dataset in inplace manner. Default `False`. Set it to True if you want to save memory by modifying the given dataset directly. use_ensemble (bool) – Whether to use ensemble to do the predict. Default `True`. use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when `use_ensemble` is `False`. Default `True`. name (str or None) – The name of model used to predict. Will only be effective when `use_ensemble` and `use_best` both are `False`. Default `None`. mask (str) – The data split to give prediction on. Default `test`.
Returns:	result – An array of shape `(N,C,)`, where `N` is the number of test nodes and `C` is the number of classes. The prediction on given dataset.
Return type:	np.ndarray

class autogl.solver.LeaderBoard(fields, is_higher_better)[source]

The leaderBoard that can be used to store / sort the model performance automatically.

Parameters:	fields (list of str) – A list of field name that shows the model performance. The first field is used as the major field for sorting the model performances. is_higher_better (dict of field -> bool) – A mapping of indicator that whether each field is higher better.

get_best_model(index=0) → str[source]

Get the best model according to the performance of the major field.

Parameters:	index (int) – The index of the model (from good to bad). Default 0.
Returns:	name – The name/identifier of the required model.
Return type:	str

insert_model_performance(name, performance) → None[source]

Add/Override a record of model performance. If name given is already in the leaderboard, will overrride the slot.

Parameters:	name (str) – The model name/identifier that identifies the model. performance (dict) – The performance dict. The key inside the dict should be the fields when initialized. The value of the dict should be the corresponding scores.
Returns:
Return type:	None

remove_model_performance(name) → None[source]

Remove the record of given models.

Parameters:	name (str) – The model name/identifier that needed to be removed.
Returns:
Return type:	None

set_major_field(field) → None[source]

Set the major field of current LeaderBoard.

Parameters:	field (str) – The major field, should be one of the fields when initialized.
Returns:
Return type:	None

show(top_k=0) → None[source]

Show current LeaderBoard (from best model to worst).

Parameters:	top_k (int) – Controls the number model shown. If less than or equal to 0, will show all the models. Default to 0.
Returns:
Return type:	None