autogl.solver

Auto solver for various graph tasks

class autogl.solver.AutoNodeClassifier(feature_module=None, graph_models=('gat', 'gcn'), nas_algorithms=None, nas_spaces=None, nas_estimators=None, hpo_module='anneal', ensemble_module='voting', max_evals=50, default_trainer='NodeClassificationFull', trainer_hp_space=None, model_hp_spaces=None, size=4, device='auto')[source]

Auto Multi-class Graph Node Classifier.

Used to automatically solve the node classification problems.

Parameters:
  • feature_module (autogl.module.feature.BaseFeatureEngineer or str or None) – The (name of) auto feature engineer used to process the given dataset. Default deepgl. Disable feature engineer by setting it to None.
  • graph_models (Sequence of models) – Models can be str, autogl.module.model.BaseAutoModel, autogl.module.model.encoders.BaseEncoderMaintainer or a tuple of (encoder, decoder) if need to specify both encoder and decoder. Encoder can be str or autogl.module.model.encoders.BaseEncoderMaintainer, and decoder can be str or autogl.module.model.decoders.BaseDecoderMaintainer.
  • nas_algorithms ((list of) autogl.module.nas.algorithm.BaseNAS or str (Optional)) – The (name of) nas algorithms used. Default None.
  • nas_spaces ((list of) autogl.module.nas.space.BaseSpace or str (Optional)) – The (name of) nas spaces used. Default None.
  • nas_estimators ((list of) autogl.module.nas.estimator.BaseEstimator or str (Optional)) – The (name of) nas estimators used. Default None.
  • hpo_module (autogl.module.hpo.BaseHPOptimizer or str or None) – The (name of) hpo module used to search for best hyper parameters. Default anneal. Disable hpo by setting it to None.
  • ensemble_module (autogl.module.ensemble.BaseEnsembler or str or None) – The (name of) ensemble module used to ensemble the multi-models found. Default voting. Disable ensemble by setting it to None.
  • max_evals (int (Optional)) – If given, will set the number eval times the hpo module will use. Only be effective when hpo_module is str. Default None.
  • default_trainer (str (Optional)) – The (name of) the trainer used in this solver. Default to NodeClassificationFull.
  • trainer_hp_space (list of dict (Optional)) – trainer hp space or list of trainer hp spaces configuration. If a single trainer hp is given, will specify the hp space of trainer for every model. If a list of trainer hp is given, will specify every model with corrsponding trainer hp space. Default None.
  • model_hp_spaces (list of list of dict (Optional)) – model hp space configuration. If given, will specify every hp space of every passed model. Default None. If the encoder(-decoder) is passed, the space should be a dict containing keys “encoder” and “decoder”, specifying the detailed encoder decoder hp spaces.
  • size (int (Optional)) – The max models ensemble module will use. Default None.
  • device (torch.device or str) – The device where model will be running on. If set to auto, will use gpu when available. You can also specify the device by directly giving gpu or cuda:0, etc. Default auto.
evaluate(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test', label=None, metric='acc')[source]

Evaluate the given dataset.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If None, will use the processed dataset passed to fit() instead. Default None.
  • inplaced (bool) – Whether the given dataset is processed. Only be effective when dataset is not None. If you pass the dataset to fit() with inplace=True, and you pass the dataset again to this method, you should set this argument to True. Otherwise False. Default False.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
  • mask (str) – The data split to give prediction on. Default test.
  • label (torch.Tensor (Optional)) – The groud truth label of the given predicted dataset split. If not passed, will extract labels from the input dataset.
  • metric (str) – The metric to be used for evaluating the model. Default acc.
Returns:

score(s) – the evaluation results according to the evaluator passed.

Return type:

(list of) evaluation scores

fit(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, balanced=True, evaluation_method='infer', seed=None) → autogl.solver.classifier.node_classifier.AutoNodeClassifier[source]

Fit current solver on given dataset.

Parameters:
  • dataset (autogl.data.Dataset) – The dataset needed to fit on. This dataset must have only one graph.
  • time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default -1.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • train_split (float or int (Optional)) – The train ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • val_split (float or int (Optional)) – The validation ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • balanced (bool) – Wether to create the train/valid/test split in a balanced way. If set to True, the train/valid will have the same number of different classes. Default True.
  • evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If infer, will automatically determine. Default infer.
  • seed (int (Optional)) – The random seed. If set to None, will run everything at random. Default None.
Returns:

self – A reference of current solver.

Return type:

autogl.solver.AutoNodeClassifier

fit_predict(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, balanced=True, evaluation_method='infer', use_ensemble=True, use_best=True, name=None) → numpy.ndarray[source]

Fit current solver on given dataset and return the predicted value.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset) – The dataset needed to fit on. This dataset must have only one graph.
  • time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default -1.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • train_split (float or int (Optional)) – The train ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • val_split (float or int (Optional)) – The validation ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • balanced (bool) – Wether to create the train/valid/test split in a balanced way. If set to True, the train/valid will have the same number of different classes. Default False.
  • evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If infer, will automatically determine. Default infer.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
Returns:

result – An array of shape (N,), where N is the number of test nodes. The prediction on given dataset.

Return type:

np.ndarray

classmethod from_config(path_or_dict, filetype='auto') → autogl.solver.classifier.node_classifier.AutoNodeClassifier[source]

Load solver from config file.

You can use this function to directly load a solver from predefined config dict or config file path. Currently, only support file type of json or yaml, if you pass a path.

Parameters:
  • path_or_dict (str or dict) – The path to the config file or the config dictionary object
  • filetype (str) – The filetype the given file if the path is specified. Currently only support json or yaml. You can set to auto to automatically detect the file type (from file name). Default auto.
Returns:

solver – The solver that is created from given file or dictionary.

Return type:

autogl.solver.AutoGraphClassifier

predict(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node class number.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If None, will use the processed dataset passed to fit() instead. Default None.
  • inplaced (bool) – Whether the given dataset is processed. Only be effective when dataset is not None. If you pass the dataset to fit() with inplace=True, and you pass the dataset again to this method, you should set this argument to True. Otherwise False. Default False.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
  • mask (str) – The data split to give prediction on. Default test.
Returns:

result – An array of shape (N,), where N is the number of test nodes. The prediction on given dataset.

Return type:

np.ndarray

predict_proba(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node probability.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If None, will use the processed dataset passed to fit() instead. Default None.
  • inplaced (bool) – Whether the given dataset is processed. Only be effective when dataset is not None. If you pass the dataset to fit() with inplace=True, and you pass the dataset again to this method, you should set this argument to True. Otherwise False. Default False.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
  • mask (str) – The data split to give prediction on. Default test.
Returns:

result – An array of shape (N,C,), where N is the number of test nodes and C is the number of classes. The prediction on given dataset.

Return type:

np.ndarray

class autogl.solver.AutoGraphClassifier(feature_module=None, graph_models=('gin', 'topkpool'), hpo_module='anneal', ensemble_module='voting', max_evals=50, default_trainer='GraphClassificationFull', trainer_hp_space=None, model_hp_spaces=None, size=4, device='auto')[source]

Auto Multi-class Graph Classifier.

Used to automatically solve the graph classification problems.

Parameters:
  • feature_module (autogl.module.feature.BaseFeatureEngineer or str or None) – The (name of) auto feature engineer used to process the given dataset. Disable feature engineer by setting it to None. Default deepgl.
  • graph_models (list of autogl.module.model.BaseModel or list of str) – The (name of) models to be optimized as backbone. Default ['gat', 'gcn'].
  • hpo_module (autogl.module.hpo.BaseHPOptimizer or str or None) – The (name of) hpo module used to search for best hyper parameters. Disable hpo by setting it to None. Default anneal.
  • ensemble_module (autogl.module.ensemble.BaseEnsembler or str or None) – The (name of) ensemble module used to ensemble the multi-models found. Disable ensemble by setting it to None. Default voting.
  • max_evals (int (Optional)) – If given, will set the number eval times the hpo module will use. Only be effective when hpo_module is str. Default None.
  • default_trainer (str (Optional)) – The (name of) the trainer used in this solver. Default to NodeClassificationFull.
  • trainer_hp_space (Iterable[dict] (Optional)) – trainer hp space or list of trainer hp spaces configuration. If a single trainer hp is given, will specify the hp space of trainer for every model. If a list of trainer hp is given, will specify every model with corrsponding trainer hp space. Default None.
  • model_hp_spaces (list of list of dict (Optional)) – model hp space configuration. If given, will specify every hp space of every passed model. Default None. If the encoder(-decoder) is passed, the space should be a dict containing keys “encoder” and “decoder”, specifying the detailed encoder decoder hp spaces.
  • size (int (Optional)) – The max models ensemble module will use. Default None.
  • device (torch.device or str) – The device where model will be running on. If set to auto, will use gpu when available. You can also specify the device by directly giving gpu or cuda:0, etc. Default auto.
evaluate(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test', label=None, metric='acc')[source]

Evaluate the given dataset.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If None, will use the processed dataset passed to fit() instead. Default None.
  • inplaced (bool) – Whether the given dataset is processed. Only be effective when dataset is not None. If you pass the dataset to fit() with inplace=True, and you pass the dataset again to this method, you should set this argument to True. Otherwise False. Default False.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
  • mask (str) – The data split to give prediction on. Default test.
  • label (torch.Tensor (Optional)) – The groud truth label of the given predicted dataset split. If not passed, will extract labels from the input dataset.
  • metric (str) – The metric to be used for evaluating the model. Default acc.
Returns:

score(s) – the evaluation results according to the evaluator passed.

Return type:

(list of) evaluation scores

fit(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, evaluation_method='infer', seed=None) → autogl.solver.classifier.graph_classifier.AutoGraphClassifier[source]

Fit current solver on given dataset.

Parameters:
  • dataset (autogl.data.dataset) – The multi-graph dataset needed to fit on.
  • time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default -1.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • train_split (float or int (Optional)) – The train ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • val_split (float or int (Optional)) – The validation ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • evaluation_method ((list of) str autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If infer, will automatically determine. Default infer.
  • seed (int (Optional)) – The random seed. If set to None, will run everything at random. Default None.
Returns:

self – A reference of current solver.

Return type:

autogl.solver.AutoGraphClassifier

fit_predict(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, evaluation_method='infer', seed=None, use_ensemble=True, use_best=True, name=None) → numpy.ndarray[source]

Fit current solver on given dataset and return the predicted value.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset) – The dataset needed to fit on. This dataset must have only one graph.
  • time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default -1.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • train_split (float or int (Optional)) – The train ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • val_split (float or int (Optional)) – The validation ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If infer, will automatically determine. Default infer.
  • seed (int (Optional)) – The random seed. If set to None, will run everything at random. Default None.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
Returns:

result – An array of shape (N,), where N is the number of test nodes. The prediction on given dataset.

Return type:

np.ndarray

classmethod from_config(path_or_dict, filetype='auto') → autogl.solver.classifier.graph_classifier.AutoGraphClassifier[source]

Load solver from config file.

You can use this function to directly load a solver from predefined config dict or config file path. Currently, only support file type of json or yaml, if you pass a path.

Parameters:
  • path_or_dict (str or dict) – The path to the config file or the config dictionary object
  • filetype (str) – The filetype the given file if the path is specified. Currently only support json or yaml. You can set to auto to automatically detect the file type (from file name). Default auto.
Returns:

solver – The solver that is created from given file or dictionary.

Return type:

autogl.solver.AutoGraphClassifier

predict(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node class number.

Parameters:
  • dataset (autogl.data.Dataset or None) – The dataset needed to predict. If None, will use the processed dataset passed to fit() instead. Default None.
  • inplaced (bool) – Whether the given dataset is processed. Only be effective when dataset is not None. If you pass the dataset to fit() with inplace=True, and you pass the dataset again to this method, you should set this argument to True. Otherwise False. Default False.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
Returns:

result – An array of shape (N,), where N is the number of test nodes. The prediction on given dataset.

Return type:

np.ndarray

predict_proba(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node probability.

Parameters:
  • dataset (autogl.data.Dataset or None) – The dataset needed to predict. If None, will use the processed dataset passed to fit() instead. Default None.
  • inplaced (bool) – Whether the given dataset is processed. Only be effective when dataset is not None. If you pass the dataset to fit() with inplace=True, and you pass the dataset again to this method, you should set this argument to True. Otherwise False. Default False.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
  • mask (str) – The data split to give prediction on. Default test.
Returns:

result – An array of shape (N,C,), where N is the number of test nodes and C is the number of classes. The prediction on given dataset.

Return type:

np.ndarray

class autogl.solver.AutoLinkPredictor(feature_module=None, graph_models=('gat', 'gcn'), hpo_module='anneal', ensemble_module='voting', max_evals=50, default_trainer='LinkPredictionFull', trainer_hp_space=None, model_hp_spaces=None, size=4, device='auto')[source]

Auto Link Predictor.

Used to automatically solve the link prediction problems.

Parameters:
  • feature_module (autogl.module.feature.BaseFeatureEngineer or str or None) – The (name of) auto feature engineer used to process the given dataset. Default deepgl. Disable feature engineer by setting it to None.
  • graph_models (list of autogl.module.model.BaseModel or list of str) – The (name of) models to be optimized as backbone. Default ['gat', 'gcn'].
  • hpo_module (autogl.module.hpo.BaseHPOptimizer or str or None) – The (name of) hpo module used to search for best hyper parameters. Default anneal. Disable hpo by setting it to None.
  • ensemble_module (autogl.module.ensemble.BaseEnsembler or str or None) – The (name of) ensemble module used to ensemble the multi-models found. Default voting. Disable ensemble by setting it to None.
  • max_evals (int (Optional)) – If given, will set the number eval times the hpo module will use. Only be effective when hpo_module is str. Default None.
  • default_trainer (str (Optional)) – The (name of) the trainer used in this solver. Default to NodeClassificationFull.
  • trainer_hp_space (list of dict (Optional)) – trainer hp space or list of trainer hp spaces configuration. If a single trainer hp is given, will specify the hp space of trainer for every model. If a list of trainer hp is given, will specify every model with corrsponding trainer hp space. Default None.
  • model_hp_spaces (list of list of dict (Optional)) – model hp space configuration. If given, will specify every hp space of every passed model. Default None. If the encoder(-decoder) is passed, the space should be a dict containing keys “encoder” and “decoder”, specifying the detailed encoder decoder hp spaces.
  • size (int (Optional)) – The max models ensemble module will use. Default None.
  • device (torch.device or str) – The device where model will be running on. If set to auto, will use gpu when available. You can also specify the device by directly giving gpu or cuda:0, etc. Default auto.
evaluate(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test', label=None, metric='auc')[source]

Evaluate the given dataset.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If None, will use the processed dataset passed to fit() instead. Default None.
  • inplaced (bool) – Whether the given dataset is processed. Only be effective when dataset is not None. If you pass the dataset to fit() with inplace=True, and you pass the dataset again to this method, you should set this argument to True. Otherwise False. Default False.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
  • mask (str) – The data split to give prediction on. Default test.
  • label (torch.Tensor (Optional)) – The groud truth label of the given predicted dataset split. If not passed, will extract labels from the input dataset.
  • metric (str) – The metric to be used for evaluating the model. Default auc.
Returns:

score(s) – the evaluation results according to the evaluator passed.

Return type:

(list of) evaluation scores

fit(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, evaluation_method='infer', seed=None) → autogl.solver.classifier.link_predictor.AutoLinkPredictor[source]

Fit current solver on given dataset.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset) – The dataset needed to fit on. This dataset must have only one graph.
  • time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default -1.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • train_split (float or int (Optional)) – The train ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • val_split (float or int (Optional)) – The validation ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If infer, will automatically determine. Default infer.
  • seed (int (Optional)) – The random seed. If set to None, will run everything at random. Default None.
Returns:

self – A reference of current solver.

Return type:

autogl.solver.AutoNodeClassifier

fit_predict(dataset, time_limit=-1, inplace=False, train_split=None, val_split=None, evaluation_method='infer', use_ensemble=True, use_best=True, name=None) → numpy.ndarray[source]

Fit current solver on given dataset and return the predicted value.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset) – The dataset needed to fit on. This dataset must have only one graph.
  • time_limit (int) – The time limit of the whole fit process (in seconds). If set below 0, will ignore time limit. Default -1.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • train_split (float or int (Optional)) – The train ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • val_split (float or int (Optional)) – The validation ratio (in float) or number (in int) of dataset. If you want to use default train/val/test split in dataset, please set this to None. Default None.
  • balanced (bool) – Wether to create the train/valid/test split in a balanced way. If set to True, the train/valid will have the same number of different classes. Default False.
  • evaluation_method ((list of) str or autogl.module.train.evaluation) – A (list of) evaluation method for current solver. If infer, will automatically determine. Default infer.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
Returns:

result – An array of shape (N,), where N is the number of test nodes. The prediction on given dataset.

Return type:

np.ndarray

classmethod from_config(path_or_dict, filetype='auto') → autogl.solver.classifier.link_predictor.AutoLinkPredictor[source]

Load solver from config file.

You can use this function to directly load a solver from predefined config dict or config file path. Currently, only support file type of json or yaml, if you pass a path.

Parameters:
  • path_or_dict (str or dict) – The path to the config file or the config dictionary object
  • filetype (str) – The filetype the given file if the path is specified. Currently only support json or yaml. You can set to auto to automatically detect the file type (from file name). Default auto.
Returns:

solver – The solver that is created from given file or dictionary.

Return type:

autogl.solver.AutoGraphClassifier

predict(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test', threshold=0.5) → numpy.ndarray[source]

Predict the node class number.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If None, will use the processed dataset passed to fit() instead. Default None.
  • inplaced (bool) – Whether the given dataset is processed. Only be effective when dataset is not None. If you pass the dataset to fit() with inplace=True, and you pass the dataset again to this method, you should set this argument to True. Otherwise False. Default False.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
  • mask (str) – The data split to give prediction on. Default test.
  • threshold (float) – The threshold to judge whether the edges are positive or not.
Returns:

result – An array of shape (N,), where N is the number of test nodes. The prediction on given dataset.

Return type:

np.ndarray

predict_proba(dataset=None, inplaced=False, inplace=False, use_ensemble=True, use_best=True, name=None, mask='test') → numpy.ndarray[source]

Predict the node probability.

Parameters:
  • dataset (torch_geometric.data.dataset.Dataset or None) – The dataset needed to predict. If None, will use the processed dataset passed to fit() instead. Default None.
  • inplaced (bool) – Whether the given dataset is processed. Only be effective when dataset is not None. If you pass the dataset to fit() with inplace=True, and you pass the dataset again to this method, you should set this argument to True. Otherwise False. Default False.
  • inplace (bool) – Whether we process the given dataset in inplace manner. Default False. Set it to True if you want to save memory by modifying the given dataset directly.
  • use_ensemble (bool) – Whether to use ensemble to do the predict. Default True.
  • use_best (bool) – Whether to use the best single model to do the predict. Will only be effective when use_ensemble is False. Default True.
  • name (str or None) – The name of model used to predict. Will only be effective when use_ensemble and use_best both are False. Default None.
  • mask (str) – The data split to give prediction on. Default test.
Returns:

result – An array of shape (N,C,), where N is the number of test nodes and C is the number of classes. The prediction on given dataset.

Return type:

np.ndarray

class autogl.solver.LeaderBoard(fields, is_higher_better)[source]

The leaderBoard that can be used to store / sort the model performance automatically.

Parameters:
  • fields (list of str) – A list of field name that shows the model performance. The first field is used as the major field for sorting the model performances.
  • is_higher_better (dict of field -> bool) – A mapping of indicator that whether each field is higher better.
get_best_model(index=0) → str[source]

Get the best model according to the performance of the major field.

Parameters:index (int) – The index of the model (from good to bad). Default 0.
Returns:name – The name/identifier of the required model.
Return type:str
insert_model_performance(name, performance) → None[source]

Add/Override a record of model performance. If name given is already in the leaderboard, will overrride the slot.

Parameters:
  • name (str) – The model name/identifier that identifies the model.
  • performance (dict) – The performance dict. The key inside the dict should be the fields when initialized. The value of the dict should be the corresponding scores.
Returns:

Return type:

None

remove_model_performance(name) → None[source]

Remove the record of given models.

Parameters:name (str) – The model name/identifier that needed to be removed.
Returns:
Return type:None
set_major_field(field) → None[source]

Set the major field of current LeaderBoard.

Parameters:field (str) – The major field, should be one of the fields when initialized.
Returns:
Return type:None
show(top_k=0) → None[source]

Show current LeaderBoard (from best model to worst).

Parameters:top_k (int) – Controls the number model shown. If less than or equal to 0, will show all the models. Default to 0.
Returns:
Return type:None