: _homo_cn: ========================== 图分类模型 ========================== 构建图分类模块 ===================================== .. In AutoGL, we support two graph classification models, ``gin`` and ``topk``. 在AutoGL中,我们支持两种图分类模型: ``gin`` and ``topk`` 。 AutoGIN >>>>>>> .. The graph isomorphism operator from the “How Powerful are Graph Neural Networks?” paper 图同构算子(Graph Isomorphism Operator)出自论文“How Powerful are Graph Neural Networks?”中, 图同构网络(Graph Isomorphism Network (GIN))是一种图神经网络,出自论文 `“How Powerful are Graph Neural Networks” `_ 。 .. The layer is 层间更新方式为: .. math:: \mathbf{x}^{\prime}_i = h_{\mathbf{\Theta}} \left( (1 + \epsilon) \cdot \mathbf{x}_i + \sum_{j \in \mathcal{N}(i)} \mathbf{x}_j \right) 或者: .. math:: \mathbf{X}^{\prime} = h_{\mathbf{\Theta}} \left( \left( \mathbf{A} + (1 + \epsilon) \cdot \mathbf{I} \right) \cdot \mathbf{X} \right), .. here :math:`h_{\mathbf{\Theta}}` denotes a neural network, *.i.e.* an MLP. 这里 :math:`h_{\mathbf{\Theta}}` 代表一个神经网络, 例如一个多层感知机(MLP). .. PARAMETERS: 参数包括: .. - num_features: `int` - The dimension of features. - num_features: `int` - 特征的维度. .. - num_classes: `int` - The number of classes. - num_classes: `int` - 类别的数量. .. - device: `torch.device` or `str` - The device where model will be running on. - device: `torch.device` or `str` - 用于运行模型的设备. .. - init: `bool` - If True(False), the model will (not) be initialized. - init: `bool` - 如果设为True(False),模型会(不会)被初始化. .. code-block:: python class AutoGIN(BaseModel): r""" AutoGIN. The model used in this automodel is GIN, i.e., the graph isomorphism network from the `"How Powerful are Graph Neural Networks?" `_ paper. The layer is .. math:: \mathbf{x}^{\prime}_i = h_{\mathbf{\Theta}} \left( (1 + \epsilon) \cdot \mathbf{x}_i + \sum_{j \in \mathcal{N}(i)} \mathbf{x}_j \right) or .. math:: \mathbf{X}^{\prime} = h_{\mathbf{\Theta}} \left( \left( \mathbf{A} + (1 + \epsilon) \cdot \mathbf{I} \right) \cdot \mathbf{X} \right), here :math:`h_{\mathbf{\Theta}}` denotes a neural network, *.i.e.* an MLP. Parameters ---------- num_features: `int`. The dimension of features. num_classes: `int`. The number of classes. device: `torch.device` or `str` The device where model will be running on. init: `bool`. If True(False), the model will (not) be initialized. """ def __init__( self, num_features=None, num_classes=None, device=None, init=False, num_graph_features=None, **args ): super(AutoGIN, self).__init__() self.num_features = num_features if num_features is not None else 0 self.num_classes = int(num_classes) if num_classes is not None else 0 self.num_graph_features = ( int(num_graph_features) if num_graph_features is not None else 0 ) self.device = device if device is not None else "cpu" self.params = { "features_num": self.num_features, "num_class": self.num_classes, "num_graph_features": self.num_graph_features, } self.space = [ { "parameterName": "num_layers", "type": "DISCRETE", "feasiblePoints": "4,5,6", }, { "parameterName": "hidden", "type": "NUMERICAL_LIST", "numericalType": "INTEGER", "length": 5, "minValue": [8, 8, 8, 8, 8], "maxValue": [64, 64, 64, 64, 64], "scalingType": "LOG", "cutPara": ("num_layers",), "cutFunc": lambda x: x[0] - 1, }, { "parameterName": "dropout", "type": "DOUBLE", "maxValue": 0.9, "minValue": 0.1, "scalingType": "LINEAR", }, { "parameterName": "act", "type": "CATEGORICAL", "feasiblePoints": ["leaky_relu", "relu", "elu", "tanh"], }, { "parameterName": "eps", "type": "CATEGORICAL", "feasiblePoints": ["True", "False"], }, { "parameterName": "mlp_layers", "type": "DISCRETE", "feasiblePoints": "2,3,4", }, { "parameterName": "neighbor_pooling_type", "type": "CATEGORICAL", "feasiblePoints": ["sum", "mean", "max"], }, { "parameterName": "graph_pooling_type", "type": "CATEGORICAL", "feasiblePoints": ["sum", "mean", "max"], }, ] self.hyperparams = { "num_layers": 5, "hidden": [64,64,64,64], "dropout": 0.5, "act": "relu", "eps": "False", "mlp_layers": 2, "neighbor_pooling_type": "sum", "graph_pooling_type": "sum" } self.initialized = False if init is True: self.initialize() .. Hyperparameters in GIN: GIN中的超参数: .. - num_layers: `int` - number of GIN layers. .. - hidden: `List[int]` - hidden size for each hidden layer. .. - dropout: `float` - dropout probability. .. - act: `str` - type of activation function. .. - eps: `str` - whether to train parameter :math:`epsilon` in the GIN layer. .. - mlp_layers: `int` - number of MLP layers in the GIN layer. .. - neighbor_pooling_type: `str` - pooling type in the GIN layer. .. - graph_pooling_type: `str` - graph pooling type following the last GIN layer. - num_layers: `int` - GIN的层数。 - hidden: `List[int]` - 每个隐藏层的大小。 - dropout: `float` - 随机失活(Dropout)的概率。 - act: `str` - 激活函数的类型。 - eps: `str` - 是否在GIN层中训练参数 :math:`epsilon` 。 - mlp_layers: `int` - GIN中的多层感知机(MLP)层数。 - neighbor_pooling_type: `str` - GIN中的池化(pooling)层类型。 - graph_pooling_type: `str` - GIN最后一层之后的图池化(graph pooling)类型。 .. You could get define your own ``gin`` model by using ``from_hyper_parameter`` function and specify the hyperpameryers. You could get define your own ``gin`` model by using ``from_hyper_parameter`` function and specify the hyperpameryers. 你可以通过使用 ``from_hyper_parameter`` 函数定义你自己的 ``gin`` 模型,并对其指定超参数。 .. code-block:: python # pyg version from autogl.module.model.pyg import AutoGIN # from autogl.module.model.dgl import AutoGIN # dgl version model = AutoGIN( num_features=dataset.num_node_features, num_classes=dataset.num_classes, num_graph_features=0, init=False ).from_hyper_parameter({ # hp from model "num_layers": 5, "hidden": [64,64,64,64], "dropout": 0.5, "act": "relu", "eps": "False", "mlp_layers": 2, "neighbor_pooling_type": "sum", "graph_pooling_type": "sum" }).model .. Then you can train the model for 100 epochs. 然后你可以对模型进行100次的训练: .. code-block:: python import torch.nn.functional as F # Define the loss optimizer. optimizer = torch.optim.Adam(model.parameters(), lr=0.01) # Training for epoch in range(100): model.train() for data in train_loader: data = data.to(args.device) optimizer.zero_grad() output = model(data) loss = F.nll_loss(output, data.y) loss.backward() optimizer.step() .. Finally, evaluate the trained model. 最后,你可以评估该模型: .. code-block:: python def test(model, loader, args): model.eval() correct = 0 for data in loader: data = data.to(args.device) output = model(data) pred = output.max(dim=1)[1] correct += pred.eq(data.y).sum().item() return correct / len(loader.dataset) acc = test(model, test_loader, args) .. Automatic Search for Graph Classification Tasks 图分类任务的自动搜索 =============================================== .. In AutoGL, we also provide a high-level API Solver to control the overall pipeline. We encapsulated the training process in the Building GNN Modules part for graph classification tasks in the solver ``AutoGraphClassifier`` that supports automatic hyperparametric optimization as well as feature engineering and ensemble. In this part, we will show you how to use ``AutoGraphClassifier``. .. In AutoGL, we also provide a high-level API Solver to control the overall pipeline. We encapsulated the training process in the Building GNN Modules part for graph classification tasks in the solver ``AutoGraphClassifier`` that supports automatic hyperparametric optimization as well as feature engineering and ensemble. In this part, we will show you how to use ``AutoGraphClassifier``. 在AutoGL中,我们还提供了一个高级的API求解器来控制整个流水线。我们将构建图神经网络模块部分的训练过程封装在求解器 ``AutoGraphClassifier`` 中以用于图分类任务,它支持自动超参数优化,特征工程及集成。 在这一部分,我们提供了一个例子来指导如何使用 ``AutoGraphClassifier`` : .. code-block:: python solver = AutoGraphClassifier( feature_module=None, graph_models=[args.model], hpo_module='random', ensemble_module=None, device=args.device, max_evals=1, trainer_hp_space = fixed( **{ # hp from trainer "max_epoch": args.epoch, "batch_size": args.batch_size, "early_stopping_round": args.epoch + 1, "lr": args.lr, "weight_decay": 0, } ), model_hp_spaces=[ fixed(**{ # hp from model "num_layers": 5, "hidden": [64,64,64,64], "dropout": 0.5, "act": "relu", "eps": "False", "mlp_layers": 2, "neighbor_pooling_type": "sum", "graph_pooling_type": "sum" }) if args.model == 'gin' else fixed(**{ "ratio": 0.8, "dropout": 0.5, "act": "relu" }), ] ) # fit auto model solver.fit(dataset, evaluation_method=['acc']) # prediction out = solver.predict(dataset, mask='test')