AutoGL Model

In AutoGL, we use model and automodel to define the logic of graph nerual networks and make it compatible with hyper parameter optimization. Currently we support the following models for given tasks.

Tasks	Models
Node Classification	`gcn`, `gat`, `sage`
Graph Classification	`gin`, `topk`
Link Prediction	`gcn`, `gat`, `sage`

Lazy Initialization

In current AutoGL pipeline, some important hyper-parameters related with model cannot be set outside before the pipeline (e.g. input dimensions, which can only be caluclated during running after feature engineered). Therefore, in automodel, we use lazy initialization to initialize the core model. When the automodel initialization method __init__() is called with argument init be False, only (part of) the hyper-parameters will be set. The automodel will have its core model only after initialize() is explicitly called, which will be done automatically in solver and from_hyper_parameter(), after all the hyper-parameters are set properly.

Define your own model and automodel

We highly recommend you to define both model and automodel, although you only need your automodel to communicate with solver and trainer. The model will be responsible for the parameters initialization and forward logic declaration, while the automodel will be responsible for the hyper-parameter definiton and organization.

General customization

Let’s say you want to implement a simple MLP for node classification and want to let AutoGL find the best hyper-parameters for you. You can first define the logics assuming all the hyper-parameters are given.

import torch

# define mlp model, need to inherit from torch.nn.Module
class MyMLP(torch.nn.Module):
    # assume you already get all the hyper-parameters
    def __init__(self, in_channels, num_classes, layer_num, dim):
        super().__init__()
        if layer_num == 1:
            ops = [torch.nn.Linear(in_channels, num_classes)]
        else:
            ops = [torch.nn.Linear(in_channels, dim)]
            for i in range(layer_num - 2):
                ops.append(torch.nn.Linear(dim, dim))
            ops.append(torch.nn.Linear(dim, num_classes))

        self.core = torch.nn.Sequential(*ops)

    # this method is required
    def forward(self, data):
        # data: torch_geometric.data.Data
        assert hasattr(data, 'x'), 'MLP only support graph data with features'
        x = data.x
        return torch.nn.functional.log_softmax(self.core(x))

After you define the logic of model, you can now define your automodel to manage the hyper-parameters.

from autogl.module.model import BaseModel

# define your automodel, need to inherit from BaseModel
class MyAutoMLP(BaseModel):
    def __init__(self):
        # (required) make sure you call __init__ of super with init argument properly set.
        # if you do not want to initialize inside __init__, please pass False.
        super().__init__(init=False)

        # (required) define the search space
        self.space = [
            {'parameterName': 'layer_num', 'type': 'INTEGER', 'minValue': 1, 'maxValue': 5, 'scalingType': 'LINEAR'},
            {'parameterName': 'dim', 'type': 'INTEGER', 'minValue': 64, 'maxValue': 128, 'scalingType': 'LINEAR'}
        ]

        # set default hyper-parameters
        self.layer_num = 2
        self.dim = 72

        # for the hyper-parameters that are related with dataset, you can just set them to None
        self.num_classes = None
        self.num_features = None

        # (required) since we don't know the num_classes and num_features until we see the dataset,
        # we cannot initialize the models when instantiated. the initialized will be set to False.
        self.initialized = False

        # (required) set the device of current auto model
        self.device = torch.device('cuda')

    # (required) get current hyper-parameters of this automodel
    # need to return a dictionary whose keys are the same with self.space
    def get_hyper_parameter(self):
        return {
            'layer_num': self.layer_num,
            'dim': self.dim
        }

    # (required) override to interact with num_classes
    def get_num_classes(self):
        return self.num_classes

    # (required) override to interact with num_classes
    def set_num_classes(self, n_classes):
        self.num_classes = n_classes

    # (required) override to interact with num_features
    def get_num_features(self):
        return self.num_features

    # (required) override to interact with num_features
    def set_num_features(self, n_features):
        self.num_features = n_features

    # (required) instantiate the core MLP model using corresponding hyper-parameters
    def initialize(self):
        # (required) you need to make sure the core model is named as `self.model`
        self.model = MyMLP(
            in_channels = self.num_features,
            num_classes = self.num_classes,
            layer_num = self.layer_num,
            dim = self.dim
        ).to(self.device)

        self.initialized = True

    # (required) override to create a copy of model using provided hyper-parameters
    def from_hyper_parameter(self, hp):
        # hp is a dictionary that contains keys and values corrsponding to your self.space
        # in this case, it will be in form {'layer_num': XX, 'dim': XX}

        # create a new instance
        ret = self.__class__()

        # set the hyper-parameters related to dataset and device
        ret.num_classes = self.num_classes
        ret.num_features = self.num_features
        ret.device = self.device

        # set the hyper-parameters according to hp
        ret.layer_num = hp['layer_num']
        ret.dim = hp['dim']

        # initialize it before returning
        ret.initialize()

        return ret

Then, you can use this node classification model as part of AutoNodeClassifier solver.

from autogl.solver import AutoNodeClassifier

solver = AutoNodeClassifier(graph_models=(MyAutoMLP(),))

The model for graph classification is generally the same, except that you can now also receive the num_graph_features (the dimension of the graph-level feature) through overriding set_num_graph_features(self, n_graph_features) of BaseModel. Also, please remember to return graph-level logits instead of node-level one in forward() of model.

Model for link prediction

For link prediction, the definition of model is a bit different with the common forward definition. You need to implement the lp_encode(self, data) and lp_decode(self, x, pos_edge_index, neg_edge_index) to interact with LinkPredictionTrainer and AutoLinkPredictor. Taking the class MyMLP defined above for example, if you want to perform link prediction:

class MyMLPForLP(torch.nn.Module):
    # num_classes is removed since it is invalid for link prediction
    def __init__(self, in_channels, layer_num, dim):
        super().__init__()
        ops = [torch.nn.Linear(in_channels, dim)]
        for i in range(layer_num - 1):
            ops.append(torch.nn.Linear(dim, dim))

        self.core = torch.nn.Sequential(*ops)

    # (required) for interaction with link prediction trainer and solver
    def lp_encode(self, data):
        return self.core(data.x)

    # (required) for interaction with link prediction trainer and solver
    def lp_decode(self, x, pos_edge_index, neg_edge_index):
        # first, get all the edge_index need calculated
        edge_index = torch.cat([pos_edge_index, neg_edge_index], dim=-1)
        # then, use dot-products to calculate logits, you can use whatever decode method you want
        logits = (x[edge_index[0]] * x[edge_index[1]]).sum(dim=-1)
        return logits

class MyAutoMLPForLP(MyAutoMLP):
    def initialize(self):
        # init MyMLPForLP instead of MyMLP
        self.model = MyMLPForLP(
            in_channels = self.num_features,
            layer_num = self.layer_num,
            dim = self.dim
        ).to(self.device)

        self.initialized = True

Model with sampling support

Towards efficient representation learning on large-scale graph, AutoGL currently support node classification using sampling techniques including node-wise sampling, layer-wise sampling, and graph-wise sampling. See more about sampling in AutoGL Trainer.

In order to conduct node classification using sampling technique with your custom model, further adaptation and modification are generally required. According to the Message Passing mechanism of Graph Neural Network (GNN), numerous nodes in the multi-hop neighborhood of evaluation set or test set are potentially involved to evaluate the GNN model on large-scale graph dataset. As the representations for those numerous nodes are likely to occupy large amount of computational resource, the common forwarding process is generally infeasible for model evaluation on large-scale graph. An iterative representation learning mechanism is a practical and feasible way to evaluate Sequential Model, which only consists of multiple sequential layers, with each layer taking a Data aggregate as input. The input Data has the same functionality with torch_geometric.data.Data, which conventionally provides properties x, edge_index, and optional edge_weight. If your custom model is composed of concatenated layers, you would better make your model inherit ClassificationSupportedSequentialModel to utilize the layer-wise representation learning mechanism to efficiently conduct representation learning for your custom sequential model.

import autogl
from autogl.module.model.base import ClassificationSupportedSequentialModel

# override Linear so that it can take graph data as input
class Linear(torch.nn.Linear):
    def forward(self, data):
        return super().forward(data.x)

class MyMLPSampling(ClassificationSupportedSequentialModel):
    def __init__(self, in_channels, num_classes, layer_num, dim):
        super().__init__()
        if layer_num == 1:
            ops = [Linear(in_channels, num_classes)]
        else:
            ops = [Linear(in_channels, dim)]
            for i in range(layer_num - 2):
                ops.append(Linear(dim, dim))
            ops.append(Linear(dim, num_classes))

        self.core = torch.nn.ModuleList(ops)

    # (required) override sequential_encoding_layers property to interact with sampling
    @property
    def sequential_encoding_layers(self) -> torch.nn.ModuleList:
        return self.core

    # (required) define the encode logic of classification for sampling
    def cls_encode(self, data):
        # if you use sampling, the data will be passed in two possible ways,
        # you can judge it use following rules
        if hasattr(data, 'edge_indexes'):
            # the edge_indexes are a list of edge_index, one for each layer
            edge_indexes = data.edge_indexes
            edge_weights = [None] * len(self.core) if getattr(data, 'edge_weights', None) is None else data.edge_weights
        else:
            # the edge_index and edge_weight will stay the same as default
            edge_indexes = [data.edge_index] * len(self.core)
            edge_weights = [getattr(data, 'edge_weight', None)] * len(self.core)

        x = data.x
        for i in range(len(self.core)):
            data = autogl.data.Data(x=x, edge_index=edge_indexes[i])
            data.edge_weight = edge_weights[i]
            x = self.sequential_encoding_layers[i](data)
        return x

    # (required) define the decode logic of classification for sampling
    def cls_decode(self, x):
        return torch.nn.functional.log_softmax(x)