AutoGL Dataset

We provide various common datasets based on PyTorch-Geometric, Deep Graph Library and OGB. Besides, users are able to leverage a unified abstraction provided in AutoGL, GeneralStaticGraph, which is towards both static homogeneous graph and static heterogeneous graph.

A basic example to construct an instance of GeneralStaticGraph is shown as follows.

from autogl.data.graph import GeneralStaticGraph, GeneralStaticGraphGenerator

''' Construct a custom homogeneous graph '''
custom_static_homogeneous_graph: GeneralStaticGraph = GeneralStaticGraphGenerator.create_homogeneous_static_graph(
    {'x': torch.rand(2708, 3), 'y': torch.rand(2708, 1)}, torch.randint(0, 1024, (2, 10556))
)

''' Construct a custom heterogemneous graph '''
custom_static_heterogeneous_graph: GeneralStaticGraph = GeneralStaticGraphGenerator.create_heterogeneous_static_graph(
    {
        'author': {'x': torch.rand(1024, 3), 'y': torch.rand(1024, 1)},
        'paper': {'feat': torch.rand(2048, 10), 'z': torch.rand(2048, 13)}
    },
    {
        ('author', 'writing', 'paper'): (torch.randint(0, 1024, (2, 5120)), torch.rand(5120, 10)),
        ('author', 'reading', 'paper'): torch.randint(0, 1024, (2, 3840)),
    }
)

Supporting datasets

AutoGL now supports the following benchmarks for different tasks:

Semi-supervised node classification: Cora, Citeseer, Pubmed, Amazon Computers, Amazon Photo, Coauthor CS, Coauthor Physics, Reddit, etc.

Dataset PyG DGL default train/val/test split
Cora
Citeseer
Pubmed
Amazon Computers  
Amazon Photo  
Coauthor CS  
Coauthor Physics  
Reddit
ogbn-products
ogbn-proteins
ogbn-arxiv
ogbn-papers100M

Graph classification: MUTAG, IMDB-Binary, IMDB-Multi, PROTEINS, COLLAB, etc.

Dataset PyG DGL Node Feature Label Edge Features
MUTAG
IMDB-Binary    
IMDB-Multi    
PROTEINS  
COLLAB    
ogbg-molhiv
ogbg-molpcba
ogbg-ppa  
ogbg-code2

Link Prediction: At present, AutoGL utilizes various homogeneous graphs towards node classification to conduct automatic link prediction.

Construct custom dataset by instances of GeneralStaticGraph

The following example shows the way to compose a custom dataset by a sequence of instances of GeneralStaticGraph.