AutoGL Feature Engineering
We provide a series of node and graph feature engineers for you to compose within a feature engineering pipeline.
Quick Start
# 1. Choose a dataset.
from autogl.datasets import build_dataset_from_name
data = build_dataset_from_name('cora')
# 2. Compose a feature engineering pipeline
from autogl.module.feature._base_feature_engineer._base_feature_engineer import _ComposedFeatureEngineer
from autogl.module.feature import EigenFeatureGenerator
from autogl.module.feature import NetLSD
# you may compose feature engineering bases through autogl.module.feature._base_feature_engineer
fe = _ComposedFeatureEngineer([
EigenFeatureGenerator(size=32),
NetLSD()
])
# 3. Fit and transform the data
fe.fit(data)
data1=fe.transform(data,inplace=False)
List of FE base names
Now three kinds of feature engineering bases are supported,namely generators
, selectors
, graph
.You can import
bases from according module as is mentioned in the Quick Start
part. Or you may want to just list names of bases
in configurations or as arguments of the autogl solver.
generators
selectors
graph
NetLSD
is a graph feature generation method. please refer to the according document.
A set of graph feature extractors implemented in NetworkX are wrapped, please refer to NetworkX for details. (NxLargeCliqueSize
, NxAverageClusteringApproximate
, NxDegreeAssortativityCoefficient
, NxDegreePearsonCorrelationCoefficient
, NxHasBridge
,``NxGraphCliqueNumber``, NxGraphNumberOfCliques
, NxTransitivity
, NxAverageClustering
, NxIsConnected
, NxNumberConnectedComponents
,
NxIsDistanceRegular
, NxLocalEfficiency
, NxGlobalEfficiency
, NxIsEulerian
)
The taxonomy of base types is based on the way of transforming features. generators
concatenate the original features with ones newly generated
or just overwrite the original ones. Instead of generating new features , selectors
try to select useful features and keep learned selecting methods
in the base itself. The former two types of bases can be exploited in node or edge level (modification upon each
node or edge feature) ,while graph
focuses on feature engineering in graph level (modification upon each graph feature).
For your convenience in further development,you may want to design a new item by inheriting one of them.
Of course, you can directly inherit the BaseFeature
as well.
Create Your Own FE
You can create your own feature engineering object by simply inheriting one of feature engineering base types ,namely generators
, selectors
, graph
,
and overloading methods extract_xx_features
.
# for example : create a node one-hot feature.
import autogl
import torch
from autogl.module.feature._generators._basic import BaseFeatureGenerator
class OneHotFeatureGenerator(BaseFeatureGenerator):
# if overrider_features==False , concat the features with original features; otherwise override.
def __init__(self, override_features: bool = False):
super(BaseFeatureGenerator, self).__init__(override_features)
def _extract_nodes_feature(self, data: autogl.data.Data) -> torch.Tensor:
num_nodes: int = (
data.x.size(0)
if data.x is not None and isinstance(data.x, torch.Tensor)
else (data.edge_index.max().item() + 1)
)
return torch.eye(num_nodes)