autogl.module.feature
We support feature engineering for both PyTorch Geometric and Deep Deep Graph Library backend.
-
class
autogl.module.feature.
EigenFeatureGenerator
(size: int = 32)[source] concat Eigen features
Notes
An implementation of [1]
References
[1] Ziwei Zhang, Peng Cui, Jian Pei, Xin Wang, Wenwu Zhu: Eigen-GNN: A Graph Structure Preserving Plug-in for GNNs. TKDE (2021) https://arxiv.org/abs/2006.04330 Parameters: size (int) – EigenGNN hidden size
-
class
autogl.module.feature.
GraphletGenerator
(override_features: bool = False)[source] generate local graphlet numbers as features. The implementation refers to [2] .
References
[2] Ahmed, N. K., Willke, T. L., & Rossi, R. A. (2016). Estimation of local subgraph counts. Proceedings - 2016 IEEE International Conference on Big Data, Big Data 2016, 586–595. https://doi.org/10.1109/BigData.2016.7840651
-
class
autogl.module.feature.
LocalDegreeProfileGenerator
(override_features: bool = False)[source] Appends the Local Degree Profile (LDP) from the “A Simple yet Effective Baseline for Non-attribute Graph Classification” paper
\[\mathbf{x}_i = \mathbf{x}_i \, \Vert \, (\deg(i), \min(DN(i)), \max(DN(i)), \textrm{mean}(DN(i)), \textrm{std}(DN(i)))\]to the node features, where \(DN(i) = \{ \deg(j) \mid j \in \mathcal{N}(i) \}\).
-
class
autogl.module.feature.
OneHotDegreeGenerator
(max_degree: int = 1000, in_degree: bool = False, cat: bool = True)[source] Adds the node degree as one hot encodings to the node features.
Parameters: - max_degree (int) – Maximum degree.
- in_degree (bool, optional) – If set to
True
, will compute the in-degree of nodes instead of the out-degree. (default:False
) - cat (bool, optional) – Concat node degrees to node features instead
of replacing them. (default:
True
)
-
class
autogl.module.feature.
NetLSD
(*args, **kwargs)[source] Notes
a graph feature generation method. This is a simple wrapper of NetLSD [3].
References
[3] A. Tsitsulin, D. Mottin, P. Karras, A. Bronstein, and E. Müller, “NetLSD: Hearing the shape of a graph,” Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 2347–2356, 2018.
-
class
autogl.module.feature.
GBDTFeatureSelector
(fixlen: int = 10, *args, **kwargs)[source] simple wrapper of lightgbm , using importance ranking to select top-k features.
Parameters: fixlen (int) – K for top-K important features.
-
class
autogl.module.feature.
AutoFeatureEngineer
(fix_length: int = 200, max_epoch: int = 5, time_budget: Optional[int] = None, feature_selector=<class 'autogl.module.feature._selectors._gbdt.GBDTFeatureSelector'>, verbosity: int = 0, *args, **kwargs)[source] Notes
An implementation of auto feature engineering method Deepgl [4] ,which iteratively generates features by aggregating neighbour features and select a fixed number of features to automatically add important graph-aware features.
References
[4] Rossi, R. A., Zhou, R., & Ahmed, N. K. (2020). Deep Inductive Graph Representation Learning. IEEE Transactions on Knowledge and Data Engineering, 32(3), 438–452. https://doi.org/10.1109/TKDE.2018.2878247 Parameters: - fix_length (int) – fixed number of features for every epoch. The final number of features added will be
fixlen
timesmax_epoch
, 200 times 5 in default. - max_epoch (int) – number of epochs in total process.
- time_budget (int) – timebudget(seconds) for the feature engineering process, None for no time budget . Note that this time budget is a soft budget ,which is obtained by rough time estimation through previous iterations and may finally exceed the actual timebudget
- y_sel_func (Callable) – feature selector function object for selection at each iteration ,lightgbm in default. Note that in original paper, connected components of feature graph is used , and you may implement it by yourself if you want.
- verbosity (int) – hide any infomation except error and fatal if
verbosity
< 1
- fix_length (int) – fixed number of features for every epoch. The final number of features added will be