autogl.data
-
class
autogl.data.
Data
(x=None, edge_index=None, edge_attr=None, y=None, pos=None)[source] A plain python object modeling a single graph with various (optional) attributes:
Parameters: - x (Tensor, optional) – Node feature matrix with shape
[num_nodes, num_node_features]
. (default:None
) - edge_index (LongTensor, optional) – Graph connectivity in COO format
with shape
[2, num_edges]
. (default:None
) - edge_attr (Tensor, optional) – Edge feature matrix with shape
[num_edges, num_edge_features]
. (default:None
) - y (Tensor, optional) – Graph or node targets with arbitrary shape.
(default:
None
) - pos (Tensor, optional) – Node position matrix with shape
[num_nodes, num_dimensions]
. (default:None
)
The data object is not restricted to these attributes and can be extented by any other additional data.
-
__call__
(*keys)[source] Iterates over all attributes
*keys
in the data, yielding their attribute names and content. If*keys
is not given this method will iterative over all present attributes.
-
__inc__
(key, value)[source] “Returns the incremental count to cumulatively increase the value of the next attribute of
key
when creating batches.Note
This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.
-
__iter__
()[source] Iterates over all present attributes in the data, yielding their attribute names and content.
-
apply
(func, *keys)[source] Applies the function
func
to all attributes*keys
. If*keys
is not given,func
is applied to all present attributes.
-
cat_dim
(key, value)[source] Returns the dimension in which the attribute
key
with contentvalue
gets concatenated when creating batches.Note
This method is for internal use only, and should only be overridden if the batch concatenation process is corrupted for a specific data attribute.
-
contiguous
(*keys)[source] Ensures a contiguous memory layout for all attributes
*keys
. If*keys
is not given, all present attributes are ensured to have a contiguous memory layout.
-
is_coalesced
()[source] Returns
True
, if edge indices are ordered and do not contain duplicate entries.
-
keys
Returns all names of graph attributes.
-
num_edges
Returns the number of edges in the graph.
-
num_features
Returns the number of features per node in the graph.
-
random_splits_mask
(train_ratio, val_ratio, seed=None)[source] If the data has masks for train/val/test, return the splits with specific ratio.
Parameters: - train_ratio (float) – the portion of data that used for training.
- val_ratio (float) – the portion of data that used for validation.
- seed (int) – random seed for splitting dataset.
-
random_splits_mask_class
(num_train_per_class, num_val, num_test, seed=None)[source] If the data has masks for train/val/test, return the splits with specific number of samples from every class for training.
Parameters: - num_train_per_class (int) – the number of samples from every class used for training.
- num_val (int) – the total number of nodes that used for validation.
- num_test (int) – the total number of nodes that used for testing.
- seed (int) – random seed for splitting dataset.
-
random_splits_nodes
(train_ratio, val_ratio, seed=None)[source] If the data uses id of nodes for train/val/test, return the splits with specific ratio.
Parameters: - train_ratio (float) – the portion of data that used for training.
- val_ratio (float) – the portion of data that used for validation.
- seed (int) – random seed for splitting dataset.
-
random_splits_nodes_class
(num_train_per_class, num_val, num_test, seed=None)[source] If the data uses id of nodes for train/val/test, return the splits with specific number of samples from every class for training.
Parameters: - num_train_per_class (int) – the number of samples from every class used for training.
- num_val (int) – the total number of nodes that used for validation.
- num_test (int) – the total number of nodes that used for testing.
- seed (int) – random seed for splitting dataset.
- x (Tensor, optional) – Node feature matrix with shape
-
class
autogl.data.
InMemoryDataset
(data: Iterable[_D], train_index: Optional[Iterable[int]] = Ellipsis, val_index: Optional[Iterable[int]] = Ellipsis, test_index: Optional[Iterable[int]] = Ellipsis, schema: Optional[autogl.data._dataset._dataset._Schema] = Ellipsis)[source]
-
class
autogl.data.
InMemoryStaticGraphSet
(graphs: Iterable[autogl.data.graph._general_static_graph._general_static_graph.GeneralStaticGraph], train_index: Optional[Iterable[int]] = Ellipsis, val_index: Optional[Iterable[int]] = Ellipsis, test_index: Optional[Iterable[int]] = Ellipsis)[source]
-
autogl.data.
download_url
(url, folder, name=None, log=True)[source] Downloads the content of an URL to a specific folder.
Parameters: - url (string) – The url.
- folder (string) – The folder.
- log (bool, optional) – If
False
, will not print anything to the console. (default:True
)
-
autogl.data.
extract_tar
(path, folder, mode='r:gz', log=True)[source] Extracts a tar archive to a specific folder.
Parameters: - path (string) – The path to the tar archive.
- folder (string) – The folder.
- mode (string, optional) – The compression mode. (default:
"r:gz"
) - log (bool, optional) – If
False
, will not print anything to the console. (default:True
)