Datasets¶

A number of real-world graphs are bundled with the library and exposed through the libam.datasets submodule. Importing the submodule gives you ready-to-use Dataset objects, with no manual download or unpacking steps required.

The first time you access a dataset, its file is fetched over the network from the libam-datasets repository, and cached on disk (under the per-user pooch cache, e.g. ~/.cache/libam). Subsequent accesses load directly from that cache, so a dataset is only ever downloaded once.

Loading a dataset¶

Every dataset exposes two methods. Use graph() to get the raw graph as a NetworkX object, or graphpair() to get a GraphPair ready for alignment:

from libam import datasets

# Downloaded and cached on first access, loaded from cache thereafter.
# NetworkX.Graph, can use to create a GraphPair
g = datasets.bio_celegans.graph()

# libam.GraphPair, can be used directly in algorithms
pair = datasets.bio_celegans.graphpair().permute().add_noise(target_noise=0.05)

Available datasets¶

The following datasets are available as attributes of libam.datasets.

Single-graph datasets (structure only), often used to build mirrored graph pairs graphpair() together with permute() and add_noise():

bio_celegans
bio_dmela
ca_astro_ph
ca_erdos992
ca_gr_qc
ca_netscience
in_arenas
inf_euroroad
inf_power
soc_facebook
soc_hamsterster
socfb_bowdoin47
socfb_hamilton46
socfb_haverford76
socfb_swarthmore42

Paired datasets, each holding a source graph, a target graph and a ground-truth mapping. These include node features unless noted:

cora
douban
acm_dblp
allmv_tmdb
fb_tw
ppi
foursquare (no node features)
phone (no node features)

The `Dataset` class¶

class libam.datasets.Dataset(filename: str, loader: Callable[[Path], Any], parser: Callable[[...], GraphPair], members: list[str] | None = None)[source]¶

Datasets¶

Loading a dataset¶

Available datasets¶

The `Dataset` class¶

LIbAM

Navigation

Related Topics

Datasets¶

Loading a dataset¶

Available datasets¶

The Dataset class¶

The `Dataset` class¶