Real Data

The quickstart builds a graph pair from a synthetic graph. This example does the same end-to-end pipeline on a real, bundled dataset and shows both an unrestricted run and a restricted (anchor-based) run.

Loading a bundled dataset

Datasets are accessed straight from libam.datasets and downloaded on first use (see Datasets). A single-graph dataset has identical source and target, so we permute() it and add noise to create a non-trivial benchmark with known ground truth:

import libam
from libam import algorithms as alg
from libam import datasets

pair = datasets.bio_celegans.graphpair().permute().add_noise(target_noise=0.05)
print(f"Source edges: {pair.src.number_of_edges()}, "
      f"Target edges: {pair.tar.number_of_edges()}")

Comparing several algorithms

Every algorithm takes the same pair and exposes align(). Running a few and scoring them with accuracy() gives a quick comparison. Sharing one parameter dictionary across algorithms is convenient; unused keys are ignored (silence the resulting warnings by raising the libam log level):

import logging
logging.getLogger("libam").setLevel(logging.ERROR)

params = {
    "iterations": 1, "simple": True, "mu": 0.05, "efn": 3,  # fugal
    "eta": 0.2, "init_sim": 1, "eig_type": 0,               # grampa_s
    "maxiter": 20, "alpha": 0.85,                           # isorank
}

algorithms = [
    alg.fugal(pair, **params),
    alg.grampa_s(pair, **params),
    alg.isorank(pair, **params),
]

for algorithm in algorithms:
    P = algorithm.align()
    score = libam.evaluation.accuracy(pair, P)
    print(f"{algorithm.name}: accuracy {score:.4f}")

Using anchors with a restricted algorithm

Restricted algorithms take a set of known correspondences. Sample them from the ground truth with get_anchor_links() and pass them as anchor_links:

anchor_links = pair.get_anchor_links(0.1)  # 10% of nodes as anchors

algorithm = alg.joena(pair, anchor_links=anchor_links)
P = algorithm.align()
print(f"{algorithm.name}: accuracy {libam.evaluation.accuracy(pair, P):.4f}")