module cobalt
function split_train_test
split_train_test(
headers: list[str],
X: Tensor,
seqid_th: float,
rnd_gen: Generator | None = None
) → tuple[list, Tensor, list, Tensor]
Splits X into two sets, T and S, such that no sequence in S has more than 'seqid_th' fraction of its residues identical to any sequence in T.
Args:
headers
(list[str]): List of sequence headers.X
(torch.Tensor): Encoded input MSA.seqid_th
(float): Threshold sequence identity.rnd_gen
(torch.Generator, optional): Random number generator. Defaults to None.
Returns:
tuple[list, torch.Tensor, list, torch.Tensor]
: Training and test sets.
function prune_redundant_sequences
prune_redundant_sequences(
headers: list[str],
X: Tensor,
seqid_th: float,
rnd_gen: Generator | None = None
) → tuple[list, Tensor]
Prunes sequences from X such that no sequence has more than 'seqid_th' fraction of its residues identical to any other sequence in the set.
Args:
headers
(list[str]): List of sequence headers.X
(torch.Tensor): Encoded input MSA.seqid_th
(float): Threshold sequence identity.rnd_gen
(torch.Generator, optional): Random generator. Defaults to None.
Returns:
tuple[list, torch.Tensor]
: Pruned sequences.
function run_cobalt
run_cobalt(
headers: list[str],
X: Tensor,
t1: float,
t2: float,
t3: float,
max_train: int | None,
max_test: int | None,
rnd_gen: Generator | None = None
) → tuple[list, Tensor, list, Tensor]
Runs the Cobalt algorithm to split the input MSA into training and test sets.
Args:
headers
(list[str]): List of sequence headers.X
(torch.Tensor): Encoded input MSA.t1
(float): No sequence in S has more than this fraction of its residues identical to any sequence in T.t2
(float): No pair of test sequences has more than this value fractional identity.t3
(float): No pair of training sequences has more than this value fractional identity.max_train
(int | None): Maximum number of sequences in the training set.max_test
(int | None): Maximum number of sequences in the test set.rnd_gen
(torch.Generator, optional): Random number generator. Defaults to None.
Returns:
tuple[list, torch.Tensor, list, torch.Tensor]
: Training and test sets.
This file was automatically generated via lazydocs.