module adabmDCA.stats
function get_freq_single_point
get_freq_single_point(
data: Tensor,
weights: Optional[Tensor] = None,
pseudo_count: float = 0.0
) → Tensor
Computes the single point frequencies of the input MSA.
Args:
data(torch.Tensor): One-hot encoded data array.weights(Optional[torch.Tensor], optional): Weights of the sequences.pseudo_count(float, optional): Pseudo count to be added to the frequencies. Defaults to 0.0.
Raises:
ValueError: If the input data is not a 3D tensor.
Returns:
torch.Tensor: Single point frequencies.
function get_freq_two_points
get_freq_two_points(
data: Tensor,
weights: Optional[Tensor] = None,
pseudo_count: float = 0.0
) → Tensor
Computes the 2-points statistics of the input MSA.
Args:
data(torch.Tensor): One-hot encoded data array.weights(Optional[torch.Tensor], optional): Array of weights to assign to the sequences of shape.pseudo_count(float, optional): Pseudo count for the single and two points statistics. Acts as a regularization. Defaults to 0.0.
Raises:
ValueError: If the input data is not a 3D tensor.
Returns:
torch.Tensor: Matrix of two-point frequencies of shape (L, q, L, q).
function generate_unique_triplets
generate_unique_triplets(
L: int,
ntriplets: int,
device: device = device(type='cpu')
) → Tensor
Generates a set of unique triplets of positions. Used to compute the 3-points statistics.
Args:
L(int): Length of the sequences.ntriplets(int): Number of triplets to be generated.device(torch.device, optional): Device to perform computations on. Defaults to "cpu".
Returns:
torch.Tensor: Tensor of shape (ntriplets, 3) containing the indices of the triplets.
function get_freq_three_points
get_freq_three_points(
nat: Tensor,
gen: Tensor,
ntriplets: int,
weights: Optional[Tensor] = None,
device: device = device(type='cpu')
) → Tuple[Tensor, Tensor]
Computes the 3-body connected correlation statistics of the input MSAs.
Args:
nat(torch.Tensor): Input MSA representing natural data in one-hot encoding.gen(torch.Tensor): Input MSA representing generated data in one-hot encoding.ntriplets(int): Number of triplets to test.weights(Optional[torch.Tensor], optional): Importance weights for the natural sequences. Defaults to None.device(torch.device, optional): Device to perform computations on. Defaults to "cpu".
Returns:
Tuple[torch.Tensor, torch.Tensor]: Natural and generated 3-points connected correlation for ntriplets randomly extracted triplets.
function get_covariance_matrix
get_covariance_matrix(
data: Tensor,
weights: Optional[Tensor] = None,
pseudo_count: float = 0.0
) → Tensor
Computes the weighted covariance matrix of the input multi sequence alignment.
Args:
data(torch.Tensor): Input MSA in one-hot variables.weights(torch.Tensor | None, optional): Importance weights of the sequences.pseudo_count(float, optional): Pseudo count. Defaults to 0.0.
Returns:
torch.Tensor: Covariance matrix.
function extract_Cij_from_freq
extract_Cij_from_freq(
fij: Tensor,
pij: Tensor,
fi: Tensor,
pi: Tensor,
mask: Optional[Tensor] = None
) → Tuple[Tensor, Tensor]
Extracts the lower triangular part of the covariance matrices of the natural data and generated data starting from the frequencies.
Args:
fij(torch.Tensor): Two-point frequencies of the natural data.pij(torch.Tensor): Two-point frequencies of the generated data.fi(torch.Tensor): Single-point frequencies of the natural data.pi(torch.Tensor): Single-point frequencies of the generated data.mask(Optional[torch.Tensor], optional): Mask for comparing just a subset of the couplings. Defaults to None.
Returns:
Tuple[torch.Tensor, torch.Tensor]: Extracted covariance matrix entries of the natural data and generated data.
function extract_Cij_from_seqs
extract_Cij_from_seqs(
data: Tensor,
chains: Tensor,
weights: Optional[Tensor] = None,
pseudo_count: float = 0.0,
mask: Optional[Tensor] = None
) → Tuple[Tensor, Tensor]
Extracts the lower triangular part of the covariance matrices of the natural data and generated data starting from the sequences.
Args:
data(torch.Tensor): Natural data sequences.chains(torch.Tensor): Generated data sequences.weights(torch.Tensor | None, optional): Weights of the sequences. Defaults to None.pseudo_count(float, optional): Pseudo count for the single and two points statistics. Acts as a regularization. Defaults to 0.0.mask(torch.Tensor | None, optional): Mask for comparing just a subset of the couplings. Defaults to None.
Returns:
Tuple[torch.Tensor, torch.Tensor]: Two-point frequencies of the natural data and generated data.
function get_correlation_two_points
get_correlation_two_points(
fij: Tensor,
pij: Tensor,
fi: Tensor,
pi: Tensor,
mask: Optional[Tensor] = None
) → Tuple[float, float]
Computes the Pearson coefficient and the slope between the two-point frequencies of data and chains.
Args:
fij(torch.Tensor): Two-point frequencies of the natural data.pij(torch.Tensor): Two-point frequencies of the generated data.fi(torch.Tensor): Single-point frequencies of the natural data.pi(torch.Tensor): Single-point frequencies of the generated data.mask(Optional[torch.Tensor], optional): Mask to select the couplings to use for the correlation coefficient. Defaults to None.
Returns:
Tuple[float, float]: Pearson correlation coefficient of the two-sites statistics and slope of the interpolating line.
This file was automatically generated via lazydocs.