Skip to content

module stats


function get_freq_single_point

get_freq_single_point(
    data: Tensor,
    weights: Tensor | None = None,
    pseudo_count: float = 0.0
) → Tensor

Computes the single point frequencies of the input MSA.

Args:

  • data (torch.Tensor): One-hot encoded data array.
  • weights (torch.Tensor | None, optional): Weights of the sequences.
  • pseudo_count (float, optional): Pseudo count to be added to the frequencies. Defaults to 0.0.

Raises:

  • ValueError: If the input data is not a 3D tensor.

Returns:

  • torch.Tensor: Single point frequencies.

function get_freq_two_points

get_freq_two_points(
    data: Tensor,
    weights: Tensor | None = None,
    pseudo_count: float = 0.0
) → Tensor

Computes the 2-points statistics of the input MSA.

Args:

  • data (torch.Tensor): One-hot encoded data array.
  • weights (torch.Tensor | None, optional): Array of weights to assign to the sequences of shape.
  • pseudo_count (float, optional): Pseudo count for the single and two points statistics. Acts as a regularization. Defaults to 0.0.

Raises:

  • ValueError: If the input data is not a 3D tensor.

Returns:

  • torch.Tensor: Matrix of two-point frequencies of shape (L, q, L, q).

function generate_unique_triplets

generate_unique_triplets(
    L: int,
    ntriplets: int,
    device: device = device(type='cpu')
) → Tensor

Generates a set of unique triplets of positions. Used to compute the 3-points statistics.

Args:

  • L (int): Length of the sequences.
  • ntriplets (int): Number of triplets to be generated.
  • device (torch.device, optional): Device to perform computations on. Defaults to "cpu".

Returns:

  • torch.Tensor: Tensor of shape (ntriplets, 3) containing the indices of the triplets.

function get_freq_three_points

get_freq_three_points(
    nat: Tensor,
    gen: Tensor,
    ntriplets: int,
    weights: Tensor | None = None,
    device: device = device(type='cpu')
) → Tuple[Tensor, Tensor]

Computes the 3-body connected correlation statistics of the input MSAs.

Args:

  • nat (torch.Tensor): Input MSA representing natural data in one-hot encoding.
  • gen (torch.Tensor): Input MSA representing generated data in one-hot encoding.
  • ntriplets (int): Number of triplets to test.
  • weights (torch.Tensor | None, optional): Importance weights for the natural sequences. Defaults to None.
  • device (torch.device, optional): Device to perform computations on. Defaults to "cpu".

Returns:

  • Tuple[torch.Tensor, torch.Tensor]: Natural and generated 3-points connected correlation for ntriplets randomly extracted triplets.

function get_covariance_matrix

get_covariance_matrix(
    data: Tensor,
    weights: Tensor | None = None,
    pseudo_count: float = 0.0
) → Tensor

Computes the weighted covariance matrix of the input multi sequence alignment.

Args:

  • data (torch.Tensor): Input MSA in one-hot variables.
  • weights (torch.Tensor | None, optional): Importance weights of the sequences.
  • pseudo_count (float, optional): Pseudo count. Defaults to 0.0.

Returns:

  • torch.Tensor: Covariance matrix.

function extract_Cij_from_freq

extract_Cij_from_freq(
    fij: Tensor,
    pij: Tensor,
    fi: Tensor,
    pi: Tensor,
    mask: Tensor | None = None
) → Tuple[float, float]

Extracts the lower triangular part of the covariance matrices of the data and chains starting from the frequencies.

Args:

  • fij (torch.Tensor): Two-point frequencies of the data.
  • pij (torch.Tensor): Two-point frequencies of the chains.
  • fi (torch.Tensor): Single-point frequencies of the data.
  • pi (torch.Tensor): Single-point frequencies of the chains.
  • mask (torch.Tensor | None, optional): Mask for comparing just a subset of the couplings. Defaults to None.

Returns:

  • Tuple[float, float]: Extracted two-point frequencies of the data and chains.

function extract_Cij_from_seqs

extract_Cij_from_seqs(
    data: Tensor,
    chains: Tensor,
    weights: Tensor | None = None,
    pseudo_count: float = 0.0,
    mask: Tensor | None = None
) → Tuple[float, float]

Extracts the lower triangular part of the covariance matrices of the data and chains starting from the sequences.

Args:

  • data (torch.Tensor): Data sequences.
  • chains (torch.Tensor): Chain sequences.
  • weights (torch.Tensor | None, optional): Weights of the sequences. Defaults to None.
  • pseudo_count (float, optional): Pseudo count for the single and two points statistics. Acts as a regularization. Defaults to 0.0.
  • mask (torch.Tensor | None, optional): Mask for comparing just a subset of the couplings. Defaults to None.

Returns:

  • Tuple[float, float]: Two-point frequencies of the data and chains.

function get_correlation_two_points

get_correlation_two_points(
    fij: Tensor,
    pij: Tensor,
    fi: Tensor,
    pi: Tensor,
    mask: Tensor | None = None
) → Tuple[float, float]

Computes the Pearson coefficient and the slope between the two-point frequencies of data and chains.

Args:

  • fij (torch.Tensor): Two-point frequencies of the data.
  • pij (torch.Tensor): Two-point frequencies of the chains.
  • fi (torch.Tensor): Single-point frequencies of the data.
  • pi (torch.Tensor): Single-point frequencies of the chains.
  • mask (torch.Tensor | None, optional): Mask to select the couplings to use for the correlation coefficient. Defaults to None.

Returns:

  • Tuple[float, float]: Pearson correlation coefficient of the two-sites statistics and slope of the interpolating line.

This file was automatically generated via lazydocs.