API Overview
Modules
cobalt
dataset
dca
fasta
functional
io
: The io module provides the Python interfaces to stream handling. Theplot
resampling
sampling
statmech
stats
utils
Classes
dataset.DatasetDCA
: Dataset class for handling multi-sequence alignments data.
Functions
cobalt.prune_redundant_sequences
: Prunes sequences from X such that no sequence has more than 'seqid_th' fraction of its residues identical to any other sequence in the set.cobalt.run_cobalt
: Runs the Cobalt algorithm to split the input MSA into training and test sets.cobalt.split_train_test
: Splits X into two sets, T and S, such that no sequence in S has more thandca.get_contact_map
: Computes the contact map from the model coupling matrix.dca.get_mf_contact_map
: Computes the contact map from the model coupling matrix.dca.get_seqid
: When average is True:dca.set_zerosum_gauge
: Sets the zero-sum gauge on the coupling matrix.fasta.compute_weights
: Computes the weight to be assigned to each sequence 's' in 'data' as 1 / n_clust, where 'n_clust' is the number of sequencesfasta.decode_sequence
: Takes a numeric sequence or list of seqences in input an returns the corresponding string encoding.fasta.encode_sequence
: Encodes a sequence or a list of sequences into a numeric format.fasta.get_tokens
: Converts the alphabet into the corresponding tokens.fasta.import_from_fasta
: Import sequences from a fasta file. The following operations are performed:fasta.validate_alphabet
: Check if the chosen alphabet is compatible with the input sequences.fasta.write_fasta
: Generate a fasta file with the input sequences.functional.one_hot
: A fast one-hot encoding function faster than the PyTorch one working with torch.int32 and returning a float Tensor.io.load_chains
: Loads the sequences from a fasta file and returns the numeric-encoded version.io.load_params
: Import the parameters of the model from a file.io.load_params_oldformat
: Import the parameters of the model from a file. Assumes the old DCA format.io.save_chains
: Saves the chains in a fasta file.io.save_params
: Saves the parameters of the model in a file.io.save_params_oldformat
: Saves the parameters of the model in a file. Assumes the old DCA format.plot.plot_PCA
: Makes the scatter plot of the components (pc1, pc2) of the input data and shows the histograms of the components.plot.plot_autocorrelation
: Plots the time-autocorrelation curve of the sequence identity and the generated and data sequence identities.plot.plot_contact_map
: Plots the contact map.plot.plot_pearson_sampling
: Plots the Pearson correlation coefficient over sampling time.plot.plot_scatter_correlations
: Plots the scatter plot of the data and generated Cij and Cijk values.sampling.get_sampler
: Returns the sampling function corresponding to the chosen method.sampling.gibbs_mutate
: Attempts to perform num_mut mutations using the Gibbs sampler.sampling.gibbs_sampling
: Gibbs sampling.sampling.metropolis
: Metropolis sampling.sampling.metropolis_mutate
: Attempts to perform num_mut mutations using the Metropolis sampler.statmech.compute_energy
: Compute the DCA energy of the sequences in X.statmech.compute_entropy
: Compute the entropy of the DCA model.statmech.compute_logZ_exact
: Compute the log-partition function of the model.statmech.compute_log_likelihood
: Compute the log-likelihood of the model.statmech.enumerate_states
: Enumerate all possible states of a system of L sites and q states.statmech.iterate_tap
: Iterates the TAP equations until convergence.stats.extract_Cij_from_freq
: Extracts the lower triangular part of the covariance matrices of the data and chains starting from the frequencies.stats.extract_Cij_from_seqs
: Extracts the lower triangular part of the covariance matrices of the data and chains starting from the sequences.stats.generate_unique_triplets
: Generates a set of unique triplets of positions. Used to compute the 3-points statistics.stats.get_correlation_two_points
: Computes the Pearson coefficient and the slope between the two-point frequencies of data and chains.stats.get_covariance_matrix
: Computes the weighted covariance matrix of the input multi sequence alignment.stats.get_freq_single_point
: Computes the single point frequencies of the input MSA.stats.get_freq_three_points
: Computes the 3-body connected correlation statistics of the input MSAs.stats.get_freq_two_points
: Computes the 2-points statistics of the input MSA.utils.get_device
: Returns the device where to store the tensors.utils.get_dtype
: Returns the data type of the tensors.utils.get_mask_save
: Returns the mask to save the upper-triangular part of the coupling matrix.utils.init_chains
: Initialize the chains of the DCA model. If 'fi' is provided, the chains are sampled from theutils.init_parameters
: Initialize the parameters of the DCA model.utils.resample_sequences
: Extracts nextract sequences from data with replacement according to the weights.
This file was automatically generated via lazydocs.