API Overview
Modules
adabmDCA.checkpointadabmDCA.cobaltadabmDCA.datasetadabmDCA.dcaadabmDCA.fastaadabmDCA.functionaladabmDCA.graphadabmDCA.ioadabmDCA.plotadabmDCA.resamplingadabmDCA.samplingadabmDCA.statmechadabmDCA.statsadabmDCA.trainingadabmDCA.utils
Classes
checkpoint.Checkpoint: Helper class to save the model's parameters and chains at regular intervals during training and to log thedataset.DatasetDCA: Dataset class for handling multi-sequence alignments data.
Functions
cobalt.prune_redundant_sequences: Prunes sequences from X such that no sequence has more than 'seqid_th' fraction of its residues identical to any other sequence in the set.cobalt.run_cobalt: Runs the Cobalt algorithm to split the input MSA into training and test sets.cobalt.split_train_test: Splits X into two sets, T and S, such that no sequence in S has more thandca.get_contact_map: Computes the contact map from the model coupling matrix.dca.get_mf_contact_map: Computes the contact map using mean-field approximation from the data.dca.get_seqid: Returns a tensor containing the sequence identities between two sets of one-hot encoded sequences.dca.get_seqid_stats: - If s2 is provided, computes the mean and the standard deviation of the mean sequence identity between two sets of one-hot encoded sequences.dca.set_zerosum_gauge: Sets the zero-sum gauge on the coupling matrix.fasta.compute_weights: Computes the weight to be assigned to each sequence 's' in 'data' as 1 / n_clust, where 'n_clust' is the number of sequencesfasta.decode_sequence: Takes a numeric sequence or list of seqences in input an returns the corresponding string encoding.fasta.encode_sequence: Encodes a sequence or a list of sequences into a numeric format.fasta.get_tokens: Converts a known alphabet into the corresponding tokens, otherwise returns the custom alphabet.fasta.import_from_fasta: Import sequences from a fasta or compressed fasta (.fas.gz) file. The following operations are performed:fasta.validate_alphabet: Check if the chosen alphabet is compatible with the input sequences.fasta.write_fasta: Generate a fasta file with the input sequences.functional.one_hot: A fast one-hot encoding function faster than the PyTorch one working with torch.int32 and returning a float Tensor.graph.decimate_graph: Performs one decimation step and updates the parameters and mask.graph.update_mask_activation: Updates the mask by removing the nactivate couplings with the smallest Dkl.graph.update_mask_decimation: Updates the mask by removing the n_remove couplings with the smallest Dkl.io.load_chains: Loads the sequences from a fasta file and returns the one-hot encoded version.io.load_params: Import the parameters of the model from a text file.io.load_params_old: Import the parameters of the model from a file.io.load_params_oldformat: Import the parameters of the model from a file. Assumes the old DCA format.io.save_chains: Saves the chains in a fasta file.io.save_params: Saves the parameters of the model in a file.io.save_params_oldformat: Saves the parameters of the model in a file. Assumes the old DCA format.plot.plot_PCA: Makes the scatter plot of the components (pc1, pc2) of the input data and shows the histograms of the components.plot.plot_autocorrelation: Plots the time-autocorrelation curve of the sequence identity and the generated and data sequence identities.plot.plot_contact_map: Plots the contact map.plot.plot_pearson_sampling: Plots the Pearson correlation coefficient over sampling time.plot.plot_scatter_correlations: Plots the scatter plot of the data and generated Cij and Cijk values.resampling.compute_mixing_time: Computes the mixing time using the t and t/2 method. The sampling will halt when the mixing time is reached orsampling.get_sampler: Returns the sampling function corresponding to the chosen method.sampling.gibbs_sampling: Gibbs sampling. Attempts L * nsweeps mutations to each sequence in 'chains'.sampling.gibbs_step_independent_sites: Performs a single mutation using the Gibbs sampler. This version selects different random sites for each chain. It issampling.gibbs_step_uniform_sites: Performs a single mutation using the Gibbs sampler. In this version, the mutation is attempted at the same sites for all chains.sampling.metropolis_sampling: Metropolis sampling. Attempts L * nsweeps mutations to each sequence in 'chains'.sampling.metropolis_step_independent_sites: Performs a single mutation using the Metropolis sampler. This version selects different random sites for each chain. It issampling.metropolis_step_uniform_sites: Performs a single mutation using the Metropolis sampler. In this version, the mutation is attempted at the same sites for all chains.sampling.sampling_profile: Samples from the profile model defined by the local biases only.statmech.compute_energy: Compute the DCA energy for a batch of sequences.statmech.compute_entropy: Compute the entropy of the DCA model.statmech.compute_logZ_exact: Compute the log-partition function of the model.statmech.compute_log_likelihood: Compute the log-likelihood of the model.statmech.enumerate_states: Enumerate all possible states of a system of L sites and q states.statmech.iterate_tap: Iterates the TAP equations until convergence.stats.extract_Cij_from_freq: Extracts the lower triangular part of the covariance matrices of the natural data and generated data starting from the frequencies.stats.extract_Cij_from_seqs: Extracts the lower triangular part of the covariance matrices of the natural data and generated data starting from the sequences.stats.generate_unique_triplets: Generates a set of unique triplets of positions. Used to compute the 3-points statistics.stats.get_correlation_two_points: Computes the Pearson coefficient and the slope between the two-point frequencies of data and chains.stats.get_covariance_matrix: Computes the weighted covariance matrix of the input multi sequence alignment.stats.get_freq_single_point: Computes the single point frequencies of the input MSA.stats.get_freq_three_points: Computes the 3-body connected correlation statistics of the input MSAs.stats.get_freq_two_points: Computes the 2-points statistics of the input MSA.training.train_eaDCA: Fits an eaDCA model on the training data and saves the results in a file.training.train_edDCA: Fits an edDCA model on the training data and saves the results in a file.training.train_graph: Trains the model on a given graph until the target Pearson correlation is reached or the maximum number of epochs is exceeded.training.update_params: Updates the parameters of the model.utils.get_device: Returns the device where to store the tensors.utils.get_dtype: Returns the data type of the tensors.utils.get_mask_save: Returns the mask to save the upper-triangular part of the coupling matrix.utils.init_chains: Initialize the Markov chains of the DCA model. If 'fi' is provided, the chains are sampled from theutils.init_parameters: Initialize the parameters of the DCA model. The bias terms are initializedutils.resample_sequences: Extracts nextract sequences from data with replacement according to the weights.
This file was automatically generated via lazydocs.