Script Arguments
In this section we list all the possible command-line arguments for the main routines of adabmDCA 2.0
.
Train a DCA model
Command |
Default value |
Description |
---|---|---|
|
N/A |
Filename of the dataset to be used for training the model. |
|
DCA_model |
Path to the folder where to save the model. |
|
bmDCA |
Type of model to be trained. Possible options are |
|
None |
Path to the file containing the weights of the sequences. If |
|
None |
Path to the file containing the model’s parameters. Required for restoring the training. |
|
None |
Path to the FASTA file containing the model’s chains. Required for restoring the training. |
|
None |
A label to identify different algorithm runs. It prefixes the output files with this label. |
|
protein |
Type of encoding for the sequences. Choose among |
|
0.05 |
Learning rate. |
|
10 |
Number of sweeps for each gradient estimation. |
|
gibbs |
Sampling method to be used. Possible options are |
|
10000 |
Number of Markov chains to run in parallel. |
|
0.95 |
Pearson correlation coefficient on the two-sites statistics to be reached. |
|
50000 |
Maximum number of epochs allowed. |
|
None |
Pseudo count for the single and two-sites statistics. Acts as a regularization. If |
|
0 |
Random seed. |
|
1 |
Number of threads used in the Julia multithreads version. |
|
cuda |
Device to use in the Python version, e.g. “cpu” or “cuda”. |
eaDCA options
Command |
Default value |
Description |
---|---|---|
|
10 |
Number of gradient updates to be performed on a given graph. |
|
0.001 |
Fraction of inactive couplings to try to activate at each graph update. |
edDCA options
Command |
Default value |
Description |
---|---|---|
|
10 |
The number of gradient updates applied at each step of the graph convergence process. |
|
0.02 |
Target density to be reached. |
|
0.01 |
Fraction of remaining couplings to be pruned at each decimation step. |
Sampling from a DCA model
Command |
Default value |
Description |
---|---|---|
|
N/A |
Path to the file containing the parameters of the DCA model to sample from. |
|
N/A |
Filename of the dataset MSA. |
|
N/A |
Path to the folder where to save the output. |
|
None |
Number of samples to generate. |
|
None |
Path to the file containing the weights of the sequences. If |
|
10000 |
Number of data sequences to use for computing the mixing time. The value min( |
|
2 |
Number of mixing times used to generate |
|
10000 |
Maximum number of sweeps allowed. |
|
protein |
Type of encoding for the sequences. Choose among |
|
gibbs |
Sampling method to be used. Possible options are |
|
1.0 |
Inverse temperature to be used for the sampling. |
Computing DCA energies of a MSA
Command |
Default value |
Description |
---|---|---|
|
N/A |
Filename of the input MSA. |
|
N/A |
Path to the file containing the parameters of the DCA model. |
|
N/A |
Path to the folder where to save the output. |
|
protein |
Type of encoding for the sequences. Choose among |
Generate a Deep Mutational Scan (DMS) from a wild type
Command |
Default value |
Description |
---|---|---|
|
N/A |
Filename of the input MSA containing the wild type. If multiple sequences are present, the first one is used. |
|
N/A |
Path to the file containing the parameters of the DCA model. |
|
N/A |
Path to the folder where to save the output. |
|
protein |
Type of encoding for the sequences. Choose among |
Compute the Frobenius contact matrix
Command |
Default value |
Description |
---|---|---|
|
N/A |
Path to the file containing the parameters of the DCA model. |
|
N/A |
Path to the folder where to save the output. |
|
None |
If provided, adds a label to the output files inside the output folder. |
|
protein |
Type of encoding for the sequences. Choose among |