Script Arguments
In this section we list all the possible command-line arguments for the main routines of adabmDCA 2.0
.
Train a DCA model
Command |
Default value |
Description |
---|---|---|
|
N/A |
Filename of the dataset to be used for training the model. |
|
DCA_model |
Path to the folder where to save the model. |
|
bmDCA |
Type of model to be trained. Possible options are |
|
None |
Path to the file containing the weights of the sequences. If |
|
0.8 |
Sequence identity threshold to be used for computing the sequence weights. |
|
N/A |
If this flag is used, the routine assigns uniform weights to the sequences. |
|
None |
Path to the file containing the model’s parameters. Required for restoring the training. |
|
None |
Path to the FASTA file containing the model’s chains. Required for restoring the training. |
|
None |
A label to identify different algorithm runs. It prefixes the output files with this label. |
|
protein |
Type of encoding for the sequences. Choose among |
|
0.05 |
Learning rate. |
|
10 |
Number of sweeps for each gradient estimation. |
|
gibbs |
Sampling method to be used. Possible options are |
|
10000 |
Number of Markov chains to run in parallel. |
|
0.95 |
Pearson correlation coefficient on the two-sites statistics to be reached. |
|
50000 |
Maximum number of epochs allowed. |
|
None |
Pseudo count for the single and two-sites statistics. Acts as a regularization. If |
|
0 |
Random seed. |
|
1 |
Number of threads used in the Julia multithreaded version. |
|
cuda |
Device to be used between cuda (GPU) and CPU. Used in the Python version. |
|
float32 |
Data type to be used between float32 and float64. Used in the Python version. |
eaDCA options
Command |
Default value |
Description |
---|---|---|
|
10 |
Number of gradient updates to be performed on a given graph. |
|
0.001 |
Fraction of inactive couplings to try to activate at each graph update. |
edDCA options
Command |
Default value |
Description |
---|---|---|
|
10 |
The number of gradient updates applied at each step of the graph convergence process. |
|
0.02 |
Target density to be reached. |
|
0.01 |
Fraction of remaining couplings to be pruned at each decimation step. |
Sampling from a DCA model
Command |
Default value |
Description |
---|---|---|
|
N/A |
Path to the file containing the parameters of the DCA model to sample from. |
|
N/A |
Filename of the dataset MSA. |
|
N/A |
Path to the folder where to save the output. |
|
None |
Number of samples to generate. |
|
None |
A label to identify different algorithm runs. It prefixes the output files with this label. |
|
None |
Path to the file containing the weights of the sequences. If |
|
0.8 |
Sequence identity threshold to be used for computing the sequence weights. |
|
N/A |
If this flag is used, the routine assigns uniform weights to the sequences. |
|
10000 |
Number of data sequences to use for computing the mixing time. The value min( |
|
2 |
Number of mixing times used to generate ‘ngen’ sequences starting from random. |
|
10000 |
Maximum number of sweeps allowed. |
|
protein |
Type of encoding for the sequences. Choose among |
|
gibbs |
Sampling method to be used. Possible options are |
|
1.0 |
Inverse temperature to be used for the sampling. |
|
None |
Pseudo count for the single and two-sites statistics. Acts as a regularization. If |
|
cuda |
Device to be used between cuda (GPU) and CPU. Used in the Python version. |
|
float32 |
Data type to be used between float32 and float64. Used in the Python version. |
Computing DCA energies of a MSA
Command |
Default value |
Description |
---|---|---|
|
N/A |
Filename of the input MSA. |
|
N/A |
Path to the file containing the parameters of the DCA model. |
|
N/A |
Path to the folder where to save the output. |
|
protein |
Type of encoding for the sequences. Choose among |
|
cuda |
Device to be used between cuda (GPU) and CPU. Used in the Python version. |
|
float32 |
Data type to be used between float32 and float64. Used in the Python version. |
Generate a Deep Mutational Scan (DMS) from a wild type
Command |
Default value |
Description |
---|---|---|
|
N/A |
Filename of the input MSA containing the wild type. If multiple sequences are present, the first one is used. |
|
N/A |
Path to the file containing the parameters of the DCA model. |
|
N/A |
Path to the folder where to save the output. |
|
protein |
Type of encoding for the sequences. Choose among |
|
cuda |
Device to be used between cuda (GPU) and CPU. Used in the Python version. |
|
float32 |
Data type to be used between float32 and float64. Used in the Python version. |
Compute the Frobenius contact matrix
Command |
Default value |
Description |
---|---|---|
|
N/A |
Path to the file containing the parameters of the DCA model. |
|
N/A |
Path to the folder where to save the output. |
|
None |
If provided, adds a label to the output files inside the output folder. |
|
protein |
Type of encoding for the sequences. Choose among |
|
cuda |
Device to be used between cuda (GPU) and CPU. Used in the Python version. |
|
float32 |
Data type to be used between float32 and float64. Used in the Python version. |
¹ Used in specific versions of the software.