Rohmer Coralie
msa-limit

Repository

git clone https://gitlab.cristal.univ-lille.fr/crohmer/msa-limit.git
cd msa-limit
./msa-limit.py test
Usage:
msa-limit.py -i <file_reads> -r <file_ref> [-options]

Arguments:
  required:
    -i <string>
       nanopore long reads file (fasta or fastq)
    -r <string>
       reference sequence file (fasta, a single sequence)
       IUPAC consensus sequence in the diploid case

  optional:
    -n <string>
       default: date and time of execution
       name of the experiment
    -o <int>
       default: 10
       number of regions to be tested
    -b <int>,<int>,...
       beginning(s) position of region(s) (replacing -o)
    -d <int>,<int>,...
       default: 10,20,50
       sequencing depth(s) (number of reads)
    -s <int>,<int>,...
       default: 100,200
       size(s) of region(s)
    -t <int>,<int>,...
       default: 50
       threshold(s) for sequences consensus
    -m <string>,<string>,...
       default: muscle,mafft,poa,kalign,spoa,kalign3,clustalo,abpoa,tcoffee (all)
       MSA software(s) to run
    -h
       help

Ex: ./msa-limit.py -i reads.fastq -r ref.fasta -b 1,150 -n exp -d 10,100 -s 100,200 -t 50,75 -m mafft,poa
Usage:
msa-limit.py -i <file_reads> -r <file_ref> [-options]

Other modes:
  test
      Launches a pipeline test
  list
      List of existing experiments
  summary
      More readable summary of experiments for a human
        optional:
         -n <string> <string> <string> ...
            default: all the names of the experiments
            names of the experiments you want to display in the summary.
  run_config <string> <string> ...
       Launches the pipeline from configuration file(s)
         required: path to the configuration file(s).
  rulegraph
       Displays a graph of the snakemake rules
I: <reads_file>    #REQUIRED, absolute path of preference
I: <ref_file>      #REQUIRED, absolute path of preference
n: test            #OPTIONAL, -n
D: [10,20,50]      #OPTIONAL, -d
S: [100,200]       #OPTIONAL, -s
T: [50]            #OPTIONAL, -t
M: [muscle,mafft]  #OPTIONAL, -m
O: 10              #OPTIONAL, -o, can be replaced by -b (B: [1,150])
snakemake --configfile <config_file> -c24 --use-conda
conda create -n <new_msa>
conda install <new_msa>
conda env export >env_conda/<new_msa>.yaml
rule <new_msa> :
    input :
        os.path.join('{data_set}','selected_read','reads_r{region_size}_d{depth}.fasta')
    output :
        time = os.path.join('{data_set}','time','MSA_<new_msa>_r{region_size}_d{depth}'),
        out = os.path.join('{data_set}','msa','MSA_<new_msa>_r{region_size}_d{depth}.fasta')
    message:
        "<new_msa> for {wildcards.data_set} (Region size={wildcards.region_size} & Depth={wildcards.depth})"
    log:
        os.path.join('{data_set}','logs','6_<new_msa>_r{region_size}_d{depth}.log')
    conda:   #Only if you use conda
        "env_conda/<new_msa>.yaml"
    shell :
        './src/run_MSA.sh "<command_to_launch_the_software>" {input} {output.out} {output.time} {log} 1'