If you aready of a reference genome or assembly, you can skip the de Novo steps and do as follows:
bwa index final_contigs_300.fasta
OR A_model_genome.fasta
.amb
, .ann
, etc.fastq
build a list of the fastq RA and RB files into different columns:
ls *RA* | sed "s/\.fastq//g" > bamlistA
ls *RB* | sed "s/\.fastq//g" > bamlistB
paste bamlist? > bamlist
run_align.sh
script For example copy cp ~millermr/common/run_align.sh ./
sh run_align.sh bamlist final_contigs_300.fasta
RA
and RB
) and aligning to the _300.fasta
file (or model genome file)Sometimes the nodes fail, and you need to find/re-run scripts for certain files. Best option is to find/sort the files that did work, and then grep against the files that didn’t to make a new list, then re-run run_align.sh
script with new list.
ls *.flt.bam | sed "s/\.sort\.flt\.bam//g" | grep -f - -v bamlist > bamlist_rerun