Add prefixsplit function #168

Tixii · 2021-02-22T21:06:29Z

This adds the ability to read a fastq/fasta file and split the file based on the prefix of each read to enable faster sorting of read sets.

Usage: seqtk prefixsplit [options] <output_filename> <in.fa>
Options:
-p INT length of prefix
-A force FASTA output (discard quality)
-C drop comments at the header lines

It will create files for each prefix of the specified length, e.g.
output_filename.AA.fa
output_filename.AC.fa
....
plus a single file that contains those reads with an N at any position in the prefix:
output_filename.N.fa

Currently only prefix lengths of 1, 2, or 3 are possible, as I felt that creating more than 64 files wouldnt be useful.

There are options to remove the quality scores and drop comments using the same methods as the seqtk seq function.

I have tried to stick to the coding format of the rest of the file, however, this is my first time coding in C and therefore I am sure there are improvements that could be made.

This adds the ability to read a fastq/fasta file and split the file based on the prefix of each read to enable fasting sorting of read sets. Usage: seqtk prefixsplit [options] <output_filename> <in.fa> Options: -p INT length of prefix -A force FASTA output (discard quality) -C drop comments at the header lines It will create files for each prefix of the specified length, e.g. output_filename.AA.fa output_filename.AC.fa .... plus a single file that contains those with an N in the prefix: output_filename.N.fa There are options to remove the quality scores and drop comments using the same methods as the seqtk seq function. I have tried to stick to the coding format of the rest of the file, however, this is my first time coding in C and therefore I am sure there are improvements that could be made.

typo

Unknown added 4 commits February 22, 2021 13:04

fix usage help message

2950ce1

typo

Add option to only print the sequence with no other information

77fcebc

add seqtk exe to gitignore

0eee545

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add prefixsplit function #168

Add prefixsplit function #168

Tixii commented Feb 22, 2021 •

edited

Add prefixsplit function #168

Are you sure you want to change the base?

Add prefixsplit function #168

Conversation

Tixii commented Feb 22, 2021 • edited

Tixii commented Feb 22, 2021 •

edited