Please note ...
FANSe3 is a commercial development project of Chi-Biotech Co. Ltd.
The public (free) version of FANSe3 here is only for trials, with only basic and limited features. For full version with unleashed power (both academic and commercial use), please contact Chi-Biotech to obtain a licence.
Feature | Free version | Commercial version |
Parallel CPU cores | Limited to 2 | Unlimited (>256) |
Cross-node parallelization | No | Standard version: no FANSe3s version: yes |
Unique mapping | Supported | Supported |
Fast indel detection | Yes | Yes |
Masked genome | Supported | Supported |
Export all multi-mapped locations | Up to 200 | Up to 200 |
Max read length | 1000 | Unlimited |
Unidirectional mapping | No | Yes, forward or reverse, for strand-specific applications |
Sequencer artifact compensation | No | Yes, optimized for Illumina and BGISEQ/MGISEQ flowcells, also compatible with Ion Torrent |
Format supported | FASTQ only | FASTQ, FASTA, FQC, one-line nucleotide, etc. (Use FQC format for the best performance) |
Disk I/O saving | No | Yes |
Batch mode | No | Yes, single indexing, mapping multiple datasets sequentially, saving time of indexing |
Performance optimization for RNA-seq | No | Yes, up to 20x faster than free version |
Optimized for single-cell RNA-seq | No | Yes, produces much less intermediate files |
Trim reads while mapping | No | Yes |
Direct quantification for RNA-seq | No | Yes, no need to use other programs to obtain read count and rpkM values |
Command-line usage
FANSe3 -R<ref.fa> -D<reads.fq> [-O<out.fanse3>] [-E3] [-S14] [-C2] [-H1] [--indel] [--unique] [--mask]
Option | Optional? | Explanation |
-R | compulsory | Reference sequence file (FASTA format). Supports UNC name (like \\server1\myfolder\abc.fa). Supports Chinese characters. In the FASTA file, the sequence name may contain space and special characters. In the FASTA file, no limitation of the sequence name. |
-D | compulsory | FASTQ dataset file. Supports UNC name (like \\server1\myfolder\ionS5-1.fq). Supports Chinese characters. |
-O | optional | Output file name. Automatic generate when missing. Supports UNC name (like \\server1\myfolder\ionS5-1.fanse3). Supports Chinese characters. |
-E | optional | Error allowance (Levenshtein Distance). Default=3. Mismatch and indel are all counted as errors. It can be set as integer or percentage. Integer: like -E5, designate fixed number of errors allowed in the alignment. This is preferred when read length is fixed, e.g. Illumina and BGISEQ/MGISEQ sequencers. Percentage: like -E5%, designate error allowance as a percentage of the read length. This is useful when the read length is variable, e.g. Ion Torrent and Helicos sequencers, or the short fragments after trimming the adapters. |
-S | optional | Seed length. Default=14. Can be set as any integer from 6 to 14. Larger seed length will be faster but may lose more reads when the error rate exceeds 6%. Please refer to the FANSe2 paper to set a proper seed length according to your high error rates scenarios for an estimated accuracy. |
-C | optional | Parallel CPU cores. Default=2. For free version, max=2. |
-H | optional | Batch size: how many reads (in million) will be loaded for each batch. -H2 means 2 million reads per batch. -H0.5 means 0.5 million reads per batch. |
--indel | optional | Fast indel detection on. Equivalent to the "-I1" in FANSe2. |
--unique | optional | Unique mapping. When this toggle is present, the uniquely mapped reads will be stored in the .fanse3 file, and the multi-mapped reads will be stored in a separate -multimap.fanse3 file. |
--mask | optional | Masked genome. When this toggle is present, the lower-case letters in the reference sequences will not be considered. Equivalent to the "-M1" in FANSe2. |
Quick examples:
FANSe3 -Rref.fa -Dillumina.fq --unique
FANSe3 -Rref.fa -Diontorrent.fq -E5% --indel --unique
Result file format
There will be four result files:
Format description of the .fanse3 files
The .fanse3 files are very similar to the .fanse2 files, with some additinoal information.
Example of a uniquely mapped read:
42628 AGCAAGGACTAACCCCTATACC .................x....
F NM_001190470 1 142 1
Line 1:
42628 = read name (exactly as in FASTQ, could be a string)
AGCAAGGACTAACCCCTATACC = read nucleotide sequence
.................x.... = alignment. "x"=mismatch; "-"=deletion; nucleotide=insertion.
Line 2:
F = direction (F = forward, R = reverse)
NM_001190470 = mapped reference name
1 = error count (Levenshtein Distance)
142 = position (0-based)
1 = number of mapped locations with equal Levenshtein Distance (1 = uniquely mapped)
Example of a multi-mapped read:
369061 AGCTGGTACAGAAAGCCAAATTCGCTG ....................x......,....................x......
F,F NM_003404,NM_139323 1 405,310 2
For multi-mapped reads, the directions, reference names, positions would be multiple values separated by a comma ",".
Auxiliary programs
This program converts the FANSe3 mapping result file (normally with .fanse3 extension) to .BED file format for visualization. Many visualization tools like UCSC Genome Browser and IGV can visualize .BED files.
FANSe3toBED a.fanse3
a.fanse3: the fanse3 mapping result file.
Output: a.BED
Prerequisite: .NET framework 4.7.2. This program runs in 32/64-bit Windows platform.
Download: Click here to download