Merge branch 'master' of https://git.wmi.amu.edu.pl/ahypki/bioinf-2023-2024-introduction-cpp
This commit is contained in:
commit
dfc612c819
@ -6,12 +6,19 @@ https://en.wikipedia.org/wiki/FASTA_format
|
|||||||
|
|
||||||
The FASTA format with one sequence for an elephant looks like this:
|
The FASTA format with one sequence for an elephant looks like this:
|
||||||
|
|
||||||
>gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
|
>BTBSCRYR
|
||||||
LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV
|
tgcaccaaacatgtctaaagctggaaccaaaattactttctttgaagacaaaaactttca
|
||||||
EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG
|
aggccgccactatgacagcgattgcgactgtgcagatttccacatgtacctgagccgctg
|
||||||
LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL
|
caactccatcagagtggaaggaggcacctgggctgtgtatgaaaggcccaattttgctgg
|
||||||
GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX
|
gtacatgtacatcctaccccggggcgagtatcctgagtaccagcactggatgggcctcaa
|
||||||
IENY
|
cgaccgcctcagctcctgcagggctgttcacctgtctagtggaggccagtataagcttca
|
||||||
|
gatctttgagaaaggggattttaatggtcagatgcatgagaccacggaagactgcccttc
|
||||||
|
catcatggagcagttccacatgcgggaggtccactcctgtaaggtgctggagggcgcctg
|
||||||
|
gatcttctatgagctgcccaactaccgaggcaggcagtacctgctggacaagaaggagta
|
||||||
|
ccggaagcccgtcgactggggtgcagcttccccagctgtccagtctttccgccgcattgt
|
||||||
|
ggagtgatgatacagatgcggccaaacgctggctggccttgtcatccaaataagcattat
|
||||||
|
aaataaaacaattggcatgc
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
The FASTA format has to separate parts:
|
The FASTA format has to separate parts:
|
||||||
@ -20,7 +27,7 @@ The FASTA format has to separate parts:
|
|||||||
2. sequence
|
2. sequence
|
||||||
|
|
||||||
|
|
||||||
Thus, the excercise for reading FASTA file briefly looks like this:
|
Thus, the excercise for reading FASTA file, in C, briefly looks like this:
|
||||||
|
|
||||||
1. Select sequence and save it to a file, e.g. seq.fasta
|
1. Select sequence and save it to a file, e.g. seq.fasta
|
||||||
|
|
||||||
@ -28,4 +35,20 @@ Thus, the excercise for reading FASTA file briefly looks like this:
|
|||||||
|
|
||||||
https://cplusplus.com/reference/cstdio/fopen/
|
https://cplusplus.com/reference/cstdio/fopen/
|
||||||
|
|
||||||
Tip: Use correct mode.
|
Tip: Use correct mode.
|
||||||
|
|
||||||
|
3. Read the file, line by line, using e.g. snippet of code from here:
|
||||||
|
|
||||||
|
https://cplusplus.com/forum/beginner/22558/
|
||||||
|
|
||||||
|
4. Check if the lines starts with ';' or with '>', then skip such a line
|
||||||
|
|
||||||
|
If the line is OK, then count how many A, C, T, G are in the given line and
|
||||||
|
save that information.
|
||||||
|
|
||||||
|
In order to check if the line starts with a given character you can
|
||||||
|
simply use index of a string (line[0] - this is first character in the
|
||||||
|
array line).
|
||||||
|
|
||||||
|
The end.
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user