Merge branch 'master' of https://git.wmi.amu.edu.pl/ahypki/bioinf-2023-2024-introduction-cpp
This commit is contained in:
commit
dfc612c819
@ -6,12 +6,19 @@ https://en.wikipedia.org/wiki/FASTA_format
|
||||
|
||||
The FASTA format with one sequence for an elephant looks like this:
|
||||
|
||||
>gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
|
||||
LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV
|
||||
EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG
|
||||
LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL
|
||||
GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX
|
||||
IENY
|
||||
>BTBSCRYR
|
||||
tgcaccaaacatgtctaaagctggaaccaaaattactttctttgaagacaaaaactttca
|
||||
aggccgccactatgacagcgattgcgactgtgcagatttccacatgtacctgagccgctg
|
||||
caactccatcagagtggaaggaggcacctgggctgtgtatgaaaggcccaattttgctgg
|
||||
gtacatgtacatcctaccccggggcgagtatcctgagtaccagcactggatgggcctcaa
|
||||
cgaccgcctcagctcctgcagggctgttcacctgtctagtggaggccagtataagcttca
|
||||
gatctttgagaaaggggattttaatggtcagatgcatgagaccacggaagactgcccttc
|
||||
catcatggagcagttccacatgcgggaggtccactcctgtaaggtgctggagggcgcctg
|
||||
gatcttctatgagctgcccaactaccgaggcaggcagtacctgctggacaagaaggagta
|
||||
ccggaagcccgtcgactggggtgcagcttccccagctgtccagtctttccgccgcattgt
|
||||
ggagtgatgatacagatgcggccaaacgctggctggccttgtcatccaaataagcattat
|
||||
aaataaaacaattggcatgc
|
||||
|
||||
|
||||
|
||||
The FASTA format has to separate parts:
|
||||
@ -20,7 +27,7 @@ The FASTA format has to separate parts:
|
||||
2. sequence
|
||||
|
||||
|
||||
Thus, the excercise for reading FASTA file briefly looks like this:
|
||||
Thus, the excercise for reading FASTA file, in C, briefly looks like this:
|
||||
|
||||
1. Select sequence and save it to a file, e.g. seq.fasta
|
||||
|
||||
@ -29,3 +36,19 @@ Thus, the excercise for reading FASTA file briefly looks like this:
|
||||
https://cplusplus.com/reference/cstdio/fopen/
|
||||
|
||||
Tip: Use correct mode.
|
||||
|
||||
3. Read the file, line by line, using e.g. snippet of code from here:
|
||||
|
||||
https://cplusplus.com/forum/beginner/22558/
|
||||
|
||||
4. Check if the lines starts with ';' or with '>', then skip such a line
|
||||
|
||||
If the line is OK, then count how many A, C, T, G are in the given line and
|
||||
save that information.
|
||||
|
||||
In order to check if the line starts with a given character you can
|
||||
simply use index of a string (line[0] - this is first character in the
|
||||
array line).
|
||||
|
||||
The end.
|
||||
|
||||
|
Loading…
Reference in New Issue
Block a user