diff --git a/ex1-reading-fasta/description b/ex1-reading-fasta/description index 6d7824d..557cde4 100644 --- a/ex1-reading-fasta/description +++ b/ex1-reading-fasta/description @@ -6,12 +6,19 @@ https://en.wikipedia.org/wiki/FASTA_format The FASTA format with one sequence for an elephant looks like this: ->gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus] -LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV -EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG -LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL -GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX -IENY +>BTBSCRYR +tgcaccaaacatgtctaaagctggaaccaaaattactttctttgaagacaaaaactttca +aggccgccactatgacagcgattgcgactgtgcagatttccacatgtacctgagccgctg +caactccatcagagtggaaggaggcacctgggctgtgtatgaaaggcccaattttgctgg +gtacatgtacatcctaccccggggcgagtatcctgagtaccagcactggatgggcctcaa +cgaccgcctcagctcctgcagggctgttcacctgtctagtggaggccagtataagcttca +gatctttgagaaaggggattttaatggtcagatgcatgagaccacggaagactgcccttc +catcatggagcagttccacatgcgggaggtccactcctgtaaggtgctggagggcgcctg +gatcttctatgagctgcccaactaccgaggcaggcagtacctgctggacaagaaggagta +ccggaagcccgtcgactggggtgcagcttccccagctgtccagtctttccgccgcattgt +ggagtgatgatacagatgcggccaaacgctggctggccttgtcatccaaataagcattat +aaataaaacaattggcatgc + The FASTA format has to separate parts: @@ -20,7 +27,7 @@ The FASTA format has to separate parts: 2. sequence -Thus, the excercise for reading FASTA file briefly looks like this: +Thus, the excercise for reading FASTA file, in C, briefly looks like this: 1. Select sequence and save it to a file, e.g. seq.fasta @@ -28,4 +35,20 @@ Thus, the excercise for reading FASTA file briefly looks like this: https://cplusplus.com/reference/cstdio/fopen/ -Tip: Use correct mode. \ No newline at end of file +Tip: Use correct mode. + +3. Read the file, line by line, using e.g. snippet of code from here: + +https://cplusplus.com/forum/beginner/22558/ + +4. Check if the lines starts with ';' or with '>', then skip such a line + +If the line is OK, then count how many A, C, T, G are in the given line and +save that information. + +In order to check if the line starts with a given character you can +simply use index of a string (line[0] - this is first character in the +array line). + +The end. +