'Adding desc for FASTA reading'

2024-04-26 08:54:47 +02:00 · 2024-04-26 08:54:47 +02:00 · 4e75e8a9c6
commit 4e75e8a9c6
parent 890d6186a5
1 changed files with 31 additions and 0 deletions
--- a/ex1-reading-fasta/description
+++ b/ex1-reading-fasta/description
@ -0,0 +1,31 @@
+The job is to read FASTA file and count how many A, C, T, G letters are in an example sequence.
+
+FASTA one can get from the internet. We can use e.g. wiki page to get one sequence:
+
+https://en.wikipedia.org/wiki/FASTA_format
+
+The FASTA format with one sequence for an elephant looks like this:
+
+>gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
+LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV
+EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG
+LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL
+GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX
+IENY
+
+
+The FASTA format has to separate parts:
+
+1. header
+2. sequence
+
+
+Thus, the excercise for reading FASTA file briefly looks like this:
+
+1. Select sequence and save it to a file, e.g. seq.fasta
+
+2. Open the file seq.fasta in C with fopen:
+
+https://cplusplus.com/reference/cstdio/fopen/
+
+Tip: Use correct mode.