'Adding desc for FASTA reading'
This commit is contained in:
parent
890d6186a5
commit
4e75e8a9c6
31
ex1-reading-fasta/description
Normal file
31
ex1-reading-fasta/description
Normal file
@ -0,0 +1,31 @@
|
|||||||
|
The job is to read FASTA file and count how many A, C, T, G letters are in an example sequence.
|
||||||
|
|
||||||
|
FASTA one can get from the internet. We can use e.g. wiki page to get one sequence:
|
||||||
|
|
||||||
|
https://en.wikipedia.org/wiki/FASTA_format
|
||||||
|
|
||||||
|
The FASTA format with one sequence for an elephant looks like this:
|
||||||
|
|
||||||
|
>gi|5524211|gb|AAD44166.1| cytochrome b [Elephas maximus maximus]
|
||||||
|
LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV
|
||||||
|
EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG
|
||||||
|
LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL
|
||||||
|
GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX
|
||||||
|
IENY
|
||||||
|
|
||||||
|
|
||||||
|
The FASTA format has to separate parts:
|
||||||
|
|
||||||
|
1. header
|
||||||
|
2. sequence
|
||||||
|
|
||||||
|
|
||||||
|
Thus, the excercise for reading FASTA file briefly looks like this:
|
||||||
|
|
||||||
|
1. Select sequence and save it to a file, e.g. seq.fasta
|
||||||
|
|
||||||
|
2. Open the file seq.fasta in C with fopen:
|
||||||
|
|
||||||
|
https://cplusplus.com/reference/cstdio/fopen/
|
||||||
|
|
||||||
|
Tip: Use correct mode.
|
Loading…
Reference in New Issue
Block a user