GNBF5010 Homework 2Please zip all your files for Homework 2, including the scripts, input files and output files if any, into asingle file called YourLastname_Firstname_HW2.zip (or .rar). Then submit it to the Blackboardon or before Wednesday, 23 October 2019.NOTE 1: You will need to add necessary comments in your program to explain your code. Examples ofcommenting can be found in the textbook.NOTE2: Test your program with various test cases to ensure that it works properly.1. Unknown LettersWrite a program to list which letters in the file seqs.txt are not A, T, C, or G. It should only listeach letter once. Hint: Start with an empty list for unknown letters. Then use two loops to scanletters in each sequences.2. Sequence PropertiesWrite a program, 1) read all sequences in seqs.txt and store them into a list called seqs, 2)prompt the user a menu for selection of various properties of the seuqences, and 3) show thecorresponding results based on user’s choice. The menu for selection should include:1) Number of sequences in the input file2) Number of occurrences of a specific sequence, e.g. GGATC (The program will promptanother message to the user for the target sequence.)3) Number of sequences that are longer than a particular length, e.g. 1000 bases (Theprogram will ask the user again for the minimum length.)4) Number of sequences with GC content higher than a given value, e.g. 50% (The GCcontent could be calculated as (num_of_G + num_of_C) / seq_total_len )5) The combination of choices 3 and 4: Number of sequences longer than a particularlength and with GC content over a particular valueIn your program, there should be separate functions for the analysis in options 1 to 4. Yourprogram should work like this:Please select the sequences property that you want to display, or press 0 toexit the program.1) Total number of sequences2) Number of pattern occurrences3) Number of sequences with length >= min_len4) Number of sequences with GC% >= min_GC5) Number of sequences with length >= min_len and GC% >= min_GCEnter the choice: 4Enter the minimum GC content (min_GC): 50Calculating …There are 36 sequences with GC% >= 50%.==Please select the sequences property that you want to display, or press 0 toexit the program.GNBF5010 Homework 21) Total number of sequences2) Number of pattern occurrences3) Number of sequences with length >= min_len4) Number of sequences with GC% >= min_GC5) Number of sequences with代写GNBF5010、代做scripts、代写Python编 length >= min_len and GC% >= min_GCEnter the choice: 5Enter the minimum length (min_len): 1000Enter the minimum GC content (min_GC): 40Calculating …There are 10 sequences with length >= 1000 bases and GC% >= 40%.==Please select the sequences property that you want to display, or press 0 toexit the program.1) Total number of sequences2) Number of pattern occurrences3) Number of sequences with length >= min_len4) Number of sequences with GC% >= min_GC5) Number of sequences with length >= min_len and GC% >= min_GCEnter the choice: 0Exiting the program …3. Unique WordsWrite a program that displays a list of all the unique words found in the file uniq_words.txt.Print your results in alphabetic order and lowercase. Hint: Store words as the elements of a set;remove punctuations by using the string.punctuation from the string module.4. Molecular Weighta) Make a python dictionary of one-letter amino acids codes (the keys) to their molecularweight (the values), for all 22 amino acids. The molecular weight of 22 amino acids can befound in the table of next page. As an example, the molecular weight of C (Cysteine) is 121.b) Print out a list of all the amino acids sorted by their molecular weights from the heaviest tothe lightest. Hint: You may need to sort the items of the dictionary in question (a) based onthe values; example output:AA MWW 204DaY 181DaR 174DaF 165Da… …c) Read the protein sequence from lysozyme.fasta and calculate the molecular weight ofthis protein using the dictionary created in question (a).GNBF5010 Homework 25. Palindromic sequenceA palindromic sequence is a nucleic acid sequence in a double-stranded DNA or RNA moleculewherein reading in a certain direction (e.g. 5 to 3) on one strand matches the sequence readingin the same direction (e.g. 5 to 3) on the complementary strand. Here is an example:, where on both strands, reading from 5’ to 3’ leads to the same sequence: GAATTC. The DNAsequence GAATTC is thus said to be palindromic. For more details about the function ofpalindromic sequences, see here. Now, write a program that reads DNA sequences from the filepalin_seq.txt and uses recursion to determine whether each of them is a palindromicsequence. Print the results of your program in the following format.1) ATCGAT --- YES2) GAATTC --- YES3) ATCGGCTA --- NO…Hint: Use string slicing to refer to and compare the characters on either end of the sequence string.转自:http://www.3daixie.com/contents/11/3444.html
讲解:GNBF5010、scripts、Python、PythonProlog|R
©著作权归作者所有,转载或内容合作请联系作者
- 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
- 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
- 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...