preview

Analyzing The Function Of A Non Coding

Decent Essays

While humans are obviously different from many other mammals, such as dogs or mice, about 5% of the human genome consists of conserved sequences shared by all mammals. Interestingly, over two thirds of these sequences do not code for protein, but this does not necessarily mean that they are non-functional. One likely possibility is that these non-coding regions conserved throughout the mammalian genome function in genetic regulation. However, before determining the function of these regions within the genome, they must first be identified. After determining the conserved sequences, they can then be classified according to function. One particularly informative way to decipher the function of a non-coding DNA sequence is to determine …show more content…

Next, they filtered out any CNEs that were less than twelve base pairs in length due to the fact that sequences twelve or more base pairs in length are expected to be found at relatively low rates simply due to chance. The sequences remaining were then collapsed into groups based on sequence similarity, and consensus sequences (termed motifs) were generated for each group. These consensus sequences were generated by calculating a position weight matrix (PWM), which is generally done by looking at each position within a sequence, determining how often each of the four possible nucleotides is present at that position, and assigning higher weights to those nucleotides which occur more frequently at that position. With this newly generated catalog of sequences, the authors further characterized each motif based on conservation. The authors used various quantitative measures to determine how prevalent each of the CNEs is within the human genome and how well conserved each CNE is. First, they simply looked at how often each motif occurred within the human genome. A ratio was then determined of the number of actual occurrences within the human genome to the number of times that sequence is expected to be found by chance. Therefore, lower ratios imply that the motif is present in the genome at a higher rate than expected by chance. Next, the conservation across mammals was observed by taking the number of times the motif was detected in the same region across many

Get Access