|Title:||Identification of Human Proteins Using Linguist’s Tools|
|Email:||click here to access email|
|Institution:||Indian Statistical Institute|
|Linguistic Field:||Discipline of Linguistics; Linguistic Theories; Syntax|
The symbolic sequences of the exons that make human proteins are subjected to methods of statistical linguistics. The ideas developed for the natural languages by G.K. Zipf, when applied to these sequences, show a significant promise. In Particular, we argue, The Zipf's exponent differentiate, and hence, identifies disparate human sequences. The codons, 64 in number, are distributed over the coding part of DNA sequences. Metaphorically speaking, the sequences almost resembles linguistic structure. The distribution function is the plot of frequency versus rank of the codons. These distributions are characterized by parameters that are almost universal, i.e., gene independent. Authors present the theory to calculate the universal (gene independent) with the help of linguistic theory. The part that is gene specific, however has undetermined overlaps and fluctuations.
Keywords: Codon, Proteins, Syntax, Zipf's Law.
|Venue:||Saha Institute of Nuclear Physics|
|Publication Info:||Indian Journal of Biochemistry and Biophysics, Vol. 38, pp. 124-127, February and April, 2001|
Add a new paper
Return to Academic Papers main page
Return to Directory of Linguists main page