16 September 2017

Gene-making: some questions answered

In my previous post I attempted to identify all the ways that a new gene can come about, after defining what I meant by "new" and "gene." Two questions came up, one a comment on the post, and the other via Facebook.

1. What about exon shuffling? Isn't that a mechanism by which new genes are made?

Exon shuffling is, roughly, the creating of new coding sequences (genes, as I'm defining them in this series) by the rearrangement of pieces of coding sequence. (An exon is a piece of coding sequence in DNA or RNA.) It's a nice descriptive phrase: the exons are gene pieces, and they can be moved around to make new cominbations that code for new proteins. So is this an additional mechanism for the creation of genes?

No, it's not.

It's just another (perhaps better) way to describe the mechanisms that I described in my previous post. Exon shuffling is the result of the operation of those mechanisms. Specifically, mechanism 6, "Rearrangement of DNA in the genome," is probably the most common driver of exon shuffling. And sometimes simple duplication of a gene (my mechanism 4) can result in exon shuffling and birth of new genes. A recent paper in PLoS Genetics describes this and documents how it has contributed to gene birth in fruit flies. This means that when I claimed that gene duplication can't make new genes "by definition," I was underestimating the ways that even simple duplication can yield new functional combinations of genetic material.

2. Larry Moran has written extensively on evolutionary genetics, and on new gene birth specifically, and is a longtime online friend of mine. He commented on the post on Facebook, raising two concerns. First, he referred to possible confusion (I would say flexibility) about the definition of 'gene' and then noted that genes need not code for protein. He's right on both counts. But I'm writing about new protein-coding genes specifically. I'm making it as clear as I can, but I guess it's worth repeating: I am exploring new findings about the birth of new proteins, encoded by new genes. It's easy to make a new gene when 'gene' includes DNA sequences that make RNA that does stuff other than code for protein. That's interesting, and important, but it's not as remarkable (in my opinion) as a new gene that makes a new protein. And it's not the subject in this series.

Second, Larry doesn't agree with me that Arhgap11b is a de novo gene. This will quickly become semantic, and therefore uninteresting, but I think we have to call the Arhgap11b protein a new protein. I introduced the details in a previous post, and will make this case more explicitly soon, but my basic argument is that Arhgap11b includes a substantial stretch of new protein sequence that has been tacked onto pre-existing sequence. The resulting protein has a new and (so far) unknown function. Those two facts (new sequence and new function) make this a new gene in my view, despite the fact that some of its sequence is not new. And regardless of how we designate the relative amount of new sequence that is required to designate a gene as "new," we have to reckon with the fact that new protein sequences, even those generated randomly, have the capacity to step into new functional roles, essentially overnight.

No comments: