Genet50336 1675.168

Copyright Ó 2006 by the Genetics Society of AmericaDOI: 10.1534/genetics.105.050336 Recently Evolved Genes Identified From Drosophila yakuba and D. erecta David J. Begun,1 Heather A. Lindfors, Melissa E. Thompson and Alisha K. Holloway Section of Evolution and Ecology, University of California, Davis, California 95616 Accepted for publication November 22, 2005 The fraction of the genome associated with male reproduction in Drosophila may be unusually dy- namic. For example, male reproduction-related genes show higher-than-average rates of protein diver-gence and gene expression evolution compared to most Drosophila genes. Drosophila male reproductionmay also be enriched for novel genetic functions. Our earlier work, based on accessory gland proteingenes (Acp’s) in D. simulans and D. melanogaster, suggested that the melanogaster subgroup Acp’s may be lostand/or gained on a relatively rapid timescale. Here we investigate this possibility more thoroughlythrough description of the accessory gland transcriptome in two melanogaster subgroup species, D. yakubaand D. erecta. A genomic analysis of previously unknown genes isolated from cDNA libraries of these speciesrevealed several cases of genes present in one or both species, yet absent from ingroup and outgroupspecies. We found no evidence that these novel genes are attributable primarily to duplication and di-vergence, which suggests the possibility that Acp’s or other genes coding for small proteins may originatefrom ancestrally noncoding DNA.
AN extensive literature documenting the unusually most proteins (Begun et al. 2000; Swanson et al. 2001; rapid evolution of reproductive traits in many taxa Holloway and Begun 2004; Kern et al. 2004; Mueller suggests that sexual selection may be a primary agent of et al. 2005; Wagstaff and Begun 2005a,b). Population evolution in natural animal populations (e.g., Eberhard genetic evidence for directional selection on Acp’s has 1985; Andersson 1994; Birkhead and Moller 1998).
been found in the melanogaster subgroup, the repleta group, Although most data bearing on evolution of reproduc- and the obscura group of Drosophila (Tsaur and Wu tive traits are morphological or behavioral in nature, 1997; Aguade´ 1999; Begun et al. 2000; Holloway and directional selection on reproductive function should Begun 2004; Kern et al. 2004; Wagstaff and Begun be manifest in patterns of genome evolution. For 2005a,b; Begun and Lindfors 2005), perhaps due to example, a genomic approach for identifying biological male–male, male–female, or fly–pathogen interactions.
functions that may be under directional selection is to As noted previously, genomic surveys of divergence of use sequence divergence in concert with gene annota- male reproduction-related genes have demonstrated tion to identify functions enriched for rapidly evolving that they evolve rapidly compared to most other protein proteins (e.g., Nielsen et al. 2005; Richards et al. 2005).
classes. Indeed, many testis-expressed Drosophila mela- Such analyses support the idea that proteins function- nogaster genes have no obvious homolog in D. pseudoob- ing in male reproduction in Drosophila, mice, and pri- scura (Richards et al. 2005), which is consistent with mates evolve unusually quickly (Zhang et al. 2004; Good either very rapid evolution or gene presence/absence and Nachman 2005; Nielsen et al. 2005; Richards et al.
variation (i.e., lineage-restricted genes). The notion that 2005). Such data do not prove that rapid evolution results genes coding for male reproductive functions may be from directional selection. However, the repeatability enriched for lineage-restricted genes in Drosophila is across taxa of the pattern of rapid protein evolution is supported by reports of recently evolved, novel genes that certainly consistent with this idea.
are expressed in Drosophila testes (Long and Langley Drosophila ACPs (seminal fluid proteins) have been 1993; Nurminsky et al. 1998; Betran and Long 2003).
the subject of several evolutionary and functional in- Although there has been little systematic investiga- vestigations. These proteins elicit manifold physiological tion regarding the question of whether reproductive and behavioral changes in females (reviewed in Chapman functions are characteristic of lineage-restricted genes, and Davies 2004) and play an important role in we previously reported that in Drosophila, an Acp in a sperm storage (Neubaum and Wolfner 1999; Tram and given species is sometimes absent from a related species Wolfner 1999). They evolve quite rapidly compared to (Begun and Lindfors 2005; Wagstaff and Begun2005a). For example, 6 of 13 D. melanogaster Acp’s in- vestigated were absent from D. pseudoobscura (Wagstaff Corresponding author: Section of Evolution and Ecology, University of and Begun 2005a). A subsequent analysis of additional D. melanogaster Acp’s vs. D. pseudoobscura yielded compa- purified [QIAGEN (Chatsworth, CA) QIAquick PCR purifica- rable results (Mueller et al. 2005). A subset of the D.
tion kit], incubated in Promega (Madison, WI) Taq poly- melanogaster Acp’s that are absent from D. pseudoobscura merase, and ligated into PCR4 TOPO vector (Invitrogen).
Ligations were transformed and plated, with the resulting have loss-of-function phenotypes or show evidence of colonies subjected to PCR using vector primers. Colony PCR directional selection in D. melanogaster/D. simulans, products were sequenced at the University of California at which suggests that invoking ‘‘functional redundancy’’ Davis College of Agricultural and Environmental Sciences and gene loss is overly simplistic. In fact, these analyses Genomics Facility. For D. yakuba, 415 clones were sequenced.
of D. melanogaster vs. D. pseudoobscura could not broach They yielded 360 high-quality sequences, which assembled(Lasergene) into 119 unique contigs. For D. erecta, 333 clones the issue of whether the lineage distribution of Acp’s were sequenced. They yielded 252 high-quality sequences and in these two species is explained by gene loss in D.
114 unique contigs. Unique D. yakuba and D. erecta accessory pseudoobscura, gene gain in D. melanogaster, or some com- gland ESTs can be found under GenBank accession nos.
bination. We also found putative cases of recent loss of The complexity of these libraries appears to be considerably greater than that estimated from random sequencing of a D.
2005). For example, D. melanogaster is missing an Acp that mojavensis accessory gland cDNA library (Wagstaff and was present in the common ancestor of D. melanogaster Begun 2005b; 26 transcripts from 139 random clones). This and D. simulans and that is present as a single-copy gene suggests that Drosophila species vary in the complexity of the in D. simulans, indicating that this gene was lost within accessory gland transcriptome, but more quantitative data would be required to address this issue.
Analysis of ESTs: Each unique EST was compared by BLAST did not find unambiguous evidence for gains of Acp’s in to predicted D. melanogaster genes and proteins. ESTs return- the melanogaster subgroup. Nevertheless, loss of Acp’s ing E-values ,1e-15 were considered to be candidate un- implies either that compensatory gains maintain mela- annotated homologous Acp’s or candidate Acp’s absent from nogaster subgroup seminal fluid protein-coding capacity the D. melanogaster genome. Each candidate was then com- or that the melanogaster subgroup is evolving toward a pared (BLASTn) to D. melanogaster chromosome arms to deter-mine if there was evidence for an unannotated D. melanogaster lower equilibrium number of Acp’s per genome.
gene corresponding to the D. yakuba or D. erecta EST. ESTs The gain and/or loss of Acp’s over time will result in that failed to show convincing BLAST hits to D. melanogaster the gradual functional divergence of seminal fluid func- were candidate lineage-restricted genes (although they could tion between Drosophila lineages, presumably under also be highly diverged orthologs). RACE was used to isolate the influence of natural selection. One possible mech- the entire transcript associated with each putative lineage-restricted gene. These genes were investigated in terms of anism for gene gain is duplication followed by functional splicing, predicted protein sequence, and whether they were divergence (Ohno 1970). However, computational anal- present as putative single-copy genes in D. yakuba or D. erecta on ysis of the D. melanogaster genome suggested that most the basis of BLAST or BLAT analyses to genome assemblies.
duplicated Acp’s are ancient (Holloway and Begun Finally, given that most ACPs have strongly predicted signal sequences (Swanson et al. 2001), which are required for secre- ueller et al. 2005), which does not support the tion, the predicted proteins were analyzed by SignalP to deter- idea that recent losses of the melanogaster subgroup Acp’s mine the likely presence/absence of a signal peptide (Bendtsen are entirely compensated for by recent duplication and et al. 2004). Candidate lineage-restricted genes were subjected divergence. The purpose of the work presented here to additional investigation, as described in the next section.
was to systematically investigate potential gains of Acp’s Search for orthologs based on syntenic alignments: Syn- in the melanogaster subgroup of Drosophila. This was tenic regions of variable size (generally several kilobases) accomplished by description of the accessory gland encompassing each candidate gene were isolated from the D.
yakuba or D. erecta genome assemblies (BLAT via the UCSC transcriptome in D. yakuba and D. erecta, followed by genome browser (Kent et al. 2002; http://genome.ucsc.edu) computational analysis of melanogaster group species ge- to D. yakuba (Release 1.0; Washington University Medical nome assemblies. We have assumed that D. yakuba and Genome Sequencing Center) or BLAST to D. erecta contigs D. erecta are sister species (Ko et al. 2003; Parsch 2003); (October 2004 assembly; sequencing by Agencourt) at http:// D. ananassae served as the outgroup.
rana.lbl.gov/drosophila/. These regions were then analyzedby BLAT to identify putative orthologous regions of the D.
melanogaster genome. This resulted in a putative orthologousregion from D. melanogaster, D. yakuba, and D. erecta for each candidate, along with the gene annotation derived from ourEST/RACE data and computational analysis for either D.
D. yakuba and D. erecta accessory gland cDNA libraries yakuba or D. erecta. Finally, we attempted to isolate a syntenic and ESTs: Accessory glands from 100 D. yakuba males (line region from D. ananassae ( July 2004 assembly; sequencing by Tai18E2) and 45 D. erecta males (line 14021-0224.0) were Agencourt) for each candidate. Generally, this was more dissected in RNA-Later (Ambion, Austin, TX). Total accessory difficult (and not always successful), probably because of gland RNA was isolated using the Ambion mirVana miRNA kit greater sequence divergence, and often required investigation and RNAsed (Ambion DNA-Free kit). RACE-ready cDNA was of larger genomic regions, occasionally up to 10–15 kb. Each synthesized from 2 mg of each prep [Invitrogen (San Diego) gene region identified from a D. yakuba or a D. erecta accessory GeneRacer kit; the SSIII module and oligo(dT) primer were gland EST was investigated in detail in the corresponding used for the RTstep]. The resulting cDNA was amplified (eight region of the other species. This entailed pairwise alignments cycles for D. erecta; five cycles for D. yakuba) using the Roche using the Martinez/Needleman-Wunsch algorithm as imple- Expand High Fidelity PCR System. Amplified libraries were mented in DNASTAR and/or multispecies alignments using ClustalW v. 1.82. In many cases, there was no DNA in other species corresponding to the gene of interest. In other cases,there was apparently a homologous sequence, but no obvious Summary of inferred phylogenetic distributions of genes conserved open reading frame (ORF). For the latter, we com- putationally investigated the genomic sequence in the homol- ogous region to determine protein-coding capacity and whetherany putative proteins showed sequence similarity or similar protein lengths relative to the candidate, or whether a pre- dicted protein had a predicted signal sequence. In a few cases, these investigations revealed evidence for highly diverged orthologous genes, likely Acp’s, which would have gone un- detected on the basis of the alignment of DNA sequences.
Population genetic analysis: Molecular population genetic data were collected for several D. yakuba- and/or D. erecta-specific genes. High-fidelity PCR was used to amplify Acp’s from multiple D. yakuba isofemale lines and a single D. teissieri isofemale line (provided by P. Andolfatto and M. Long, re- spectively). These PCR products were cloned and subjected to colony PCR. A single allele was isolated and sequenced fromeach line. Summary statistics and tests of neutral evolution were generated by use of DnaSP (Rozas et al. 2003). Sequence data for the population genetics analysis can be found under GenBank accession nos. DQ318145–319181.
Signal sequence potential of D. melanogaster intergenic and intronic sequences: Intergenic sequences (defined as sequen- SignalP probabilities and lengths are from D. yakuba for D.
ces between two adjacent genes, independent of a strand) and yakuba-derived genes and from D. erecta for D. erecta-derived genes.
introns were obtained from release 4.1 of the D. melanogastergenome. Introns were parsed to mask known exons embed-ded within them. RepeatMasker (S more detail below. None are associated with repetitive used to mask repetitive elements of intergenic and intronic sequences; all show male-specific expression as deter- sequences. A Perl script was used to identify single-exon ORFs mined by RT–PCR on templates generated from RNA in the remaining DNA. An ORF was defined as a continuous isolated from whole adult males or females. Syntenic sequence starting with an ATG that extends at least 40 codons alignments of these putative lineage-restricted genes and ends with the first termination codon. ORFs from bothstrands and all reading frames were included in the data and orthologous regions can be found in Supplemental set. SignalP version 3.0 was used to predict the presence or Data B at http://www.genetics.org/supplemental/; pu- absence of signal peptides, which are characteristic of secreted tative CDS regions are in boldface type with the excep- proteins (Bendtsen et al. 2004). SignalP employs two meth- tion of Gene144, for which the transcript is in boldface ods, a neural network method and a hidden Markov model, for type; introns are underlined. Table 1 summarizes inferred detecting signal sequences. We accepted that an ORF had asignal sequence if both the neural network and hidden phylogenetic distributions of putative lineage-restricted Markov model (posterior probability $ 0.95) predicted that genes and some physical properties of the gene/pro- tein, including the probability that the predicted aminoacid has a signal sequence, which is frequently found inAcp’s (Swanson et al. 2001). Table 2 presents the results of BLAST analysis of several D. yakuba accessory gland Many of our D. yakuba/D. erecta accessory gland ESTs ESTs corresponding to putative novel genes compared returned highly significant BLAST hits to annotated D.
to the genomes of D. yakuba (April 2004 assembly), D.
melanogaster genes or proteins. These were not consid- melanogaster (release 4.2.1), D. erecta (August 2005 assem- ered further. Several ESTs had highly significant BLAST bly), and D. ananassae (August 2005 assembly). Table 3 hits to unannotated D. melanogaster sequence (as well as provides summary statistics of D. yakuba polymorphism to D. yakuba and D. erecta genomic sequence). On the basis and divergence to D. teissieri for five genes.
of the conserved location and organization of an open Putative lineage-restricted genes identified from D.
reading frame and the presence of a strongly predicted yakuba accessory gland ESTs: Acp134 codes for a pre- signal sequence in either D. yakuba or D. erecta and D.
dicted protein of 35 residues. This gene is represented in melanogaster, we consider 20 genes to be candidates the D. yakuba testis ESTcollection (CV785591, CV785729, for previously unknown Acp’s that are shared among CV786139), probably as a result of low-level contami- melanogaster subgroup species [supplemental Data A nation of the testis dissection with accessory gland at http://www.genetics.org/supplemental/ presents the tissue. Acp134 returns no significant BLAST results vs.
putative D. melanogaster protein-coding sequence (CDS) D. melanogaster, D. erecta, or D. ananassae. The putative for each gene]. However, additional empirical work syntenic alignments for the D. yakuba Acp134 region with would be required to solidify their status as such.
D. melanogaster, D. erecta, and D. ananassae suggest that Accessory gland ESTs for which we failed to find there are no plausible orthologous protein-coding regions putative orthologs in other species are presented in in D. melanogaster, D. erecta, or D. ananassae that correspond putative syntenic alignment between D. yakuba and D.
ananassae is presented in supplemental data at http:// BLASTn results (default parameters) of D. yakuba ESTs from www.genetics.org/supplemental/. However, the quality putative orphans to the D. yakuba genome (two best hits), to other melanogaster subgroup species genomes (best hit), of this alignment leads us to consider the status of the and annotation of the corresponding microsyntenic Acp223 codes for a predicted protein of 116 residues.
It is located between the D. yakuba orthologs of Obp56f and Obp56e. Indeed, the organization of the three genes is similar, which together with their physical location,suggests that they are paralogous. D. erecta also has a copy of Acp233. D. yakuba Acp223 is more highly diverged from the D. yakuba Obp56e and Obp56f genes than these genes are from one another. A partial, homologous D.
melanogaster ORF appears to be present; however, it codes for a predicted protein of only 44 residues, which leaves it with questionable status in D. melanogaster (Supplemental Data B at http://www.genetics.org/ supplemental/). A syntenic alignment of the putative D. ananassae orthologous region with D. yakuba pro- vides no evidence for a D. ananassae copy of Acp223.
Acp224 codes for a predicted protein of 231 residues in D. yakuba and is located within an intron of CG31757.
An alignment of the orthologous region from D. erectareveals that the reading frame starting with the D. yakuba to D. yakuba Acp134. Moreover, a computational analysis of initiation codon codes for a predicted protein of 75 these orthologous regions also revealed no potential residues. However, the fact that the D. yakuba gene and genes that were plausible orthologs. These data strongly the putative D. erecta ortholog are extremely divergent in suggest that Acp134 is present only in D. yakuba.
terms of length and sequence casts some doubt on the Acp225 codes for a predicted protein of 121 residues.
status of the D. erecta gene. To address this uncertainty, The syntenic alignment strongly suggests that there is we used RACE on accessory gland cDNA to isolate the no ortholog of Acp225 in D. melanogaster or D. erecta. A ends of the D. erecta gene. The RACE results revealed small ORF (36 bp) in D. erecta in the region near the first that there is an apparently orthologous D. erecta tran- exon of D. yakuba Acp225 is clearly not orthologous. A script, which codes for two potential ORFs (89 codons D. yakuba/D. teissieri population genetics data for putative orphans n is the number of D. yakuba alleles sampled. For D. teissieri, n ¼ 1 for all loci. Genes are on chromosome arm 2R with the exception of Acp225, which is on 3R. Divergence estimates are Jukes–Cantor corrected.
and the aforementioned 75 codons) that share the same RACE), along with the absence of a genomic poly(A) reading frame (but different initiation codons). The sequence downstream of the transcript, suggests that it is shorter ORF has a more strongly predicted signal se- not the result of genomic contamination. We unsuccess- quence, which suggests that it is the more likely candi- fully attempted to amplify the homologous region by date. Acp224 is the only putative Acp from our study that RT–PCR using RNA isolated from whole D. melanogaster has a recognizable functional domain based on an NCBI males. This failure is consistent with the idea that this gene conserved domain search (Marchler-Bauer et al. 2003).
is not present in each of the melanogaster subgroup species.
The D. yakuba copy has three predicted Kazal-type serpin Acp157a codes for a 112-residue-long predicted pro- domains, while the D. erecta copy has one such predicted tein. An alignment of the D. yakuba Acp157a region to domain. Serpin domains have previously been observed orthologous regions of the D. erecta and D. melanogaster in Drosophila Acp’s (Swanson et al. 2001; Mueller et al.
genomes shows that D. erecta contains an ortholog, while 2004). Syntenic alignments of D. yakuba Acp224 region D. melanogaster does not. A similar alignment to the pu- vs. D. melanogaster and D. ananassae (Supplemental Data tative orthologous region of the D. ananassae assembly B at http://www.genetics.org/supplemental/) strongly strongly suggests that the gene is not in this species.
suggest that the gene is absent from these species. Thus, Thus, Acp157a is likely a D. yakuba/D. erecta-specific gene.
Acp224 is likely a very rapidly evolving D. yakuba/D.
D. yakuba, but not other species, harbors a nearby, recent duplication (4 kb 59) of Acp157a. However, this dupli- Acp158 codes for a predicted protein of 71 residues.
cation has no long open reading frame, suggesting that Syntenic alignments of orthologous regions in D. mela- it is a D. yakuba-specific pseudogene.
nogaster and D. erecta provide no evidence of an ortho- Putative lineage-restricted genes identified from D.
logous gene in these species. This gene is located within erecta accessory gland ESTs: Acp100 codes for a pre- an intron of Pkc53E. Another putative Acp, Acp133, which dicted protein of 190 residues. A potential highly di- is likely shared in D. melanogaster, D. yakuba, and D. erecta, verged D. yakuba ortholog is present. This D. yakuba gene is located 1.2 kb 59 of Acp158 in D. yakuba, also in a shares the putative D. erecta initiation codon, but with a Pkc53E intron. Acp133 and Acp158 code for proteins of predicted length of 263 residues, is significantly longer roughly equal length (62 and 71 residues, respectively) than the predicted D. erecta protein. Both species share a and both are composed of two small exons and one canonical polyadenlyation signal downstream of their small intron. These similarities, along with their physical putative stop codons. A syntenic alignment between D.
proximity, suggest the possibility that the two genes are erecta and D. melanogaster suggests that the gene is absent related by duplication. However, their predicted protein from the latter. We were unable to generate a convincing sequences are too highly diverged to provide strong syntenic alignment with D. ananassae.
evidence of homology. The data are consistent with the Gene 37 codes for a predicted protein of 80 residues.
idea that Acp158 is a highly diverged duplication of This protein does not have a predicted signal sequence, Acp133 that is present only in D. yakuba. This implies casting some doubt on its status as an Acp. Syntenic align- either that Acp158 is a recent duplication that has di- ments to D. yakuba, D. melanogaster, and D. ananassae sug- verged incredibly rapidly or that Acp158 is an old dupli- gest that this gene is D. erecta specific. We computationally cation that has been lost multiple times in the melanogaster discovered a second putative open reading frame (single subgroup. Alternatively, it is possible that these two genes exon, 210 residues) that is 39 of gene 37 and coded on the are not paralogous. The alignment of the D. yakuba opposite strand (the putative CDS is annotated by left- Acp158 region with the putative orthologous region of facing arrows in the supplemental data alignment at D. ananassae suggests that neither it nor Acp133 is pres- http://www.genetics.org/supplemental/). This second pu- ent in this species, although some uncertainty regarding tative gene, which contains a strongly predicted signal the alignment means that this conclusion should be sequence and a predicted fibrinogen domain, overlaps gene 37 (their putative 39-ends overlap). The best hit in Gene144 has a single exon. The protein-coding po- a BLASTp analysis of this second gene to D. melanogaster tential of this gene is unclear. Transcript data from our proteins is to CG30281 (6e-36, 40% identity). CG30281 original cDNA clone and RACE experiments suggest the is associated with the gene ontology terms ‘‘receptor possibility of three open reading frames, two of which binding’’ and ‘‘defense response.’’ It appears to be D.
start with methionine and code for predicted proteins of erecta specific. However, we were unable to generate a D.
14 residues and one of which starts with isoleucine and erecta RT–PCR product, which casts doubt on its status.
codes for a predicted protein of 39 residues (which is not Population genetics of lineage-restricted Acp’s: We predicted to have a signal sequence). None of the three collected polymorphism and divergence data from several open reading frames is conserved in D. melanogaster, D. yakuba/D. erecta-specific putative Acp’s to investigate although there is apparently orthologous genomic se- mechanisms of protein evolution between D. yakuba and quence. This is likely not an Acp, and may not be a protein- D. teissieri (Table 2). The data, pooled across genes, coding gene (e.g., Tupyet al. 2005). However, the fact that reject the null (neutral) model (Kimura 1983) in the we isolated this putative transcript twice (cDNA clone and direction of adaptive protein divergence (McDonald and Kreitman 1991); however, only one gene, Acp158, is sequence. Acp’s have several features that make this sug- individually significant. Removing the data from Acp158 gestion worth considering. First, they tend to have short yields a nonsignificant test on data from the remaining open reading frames, of which there are huge numbers genes (P ¼ 0.17). Thus, although the rates of protein in noncoding genomic sequence. Second, as secreted divergence reported here are high compared to most proteins, a signal sequence is the primary functional Drosophila genes (e.g., Begun 2002; Richards et al.
element. Although signal sequences tend to be hydro- 2005), there is no strong support for recent, recurrent phobic and a-helical (Doudna and Batey 2004), the directional selection on these genes overall.
amino acid sequences are not always highly conserved(Nielsen et al. 1997). Third, Acp’s frequently have noknown functional domains apart from their signal sequences (Swanson et al. 2001; Mueller et al. 2005; We discovered several genes, many of which are likely Wagstaff and Begun 2005b), which is consistent with Acp’s, that have a lineage-restricted distribution in the the potential for a large degree of functional and evo- melanogaster subgroup. Each lineage-restricted gene de- lutionary lability. Finally, seminal fluid function may be scribed here could be explained in two ways: (i) as a novel under stronger or more frequent directional selection gene gained in D. yakuba, D. erecta, or their common an- than many other biological functions, which may make cestor or (ii) as multiple losses of a gene. One’s intuition is it more likely for novel Acp’s to invade populations.
that gains of novel genetic functions are much less likely Unannotated portions of eukaryotic genomes (and, than losses. The problem with this formulation is that it indeed, random DNA sequences) contain many short raises the question, How many losses must one invoke (e.g., 30–100 residues) open reading frames. A fraction before entertaining the hypothesis of gene gain as equally of new mutations, most of which are likely deleterious (or more) parsimonious? Regardless of the conclusion (Hahn et al. 2003), may create promoters near such for any particular Acp, it seems unreasonable to repeat- ORFs, thereby driving their expression, even if at a low edly invoke multiple losses and disallow occasional gains, level. Moreover, the consensus, highly conserved animal as this would imply that ancestral seminal fluid function polyadenylation signal AATAAA (Zhao et al. 1999) is is being lost from Drosophila, which seems unlikely.
short, simple, and, therefore, common. Thus, at muta- Thus, we favor the interpretation that some of the tion-selection balance there is likely a large pool of small orphan genes described here are newly evolved.
open reading frames (many of which possess signal What are plausible mechanisms for the origin of novel sequences) that are a short mutational distance from del- Acp’s? One possibility is duplication and divergence eterious expression and translation. Occasionally, how- (Holloway and Begun 2004; Mueller et al. 2005). For ever, a ‘‘spuriously’’ expressed ORF coding for a small, example, Acp158, which appears to be present only in D.
secreted peptide could be recruited into a novel function yakuba, may be a highly diverged duplicate of Acp133, which is present in D. melanogaster, D. yakuba, and D.
To investigate the plausibility of this scenario, we car- erecta. However, most of our orphans cannot be explained ried out an analysis of the signal peptide-coding potential this way (Table 2), as BLASTresults support the idea that of the intergenic and intronic portions of the D. mela- they are unique. This is consistent with previous analyses nogaster reference sequence. We found that Repeat- of the D. melanogaster genome suggesting the presence Masked D. melanogaster intergenic sequence harbors of few recent Acp duplications (Holloway and Begun 174,779 open reading frames of $40 residues. Of these, 2004; Mueller et al. 2005). An alternative possibility is we conservatively estimate that 6071 (3.5%) have a that novel genetic functions can be co-opted from previ- strongly predicted signal sequence (SignalP, hidden ously noncoding sequence. Such phenomena have been Markov model P . 0.95 and positive neutral network pre- observed before. For example, the recently evolved D.
diction). The corresponding numbers for introns are melanogaster gene, Sdic, is partially derived from an intron 53,003 ORFs and 1963 strongly predicted signal sequen- of a cytoplasmic dynein gene (Nurminsky et al. 1998).
ces (3.7%). Although a small fraction of these ORFs In nototheneoid fishes, intronic sequence from an an- may be previously undescribed genes or exons, it seems cestral trypsinogen gene has been co-opted into protein- more likely that we should conclude that the coding coding function in a descendant antifreeze protein (Chen potential for novel, small, secreted peptides in Drosophila et al. 1997). Such examples support the plausibility of the noncoding DNA is impressively large. Recent reports recruitment of ancestral noncoding sequence into coding that a surprisingly high fraction of eukaryotic genomes function. For the genes described here, however, there is transcribed (Bertone et al. 2004; Stolc et al. 2004, is neither evidence for partial derivation from ancestral 2005) would favor the mutation-selection-recruitment protein-coding sequence nor evidence of association model for the origin of small peptides. Direct support for with transposable elements or other repetitive sequences.
this model could be best obtained through the discovery These observations raise the question of the plausi- of small, novel, polymorphic proteins in populations.
bility of the birth of novel Acp’s entirely from small It seems clear that Acp’s are much more likely than open reading frames present in ancestrally noncoding most other genes to have lineage-restricted distributions.
The proximate and ultimate explanations for this pat- tern are unclear, although, in principle, the small size of of jingwei, a chimeric processed functional gene in Drosophila.
Science 260: 91–95.
Acp’s and the fact that they may be under unusually Marchler-Bauer, A., J. B. Anderson, C. DeWeese-Scott, N. D.
strong directional selection may contribute to a rapid gain of seminal fluid proteins. Comparative functional base of conserved domain alignments. Nucleic Acids Res. 31:383–387.
analysis of Acp’s, including the lineage-restricted genes described here, could greatly illuminate their evolution- tion at the Adh locus in Drosophila. Nature 351: 652–654.
Mueller, J. L, D. R. Ripoll, C. F. Aquadro and M. F. Wolfner, Comparative structural modeling and inference of con- M. Levine, S. Schaeffer, and two anonymous reviewers provided served protein classes in Drosophila seminal fluid. Proc. Natl.
useful comments. This work was supported by National Science Foundation grant DEB-0327049 and National Institutes of Health Mueller, J. L., K. RaviRam, L. A. McGraw, M. C. Bloch Qazi, E. D.
Cross-species comparison of Drosophila male ac- cessory gland protein genes. Genetics 171: 131–143.
nogaster females require a seminal fluid protein, Acp36DE, to store sperm efficiently. Genetics 153: 845–857.
Nielsen, H., J. Engelbrecht, S. Brunak and G. von Heijne, Positive selection drives the evolution of the Identification of prokaryotic and eukaryotic signal peptides Acp29AB accessory gland protein in Drosophila. Genetics 152: and prediction of their cleavage sites. Protein Eng. 10: 1–6.
Nielsen, R., C. Bustamante, A. G. Clark, S. Glanowski, T. B. Sackton Sexual Selection. Princeton University Press, A scan for positively selected genes in the genomes of humans and chimpanzees. PloS Biol. 3(6): e170.
Protein variation in Drosophila simulans and com- Nurminsky, D. I., M. V. Nurminskaya, D. DeAguiar and D. L.
parison of genes from centromeric versus non-centromeric re- Selective sweep of a newly evolved sperm-specific gions of chromosome 3. Mol. Biol. Evol. 19: 201–203.
gene in Drosophila. Nature 396: 572–575.
Evolution by Gene Duplication. Springer-Verlag, Berlin.
Acp complement in the melanogaster subgroup of Drosophila.
Selective constraints on intron evolution in Dro- Begun, D. J., P. Whitley, B. Todd, H. Waldrip-Dail and A. G.
Richards, S., Y. Liu, B. B. Bettencourt, P. Hradecky, S. Letovsky Molecular population genetics of male accessory Comparative genome sequencing of Drosophila pseu- gland proteins in Drosophila. Genetics 156: 1879–1888.
doobscura: chromosomal, gene, and cis-element evolution. Genome Bendtsen, J. D., H. Nielsen, G. von Heijne and S. Brunak, Improved prediction of signal peptides: SignalP 3.0.
Rozas, J., J. C. Sanchez-DelBarrio, X. Messegyer and R. Rozas, DnaSP, DNA polymorphism analysis by the coalescent Bertone, P., V. Stolc, T. E. Royce, J. S. Rozowsky, A. E. Urban et al., and other methods. Bioinformatics 19: 2496–2497.
Global identification of human transcribed sequences Smit, A. F. A., R. Hubley and P. Green, 1996–2004 RepeatMasker with genome tiling arrays. Science 306: 2242–2246.
Open-3.0 (http://www.repeatmasker.org).
Stolc, V., Z. Gauhar, C. Mason, G. Halasz, M. F. van Batenburg posed gene with specific male expression under positive Darwin- A gene expression map for the euchromatic genome ian selection. Genetics 164: 977–988.
of Drosophila melanogaster. Science 306: 655–660.
Birkhead, T. R. and A. P. Moller (Editors), 1998 Stolc, V., M. J. Samanta, W. Tongpsait, H. Sethi, S. Liang et al., and Sexual Selection. Academic Press, San Diego.
Identification of transcribed sequences in Arabidopsis thali- ana by using high-resolution genome tiling arrays. Proc. Natl.
seminal fluid proteins of male Drosophila melanogaster fruit Swanson, W. J., A. G. Clark, H. Waldrip-Dail, M. F. Wolfner and Chen, L., A. L. DeVries and C-H. C. Cheng, 1997 Evolutionary EST analysis identifies rapidly tifreeze glycoprotein from a trypsinogen gene in Antarctic noto- evolving male reproductive proteins in Drosophila. Proc. Natl.
thenioid fish. Proc. Natl. Acad. Sci. USA 94: 3811–3816.
recognition particle. Annu. Rev. Biochem. 73: 539–557.
essential for sperm storage in Drosophila melanogaster. Genetics Sexual Selection and Animal Genitalia. Harvard evolution of a gene of male reproduction, Acp26Aa of Drosoph- are positively correlated with developmental timing of expres- ila. Mol. Biol. Evol. 14: 544–549.
sion during mouse spermatogenesis. Mol. Biol. Evol. 22: 1044– Tupy, J. L., A. M. Bailey, G. Dailey, M. Evans-Holm, C. W. Siebel Identification of putative noncoding polyadenylated Hahn, M. W., J. E. Stajich and G. A. Wray, 2003 transcripts in Drosophila melanogaster. Proc. Natl. Acad. Sci.
against spurious transcription factor binding sites. Mol. Biol.
accessory gland protein genes in Drosophila melanogaster and D.
ulation genetics of duplicated accessory gland protein genes in pseudoobscura. Mol. Biol. Evol. 22: 818–832.
Drosophila. Mol. Biol. Evol. 21: 1625–1628.
Kent, W. J., C. W. Sugnet, T. S. Furey, K. M. Roskin, T. H. Pringle netics of accessory gland protein genes and testis-expressed genes The human genome browser at UCSC. Genome Res.
in Drosophila mojavensis and D. arizonae. Genetics 171: 1083–1101.
Zhang, Z., T. M. Hambuch and J. Parsch, 2004 Kern, A. D., C. D. Jones and D. J. Begun, 2004 of sex-biased genes in Drosophila Mol. Biol. Evol. 21: 2130–2139.
tion genetics of male accessory gland proteins in the Drosophila simulans complex. Genetics 167: 725–735.
ends in eukaryotes: mechanism, regulation, and interrelation- The Neutral Theory of Molecular Evolution. Cambridge ships with other steps in mRNA synthesis. Microbiol. Mol. Biol.
Ko, W. Y., R. M. David and H. Akashi, 2003 of the Drosophila melanogaster species subgroup. J. Mol. Evol.
57: 562–573.

Source: http://www.eve.ucdavis.edu/djbegun/Begunetal_Genetics06.pdf

unc.br

EDITAL DE PROCESSO SELETIVO PÚBLICO N.° 002/2013 PROVA: CONHECIMENTOS GERAIS E ESPECÍFICOS  Este caderno de prova é composto de 20 (vinte) questões de múltipla escolha, assim  05 (cinco) questões de Português;  15 (quinze) questões de Conhecimentos Específicos;  Você recebeu:  Caderno de Prova.  Cartão-resposta.  Caso o CADERNO DE PROVA esteja incompleto ou te

sole.dimi.uniud.it

A logical approach to represent and reason about calendars Department of Computer Science, University of Verona, ItalyDepartment of Sciences, University ‘G. D’Annunzio’ of Pescara, ItalyDepartment of Physical Sciences, University ‘Federico II’ of Napoli, Italy Abstract • Expressiveness . The class of granularities representedin the formalism should be large enough to be of

Copyright ©2018 Sedative Dosing Pdf