Short insert paired-end reads were corrected and assembled with SGA v0.9.798 (Supplementary Fig. 26a). This assembly was used to calculate the k-mer distribution for all odd k of 41–81, using GenomeTools v.1.3.799. The k-mer length for which the maximum number of unique k-mers was present was used as the k-mer setting in a second assembly, using Velvet v1.2.03100 with SGA-corrected reads. For species with 3 kb mate-pair data, the Velvet assembly was scaffolded using SSPACE101. Contigs were extended, and gaps closed and shortened, using Gapfiller102 and IMAGE103. Short fragment reads were remapped to the assembly using SMALT (see URLs), and unaligned reads assembled using Velvet100 and this merged with the main assembly. The assembly was re-scaffolded using SSPACE101, and consensus base quality improved with iCORN104. REAPR105 was used to break incorrectly assembled scaffolds/contigs. We carried out manual improvement for Wuchereria bancrofti and D. medinensis using Gap5106 and Illumina read-pairs.
WSI assembly quality control
Contamination screening. Assemblies were screened for contamination using BLAST107 against vertebrate and invertebrate sequences (see ref. 108). For Anisakis simplex, the assembly contained minor laboratory contamination with S. mansoni, which we removed using BLASTN against S. mansoni.Assembly completeness. CEGMA v2.4109 was used to assess completeness. Consistent sets of CEGMA genes were missing from some phylogenetic groups (Supplementary Table 2); these were discounted from the completeness calculation for those species (‘CEGMA’ in Supplementary Table 2).
Effect of repeats. We re-mapped the short-insert library’s reads to the appropriate assembly using SMALT (see URLs; indexing -k13 -s4 and mapping -y 0.9 -x -r 1). For each scaffold of ≥8 kb, median (meds) and mean (ms) per-base read-depth were calculated using BEDTools110, and genome-wide depth (medg) calculated as the median meds (ref. 17). For a ls bp scaffold, the extra sequence that would be gained by ‘uncollapsing’ repeats was estimated as es = (ms − medg) × ls/medg (Supplementary Table 5).
WSI gene prediction
Our pipeline111 had four steps (Supplementary Fig. 27a). First, repeats were masked. Second, preliminary gene predictions, to use as input for MAKER v2.2.28112 were generated using Augustus 2.5.5113, SNAP 2013-02-16114, GeneMark-ES 2.3a115, genBlastG116 and RATT117. Third, species-specific ESTs and complementary DNAs from INSDC118, and proteins from related species, were aligned to the genome using BLAST107. Last, EST/protein alignments and gene models were used by MAKER to produce a gene set.McDonnell Genome Institute (MGI) data production
The genomes of six species were sequenced at MGI (Supplementary Tables 1 and 2).MGI sequencing, assembly and quality control
Genome sequencing was carried out on Illumina and 454 instruments (see ref. 119). The workflow for each assembly is in Supplementary Table 1.Three kilobase, 8 kb and fragment 454 reads (or Illumina reads) were subject to adapter removal, quality trimming and length filtering (Supplementary Fig. 26b). Cleaned 454 reads were assembled using Newbler120 before being scaffolded with an in-house tool CIGA, which links contigs based on cDNA evidence. Cleaned Illumina reads were assembled using AllPaths-LG121. The assembly was scaffolded further using an in-house tool Pygap, using Illumina short paired-end sequences; and L_RNA_scaffolder122, using 454 cDNA data.
An assisted assembly approach was used for Trichinella nativa, whereby ‘cleaned’ Illumina 3 kb paired-end sequence data were mapped against the T. spiralis genome using bwa123 (Supplementary Fig. 26b), and the T. nativa residues were substituted at aligned positions (see ref. 119).
Adaptor sequences and contaminants were identified by comparison to a database of vectors and contaminants, using Megablast124.
MGI transcriptome sequencing and gene prediction
Transcriptome libraries (Supplementary Table 22) were generated with the Illumina TS stranded protocol, and reads assembled using Trinity125 (see ref. 119).Genes were predicted using MAKER112, based on input gene models from SNAP114, FGENESH (Softberry), Augustus113, and aligned messenger RNA, EST, transcriptome and protein data from the same or related species (Supplementary Fig. 27b; see ref. 119).
Blaxter Nematode and Neglected Genomics (BaNG) data production
The genomes of three species were sequenced by BaNG (Supplementary Tables 1 and 2).Sequencing was performed on Illumina HiSeq 2000 and HiSeq 2500 instruments, using 100 or 125 base, paired-end protocols. Paired-end libraries were generated using the Illumina TruSeq protocol.
Sequence data were filtered of contaminating host reads using blobtools126. Cleaned reads were normalized with the khmer software127 using a k-mer of 41, and then assembled with ABySS (v1.3.3)128, with a minimum of three pairs needed to connect contigs during scaffolding (n = 3) (Supplementary Fig. 26c). Assemblies were assessed using blobtools and CEGMA109.
Augustus113 was used to predict gene models, trained using annotations from MAKER112. As hints for MAKER, we used Litomosoides sigmodontis 454 RNA sequencing data assembled with MIRA129 and Newbler120, and Onchocerca ochengi Illumina RNA sequencing data130 assembled using Trinity131 (Supplementary Fig. 27c).
Defining high-quality ‘tier 1’ species
A subset of nematode and platyhelminth genomes, termed ‘tier 1’, was selected that had better-quality assemblies and spanned the major clades (Supplementary Table 4). To choose these, species were selected that (1) had contiguous assemblies (usually N50/scaffold-count >5), and complete proteomes (usually CEGMA partial >85%), or (2) that helped to ensure ~50% of the genera in each species group (‘Analysis group’ in Supplementary Table 4) were represented.Analysis of repeat content and genome size
For each species, repeat libraries were built using RepeatModeler (see URLs), TransposonPSI (see URLs) and LTRharvest132, and the three libraries merged (see ref. 133). The merged library was used to mask repeats in a species’ genome using RepeatMasker (see URLs; –s).The initial standard regression model and stepwise model fitting used ‘lm’ and ‘step’ in R v3.2.2. The Bayesian mixed-effect model used MCMCglmm134 (v2.24). To create a mixed-effect model, the species tree (see Methods) was transformed into an ultrametric tree using PATHd8135, with a small constant added to short branches to ensure no zero-length branches were reconstructed; and outgroup species were removed.
Compara database
An in-house Ensembl Compara46 database was constructed containing the 81 platyhelminths and nematodes, and 10 additional outgroups (Supplementary Table 2). All parasitic nematode/platyhelminth species with gene sets available at the time (April 2014) were included.The species tree used to construct the initial version of our database used an edited version of the National Center for Biotechnology Information (NCBI) taxonomy136 with several controversial speciation nodes represented as multifurcations. For our final database, the input species tree was derived by building a tree based on the previous database version, based on one-to-one orthologs present in ≥20 species. To do this, proteins in each ortholog group were aligned using MAFFT v6.857137; alignments trimmed using GBlocks v0.91b138, concatenated and used to build a maximum likelihood tree using a partitioned analysis in RAxML v7.8.6139, using the minimum Akaike’s information criterion (minAIC) model for each ortholog group.
The database was queried to identify gene families, orthologs and paralogs.
Species tree and tree based on gene family presence
We identified 202 gene families present in ≥25% of the 91 species (81 helminths and 10 outgroups) in our Compara database (Methods) and always single-copy. For each family, amino acid sequences were aligned using MAFFT v7.205137 (-auto). Each alignment was trimmed using GBlocks v0.91b138 (-b4 = 4 -b3 = 4 -b5 = h), and its likelihood calculated on a maximum-parsimony guide tree for all relatively simple (single-matrix) amino acid substitution models in RAxML v8.0.24139, and the minAIC model identified. Alignments were concatenated and a maximum-likelihood tree built, under a partitioned model in which sites from a gene were assigned the minAIC model for that gene, with a discrete gamma distribution of rates across sites. Relationships within outgroup lineages were constrained to match the standard view of metazoan relationships (for example, Dunn et al.140). The final tree was the highest likelihood one from five search replicates with different random number seeds. One hundred bootstrap resampling replicates were performed, each based on a single rapid search.We also constructed a maximum-likelihood phylogeny based on gene family presence/absence for families not shared by all 81 nematode/platyhelminth species, using RAxML v8.2.8139, with a two-state model and the Lewis method to correct for absence of constant-state observations.
Functional annotation
InterProScan141 v5.0.7 was used to identify conserved domains from all predicted proteins. A name was assigned to each predicted protein based on curated information in UniProt142 for orthologs identified from our Compara database (Methods), or based on InterPro143 domains (see ref. 144). Gene ontology (GO) terms were assigned by transferring GO terms from orthologs144, and using InterProScan.Signal peptides and transmembrane domains were predicted using Phobius145 v1.01 and SecretomeP146 v1.0. A protein predicted by Phobius to have a transmembrane domain was categorized as ‘membrane-bound’, and non-membrane-bound proteins as ‘classically secreted’ if Phobius predicted a signal peptide within 70 amino acids of their start. Remaining proteins in which SecretomeP predicted a signal peptide were classified as ‘non-classically secreted’ (Supplementary Table 7).
Pairwise combinations of Pfam domains were identified in proteins of the 81 nematodes and platyhelminths. After excluding those present in complete genomes of other phyla in UniProt (June 2016), we classified a combination as ‘nematode-specific’ (or ‘flatworm-specific’) if it was present in >30% of nematodes (platyhelminths) and no platyhelminths (nematodes) (Supplementary Table 14).
Synapomorphic gene families
Families in our Compara database (Methods) were analyzed using KinFin v0.8.3147, by providing InterPro IDs (Methods) and a species tree that had clades III, IV and V as a polytomy (Fig. 2). Synapomorphic families were identified at 25 nodes of interest (Supplementary Table 8), by using Dollo parsimony and requiring a family must contain genes from ≥1 descendant species from each child node of the node of interest, and must not contain other species. Families were filtered to retain those that (1) contained ≥90% of descendant species of the node of interest, and (2) in which >90% of species contained ≥1 gene with a particular InterPro domain.Candidate lateral gene transfers
Ferrochelatase families in our Compara database (Methods) were extracted by screening for a Ferrochelatase (IPR001015) domain. Additional ferrochelatases were retrieved from NCBI for 17 bacterial taxa (Supplementary Table 8c). Sequences were aligned using MAFFT v7.267 (E-INS-i algorithm)137 and the alignment trimmed using trimAl v1.4148. Phylogenetic analysis was carried out using RAxML139 under the PROTGAMMAGTR model, and 20 alternative runs on distinct starting trees. Non-parametric bootstrap analysis was carried out for 100 replicates.For cobyric acid synthase and acetate/succinate transporter, the top BLAST hits from GenBank, and representative sequences from other taxonomic groups, were aligned with MAFFT v7.205137 (-auto), and alignments trimmed with trimal v1.4148. Phylogenetic analyses were performed using RAxML v8.2.8139 under the model that minimized the AIC (LG4X for cobyric acid synthase, LG4M for acetate transporter), based on 5 random-addition-sequence replicates, and 100 non-parametric bootstrap replicates.
Gene family expansions
We used three metrics to identify families in our Compara database (Methods) that varied greatly in gene count across species (see ref. 149). To control for fragmented assemblies, we used summed protein length per species (in a family) as a proxy for gene count in these metrics:1. Coefficient of variation:
its mean.
2. Maximum Z-score:
the mean of the summed protein length (per species) in c, and the standard deviation in summed protein length per species in species outside c.
3. Maximum enrichment coefficient:
To increase reliability, these metrics were calculated by only considering tier 1 species (those with high-quality assemblies; Methods). Our code for calculating metrics is available (see URLs).
Putative GPCRs, identified from the literature and GO:0004930 annotations in WormBase154, were used to identify families in our Compara database (Methods). For each family, HHSuite155 was used to search Uniprot, SCOPUS, Pfam, and PDB; 200 families hitting ≥2 databases were deemed actual GPCR families (see ref. 156). Additional families were identified from synapomorphies (Methods) and curation, giving 230 GPCR families (Supplementary Table 15).
To build a phylogenetic tree of ion channels, known genes from C. elegans157, Brugia malayi158, Haemonchus contortus159, Oesophagostomum dentatum159 and S. mansoni84 were gathered, and their homologues in Compara families in WormBase ParaSite160. Genes with <3 or="">8 transmembrane domains (predicted by HMMTOP161) were discarded. Genes were aligned with MAFFT137, and the alignment trimmed with trimAl148. The phylogeny was inferred with MrBayes3.2162. Posterior probabilities were calculated from eight reversible jump Markov chain Monte Carlo chains over 20,000,000 generations.3>
Kinase models were taken from Kinomer163, and thresholds optimized to detect known C. elegans kinases (see ref. 164). The final thresholds were used to filter HMMER search results (against Kinomer) for nematode and platyhelminth species (Supplementary Table 23).
C. elegans ABC transporter and cys-loop receptor subunit genes were collated from WormBase154, to which we added H. contortus acr-26 and acr-27 (absent from C. elegans85). Homologs in nematodes and platyhelminths were identified using BLASTP (Supplementary Tables 16 and 17).
Pathway coverage was the fraction of ECs in a reference pathway that were annotated in a species (see ref. 173). We included pathways for which KEGG had a reference pathway for a nematode/platyhelminth (Supplementary Table 18e). Presence of KEGG modules was predicted using modDFS174, and species clustered based on module presence using Ward-linkage, based on Jaccard similarity index175.
Chokepoint enzymes were predicted following Taylor et al.176, using subnetworks of KEGG networks formed by just the enzymes (ECs) we had annotated in each particular species.
To assign a ‘target score’ to each worm gene, the main factors considered were similarity to known drug targets; lack of human homologues; and whether C. elegans/Drosophila melanogaster homologues had lethal phenotypes (see ref. 178).
Our top 15% (249) of highest-scoring worm targets had 292,499 compounds. These were filtered by selecting compounds that (1) co-appeared in a PDBe179 (Protein Data Bank in Europe) structure with the ChEMBL target; or (2) had median pChEMBL > 5; leaving 131,452 ‘top drug candidates’.
SCP/TAPS
SCP/TAPS genes were identified as having Pfam PF00188, or being in a SCP/TAPS family in our Compara database (Methods). Those between 146 aa (shortest C. elegans SCP/TAPS) and 1,000 aa were included in the phylogenetic analysis (Supplementary Table 10). Clusters were detected among sequences from a species group (‘analysis group’ in Supplementary Table 4) using USEARCH150 (UCLUST, aa identity cut-off = 0.70), and a consensus sequence generated for each cluster. The consensus sequences were aligned using MAFFT137 (v7.271, –localpair –maxiterate 2 –retree 1 –bl 45); the alignment trimmed with trimAl148 (-gt 0.006); and a maximum likelihood tree built using FastTreeMP151 (v2.1.7 SSE3, -wag -gamma).Proteins historically targeted for drug development
Each nematode/platyhelminth proteome was searched against candidate proteases using MEROPS batch-BLAST152 (E < 0.001), and PfamScan153 was used to identify additional homologues in some species (Supplementary Table 11).Putative GPCRs, identified from the literature and GO:0004930 annotations in WormBase154, were used to identify families in our Compara database (Methods). For each family, HHSuite155 was used to search Uniprot, SCOPUS, Pfam, and PDB; 200 families hitting ≥2 databases were deemed actual GPCR families (see ref. 156). Additional families were identified from synapomorphies (Methods) and curation, giving 230 GPCR families (Supplementary Table 15).
To build a phylogenetic tree of ion channels, known genes from C. elegans157, Brugia malayi158, Haemonchus contortus159, Oesophagostomum dentatum159 and S. mansoni84 were gathered, and their homologues in Compara families in WormBase ParaSite160. Genes with <3 or="">8 transmembrane domains (predicted by HMMTOP161) were discarded. Genes were aligned with MAFFT137, and the alignment trimmed with trimAl148. The phylogeny was inferred with MrBayes3.2162. Posterior probabilities were calculated from eight reversible jump Markov chain Monte Carlo chains over 20,000,000 generations.3>
Kinase models were taken from Kinomer163, and thresholds optimized to detect known C. elegans kinases (see ref. 164). The final thresholds were used to filter HMMER search results (against Kinomer) for nematode and platyhelminth species (Supplementary Table 23).
C. elegans ABC transporter and cys-loop receptor subunit genes were collated from WormBase154, to which we added H. contortus acr-26 and acr-27 (absent from C. elegans85). Homologs in nematodes and platyhelminths were identified using BLASTP (Supplementary Tables 16 and 17).
GO and InterPro/Pfam annotation enrichment
Counts of proteins annotated with each GO term (or InterPro/Pfam domain) per species were normalized by dividing by the total GO annotations in a particular species. To test for enrichment of a particular GO term in a species group (‘analysis group’ in Supplementary Table 4), we used a Mann-Whitney U test to compare normalized counts in that species group, to those in all other species (Supplementary Table 24).Metabolism
EC (Enzyme Commission number) predictions for nematodes and platyhelminths were derived by combining DETECT v2.0165, PRIAM166, KAAS167 and BRENDA168 (see ref. 169, Supplementary Fig. 28 and Supplementary Table 18), and supplemented for the 33 tier 1 species (Methods) by pathway hole-filling using Pathway Tools170 (v18.5). Comparisons of all 81 species (Supplementary Fig. 20a and Supplementary Table 20) did not include ECs from hole-filling. Lower confidence ECs were inferred using families from our Compara database (Methods). Auxotrophies were predicted using Pathway Tools and BioCyc171. To predict carbohydrate-active enzymes, HMMER3 was used to search dbCAN172 (Supplementary Table 25).Pathway coverage was the fraction of ECs in a reference pathway that were annotated in a species (see ref. 173). We included pathways for which KEGG had a reference pathway for a nematode/platyhelminth (Supplementary Table 18e). Presence of KEGG modules was predicted using modDFS174, and species clustered based on module presence using Ward-linkage, based on Jaccard similarity index175.
Chokepoint enzymes were predicted following Taylor et al.176, using subnetworks of KEGG networks formed by just the enzymes (ECs) we had annotated in each particular species.
Potential anthelmintic drug targets and drugs
Potential drug targets
Nematode and platyhelminth proteins from tier 1 species (with high-quality assemblies; Methods) were searched against single-protein targets from ChEMBL v21177 using BLASTP (E ≤ 1 × 10−10). After collapsing by gene family, 1,925 worm genes remained.To assign a ‘target score’ to each worm gene, the main factors considered were similarity to known drug targets; lack of human homologues; and whether C. elegans/Drosophila melanogaster homologues had lethal phenotypes (see ref. 178).
Potential new anthelmintic drugs
ChEMBL v21177 was used to identify 827,889 compounds with activities against ChEMBL targets to which worm proteins had BLAST matches. To calculate ‘compound scores’, we prioritized compounds in high clinical development phases, oral/topical administration, crystal structures, properties consistent with oral drugs and lacking toxicity (see ref. 178).Our top 15% (249) of highest-scoring worm targets had 292,499 compounds. These were filtered by selecting compounds that (1) co-appeared in a PDBe179 (Protein Data Bank in Europe) structure with the ChEMBL target; or (2) had median pChEMBL > 5; leaving 131,452 ‘top drug candidates’.
A ‘diverse screening set’
The 131,452 candidates were placed into 27,944 chemical classes, based on ECFP4 fingerprints (see ref. 178). They were filtered by (1) discarding medicinal chemistry compounds that did not co-appear in a PDBe structure with the ChEMBL target, or have median pCHEMBL > 7; (2) checking availability for purchase in ZINC 15180; and (3) for each worm target, taking the highest-scoring compound from each class; this gave 5,046 compounds.Self-organizing map
We constructed a self-organizing map of our diverse screening set plus known anthelmintic compounds (Supplementary Table 21a; see ref. 178), using Kohonen v3.02181 in R v3.3.0, using a 20 × 20 cell hexagonal, non-toroidal grid. The self-organizing map was trained for 4,000 steps, where training optimized Tanimoto distances between ECFP4 fingerprints.Reporting Summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.Data availability
Sequence data have been deposited in the European
Nucleotide Archive (ENA). Assemblies and annotation are available at
WormBase and WormBase-ParaSite (https://parasite.wormbase.org/). All have been submitted to GenBank under the BioProject IDs listed in Supplementary Table 1.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.G.B.D. 2015 Disease and Injury Incidence and Prevalence Collaborators. Global, regional, and national incidence, prevalence, and years lived with disability for 310 diseases and injuries, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet 388, 1545–1602 (2016).
Charlier,
J., van der Voort, M., Kenyon, F., Skuce, P. & Vercruysse, J.
Chasing helminths and their economic impact on farmed ruminants. Trends Parasitol. 30, 361–367 (2014).
Jones, J. T. et al. Top 10 plant-parasitic nematodes in molecular plant pathology. Mol. Plant Pathol. 14, 946–961 (2013).
Furtado, L. F., de Paiva Bello, A. C. & Rabelo, E. M. Benzimidazole resistance in helminths: From problem to diagnosis. Acta Trop. 162, 95–102 (2016).
Kaplan, R. M. & Vidyashankar, A. N. An inconvenient truth: global worming and anthelmintic resistance. Vet. Parasitol. 186, 70–78 (2012).
Hewitson, J. P. & Maizels, R. M. Vaccination against helminth parasite infections. Expert Rev. Vaccines 13, 473–487 (2014).
Ntalli, N. G. & Caboni, P. Botanical nematicides: a review. J. Agric. Food. Chem. 60, 9929–9940 (2012).
Young, N. D. et al. Whole-genome sequence of Schistosoma haematobium. Nat. Genet. 44, 221–225 (2012).
The Schistosoma japonicum Genome Sequencing and Functional Analysis Consortium. The Schistosoma japonicum genome reveals features of host–parasite interplay. Nature 460, 345–351 (2009).
Protasio,
A. V. et al. A systematically improved high quality genome and
transcriptome of the human blood fluke Schistosoma mansoni. PLoS Negl. Trop. Dis. 6, e1455 (2012).
Wang, X. et al. The draft genome of the carcinogenic human liver fluke Clonorchis sinensis. Genome. Biol. 12, R107 (2011).
McNulty, S. N. et al. Genomes of Fasciola hepatica from the Americas reveal colonization with Neorickettsia endobacteria related to the agents of potomac horse and human sennetsu fevers. PLoS Genet. 13, e1006537 (2017).
Tsai, I. J. et al. The genomes of four tapeworm species reveal adaptations to parasitism. Nature 496, 57–63 (2013).
Bennett, H. M. et al. The genome of the sparganosis tapeworm Spirometra erinaceieuropaei isolated from the biopsy of a migrating brain lesion. Genome. Biol. 15, 510 (2014).
Schiffer, P. H. et al. The genome of Romanomermis culicivorax: revealing fundamental changes in the core developmental genetic toolkit in Nematoda. BMC Genomics 14, 923 (2013).
Mitreva, M. et al. The draft genome of the parasitic nematode Trichinella spiralis. Nat. Genet. 43, 228–235 (2011).
Foth,
B. J. et al. Whipworm genome and dual-species transcriptome analyses
provide molecular insights into an intimate host–parasite interaction. Nat. Genet. 46, 693–700 (2014).
Hunt, V. L. et al. The genomic basis of parasitism in the Strongyloides clade of nematodes. Nat. Genet. 48, 299–307 (2016).
Kikuchi, T. et al. Genomic insights into the origin of parasitism in the emerging plant pathogen Bursaphelenchus xylophilus. PLoS Pathog. 7, e1002219 (2011).
Cotton, J. A. et al. The genome and life-stage specific transcriptomes of Globodera pallida elucidate key aspects of plant parasitism by a cyst nematode. Genome Biol. 15, R43 (2014).
Opperman, C. H. et al. Sequence and genetic map of Meloidogyne hapla: a compact nematode genome for plant parasitism. Proc. Natl Acad. Sci. USA 105, 14802–14807 (2008).
Ghedin, E. et al. Draft genome of the filarial nematode parasite Brugia malayi. Science 317, 1756–1760 (2007).
Godel, C. et al. The genome of the heartworm, Dirofilaria immitis, reveals drug and vaccine targets. FASEB J. 26, 4650–4661 (2012).
Desjardins, C. A. et al. Genomics of Loa loa, a Wolbachia-free filarial parasite of humans. Nat. Genet. 45, 495–500 (2013).
Cotton, J. A. et al. The genome of Onchocerca volvulus, agent of river blindness. Nat. Microbiol. 2, 16216 (2016).
Wang, J. et al. Silencing of germline-expressed genes by DNA elimination in somatic cells. Dev. Cell 23, 1072–1080 (2012).
Tang, Y. T. et al. Genome of the human hookworm Necator americanus. Nat. Genet. 46, 261–269 (2014).
Tyagi,
R. et al. Cracking the nodule worm code advances knowledge of parasite
biology and biotechnology to tackle major diseases of livestock. Biotechnol. Adv. 33, 980–991 (2015).
McNulty, S. N. et al. Dictyocaulus viviparus genome, variome and transcriptome elucidate lungworm biology and support future intervention. Sci. Rep. 6, 20316 (2016).
Laing, R. et al. The genome and transcriptome of Haemonchus contortus, a key model parasite for drug and vaccine discovery. Genome. Biol. 14, R88 (2013).
C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012–2018 (1998).
Dieterich, C. et al. The Pristionchus pacificus genome provides a unique perspective on nematode lifestyle and parasitism. Nat. Genet. 40, 1193–1198 (2008).
Robb, S. M., Ross, E. & Sanchez Alvarado, A. SmedGD: the Schmidtea mediterranea genome database. Nucleic Acids Res. 36, D599–D606 (2008).
Srinivasan, J. et al. The draft genome and transcriptome of Panagrellus redivivus are shaped by the harsh demands of a free-living lifestyle. Genetics 193, 1279–1295 (2013).
Srivastava, M. et al. The Amphimedon queenslandica genome and the evolution of animal complexity. Nature 466, 720–726 (2010).
Simakov, O. et al. Insights into bilaterian evolution from three spiralian genomes. Nature 493, 526–531 (2013).
Satou, Y. et al. Improved genome assembly and evidence-based global gene model set for the chordate Ciona intestinalis: new insight into intron and operon populations. Genome Biol. 9, R152 (2008).
Zhang, G. et al. The oyster genome reveals stress adaptation and complexity of shell formation. Nature 490, 49–54 (2012).
Howe, K. et al. The zebrafish reference genome sequence and its relationship to the human genome. Nature 496, 498–503 (2013).
Adams, M. D. et al. The genome sequence of Drosophila melanogaster. Science 287, 2185–2195 (2000).
International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Pagel Van Zee, J. et al. Tick genomics: the Ixodes genome project and beyond. Int. J. Parasitol. 37, 1297–1305 (2007).
Putnam, N. H. et al. Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science 317, 86–94 (2007).
Srivastava, M. et al. The Trichoplax genome and the nature of placozoans. Nature 454, 955–960 (2008).
Lynch, M., Bobay, L. M., Catania, F., Gout, J. F. & Rho, M. The repatterning of eukaryotic genomes by random genetic drift. Annu. Rev. Genomics Hum. Genet. 12, 347–366 (2011).
Vilella, A. J. et al. EnsemblCompara GeneTrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 19, 327–335 (2009).
Rao, A. U., Carta, L. K., Lesuisse, E. & Hamza, I. Lack of heme synthesis in a free-living eukaryote. Proc. Natl Acad. Sci. USA 102, 4270–4275 (2005).
Wu, B. et al. Interdomain lateral gene transfer of an essential ferrochelatase gene in human parasitic nematodes. Proc. Natl Acad. Sci. USA 110, 7748–7753 (2013).
Nagayasu, E. et al. Identification of a bacteria-like ferrochelatase in Strongyloides venezuelensis, an animal parasitic nematode. PLoS ONE 8, e58458 (2013).
Casaravilla, C. et al. Characterization of myo-inositol hexakisphosphate deposits from larval Echinococcus granulosus. FEBS J. 273, 3192–3203 (2006).
Diaz, A., Casaravilla, C., Barrios, A. A. & Ferreira, A. M. Parasite molecules and host responses in cystic echinococcosis. Parasite Immunol. 38, 193–205 (2016).
Santos, R. et al. A comprehensive map of molecular drug targets. Nat. Rev. Drug Discov. 16, 19–34 (2017).
Valentim,
C. L. et al. Genetic and molecular basis of drug resistance and
species-specific drug action in schistosome parasites. Science 342, 1385–1389 (2013).
Parsons, L. M. et al. Caenorhabditis elegans bacterial pathogen resistant bus-4 mutants produce altered mucins. PLoS ONE 9, e107250 (2014).
Shapira, M. et al. A conserved role for a GATA transcription factor in regulating epithelial innate immune responses. Proc. Natl Acad. Sci. USA 103, 14086–14091 (2006).
Hewitson, J. P. et al. Proteomic analysis of secretory products from the model gastrointestinal nematode Heligmosomoides polygyrus reveals dominance of venom allergen-like (VAL) proteins. J. Proteomics 74, 1573–1594 (2011).
van
der Hoeven, R., Cruz, M. R., Chavez, V. & Garsin, D. A.
Localization of the dual oxidase BLI-3 and characterization of its NADPH
oxidase domain during infection of Caenorhabditis elegans. PLoS ONE 10, e0124091 (2015).
Esteban,
M. R., Giovinazzo, G., de la Hera, A. & Goday, C. PUMA1: a novel
protein that associates with the centrosomes, spindle and centromeres in
the nematode Parascaris. J. Cell Sci. 111, 723–735 (1998).
Tobler, H., Etter, A. & Muller, F. Chromatin diminution in nematode development. Trends Genet. 8, 427–432 (1992).
Albarqi, M. M. et al. Regulation of life cycle checkpoints and developmental activation of infective larvae in Strongyloides stercoralis by dafachronic acid. PLoS Pathog. 12, e1005358 (2016).
Zarlenga, D. S., Nisbet, A. J., Gasbarre, L. C. & Garrett, W. M. A calcium-activated nucleotidase secreted from Ostertagia ostertagi 4th-stage larvae is a member of the novel salivary apyrases present in blood-feeding arthropods. Parasitology 138, 333–343 (2011).
Cathcart,
M. K. & Bhattacharjee, A. Monoamine oxidase A (MAO-A): a signature
marker of alternatively activated monocytes/macrophages. Inflamm. Cell Signal. 1, e161 (2014).
Coakley,
G., Maizels, R. M. & Buck, A. H. Exosomes and other extracellular
vesicles: the new communicators in parasite infections. Trends Parasitol. 31, 477–489 (2015).
Wu, C. et al. Mapping the binding between the tetraspanin molecule (Sjc23) of Schistosoma japonicum and human non-immune IgG. PLoS ONE 6, e19112 (2011).
Krautz-Peterson, G. et al. Schistosoma mansoni
infection of mice, rats and humans elicits a strong antibody response
to a limited number of reduction-sensitive epitopes on five major
tegumental membrane proteins. PLoS Negl. Trop. Dis. 11, e0005306 (2017).
Prior, A. et al. A surface-associated retinol- and fatty acid-binding protein (Gp-FAR-1) from the potato cyst nematode Globodera pallida: lipid binding activities, structural analysis and expression pattern. Biochem. J. 356, 387–394 (2001).
Rey-Burusco,
M. F. et al. Diversity in the structures and ligand-binding sites of
nematode fatty acid and retinol-binding proteins revealed by Na-FAR-1
from Necator americanus. Biochem. J. 471, 403–414 (2015).
Dell, A., Haslam, S. M. & Morris, H. R. in Parasitic Nematodes: Molecular Biology, Biochemistry and Immunology (eds. Kennedy, M. W. & Harnett, W.) 285–307 (Cabi Publishing, Oxfordshire, UK, 2013).
Rodrigues, J. A. et al. Parasite glycobiology: a bittersweet symphony. PLoS Pathog. 11, e1005169 (2015).
Anderson, L. et al. Schistosoma mansoni egg, adult male and female comparative gene expression analysis and identification of novel genes by RNA-Seq. PLoS Negl. Trop. Dis. 9, e0004334 (2015).
Gong, H. et al. A novel PAN/apple domain-containing protein from Toxoplasma gondii: characterization and receptor identification. PLoS ONE 7, e30169 (2012).
Cantacessi,
C. et al. A portrait of the “SCP/TAPS” proteins of
eukaryotes—developing a framework for fundamental research and
biotechnological outcomes. Biotechnol. Adv. 27, 376–388 (2009).
McKerrow, J. H., Caffrey, C., Kelly, B., Loke, P. & Sajid, M. Proteases in parasitic diseases. Annu. Rev. Pathol. 1, 497–536 (2006).
Williamson, A. L. et al. Ancylostoma caninum MTP-1, an astacin-like metalloprotease secreted by infective hookworm larvae, is involved in tissue migration. Infect. Immun. 74, 961–967 (2006).
Williamson, A. L. et al. A multi-enzyme cascade of hemoglobin proteolysis in the intestine of blood-feeding hookworms. J. Biol. Chem. 279, 35950–35957 (2004).
Delcroix, M. et al. A multienzyme network functions in intestinal protein digestion by a platyhelminth parasite. J. Biol. Chem. 281, 39316–39329 (2006).
Duffy,
M. S., Cevasco, D. K., Zarlenga, D. S., Sukhumavasi, W. & Appleton,
J. A. Cathepsin B homologue at the interface between a parasitic
nematode and its intermediate host. Infect. Immun. 74, 1297–1304 (2006).
Cancela, M. et al. A distinctive repertoire of cathepsins is expressed by juvenile invasive Fasciola hepatica. Biochimie 90, 1461–1475 (2008).
Knox, D. P. Proteinase inhibitors and helminth parasite infection. Parasite Immunol. 29, 57–71 (2007).
Rehman, A. A., Ahsan, H. & Khan, F. H. Alpha-2-macroglobulin: a physiological guardian. J. Cell. Physiol. 228, 1665–1675 (2013).
Martzen, M. R., Geise, G. L., Hogan, B. J. & Peanasky, R. J. Ascaris suum: localization by immunochemical and fluorescent probes of host proteases and parasite proteinase inhibitors in cross-sections. Exp. Parasitol. 60, 139–149 (1985).
Nei,
M., Niimura, Y. & Nozawa, M. The evolution of animal chemosensory
receptor gene repertoires: roles of chance and necessity. Nat. Rev. Genet. 9, 951–963 (2008).
Lynagh, T. et al. Molecular basis for convergent evolution of glutamate recognition by pentameric ligand-gated ion channels. Sci. Rep. 5, 8558 (2015).
MacDonald, K. et al. Functional characterization of a novel family of acetylcholine-gated chloride channels in Schistosoma mansoni. PLoS Pathog. 10, e1004181 (2014).
Courtot, E. et al. Functional characterization of a novel class of morantel-sensitive acetylcholine receptors in nematodes. PLoS Pathog. 11, e1005267 (2015).
Vasiliou, V., Vasiliou, K. & Nebert, D. W. Human ATP-binding cassette (ABC) transporter family. Hum. Genomics 3, 281–290 (2009).
Martin-Duran,
J. M., Ryan, J. F., Vellutini, B. C., Pang, K. & Hejnol, A.
Increased taxon sampling reveals thousands of hidden orthologs in
flatworms. Genome Res. 27, 1263–1272 (2017).
Kanehisa,
M., Goto, S., Sato, Y., Furumichi, M. & Tanabe, M. KEGG for
integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 40, D109–D114 (2012).
Kondrashov,
F. A., Koonin, E. V., Morgunov, I. G., Finogenova, T. V. &
Kondrashova, M. N. Evolution of glyoxylate cycle enzymes in Metazoa:
evidence of multiple horizontal transfer events and pseudogene
formation. Biol. Direct. 1, 31 (2006).
Harder, A. The biochemistry of Haemonchus contortus and other parasitic nematodes. Adv. Parasitol. 93, 69–94 (2016).
Marr, J. J. & Müller, M. Biochemistry and Molecular Biology of Parasites (Academic Press, San Diego, CA, USA, 1995).
Pearce, E. J. & Huang, S. C. The metabolic control of schistosome egg production. Cell Microbiol. 17, 796–801 (2015).
Mehlhorn, H. (ed.) Encyclopedia of Parasitology (Springer, New York, NY, USA, 2008).
Watson, E. et al. Metabolic network rewiring of propionate flux compensates vitamin B12 deficiency in C. elegans. eLife 5, e17670 (2016).
Van Soest, P. J. Nutritional Ecology of the Ruminant (Cornell Univ. Press, Ithaca, NY, USA, 1994).
Taylor, C. M. et al. Using existing drugs as leads for broad spectrum anthelmintics targeting protein kinases. PLoS Pathog. 9, e1003149 (2013).
Vermeire,
J. J., Suzuki, B. M. & Caffrey, C. R. Odanacatib, a cathepsin K
cysteine protease inhibitor, kills hookworm in vivo. Pharmaceuticals 9, 39 (2016).
Simpson, J. T. & Durbin, R. Efficient de novo assembly of large genomes using compressed data structures. Genome Res. 22, 549–556 (2012).
Gremme,
G., Steinbiss, S. & Kurtz, S. GenomeTools: a comprehensive software
library for efficient processing of structured genome annotations. IEEE/ACM Trans. Comput. Biol. Bioinform. 10, 645–656 (2013).
Zerbino, D. R. & Birney, E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 18, 821–829 (2008).
Boetzer, M., Henkel, C. V., Jansen, H. J., Butler, D. & Pirovano, W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics 27, 578–579 (2011).
Boetzer, M. & Pirovano, W. Toward almost closed genomes with GapFiller. Genome Biol. 13, R56 (2012).
Tsai,
I. J., Otto, T. D. & Berriman, M. Improving draft assemblies by
iterative mapping and assembly of short reads to eliminate gaps. Genome Biol. 11, R41 (2010).
Otto,
T. D., Sanders, M., Berriman, M. & Newbold, C. Iterative correction
of reference nucleotides (iCORN) using second generation sequencing
technology. Bioinformatics 26, 1704–1707 (2010).
Hunt, M. et al. REAPR: a universal tool for genome assembly evaluation. Genome Biol. 14, R47 (2013).
Bonfield, J. K. & Whitwham, A. Gap5—editing the billion fragment sequence assembly. Bioinformatics 26, 1699–1703 (2010).
Altschul, S. F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).
Coghlan, A. L., Gordon, D. & Berriman, M. Contamination screening of parasitic worm genome assemblies. Protoc. Exch. https://doi.org/10.1038/protex.2018.038 (2018).
Parra, G., Bradnam, K., Ning, Z., Keane, T. & Korf, I. Assessing the gene space in draft genomes. Nucleic Acids Res. 37, 289–297 (2009).
Quinlan, A. R. BEDTools: the Swiss-army tool for genome feature analysis. Curr. Protoc. Bioinformatics 47, 11.12.1–11.12.34 (2014).
Stanley,
E., Coghlan, A. L. & Berriman, M. A MAKER pipeline for prediction
of protein-coding genes in parasitic worm genomes. Protoc. Exch. https://doi.org/10.1038/protex.2018.056 (2018).
Holt,
C. & Yandell, M. MAKER2: an annotation pipeline and genome-database
management tool for second-generation genome projects. BMC Bioinformatics 12, 491 (2011).
Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).
Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
Ter-Hovhannisyan,
V., Lomsadze, A., Chernoff, Y. O. & Borodovsky, M. Gene prediction
in novel fungal genomes using an ab initio algorithm with unsupervised
training. Genome Res. 18, 1979–1990 (2008).
She, R. et al. genBlastG: using BLAST searches to build homologous gene models. Bioinformatics 27, 2141–2143 (2011).
Otto, T. D., Dillon, G. P., Degrave, W. S. & Berriman, M. RATT: rapid annotation transfer tool. Nucleic Acids Res. 39, e57 (2011).
Cochrane,
G., Karsch-Mizrachi, I. & Takagi, T., International Nucleotide
Sequence Database Collbaoration. The International Nucleotide Sequence
Database Collaboration. Nucleic Acids Res. 44, D48–D50 (2016).
Martin, J. & Mitreva, M. Genomic and transcriptomic data production for helminths. Protoc. Exch. https://doi.org/10.1038/protex.2018.044 (2018).
Margulies, M. et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature 437, 376–380 (2005).
Butler, J. et al. ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res. 18, 810–820 (2008).
Xue, W. et al. L_RNA_scaffolder: scaffolding genomes with transcripts. BMC Genomics 14, 604 (2013).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Morgulis, A. et al. Database indexing for production MegaBLAST searches. Bioinformatics 24, 1757–1764 (2008).
Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).
Kumar, S. & Blaxter, M. L. Simultaneous genome sequencing of symbionts and their hosts. Symbiosis 55, 119–126 (2011).
Crusoe, M. R. et al. The khmer software package: enabling efficient nucleotide sequence analysis. F1000Res. 4, 900 (2015).
Simpson, J. T. et al. ABySS: a parallel assembler for short read sequence data. Genome Res. 19, 1117–1123 (2009).
Chevreux,
B. et al. Using the miraEST assembler for reliable and automated mRNA
transcript assembly and SNP detection in sequenced ESTs. Genome Res. 14, 1147–1159 (2004).
Darby, A. C. et al. Analysis of gene expression from the Wolbachia genome of a filarial nematode supports both metabolic and defensive roles within the symbiosis. Genome Res. 22, 2467–2477 (2012).
Haas,
B. J. et al. De novo transcript sequence reconstruction from RNA-seq
using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).
Ellinghaus,
D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible
software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).
Coghlan,
A. L., Tsai, I. J. & Berriman, M. Creation of a comprehensive
repeat library for a newly sequenced parasitic worm genome. Protoc. Exch. https://doi.org/10.1038/protex.2018.054 (2018).
Hadfield, J. D. MCMC methods for multi-response generalized linear mixed models: the MCMCglmm R package. J. Stat. Softw. 33, 1–22 (2010).
Britton,
T., Anderson, C. L., Jacquet, D., Lundqvist, S. & Bremer, K.
Estimating divergence times in large phylogenetic trees. Syst. Biol. 56, 741–752 (2007).
Tatusova, T. Update on genomic databases and resources at the National Center for Biotechnology Information. Methods Mol. Biol. 1415, 3–30 (2016).
Katoh, K. & Standley, D. M. MAFFT: iterative refinement and additional methods. Methods Mol. Biol. 1079, 131–146 (2014).
Castresana, J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol. Biol. Evol. 17, 540–552 (2000).
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
Dunn, C. W. et al. Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452, 745–749 (2008).
Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).
Bairoch, A. & Apweiler, R. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res. 28, 4–48 (2000).
Hunter, S. et al. InterPro in 2011: new developments in the family and domain prediction database. Nucleic Acids Res. 40, D306–D312 (2012).
Coghlan, A. L. & Berriman, M. Functional annotation of parasitic worm genomes, by assigning protein names and GO terms. Protoc. Exch. https://doi.org/10.1038/protex.2018.055 (2018).
Kall,
L., Krogh, A. & Sonnhammer, E. L. Advantages of combined
transmembrane topology and signal peptide prediction—the Phobius web
server. Nucleic Acids Res. 35, W429–W432 (2007).
Bendtsen,
J. D., Jensen, L. J., Blom, N., Von Heijne, G. & Brunak, S.
Feature-based prediction of non-classical and leaderless protein
secretion. Protein Eng. Des. Sel. 17, 349–356 (2004).
Laetsch, D. R. & Blaxter, M. L. KinFin: software for taxon-aware analysis of clustered protein sequences. G3 7, 3349–3357 (2017).
Capella-Gutierrez,
S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for
automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Ribeiro,
D., Coghlan, A. L., Harsha, B. & Berriman, M. Identification of
lineage-specific gene family expansions in a database of gene families. Protoc. Exch. https://doi.org/10.1038/protex.2018.057 (2018).
Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).
Rawlings,
N. D. & Morton, F. R. The MEROPS batch BLAST: a tool to detect
peptidases and their non-peptidase homologues in a genome. Biochimie 90, 243–259 (2008).
Finn, R. D. et al. The Pfam protein families database. Nucleic Acids Res. 38, D211–D222 (2010).
Howe, K. L. et al. WormBase 2016: expanding to enable helminth genomic research. Nucleic Acids Res. 44, D774–D780 (2016).
Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960 (2005).
Wheeler, N., Day, T., Zamanian, M. & Kimber, M. GPCR identification in parasitic worm genome assemblies. Protoc. Exch. https://doi.org/10.1038/protex.2018.061 (2018).
Jones, A. K., Davis, P., Hodgkin, J. & Sattelle, D. B. The nicotinic acetylcholine receptor gene family of the nematode Caenorhabditis elegans: an update on nomenclature. Invert. Neurosci. 7, 129–131 (2007).
Li, B. W., Rush, A. C. & Weil, G. J. Expression of five acetylcholine receptor subunit genes in Brugia malayi adult worms. Int. J. Parasitol. Drugs Drug Resist. 5, 100–109 (2015).
Buxton,
S. K. et al. Investigation of acetylcholine receptor diversity in a
nematode parasite leads to characterization of tribendimidine- and
derquantel-sensitive nAChRs. PLoS Pathog. 10, e1003870 (2014).
Howe,
K. L., Bolt, B. J., Shafie, M., Kersey, P. & Berriman, M. WormBase
ParaSite—a comprehensive resource for helminth genomics. Mol. Biochem. Parasitol. 215, 2–10 (2017).
Tusnady, G. E. & Simon, I. The HMMTOP transmembrane topology prediction server. Bioinformatics 17, 849–850 (2001).
Ronquist, F. et al. MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst. Biol. 61, 539–542 (2012).
Miranda-Saavedra, D. & Barton, G. J. Classification and functional annotation of eukaryotic protein kinases. Proteins 68, 893–914 (2007).
Martin, J. & Mitreva, M. Kinase annotation for helminths. Protoc. Exch. https://doi.org/10.1038/protex.2018.042 (2018).
Hung,
S. S., Wasmuth, J., Sanford, C. & Parkinson, J. DETECT—a density
estimation tool for enzyme classification and its application to Plasmodium falciparum. Bioinformatics 26, 1690–1698 (2010).
Claudel-Renard, C., Chevalet, C., Faraut, T. & Kahn, D. Enzyme-specific profiles for genome annotation: PRIAM. Nucleic Acids Res. 31, 6633–6639 (2003).
Moriya,
Y., Itoh, M., Okuda, S., Yoshizawa, A. C. & Kanehisa, M. KAAS: an
automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 35, W182–W185 (2007).
Chang, A. et al. BRENDA in 2015: exciting developments in its 25th year of existence. Nucleic Acids Res. 43, D439–D446 (2015).
Swapna, S., Tyagi, R., Mitreva, M. & Parkinson, J. Annotating metabolic enzymes in parasitic worm proteomes. Protoc. Exch. https://doi.org/10.1038/protex.2018.047 (2018).
Karp, P. D. et al. Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology. Brief. Bioinformatics 17, 877–890 (2016).
Caspi, R. et al. The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 40, D742–D753 (2012).
Lombard,
V., Golaconda Ramulu, H., Drula, E., Coutinho, P. M. & Henrissat,
B. The carbohydrate-active enzymes database (CAZy) in 2013. Nucleic Acids Res. 42, D490–D495 (2014).
Tyagi, R., Swapna, S., Parkinson, J. & Mitreva, M. Comparative analysis of metabolism in parasitic worms. Protoc. Exch. https://doi.org/10.1038/protex.2018.048 (2018).
Tyagi, R., Rosa, B. A., Lewis, W. G. & Mitreva, M. Pan-phylum comparison of nematode metabolic potential. PLoS Negl. Trop. Dis. 9, e0003788 (2015).
Real, R. & Vargas, J. M. The probabilistic basis of Jaccard’s index of similarity. Syst. Biol. 45, 380–385 (1996).
Taylor, C. M. et al. Discovery of anthelmintic drug targets and drugs using chokepoints in nematode metabolic pathways. PLoS Pathog. 9, e1003505 (2013).
Gaulton, A. et al. The ChEMBL database in 2017. Nucleic Acids Res. 45, D945–D954 (2017).
Coghlan, A. L. et al. Creating a screening set of potential anthelmintic compounds using ChEMBL. Protoc. Exch. https://doi.org/10.1038/protex.2018.053 (2018).
Velankar, S. et al. PDBe: improved accessibility of macromolecular structure data from PDB and EMDB. Nucleic Acids Res. 44, D385–D395 (2016).
Sterling, T. & Irwin, J. J. ZINC 15—ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324–2337 (2015).
Wehrens, R. & Buydens, L. M. C. Self- and super-organizing maps in R: the kohonen package. J. Stat. Softw. 21, 1–19 (2007).
Nenhum comentário:
Postar um comentário
Observação: somente um membro deste blog pode postar um comentário.