AppleGFDB:The Apple Gene Function & Gene Family DataBase v1.0
Locus Search:

 

Update news

, --------------------


2011-3-8
Apple Gene Family Collection Finished


2011-3-20
Apple CDD Finished


2011-4-8
Apple GO finished


2011-4-30
Apple Interpro finished


2011-5-10
MicroRNA database finished


2011-5-20
Blast and Tools finished


2011-6-12
System Integration Finished


2011-6-15
System Test Online



 

Introduction

Apple Gene Function and Gene Family Database (AppleGFDB) is supported by National Research Center for Apple Engineering and Technology, and State Key Laboratory of Crop Biology. The AppleGFDB aims to collect any information that is helpful for apple genome annotation.

AppleGFDB provides genome sequence from apple ('Golden Delicious', Malusdomestica Borkh., family Rosaceae, tribe Pyreae) and annotation of the 17 apple chromosomes. The genome sequence and the peptide sequence were download from the GDR database and the FEM-IASMA Computational Biology Web Resources. These data are available through search pages and our Genome Browser that provides an integrated display of annotation data.

AppleGFDB includs 63,541 gene models and 301,186 exons. The apple genes that have been researched afforded the prior attention to be annotated. For these abundant apple genes which have been not studied yet, the annotation of the most similar Arabidopsis gene in TAIR, and populus gene in PlantGDB are used as the reference for that of apple. Currently, the genome annotation consists the following parts:

1. GO analysis

Allow for the lack of studies about the function of apple genes, it is necessary to consult from known genes based on sequence similarities. Thus, we should draw support from the tool which can provide detailed annotation about these genes in model plant. On the basis of that Gene Ontology can provide structured, functional annotation and classifications of several plants, we select the tool to predict the unknown function of the apple genes. Each apple gene was found out its best homologous gene in Arabidopsis, and we then regard the GO terms of this Arabidopsis gene as the annotation of apple gene. Besides, the apple GO analysis was performed by the GO tools and the Interproscan, too.

2. Conserved Domain analysis

The protein domain structures with functional motifs are greatly helpful for the users to grasp the functions of protein rapidly. As CDD collected many important protein databases including Pfam, SMART, COG, TIGRFAM, and the NCBI Protein Clusters, the information of protein domains would be comprehensive. Therefore, we selected the NCBI CDD (Marchler-Bauer et al. 2011) and its batch-CD search tool to analyze domains of each protein encoding by apple gene, and then examined the validity of output data by Pfam and SMART. In the end, the analyzed data were organized to generate apple protein domain database. By inputting the given domain name into the interface of protein domain, user can get access to all apple genes with this domain. To be convenient for the user, the protein structure with domains is visualized as the concise and illustrative map showing in each information page of apple gene. Furthermore, apple CDD can be further used to examine my gene family classification. We checked whether the genes classified into each family have the typical domain of this gene family. Consequently, an apple gene has the domains which are typical of the family it is classified in. The results also demonstrated that our gene family classification is reasonable. AppleGADB 2.0 collected all predictied conserved domains using NCBI Batch CD-search tool and the PFAM search tool.

3. Gene Family classification

This part collected the gene families based on alignment to Arabidopsis genome. The Arabidopsis classification collection criteron was considered. Most genes in the same gene family have the similar protein structure and same functional domains. For these genes, we used MUSCLE (Edgar et al. 2004.) to generate a multiple sequence alignment for each Arabidopsis gene family. The multiple sequence alignment was then input into SAM 3.5 to build a hidden Markov model (HMM) for each family. Every gene sequence was aligned with each of these HMMs, and output an e-value. The lower the e-value is, the better fitness between the gene family sequence and a hidden Markov model. Thus, the gene was assigned into the family whose HMM produces the lowest e-value.

4. Interpro

The information of the well studied plants such as Arabidopsis provides us a chance to investigate the apple proteins by many databases. Of them, interpro is useful database of protein families, domains and functional sites in which identifiable features found in known protein can be applied to novel protein sequence in order to functionally characterize them. For apple proteins which most were unknown in function, interpro would be useful to predict the role of apple proteins. The interpro classification system also provides us another criterion to analyze gene function and annotation. The interpro data was obtained by the Interproscan tool in EBI.

5.Gene evolution analysis

The evolution analysis were obtained by aligning each apple gene with other plants genome, and the gene and protein structure of the best orthologous gene in plants were showed by the form of figures.

6. miRNA

The predicted microRNAs were mainly collected from the availabe publications and microRNA database. And the conserved miRNAs was conducted by miRPI software from the apple genome by Dr. Guodong Yang and Dr. Zhenlin Wei.

7. Blast Sequence Search

This component provides users a chance to research a query nucleotide or protein sequence against all sequences stored in AppleGADB. User can submit a query sequence and change the BLAST paramters. After performing the BLAST search, the significantly hit genes are ranked basing on the e-value generated from every gene. In this component, the users also can provide query text (FASTA format) to search information in the database. After submitting query text, hit genes would appear which ranked by sequence alignment.

 

Support by:

National Research Center for Apple Engineering and Technology

State Key Laboratory of Crop Biology

 

Publication:

Zhang, S., Chen, G. H., Liu, Y., Chen, H., Yang, G., Yuan, X., ... & Shu, H. (2013). Apple gene function and gene family database: an integrated bioinformatics database for apple research.?Plant Growth Regulation, 1-8.[FullText].

For any problems and advices, please contact:

Dr. Shizhong Zhang
National Research Center for Apple Engineering and Technology
College of Horticulture Science and Technology
Shandong Agricultural Unversity
Tai'an 271018,China
Email:gfdb@sdau.edu.cn

Dr. Guang Hui Chen
State Key Laboratory of Crop Biology
College of Life Science
Shandong Agricultural Unversity
Tai'an 271018, China
Email:chenguanghui@ymail.com

Current Adress:
Istitute of Genetics and Developmental Biology,CAS,Beijing, China,100101.

Dr. Yukun Liu
Key Laboratory for Forest Resource Conservation and Utilization in the Southwest Mountains of China,
College of Forestry Southwest Forestry University
P. R. China
Email:ykliu@swfu.edu.cn


All copyright are reserved by Prof. Huairui Shu 's lab at National Research Center for Apple Engineering and Technology,Shandong Agricultrural University

Web Site Designing & Administration Shizhong Zhang,Guanghui Chen and Yukun Liu ; IE 8 & 1600〜900 Resolution Suggested