New Citation 20 July 2019

【】1. Disease classification via gene network integrating modules and pathways
Disease classification based on gene information has been of significance as the foundation for achieving precision medicine. Previous works focus on classifying diseases according to the gene expression data of patient samples, and constructing disease network based on the overlap of disease genes, as many genes have been confirmed to be associated with diseases.In this work, the effects of diseases on human biological functions are assessed(评估) from the perspective(观点) of gene network modules and pathways, and the distances between diseases are defined to carry out the classification models. In total, 1728
diseases are divided into 12 and 14 categories by the intensity( 强度)and scope(范围) of effects on pathways, respectively. Each category is a mix of several types of diseases identified based on congenital(先天的、天生的) and acquired factors as well as diseased tissues and organs. The disease classification models on the basis of gene network are parallel with traditional pathology (病理)classification based on anatomic(解剖) and clinical(临床) manifestations(表征、表现), and enable us to look at diseases in the viewpoint of commonalities in etiology and pathology. Our models provide a foundation for exploring combination therapy of diseases, which in turn may inform strategies for future gene-targeted therapy

2.Allergenicity prediction of novel and modified proteins: Not a mission impossible! Development of a Random Forest allergenicity prediction model
Abstract
Alternative and sustainable protein sources (e.g., algae, duckweed, insects) are required to produce (future) foods. However, introduction of new food sources to the market requires a thorough risk assessment of nutritional, microbial and toxicological risks and potential allergic responses(过敏反应). Yet, the risk assessment of allergenic potential of novel proteins is challenging. Currently, guidance for genetically modified proteins relies on a weight-of-evidence approach. (证据权衡法)Current Codex (2009) and EFSA (2010; 2017) guidance indicates that sequence identity to known allergens is acceptable for predicting the cross-reactive potential of novel proteins and resistance to pepsin digestion(胃蛋白酶切) and glycosylation status is used for evaluating de novo allergenicity potential. Other physicochemical(物理化学的) and biochemical protein properties, however, are not used in the current weight-of-evidence approach. In this study, we have used the Random Forest algorithm for developing an in silico model that yields a prediction of the allergenic potential of a protein based on its physicochemical and biochemical properties. The final model contains twenty-nine variables, which were all calculated using the protein sequence by means of the ProtParam software and the PSIPred Protein Sequence Analysis program. Proteins were assigned as allergenic when present in the COMPARE database.

Results show a robust model performance with a sensitivity, specificity and accuracy each greater than ≥85%. As the model only requires the protein sequence for calculations, it can be easily incorporated into the existing risk assessment approach. In conclusion, the model developed in this study improves the predictability of the allergenicity of new or modified food proteins, as demonstrated for insect proteins.

上一篇:EULER::【欧拉定理】木大木大木大!欧拉欧拉欧拉!


下一篇:标准模板库巧解算法题 前缀和