Text-mining solutions for biomedical research: enabling integrative biology

As well as providing background information for research, scientific publications can be processed to transform textual information into database content or complex networks and can be integrated with existing knowledge resources to suggest novel hypotheses. 

The latest developments in text-mining solutions allow a shift from the analysis of abstracts to the analysis of the full text of papers. For the future, we expect that seamless querying and searching across biomedical knowledge will facilitate the systematic generation and exploration of hypotheses as well as the identification of new research topics and existing controversies.

Categories of text-mining solutions: information retrieval, information extraction, building knowledge bases and knowledge discovery.

Information retrieval:

The user submits a query to the search engine and receives documents or text messages fitting to the query in return.

 

Information extraction:

it comprises the identification of entities, as well as the relationships between those entities.

Dictionary-based approach: the entity mentioned in the text in fitted to the best match from the dictionay resource and is then immediately linked to a database entry.

Machine learning approach: computer program identifies any string mention in the text and requires a secondary analysis to link the entity to the correct database entry. This approach can identify only entities that are reffered to in a pre-annotated training corpus, it relies on the quality of the annoatations in the corpus.

Entity recognition, Entity disambiguation, Entity normalization

Identification of relations between named entities, 1, co-occcurrence 2, construct network from all relating scientific publications 3, extraction of specific types of statements.

 

Knowledge Bases:

they can be built that contain the collected statements together with collected evidence and provenance in the form of references to the scientific literature.

A large number of databases now use text mining to gather their data.

Knowledge discovery

it aims at the identification of hidden or as of yet undiscovered knowledge by applying data-mining algorithms.

A fairly new approach is the combination of text mining and ontologies to generate hypotheses.

Text-mining solutions for biomedical research: enabling integrative biology

上一篇:UVa OJ 494


下一篇:java命令