Recent Trends Of NLP Based Text Analytics (With Specific To The Semantic Information- Medical Application)

The advancements in medical sector have been exhibited in different technological aspects (Yang et al., 2015). However, with the advancements there have also been noticed certain challenges such as higher cost, increase in complexity, poor quality of care etc (Cortada, Gordon, & Lenihan, 2012). For example, considering U.S. healthcare sector it has been noticed that 21-47% of the expenditures Berwick & Hackbarth, (2012) occur due to inefficiency such inappropriate use of antibiotics etc. It was also observed that a small part of these costs was related to low-quality care due to which many people die in the U.S. every year (Makary & Daniel, 2016). Thus, to alleviate these issues employing better decision-making techniques on the basis of information available can be considered to be an optimal solution. Consequently, the health care sector is employing information technology in their management systems (Prokosch & Ganslandt, 2009).

While implementation of information technology has enabled gathering a vast amount of data, employing analytics tools and techniques has enabled retrieving the relevant or desired information from this unstructured data and then transforming it for enabling better decision-making process (Islam, Hasan, Wang, Germack, & Noor-E-Alam, 2018). Employing analytics tools and techniques such as data mining, text mining etc has been of great help for medical professionals in predicting and diagnosing diseases, treating patients, consequently leading to delivering better service quality and reducing the expenses (Tomar & Agarwal, 2013). As per the estimates, applying data mining can enable saving $450 billion every year from the U.S. medical sector (Herland, Khoshgoftaar, & Wald, 2014).

Noticeably, exploitation of e-health records and the corresponding interest in employing them for optimization of data quality has exhibited that analyzing the open text (data captured in health records) is highly critical (Dhole & Uke, 2014). There have been used several techniques for extracting medical text. For example, natural language processing technique employs different tools such as part of speech taggers, noun entity recognizers etc. While applying these tools, it has been noticed that medical text differs from normal text as it includes complex terminologies.

These type of texts usually include abstracts of scientific papers, medical reports etc. The primary task of the techniques used is classification of papers into different categories such that it can be fed into a database and thus retrieved by the end-user as required (Moreno & Redondo, 2016). Recognition of biomedical entities includes identifying and grouping the entity names as per the different medical domains such as proteins, genes, drugs etc. Though there are numerous sources available, there is not a particular resource that can be considered as new medicines are explored every time and this imposes as the main hindrance for NLP .

Linguistic-based approaches: This approach is based on using parsers to understand syntactic structures and associate them into semantic representations.

• Pattern-based approaches: This approach employs a set of patterns for associating the entities that are defined by domain experts.

• Machine Learning-based approaches: This approach is based on deriving associations in new categories that represent similar text (Andreu-Perez, Poon, Merrifield, Wong, & Yang, 2015)

To summarize, it can be asserted that NLP enables a machine to process the human language and transform it to machine-understandable format. It is noteworthy that the origination of NLP took place in 1960s but it soon gained popularity with the large scale employability of the World Wide Web and search engines.

The process of processing followed by NLP is divided into the following steps:

  • Text Preprocessing/Tokenization
  • Lexical Analysis
  • Syntactical Analysis
  • Semantic Analysis

Thus employing NLP can be considered to be of paramount significance for medical field since there is a large amount of unstructured data which needs to be refined in order to alleviate inefficient outcomes. Though electronic medical records are now becoming popular and thus are being commonly used, but there are very less systems that can extract relevant data from these medical documents. Hence, there is a need to process the raw data using the NLP technique.

In addition to processing the data available, the healthcare sector is also concerned about extracting the relevant data from different sources or the data which is stored in different databases (Luque, Luna, Luque, & Ventura, 2018). Extraction of relevant data from the unstructured data is another challenge revolving around the healthcare or medical domain. Thus, considering this text mining technique is used to extract structured information from unstructured data. In the last few years, the amount of information generated regularly in the field of medicine is growing continuously, especially the information produced by professionals in their day to day routine (Feldman, Hazekamp, & Chawla, 2016). It is so because the condition of different patients is discussed by several healthcare professionals in the textual format which is stored in different formats. Due to the presence of this unstructured data, the process of obtaining structured or relevant data for decision-making sometimes becomes more complex.

Thus to conclude, it can be affirmed that the primary challenge encountered by different medical professionals is not just retrieving any data, rather obtaining significant and relevant information from the database using text mining technique and then employing NLP (a part of text mining) for processing the data obtained so as to get the desired information. Thus the different phases involved for extracting useful medical data from the unstructured data are as follows:

  1. Pre-processing stage: In this stage, unstructured data is arranged as per the standards and then processed further using NLP techniques.
  2. Text representation stage: In this stage, unstructured data is converted into an appropriate form which enables it to be analyzed efficiently.
  3. Discovery stage: This is the last stage of processing unstructured data, and in this stage, the information expected by the end-user is obtained by employing techniques such as clustering, classification etc.

The continuous advancement of data analytics and its applications to medicine has served to be a great support for the medical professionals and patients as well. Some of these benefits are listed as follows:

  • Better quality of care services offered to patients (Delespierre, Denormandie, Bar-Hen, & Josseran, 2017).
  • Reduced medical errors (Cohan, Fong, Ratwani, & Goharian, 2017)
  • Enabling prevention and detection of diseases in an earlier stage (Just, 2017).
  • Facilitating the availability of most beneficial treatment for patients (Palanisamy & Thirunavukarasu, 2017).
  • Reducing the consumption of both time as well as expenditure incurred by healthcare management (Wencheng, Fang, Zhiping, Shengqun, & Guoyan, 2017).

References

  1. Andreu-Perez, J., Poon, C. C. Y., Merrifield, R. D., Wong, S. T. C., & Yang, G.-Z. (2015). Big Data for Health. IEEE Journal of Biomedical and Health Informatics, 19(4), 1193–1208.
  2. Berwick, D. M., & Hackbarth, A. D. (2012). Eliminating waste in US health care. JAMA, 307(14), 1513–1516.
  3. Cohan, A., Fong, A., Ratwani, R. M., & Goharian, N. (2017). Identifying Harm Events in Clinical Care through Medical Narratives. Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics – ACM-BCB ’17, 52–59
  4. Cortada, J. W., Gordon, D., & Lenihan, B. (2012). The value of analytics in healthcare
  5. Delespierre, T., Denormandie, P., Bar-Hen, A., & Josseran, L. (2017). Empirical advances with text mining of electronic health records. BMC Medical Informatics and Decision Making, 17(1), 127.
  6. Dhole, G., & Uke, N. (2014). Medical Information Extraction Using Natural Language Interpretation. SSRN Electronic Journal, 1(1), 19–25.
  7. Feldman, K., Hazekamp, N., & Chawla, N. V. (2016). Mining the Clinical Narrative: All Text are Not Equal. 2016 IEEE International Conference on Healthcare Informatics (ICHI), 271–280.
  8. Herland, M., Khoshgoftaar, T. M., & Wald, R. (2014). A review of data mining using big data in health informatics. Journal Of Big Data, 1(1)
  9. Islam, M. S., Hasan, M. M., Wang, X., Germack, H. D., & Noor-E-Alam, M. (2018). A Systematic Review on Healthcare Analytics: Application and Theoretical Perspective of Data Mining. Healthcare (Basel, Switzerland), 6(2).
  10. Just, E. (2017). How to Use Text Analytics in Healthcare to Improve Outcomes—Why You Need More than NLP.
  11. Luque, C., Luna, J. M., Luque, M., & Ventura, S. (2018). An advanced review on text mining in medicine. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 9(3).
  12. Makary, M. A., & Daniel, M. (2016). Medical error-the third leading cause of death in the US. BMJ (Clinical Research Ed.), 353, i2139
  13. Moreno, A., & Redondo, T. (2016). Text Analytics: the convergence of Big Data and Artificial Intelligence. International Journal of Interactive Multimedia and Artificial Intelligence, 3(6), 57.
  14. Palanisamy, V., & Thirunavukarasu, R. (2017). Implications of big data analytics in developing healthcare frameworks – A review. Journal of King Saud University – Computer and Information Sciences, 31(4), 415–425.
  15. Prokosch, H. U., & Ganslandt, T. (2009). Perspectives for medical informatics. Reusing the electronic medical record for clinical research. Methods of Information in Medicine, 48(1), 38–44
  16. Tomar, D., & Agarwal, S. (2013). A survey on Data Mining approaches for Healthcare. International Journal of Bio-Science and Bio-Technology, 5(5), 241–266.
  17. Wencheng, S. U. N., Fang, L. I. U., Zhiping, C. A. I., Shengqun, F., & Guoyan, W. (2017). A Survey of Data Processing of EMR ( Electronic Medical Record ) Based on Data Mining. (August), 1–16.
  18. Yang, J.-J., Li, J., Mulder, J., Wang, Y., Chen, S., Wu, H., … Pan, H. (2015). Emerging information technologies for enhanced healthcare. Computers in Industry, 69, 3–11.

Leave a Reply

Your email address will not be published. Required fields are marked *

X