By Michael Cheyne
When it comes to creating a complete view of the patient journey, the challenge doesn’t lie in finding enough data. There are vast amounts of healthcare data in the ecosystem from patients, providers, and other healthcare organizations. The challenge lies in acquiring manageable data that is fit-for-purpose and can be mined for insights at scale. With the tools to accomplish this, we can not only understand the complete patient journey but also inform decisions that improve patient outcomes.
The value of integrated lab data
To make the best healthcare decisions, it is important to factor in all the data generated by the patient journey. Lab results are one of the most influential datasets as they generally help provide more detailed information on the “whys” relating to a patient’s clinical manifestations of disease. While claims, prescriptions, and other medical data are crucial to an informed study, the detail and sheer amount of information in lab reports is a wealth of untapped data that most organizations do not take full advantage of for a variety of reasons including data complexity, lack of experience, and the traditional focus on treatment information
When pharmaceutical organizations are equipped with the tools to factor in lab data with other patient-level data, the most complete view of the patient journey is revealed. With that understanding, healthcare stakeholders can form actionable insights to make their work faster and more impactful for everyone involved.
The unique challenges presented by lab data
Managing and interpreting lab data presents several challenges that often prevent it from being fully factored into the patient journey. Lab data is different from other, more traditional forms of healthcare data because results are often unstructured or semi-structured and it is difficult to master. These challenges arise in large part because lab data is generated by clinicians for other clinicians (i.e. humans communicating sometimes complex information to other humans) and not for automated categorization, storage, or interpretation at scale. This is in contrast to prescription and medical office claims data, which are structured in a way so that computers can talk to other computers, adjudicating payment at light speed.
Additionally, lab entries are not standardized across the industry. Different labs—and even providers within the same lab—use different forms and formats for reporting data. The introduction of the Logical Observation Identifiers Names and Codes (LOINC) ontology was meant to support standardization across labs and other types of clinical vital signs or records, but labs often use their own proprietary codes, which still need to be mapped back through LOINC in order to standardize across labs for effective interpretation. Even when labs attempt to adopt the LOINC standard in their reporting of results, they can often mismap the test to an incorrect code, generating confusion. Or, more often, they will find that there is not a fully applicable LOINC code for the test they are performing in the case of newer testing or methods. Even if the data has been mapped to LOINC correctly, units, terminology, and other information shared for interpretation can vary across lab data. As long as the assigned physician can understand the lab process and results, mass interpretation of the data is a distant concern for labs, making normalization a challenge. Further, a significant amount of important information is often buried in the clinician’s notes, requiring either more time to decipher them by hand, or the use of natural language processing (NLP) technology if an organization wants to take full advantage of all available data.
In this raw format, these data sets can’t be utilized or integrated with other healthcare data to find new insights and conclusions. Ultimately, most pharmaceutical companies just do not have the manpower or infrastructure to effectively normalize lab data at scale, integrate it with other types of healthcare data, and generate meaningful insights born from the full patient-level view of the data.
Methods for addressing these challenges
Historically, a clinician would create a map for a specific condition by iterating and adding to the data by hand over a period of several weeks. Each map would only provide insights for one condition under one set of circumstances. This process made iteration extremely time consuming and scaling nearly impossible. Additionally, simply using existing software to normalize data would work for individual data sets, but not for ingesting, normalizing, and applying millions of lab results. Fortunately, data scientists have developed specific tools to replace this process with machine learning (ML) to identify patterns and extract value at faster speeds and on a massive scale.
Over time, as these self-refining algorithms and models evaluate more and more data, they improve in accuracy and are able to process data at a greater scale. As the models become more mature, they can even generate the initial interpretations and mapping, allowing a clinician to review them for accuracy and significantly shortening the process. While the clinician remains a critical piece of the process, ML reduces the time to value and produces results that are standardized and scalable.
Additionally, data scientists and clinical experts have been able to successfully leverage NLP to extract information from text fields in lab results, such as those that house clinician notes. This data adds clinical specificity to the results that have never before been possible at a large scale. Currently, NLP has the capability to read pathology notes, learn healthcare acronyms, decipher typos, and even learn and apply new words.
Not only does ML and NLP significantly increase data normalization speeds and capabilities, it also makes massive scalability an option. Models can utilize the nearly endless supply of healthcare data, rather than pick and choose the normalized data sets from single sources. With this approach, a clinician is needed to review the output of the model, rather than constantly overseeing and tweaking the model.
Additionally, existing models are advanced enough to determine expected mapping for new data types. For example, if data is brought in from a new lab or by a client in their own data format, the established generalized solutions can accurately determine the probable mapping of the disease or test. This allows for the rapid application of existing models to new data which cuts down the time to insight significantly.
Investing in powerful tools
Making high-stakes healthcare choices shouldn’t be done without taking into consideration as much data as possible. While lab data is unstandardized and challenging to manage, it holds incredibly valuable information that can give organizations a more complete view of the patient journey – if they invest in the tools needed to extract its value. By using existing tools to standardize lab data or partnering with data experts who already have vast ecosystems of harmonized data, pharmaceutical companies can unlock the potential of lab results on a massive scale and revolutionize their decision-making process.
Michael Cheyne is vice president of clinical centers of excellence at Prognos Health, a leading clinically-focused data and analytics platform company that is on a mission to improve patient outcomes.