3 Strategies for AI Development Teams to Make Sense of Unstructured Healthcare Data


By Vatsal Ghiya

Much has been made about the potential for artificial intelligence to transform the healthcare industry, and for good reason. Sophisticated AI platforms are fueled by data, and healthcare organizations have that in abundance. So why has the industry lagged behind others in terms of AI adoption?

That’s a multifaceted question with many possible answers. All of them, however, will undoubtedly highlight one obstacle in particular: large amounts of unstructured data.

Easy to Generate, Hard to Use

Unstructured data is everywhere in clinical settings, often taking the form of nurses’ notes, physician transcripts, and other patient information that’s usually stored in silos across multiple organizations. Developing robust algorithms capable of converting unstructured data into structured datasuitable for analysis is expensive and time-consuming. Moreover, performing analysis on top of unstructured data is often problematic because data points associated with the same patient might not be connected. Hence, it’s difficult to develop correlations between them.

Structured data, in comparison, is easy to tap into. In the healthcare industry, patient demographic information, medication codes, and other data stored via electronic health records in a standardized format are the most common examples of structured data. Unfortunately, the vast majority of healthcare data is unstructured. As hospitals and physicians’ offices increasingly rely on connected devices and digital platforms to interact with patients, they’re generating exponentially more of it.

Now more than ever, the industry needs the power of AIto generate actionable insights from all this scattered information. With that in mind, here are three ways healthcare organizations and product teams can pave the way for more viable, impactful AI solutions.

1. Increase connectivity between data points. It’s hard to quantify the time and capital investment that would be required to create interlinkages among the technologies, platforms, and systems that comprise the industry’s sprawling digital infrastructure. Even so, such an investment would eventually pay off. At the very least, however, sweeping change could take many years, especially in a highly regulated industry like healthcare. Product teams must do their part to develop new strategies for taking advantage of unstructured data in the meantime — and those that do might be pleasantly surprised. They’ll save time that would otherwise be dedicated to loading information into rows and columns, and might be able to access potentially valuable insights that frequently get lost when unstructured data is converted to a structured format.

2. Utilize partners to annotate and label data sets. AI-based tools can analyze medical imaging data from X-rays, CAT scans, and other procedures to help radiologists and physicians perform faster, more accurate diagnoses and acquire critical knowledge. To do this, however, the tools need to be trained with large amounts of complex data that’s been painstakingly annotated by experts who understand what medical professionals need from a computer assistant. Finding a team that has both the technical capabilities and domain expertise to annotate unstructured data sets for training sophisticated AI isn’t easy, but can dramatically speed up and improve the development process. A partner can also help develop useful algorithms on top of annotated data, so teams can focus on other critical objectives.

3. Aim for progress over perfection. Developing any AI-based solution is an iterative process that requires teams to continually test, gather feedback, and optimize their models. Unfortunately, this process doesn’t always lead to a viable solution. Healthcare organizations must set clear goals and prioritize very specific use cases prior to beginning development if they ever hope to generate a return on their AI investment. Similarly, having transparent conversations about existing constraints and organizational limitations will help streamline resource provision and perhaps prevent the need for much harder conversations months (or years) into the project.

Barring unprecedented reform, unstructured data will continue to be the rule rather than the exception in the healthcare industry. Instead of lamenting this as an insurmountable barrier to innovation, healthcare organizations and product teams should use existing constraints to focus their efforts. Change won’t happen overnight; without slow, gradual progress, it won’t happen at all. Using the above strategies, teams can take the small steps needed to ensure that it does.

Vatsal Ghiya is a serial entrepreneur with more than 20 years of experience in healthcare AI software and services. He is a CEO and co-founder of Shaip, which enables the on-demand scaling of its platform, processes, and people for companies with the most demanding machine learning and artificial intelligence initiatives.


Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.