The third annual NLP Summit is upon us, and it’s the biggest yet. For years, this free virtual event has brought together the brightest minds in artificial intelligence (AI), and more specifically, natural language processing (NLP), to share knowledge, best practices, and advances in the field. The summit takes place in just a week (Oct. 4-6) online, and with companies like Roche, Mount Sinai Health System, Merck, Cigna, Change Healthcare, Johnson & Johnson, and others presenting, it’s not one you want to miss.
NLP has been on a steady growth trajectory for the last several years. In fact, 60% of respondents from a recent survey indicated that their NLP budgets grew by at least 10% from the previous year, while a third (33%) said their budgets grew by at least 30%. There are few industries that have benefited from advances in AI and NLP more than healthcare and life science. The technology has enabled everything from accelerating clinical trials, predicting types of patient risk earlier, identifying candidates for drug discovery or drug repurposing, to summarizing new research on COVID-19, and better understanding social determinants of health.
The NLP Summit is where leaders from healthcare and beyond join forces to map out what’s next in NLP, including the challenges and promises ahead. The event will be broken down into three full days dedicated to open source, healthcare, and applications. While this year’s program covers many growth areas of NLP, three major themes have emerged: large language models, responsible AI practices, and multimodal learning. Here’s a closer look at these themes and why they matter to healthcare AI practitioners.
Large Language Models
Large language models (LLMs) are, as the name implies, upward of many gigabytes in size and trained on enormous amounts of text data. One of the main advantages of LLMs is that they significantly extend the capabilities of what systems are able to do with text. For example, they are very effective for most downstream tasks such as text generation, translation, summarization, semantic search, and more. Several sessions in the program will talk about applications of LLMs.
As you can imagine, in a clinical setting, accuracy is of the utmost importance. A crucial part of NLP is monitoring models over time to ensure they are still performing accurately and driving the best—in this case, safest—results. The ability of LLMs to effectively inform downstream tasks is a huge advantage for users. Models degrade overtime, and thus, all the tasks they’re powering. While LLMs are far from a fail-safe solution, they currently provide the most accurate solutions available, knowing that tuning and monitoring are still necessary.
Responsible AI
It’s hard to talk about AI or NLP without bringing up ethics. We rely on our technology to perform accurately and with the best intent, just as we entrust our clinicians to do no harm. But as long as humans have bias, so too will AI. It’s not something we can eliminate entirely (although that would be nice), but it is something that we should proactively measure, mitigate, and minimize. For healthcare, this includes factors like social determinants of health, access to care, and illnesses that disproportionately affect certain populations.
Several sessions at the NLP Summit will cover responsible AI. For example, one keynote from John Snow Labs will explore readily usable tools to automatically test and fix model robustness, errors in training data, and gender bias. Another session that comes to mind will be presented by Cigna, ensuring that LLMs stay current by effectively monitoring models that no longer represent the latest, state-of-the-art research. The talk will investigate how LLMs fail to model unseen or changing language, and techniques to help detect these instances. Methods that can be used to alleviate the problem of covariate shift will also be discussed.
Multimodal Learning
Multimodal learning uses various data types including text, images, and audio together to achieve better, more accurate results. Healthcare presents the very real challenge of combining structured data—medical claims, prescriptions, medical devices measurements—with unstructured data—medical notes, radiology images, conversations—to build a more cohesive view of a patient and their medical journey. Unsurprisingly, doing this at scale requires strong healthcare-specific NLP.
Roche will be presenting a session about deep learning for relation extraction from clinical documents. Another session presented by Meaning will pinpoint effective strategies for creating systems that can synthesize human-like speech, and discuss breakthroughs in generative AI. Speech technology is an area with significant potential in the healthcare space.
According to MarketsandMarkets, the NLP market size is expected to grow to nearly $50 billion by 2027. There are various factors driving this—urgent new medical challenges that require new solutions, as well as new technology that makes new solutions possible —resulting in increased investments in healthcare language understanding. There’s no better place to learn about how healthcare is propelling NLP forward than at this year’s NLP Summit. If you’re interested in attending, it’s not too late to register for free. Hope to see you there!
David Talby is CTO of John Snow Labs.