Thank you for Subscribing to Healthcare Business Review Weekly Brief
Modern Extractive Question Answering Annotators, Notable Performance Enhancements, and State-of-the-Art Models Characterize Spark NLP 4.0, the newest release from John Snow Labs.
Fremont, CA: “As the most widely used NLP library in the enterprise, we have a responsibility to deliver accurate, production-grade, state-of-the-art NLP software,” comments David Talby, CTO, John Snow Labs. Spark NLP 4.0 has been launched by John Snow Labs, a Healthcare AI and NLP business and provider of the Spark NLP library. Spark NLP 4.0 is the company's most significant release this year, featuring new question-answering annotators, huge performance enhancements, optimizations on new hardware platforms, and more than 1,000 state-of-the-art pre-trained transformer models available in numerous languages. This exemplifies John Snow Labs' continued dedication to providing the most advanced and accurate NLP tools to the global AI community.
Spark NLP's introduction of question-answering annotators enables the software to respond to arbitrary natural-language inquiries based on a given document. The models include an answer and an explanation of where the solution is found in the document. Hundreds of pre-trained models based on BERT, ALBERT, DeBERTa, RoBERTa, DistilBERT, Longformer, and XLM-RoBERT are available out-of-the-box to handle numerous languages, document formats, and performance objectives. Models are trained and fine-tuned so that consumers can instantly begin using these apps. This is another step toward making NLP easier to implement in production-grade systems.
Spark NLP 4.0 is optimized for the most recent hardware innovations, with official support for Apple's silicon M1 processor and Intel's oneAPI Deep Neural Network Library (oneDNN). Enabling onDNN can increase the performance of transformer-based models operating on CPU chips by up to 97%. Moreover, as a result of improvements to Nvidia GPU processors, consumers are enjoying performance increases of up to 700 percent. Support for the latest runtimes of Databricks, AWS EMR, and Kubernetes are also noteworthy features.
The version also includes enhancements to the precision of critical activities, giving new state-of-the-art precision for two popular tasks. One is named entity recognition (NER), for which Spark NLP 4.0 offers the highest accurate model on the widely used CoNLL-2003 benchmark among open-source NLP libraries. The second is coreference resolution, which utilizes BERT-based span categorization to outperform conventional methods and libraries.
Spark NLP has been in production for only five years, and already 33 percent of the world's businesses use it. According to Gradient Flow, this rises to 59 percent among AI practitioners in the healthcare and life sciences. Customers of John Snow Labs include, among others, half of the world's top 10 pharmaceutical businesses and the three largest healthcare companies in the United States.