Skip to main content

Making sense of data retrieval with Elasticsearch

by Eamon McCarthy Earls
Assistant Editor, MSDynamicsWorld.com

Microsoft is aggressively promoting new Copilot capabilities. For many in the Microsoft channel, this new hype raises some important questions. How much will offerings like Copilot expand on existing enterprise search using OpenAI large language models? Users are still learning the exact extent of these capabilities in practice and what’s involved in training LLMs to be more customized.

Speaking during a session at Build 2023, Nick Chow of Elastic shed some light on how data is retrieved with Azure Machine Learning and Elasticsearch.

Models like ChatGPT are limited by public training data that can be relatively generic. Chow explained that fine tuning LLMs is costly. Context windows provide a way to do this at a somewhat lower cost, but are limited in how much information can be included to train the LLM. “[Data] needs to be labeled, often manually,” he said. Training LLMs can be a grueling process, with data labeling conducted by low-wage data entry employees.

FREE Membership Required to View Full Content:

Joining MSDynamicsWorld.com gives you free, unlimited access to news, analysis, white papers, case studies, product brochures, and more. You can also receive periodic email newsletters with the latest relevant articles and content updates.
Learn more about us here

About Eamon McCarthy Earls

As the assistant editor at MSDynamicsWorld.com, Eamon helps to oversee editorial content on the site and supports site management and strategy. He can be reached at eearls@msdynamicsworld.com.

Before joining MSDynamicsWorld.com, Eamon was editor for SearchNetworking.com at TechTarget, where he covered networking technology, IoT, and cybersecurity. He is also the author of multiple books and previously contributed to publications such as the Boston Globe, Milford Daily News, and DefenceWeb.