Top legal tech trend prediction for 2025: Data, data and more data management
Firms that prioritise their data strategy will be best placed to support and derive value from their AI initiatives in 2025, writes Andrew Lindsay, general manager at LexisNexis Enterprise Solutions
2024 marked the year that generative AI (genAI) adoption in the industry was proven a certainty. Even the staunchest sceptics are now recognising that genAI is here to stay in legal. But was it also the year the AI ‘hype bubble’ burst?
The initial excitement and DIY approach to AI has been trumped by the need to demonstrate tangible ROI. The Boston Consulting Group recently reported that 74% of companies struggle to achieve and scale value when adopting AI, and legal providers are taking note, only considering investing in AI solutions that deliver demonstrable business value to their firm.
Data strategy and management should be the number one priority for law firms
It’s safe to say that a robust data strategy for continuous and thorough data management should be the number one focus for firms in 2025. Today there is the utmost recognition that the promise of AI, and more specifically genAI, relies almost entirely on the quality and integrity of the data that the large language models (LLM) are fed.
Research shows that technology companies are predicted to exhaust publicly available data for LLM training as early as 2026. So, here’s the rub, especially for firms tempted to buy ‘sexy’ systems that say ‘AI’ on the tin as a shortcut to AI adoption — firms need to start thinking about their own data source and how to glean value from it, before even attempting to gain efficiencies from AI exploiting internal knowledge.
Let’s face it, no firm can deliver better advice and lawyering by simply using publicly available data (ie, ChatGPT) — but by using their own data to train their LLM? Now, that’s when real value can be derived! A legal practice’s invaluable knowledge and expertise resides in the data held within their systems, so it makes sense that to truly extract value from AI technology, using their own data warehouse to power it is essential — in addition, of course, to external private and proprietary sources of data, whose quality, reliability, and integrity are proven.
Entangling data — a messy affair
However, if we’re honest, many lawyers’ data houses are not necessarily in ‘order’, and the task of solving the data management problem can undoubtedly be difficult and messy. Admittedly, such projects are not going to be enthusing and exciting, nevertheless they are essential — not only because of the promise the future of AI holds, but for the increased client and legislative demands weighing on modern legal practitioners. Firms are therefore better off having a well-developed data strategy that is ‘well on the way’ to implementing AI, rather than treating data cleansing and management as tomorrow’s job.
The garage analogy is a fitting comparison. For a garage — which for the best part of its existence has been filled with ‘stuff’, the door lowered to hide its clutter — sorting through, deciding on what to discard or retain, and then organising the identified useful ‘stuff’ is a daunting, drawn-out and potentially painful exercise. What’s the decision-making criteria around what to keep or throw away? Should the whole garage be organised in one go, or should the process be staggered? However arduous, it is better to consider these factors before decluttering to make sure nothing of value is lost. As a result, a plan of action can be determined, and the garage will finally be in a fit state to house the new shiny car and comply with the reduced insurance policy cost of keeping it in a locked garage — a win-win!
Considerations for a data management
Today, firms have disparate, disorganised and duplicated (even triplicated) data across various formats — Word, Excel, Outlook, PMS/CMS/CRM systems, and more. A key reason for this is that in most firms, digital transformation has occurred gradually over the years, often by converting hard copies into digital files.
A careful process of identifying the best, most representative documents to use for training, rather than just feeding all available documents into the model, is crucial. Due to the colossal volume of data residing in firms, they must conceptualise and build a data framework to collect, store, curate and manage so that every piece of data is held only once. This data normalisation is important to ensure data quality and integrity.
Routinely, some files may be drafts, outdated versions, or not representative of the firm’s best practices. If these lower-quality documents are used to train the LLM, it could lead to the model producing biased or inaccurate outputs. So, firms then must determine which data is trustworthy and appropriate for training AI models. The firm’s data strategy must be driven by the business need and its timeline for AI adoption. Of course, the ultimate vision has to be a carefully cleansed and automatically managed data environment, but realistically this goal cannot be realised in one day. Data strategy and management projects can take anywhere from two to five years to deliver full ROI. In the interim, firms must decide what data they need immediately for training the AI models so that the AI adoption vision can progress.
Part of the data strategy must also be determining, or even taking a stand on, who actually owns the data that will be used to train the LLMS — the firm or clients? Some food for thought: A client instructs the firm to act on its behalf. The firm uses its knowledge, expertise and experience to process the legal case, delivering an outcome and result for the client. Thereafter, the firm anonymises the client files/data, removing the client-specific information to the extent that it is not possible to identify the client. This experience collated and accumulated through 1,000s of such cases and files (in the form of data) is invaluable, and if fed to the LLM, will categorically improve the firm’s future legal service delivery for the better. Nonetheless, it is important to define the intent of this client data usage from the get-go, ensuring the client is informed, even if anonymised, so that no compliance implications can be realised.
In summary, firms that prioritise the development and execution of a realistic and comprehensive data strategy are the ones that will be best placed to support and derive value from their AI initiatives in 2025.
As the AI market matures, partners and business owners will be expecting to see real, tangible return for their continued commitment, creating ‘AI hope’ as opposed to ‘AI hype’, proving the value that AI holds for the future of law… and the business.