Chatting with GPT
The internet’s new favourite AI tool is quickly changing legal tech.
Scott Stevenson, CEO of Rally, and co-founder Daniel Di Maria can hardly keep up with demand. Their AI contract tool, Spellbook, has caught the attention of lawyers in places as far as Germany and Singapore. The software works as a Microsoft Word plug-in and uses GPT 3.5 to write clauses and create plain language explanations that can be sent to clients. Lawyers can ask questions via the Chat GPT function. The high demand is prompting the pair to go through another round of investment to hire more employees.
"ChatGPT has been good for us," says Stevenson. "We focus on user experience for lawyers. Previous tech was not very good and can interrupt the flow of work. We wanted to keep it simple. It's hard to make tools that stick in this market."
ChatGPT has taken over the conversation in the tech industry. It's rare to have a tech phenomenon capture the hearts and imagination of lawyers and the public at the same time. When OpenAI launched ChatGPT last November, more than one million users signed up for a new trial within the first week. It now has more than 100 million users. On February 1, the company announced ChatGPT plus, a $20 monthly subscription for U.S. users. The race is on for educators, employers and lawyers to adapt to this new technology.
While many lawyers are hearing about ChatGPT for the first time, the program has been out in the market for years. OpenAI launched its first version in 2018. The latest, GPT3.5, is one of many AI tools in the market allowing tech startups to use application programming interfaces (APIs) to build their platforms.
Startups and software engineers have been quietly building and using the tech for years. Colin Lachance, CEO of Jurisage AI, has worked with GPT to create a more efficient way to classify and summarize case law.
Machine learning models need a training set of data to identify and classify information. Because ChatGPT has already been trained in language, it can do this task without a training set. According to Lachance, GPT may be able to perform with 90-95% percent accuracy, making the work cheaper than other models.
"Let's say you wanted to train software to identify criminal cases," he says. "You can get a law student to do it and then rely on machine learning to do pattern matching based on the law student's work. With language models like GPT, I don't need the student to do the initial work. The machine will go through the cases, and the law student will check its work, with the results being the training set for a fine-tuned machine learning model."
Lachance hopes to use ChatGPT as a way for users to have a "conversation" with specific cases. "Using the Jurisage dashboard, we'll let you explore topics in a conversational mode," he says. "And precise language won't be necessary. For example, you asking about either vicarious liability or contributory negligence may surface both and their relationship with each other in the case."
What makes ChatGPT so intriguing is that it appears to answer any question. Researchers Daniel Martin Katz and Michael James Bommario decided to run a test — have ChatGPT complete the multistate multiple choice section of the bar exam from the U.S. National Conference of Bar Examiners. Their paper, "GPT takes the Bar Exam," documents how the tool has improved over the years. ChatGPT passed the test, answering 68% of the questions correctly. That was an improvement from GPT3 getting 35%. GPT2 wasn't even able to understand the questions.
"It was hard to communicate about AI and now people can see it render a poem or document," says Katz, a law professor at Illinois Tech - Chicago Kent Law. "We created a simple way to demonstrate GPT's increased capacity in understanding not just general language but legal language as well. The bar exam was a way to do that."
The paper went viral, even prompting The Onion to mock ChatGPT for being forced to forgo its dreams of being an AI art bot to pursue the law.
For Katz and other researchers, the real question is whether we've reached a turning point in AI development. The longstanding debate is whether any AI program will pass the Turing test. Alan Turing created the test in 1950, which states if humans cannot differentiate whether a program is human or not, it passes the test of being an intelligent, thinking entity. "The public can see this technology broadly is beginning to cross the rubicon," says Katz adding that the breakthrough is machines being able to understand nuance in language.
"Legal complexity is increasing and lawyers need to offset that with AI tools," says Katz. "Some types of work can be mechanized, but we still need humans to be a part of the oversight. People should work collaboratively with these tools."
Fact or hallucination?
Ask ChatGPT about anything. Eventually you'll learn something and find out it's not real.
When ChatGPT gives you a fake answer, it's known as a "hallucination". AI researchers borrowed the term from cognitive science and psychologists to describe something that did not occur, even though the person describing it believes it's real. For AI, it creates a false reality, like a fake court case, based on the data it has from the internet and other sources.
Leslie McCallum is working hard to prevent her AI tool from hallucinating. Her company Lexata is an AI research tool used to navigate the complex and technical language of U.S. and Canadian securities regulation. She's been using GPT3 since 2021 and ChatGPT as a "playground" for a couple of years. Using statistics and identifying the right inputs through prompt engineering, her goal is to avoid "garbage in, garbage out."
"GPT-3 is one part of Lexata's natural language processing workflow," says McCallum. "Lexata is building something that is "high quality in, high quality out. It's very simple to use — no training is required, unlike most legal technology research tools. We plan to cover all areas of Canadian and U.S. securities law and have our sights set on U.K. law and global climate disclosure rules too."
The key to avoiding AI hallucinations is understanding how the tool is developed in the backend. Some answers should always remain the same — the colour of Santa's suit for example. Developers made different decisions that can affect the likelihood of AI hallucinations.
"The playground facilitates trial and error," says McCallum. "For example, there's a parameter called temperature. If the temperature is set close to 0, you'll get statistically safer responses. When the temperature is set higher, the system will take more risk with language, so you're more likely to get creative output but also a factually inaccurate response."
That risk is causing people to hesitate to use the software. To respond to criticism, OpenAI launched AI Text Classifier to help educators detect if students used AI to complete homework. McCallum believes AI tools can't be foolproof, and there needs to be human oversight.
"As human beings, we cannot abandon responsibility for what we are creating," says McCallum. "AI doesn't exist without humans building it. Even if you ask ChatGPT whether it can think for itself and it replies, 'Yes, I certainly can, and I'm highly intelligent,' don't be fooled. It cannot. Humans built it to make it behave as human-like as possible. But it's still just 1s and 0s."
Rally's co-founders also have this issue on the radar. They're working on creating a tool where clients can train AI to draft and explain contracts using their own templates.
"Lawyers want guardrails, so they don't get hallucinations," says Stevenson. "We have legal professionals who want to use their own historical documents to train the software. Eventually, we want to go to the full matter level. We want clients to be able to upload minute books and the AI to create shareholder agreements and other documents automatically. That way, you can ask a question like who are the shareholders or what are the ten riskiest contracts we did this year."
The future of GPT
It's still anyone's guess how fast the technology evolves. Katz, along with other researchers, recently released a paper studying the growing prevalence of AI tools and how general and specialized tools are performing in the market.
"Small models are trained on specific topics and are competing with big, general models," says Katz. "But if there are big specialized models, they will likely beat the general, big models. It's about integrating general knowledge with domain knowledge. The sweet spot is depth and breadth of knowledge. Lawyers have three layers of knowledge — hyper task-specific knowledge, general legal knowledge, and overall general knowledge. The key is to connect the hyper-specific knowledge with the other two."
The battle for GPT supremacy has already begun. Microsoft recently announced a multi-billion dollar investment into OpenAI and launched a paid subscription service for Microsoft Teams that uses ChatGPT.
"This will be the most exciting year we've had in tech in a long time," says Katz. "OpenAI got to market and had one of the greatest demo rollouts I've ever seen." But that won't go unanswered, he says. Google is about to release its own chatbots. Amazon, Meta and IBM will all have their own versions, too. "There are lots of choices you make when creating a model, so they will have different capabilities."
We'll just have to wait and see where ChatGPT takes the conversation next.