ediscovery

Simplifying the statistics behind the Elusion Test

03 August 2021 | Written by Altlaw

The Elusion Test is a validatory test run at the stabilisation point of a Technology Assisted Review (or TAR), that estimates the accuracy of your active learning algorithm in identifying relevant documents. There are two types of Elusion Tests, allowing you to run tests speciﬁc to your data and desired outcomes, these tests are:

• Fixed Testing – A test sample of a speciﬁed size is created.

• Statistical Testing – A random test sample is created where the size of the sample is dependent on a given Conﬁdence and Margin of Error.

Regardless of which test you choose, the conﬁdence level and margin of error are crucial statistics to understand in order to properly interpret the results of your Elusion Test, yet they are not entirely clear in their meaning. To understand what conﬁdence level and margin of error refer to in the case of an Elusion Test, we must ﬁrst understand that the Elusion Test is run on a sample of documents taken from the ’discard pile’ created by the active learning algorithm. The discard pile is made up of all the documents the algorithm has deemed to have a relevancy score below your desired cutoﬀ. None of these documents has been reviewed, therefore we as reviewers have no idea what the document landscape of this pile may look like. This is where conﬁdence level and margin of error come into play.

Screen Shot

Figure 1: A random sample of documents are taken from the discard pile to create the sample set upon which the Elusion Test is run.

Conﬁdence Level is the percentage probability of the sample upon which the Elusion Test was run, being an accurate reﬂection of the entire discard pile. Simply put, if we had a discard pile of 10,000 documents and only test reviewed 100, we have only tested 1% of the document pool, therefore we are unlikely to have an accurate representation of the pool as a whole. On the other hand, if we test reviewed 5,000 documents, we have now tested 50% of the documents and are much more likely to understand the document landscape.

Note: these percentages are NOT the conﬁdence level itself!

As an example, a 95% conﬁdence level speaks to the 95% likelihood of the actual number of eluded documents being within the predicted range given by the Elusion Test. The higher the conﬁdence level you need, the more documents you will have to review and the more time the Elusion Test will take.

Visit the blog

Managed Review in the digital age: Embracing technology to stay ahead

Manual Review is the process of human reviewers painstakingly reviewing documents collected as part...

Use cases for aiR for Review: How generative AI can boost efficiency

Ever since Relativity announced their new GenAI aiR suite of tools based on OpenAI's GPT-4 model,...

aiR for Review: Introducing the tech changing up the legal landscape

Since the launch of Chat GPT in November 2022, there has been an unprecedented uptake in the use of...

It has come to our attention in recent months that a fair amount of interest has been shown in...

Simplifying the statistics behind the Elusion Test

Related articles

Managed Review in the digital age: Embracing technology to stay ahead

Use cases for aiR for Review: How generative AI can boost efficiency

aiR for Review: Introducing the tech changing up the legal landscape

What is Microsoft Azure Purview

Simplifying the statistics behind the Elusion Test

To continue reading please sign up

Related articles

Managed Review in the digital age: Embracing technology to stay ahead

Use cases for aiR for Review: How generative AI can boost efficiency

aiR for Review: Introducing the tech changing up the legal landscape

What is Microsoft Azure Purview