Code To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment.

Related tags

Deep Learningcoliee
Overview

COLIEE 2021 - task 2: Legal Case Entailment

This repository contains the code to reproduce NeuralMind's submissions to COLIEE 2021 presented in the paper To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment. There has been mounting evidence that pretrained language models fine-tuned on large and diverse supervised datasets can transfer well to a variety of out-of-domain tasks. In this work, we investigate this transfer ability to the legal domain. For that, we participated in the legal case entailment task of COLIEE 2021, in which we use such models with no adaptations to the target domain. Our submissions achieved the highest scores, surpassing the second-best submission by more than six percentage points. Our experiments confirm a counter-intuitive result in the new paradigm of pretrained language models: that given limited labeled data, models with little or no adaption to the target task can be more robust to changes in the data distribution and perform better on held-out datasets than models fine-tuned on it.

Models

monoT5-zero-shot: We use a model T5 Large fine-tuned on MS MARCO, a dataset of approximately 530k query and relevant passage pairs. We use a checkpoint available at Huggingface’smodel hub that was trained with a learning rate of 10−3 using batches of 128 examples for 10k steps, or approximately one epoch of the MS MARCO dataset. In each batch, a roughly equal number of positive and negative examples are sampled.

monoT5: We further fine-tune monoT5-zero-shot on the COLIEE 2020 training set following a similar training procedure described for monoT5-zero-shot. The model is fine-tuned with a learning rate of 10−3 for 80 steps using batches of size 128, which corresponds to 20 epochs. Each batch has the same number of positive and negative examples.

DeBERTa: Decoding-enhanced BERT with disentangled attention(DeBERTa) improves on the original BERT and RoBERTa architectures by introducing two techniques: the disentangled attention mechanism and an enhanced mask decoder. Both improvements seek to introduce positional information to the pretraining procedure, both in terms of the absolute position of a token and the relative position between them. We fine-tune DeBERTa on the COLIEE 2020 training set following a similar training procedure described for monoT5.

DebertaT5 (Ensemble): We use the following method to combine the predictions of monoT5 and DeBERTa (both fine-tuned on COLIEE 2020 dataset): We concatenate the final set of paragraphs selected by each model and remove duplicates, preserving the highest score. It is important to note that our method does not combine scores between models. The final answer for each test example is composed of individual answers from one or both models. It ensures that only answers with a certain degree of confidence are maintained, which generally leads to an increase in precision.

Results

Model Train data Evaluation F1 Description
Median of submissions Coliee 58.60
Coliee 2nd best team Coliee 62.74
DeBERTa (ours) Coliee Coliee 63.39 Single model
monoT5 (ours) Coliee Coliee 66.10 Single model
monoT5-zero-shot (ours) MS Marco Coliee 68.72 Single model
DebertaT5 (ours) Coliee Coliee 69.12 Ensemble

In this table, we present the results. Our main finding is that our zero-shot model achieved the best result of a single model on 2021 test data, outperforming DeBERTa and monoT5, which were fine-tuned on the COLIEE dataset. As far as we know, this is the first time that a zero-shot model outperforms fine-tuned models in the task of legal case entailment. Given limited annotated data for fine-tuning and a held-out test data, such as the COLIEE dataset, our results suggest that a zero-shot model fine-tuned on a large out-of-domain dataset may be more robust to changes in data distribution and may generalize better on unseen data than models fine-tuned on a small domain-specific dataset. Moreover, our ensemble method effectively combines DeBERTa and monoT5 predictions,achieving the best score among all submissions (row 6). It is important to note that despite the performance of DebertaT5 being the best in the COLIEE competition, the ensemble method requires training time, computational resources and perhaps also data augmentation to perform well on the task, while monoT5-zero-shot does not need any adaptation. The model is available online and ready to use.

Conclusion

Based on those results, we question the common assumption that it is necessary to have labeled training data on the target domain to perform well on a task. Our results suggest that fine-tuning on a large labeled dataset may be enough.

How do I get the dataset?

Those who wish to use previous COLIEE data for a trial, please contact rabelo(at)ualberta.ca.

How do I evaluate?

As our best model is a zero-shot one, we provide only the evaluation script.

References

[1] Document Ranking with a Pretrained Sequence-to-Sequence Model

[2] DeBERTa: Decoding-enhanced BERT with Disentangled Attention

[3] ICAIL '21: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law

[4] Proceedings of the Eigth International Competition on Legal Information Extraction/Entailment

How do I cite this work?

 @article{to_tune,
    title={To Tune or Not To Tune? Zero-shot Models for Legal Case Entailment},
    author={Moraes, Guilherme and Rodrigues, Ruan and Lotufo, Roberto and Nogueira, Rodrigo},
    journal={ICAIL '21: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law June 2021 Pages 295–300},
    url={https://dl.acm.org/doi/10.1145/3462757.3466103},
    year={2021}
}
Owner
NeuralMind
Deep Learning for NLP and image processing
NeuralMind
Exploring Relational Context for Multi-Task Dense Prediction [ICCV 2021]

Adaptive Task-Relational Context (ATRC) This repository provides source code for the ICCV 2021 paper Exploring Relational Context for Multi-Task Dense

David Brüggemann 35 Dec 05, 2022
Its a Plant Leaf Disease Detection System based on Machine Learning.

My_Project_Code Its a Plant Leaf Disease Detection System based on Machine Learning. I have used Tomato Leaves Dataset from kaggle. This system detect

Sanskriti Sidola 3 Jun 15, 2022
The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

Introduction This repository includes the source code for "Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks", which is pu

machen 11 Nov 27, 2022
Educational 2D SLAM implementation based on ICP and Pose Graph

slam-playground Educational 2D SLAM implementation based on ICP and Pose Graph How to use: Use keyboard arrow keys to navigate robot. Press 'r' to vie

Kirill 19 Dec 17, 2022
ECAENet (TensorFlow and Keras)

ECAENet: EfficientNet with Efficient Channel Attention for Plant Species Recognition (SCI:Q3) (Journal of Intelligent & Fuzzy Systems)

4 Dec 22, 2022
Official Python implementation of the 'Sparse deconvolution'-v0.3.0

Sparse deconvolution Python v0.3.0 Official Python implementation of the 'Sparse deconvolution', and the CPU (NumPy) and GPU (CuPy) calculation backen

Weisong Zhao 23 Dec 28, 2022
Official code for Next Check-ins Prediction via History and Friendship on Location-Based Social Networks (MDM 2018)

MUC Next Check-ins Prediction via History and Friendship on Location-Based Social Networks (MDM 2018) Performance Details for Accuracy: | Dataset

Yijun Su 3 Oct 09, 2022
Tensorflow implementation of Semi-supervised Sequence Learning (https://arxiv.org/abs/1511.01432)

Transfer Learning for Text Classification with Tensorflow Tensorflow implementation of Semi-supervised Sequence Learning(https://arxiv.org/abs/1511.01

DONGJUN LEE 82 Oct 22, 2022
PyTorch implementation for Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuous Sign Language Recognition.

Stochastic CSLR This is the PyTorch implementation for the ECCV 2020 paper: Stochastic Fine-grained Labeling of Multi-state Sign Glosses for Continuou

Zhe Niu 28 Dec 19, 2022
Research code of ICCV 2021 paper "Mesh Graphormer"

MeshGraphormer ✨ ✨ This is our research code of Mesh Graphormer. Mesh Graphormer is a new transformer-based method for human pose and mesh reconsructi

Microsoft 251 Jan 08, 2023
Code for HodgeNet: Learning Spectral Geometry on Triangle Meshes, in SIGGRAPH 2021.

HodgeNet | Webpage | Paper | Video HodgeNet: Learning Spectral Geometry on Triangle Meshes Dmitriy Smirnov, Justin Solomon SIGGRAPH 2021 Set-up To ins

Dima Smirnov 61 Nov 27, 2022
MLJetReconstruction - using machine learning to reconstruct jets for CMS

MLJetReconstruction - using machine learning to reconstruct jets for CMS The C++ data extraction code used here was based heavily on that foundv here.

ALPhA Davidson 0 Nov 17, 2021
Enhancing Column Generation by a Machine-Learning-BasedPricing Heuristic for Graph Coloring

Enhancing Column Generation by a Machine-Learning-BasedPricing Heuristic for Graph Coloring (to appear at AAAI 2022) We propose a machine-learning-bas

YunzhuangS 2 May 02, 2022
A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation

Aboleth A bare-bones TensorFlow framework for Bayesian deep learning and Gaussian process approximation [1] with stochastic gradient variational Bayes

Gradient Institute 127 Dec 12, 2022
Python calculations for the position of the sun and moon.

Astral This is 'astral' a Python module which calculates Times for various positions of the sun: dawn, sunrise, solar noon, sunset, dusk, solar elevat

Simon Kennedy 169 Dec 20, 2022
Certis - Certis, A High-Quality Backtesting Engine

Certis - Backtesting For y'all Certis is a powerful, lightweight, simple backtes

Yeachan-Heo 46 Oct 30, 2022
Readings for "A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Therapy, and Drug-Drug Interaction Prediction."

Polypharmacy - DDI - Synergy Survey The Survey Paper This repository accompanies our survey paper A Unified View of Relational Deep Learning for Polyp

AstraZeneca 79 Jan 05, 2023
FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows

FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows.

Meta Incubator 272 Jan 02, 2023
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

Pytorch Lightning 21.1k Jan 01, 2023
Collapse by Conditioning: Training Class-conditional GANs with Limited Data

Collapse by Conditioning: Training Class-conditional GANs with Limited Data Moha

Mohamad Shahbazi 33 Dec 06, 2022