Fine Tuning DeBERTa for Named Entity Recognition (NER).
In this article, we’ll shortly describe how to fine tune a deberta base model on a Named Entity Recognition (NER) task.
If you would like to jump straight to the code, you can find it in this repository. If you want to download the trained model and play with it, you can do so on the model card page.
Context
Here we fine tune an NLP model to perform Named Entity Recognition, that is the task of identifying in a sentence the entity that can be . This task can be framed under a token classification task where the model classify a token into different classes such as Location, Person, Organization…
Here we will use a transformer-based model namely DeBERTa, pretrained, to fine tune it on a classification task using the connll2003 dataset. This dataset contains tokenized sentences along with the assigned to the tokens.
The base deberta model is transformer based model from Microsoft, with some variation on the attention mechanism of BERTand RoBERTa, using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on majority of NLU tasks.
Set up
To fine tune this model on the connll2003 dataset, we used Google Colaboratory as it can provide jupyter notebook with GPU computation for free.
Fine tuning
Once the colaboratory notebook set up, here are the different steps to follow:
- Install transformers and datasets packages from huggingface
- Load the wikiann dataset for english language, containing the training instances for the NER task
- Instantiate the Tokenizer for the deberta-base model, we used transformers.AutoTokenizer
- Tokenize the dataset, the align the labels, pad the sentence as the tokens expected by the deberta model are not the same as the one present in the connll2003 dataset.
- Define mappings between entities type ids and their label (B-LOC, I-LOC, B-PER, I-PER…)
- Define training parameters with transformers.TrainingArguments, we kept the parameter values used in the Token Classification course from Huggingface.
- Train model using transformers.Trainer
These steps are highlighted by Mardown cell headers in the Collaboratory notebook. This process is mainly the same as the one formulated on the excellent Huggingface course on Token Classification at the exception of one modification:
- When loading the Tokenizer, we need to add the add_prefix_space argument as for a RoBERTa model.
# adding prefix space for deberta model
model_name = "microsoft/deberta-base"
model_tokenizer = AutoTokenizer.from_pretrained(model_name, add_prefix_space=True)
Publishing/Inference demo
When uploaded on HugginFace model, you can try the model on the model card page. Next step will be to deploy the model through an API, or take the saved model and use it to make inference on some text to perform Named Entity Recognition.