TEXT NORMALIZATION AND SPELL CORRECTION OF PUNJABI TEXT

Authors

  • Jagdish Kaur
  • C P Kamboj

Keywords:

Tokenization, Normalization, NeuralNetworks, Transformer, Deep Learning

Abstract

Text Normalization is the practice of mapping non-standardized words into standardized and canonical form.Training a language model of Punjabi language for Grammar Checker is very tedious task as notplentifulcorrect dataset for Punjabi linguistic is available. Collecting data from different sources may include noisy text, spelling errors and unwanted text etc. which require text normalization to make these data more suitable for training language model. In this paper we look at various texts’ normalization methods including spelling correction and highlight our framework for normalizing the Punjabi text.We treat text normalization of Punjabi text with neural machine translation approach. In this paper we propose ahybrid approach using deep learning-based encoder-decoder model using fine tuning of transformer with copy input method to do the task of text normalization and spelling correction of Punjabi language misspelled words and statistical technique which is highlighted and could be used as pre-processing or post-processing for enhancing the performance of our proposed model architecture. We trained and evaluated our proposed model on prepared Punjabi language parallel dataset consisting of correct-incorrect words.The experiments reveal that our proposed model touches significant performance on various semiotic classes and outperforms other existing models in terms of the accuracy.

Downloads

Published

2025-06-28

How to Cite

Jagdish Kaur, & C P Kamboj. (2025). TEXT NORMALIZATION AND SPELL CORRECTION OF PUNJABI TEXT. Utilitas Mathematica, 122(1), 1498–1516. Retrieved from https://utilitasmathematica.com/index.php/Index/article/view/2380

Citation Check

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.