XLNet is a large bidirectional transformer with larger data and more computational power that uses improved training methodology. It supersedes BERT on prediction metrics on language tasks.
XLNet uses permutational language modelling, which predicts tokens in random order. This can help the model learn bidirectional relationships. The base architecture was the Transformer XL. XLNet’s permutation-based training should handle dependencies well in the long term.
- Project: XLNet
- Author: Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, and Quoc V. Le
- Initial Release: 2019
- Type: NLP
- License: Apache 2.0 Licence
- Contains: BERT (Large), XLNet Base and XLNet Large
- Language: Python, Jupyter Notebook, Shell
- GitHub: /zihangdai/xlnet with 5.8k stars and 10 contributors
- Twitter: None
- Applications: Predicting words based on context