Open
Description
The last layer's hidden state is not always the best representation of text. In literature, output from intermediate layers is leveraged as well, to improve predictive performance.
Here is a notebook to get started: https://colab.research.google.com/drive/1mdodbRk6ayA6g0pCTWxoaQB3cJ8ecbjj?usp=sharing.