You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
https://arxiv.org/abs/2307.02486
Scaling to 1 billion context length paper in addition to this seems like it would solve the pursuit of infinite context length. Also FoT feels similar to L2P learn to prompt which integrates a pool of prompts to help get over the forgetful issues while applying continuous learning to a model... Maybe there could be both the database of kvs accessed via knn that blends well also with L2P... Plus the LongNet dilation algorithm could definitely benefit from contrast learning too.
Thoughts?
The text was updated successfully, but these errors were encountered:
Hi, thanks for your interest in our work! From my understanding of the LongNet paper, the main idea of FoT which is training on negative examples while utilizing longer context, and the dilated attention from LongNet seem pretty orthogonal, which would make combining these two methods an interesting research direction to explore!
https://arxiv.org/abs/2307.02486
Scaling to 1 billion context length paper in addition to this seems like it would solve the pursuit of infinite context length. Also FoT feels similar to L2P learn to prompt which integrates a pool of prompts to help get over the forgetful issues while applying continuous learning to a model... Maybe there could be both the database of kvs accessed via knn that blends well also with L2P... Plus the LongNet dilation algorithm could definitely benefit from contrast learning too.
Thoughts?
The text was updated successfully, but these errors were encountered: