Open
Description
I believe that the library's support models are based on tokenization and not model brand.
For example, using any of the latest Falcon models should work because they run on PreTrainedTokenizerFast
, however, the following error arises:
NotImplementedError: Tokenizer not supported: PreTrainedTokenizerFast
This seems to be because the library recognizes models based on their path or title as opposed to their tokenizer type, and so tiiuae/Falcon3-1B-Base
is unsupported, even though it should not be. The solution is to compare the loaded tokenizer to a list of supported tokenizer classes instead of focusing on naming conventions.
Metadata
Metadata
Assignees
Labels
No labels