server : Add option to return token pieces in /tokenize endpoint (#9108)
* server : added with_pieces functionality to /tokenize endpoint * server : Add tokenize with pieces tests to server.feature * Handle case if tokenizer splits along utf8 continuation bytes * Add example of token splitting * Remove trailing ws * Fix trailing ws * Maybe fix ci * maybe this fix windows ci? --------- Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
M
Mathijs Henquet committed
78203641fee3b1f82abaff0c7f667e1b4a286390
Parent: e6b7801
Committed by GitHub <noreply@github.com>
on 9/12/2024, 8:30:11 PM