When deploying a Transformer model, the complexity increases since besides the model one also needs to consider how to deploy the tokenizer. Is it best to deploy the tokenizer on the server or to have the end users handle it and what is the best way to deploy the tokenizer along with the model on the server so that inference latency and costs are minimized?