It would work the same way, you would just need to connect with your local model. For example, change the code to find the embeddings with your local model, and store that in Milvus. After that, do the inference calling your local model.
I've not used inference with local API, can't help with that, but for embeddings, I used this model and it worked quite fast, plus was a top2 model in Hugging Face. Leaderboard. Model.
I didn't do any training, just simple embed+interference.
Plus you can always go the pirate way as well. I do for the most expensive games / from companies I dislike / as a trial mode for games I'm interested in buying.