You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why would you want to replace the vit model? Our vit (SigLip-400M) is trained end-to-end with the LLM, so directly replacing it would definitely lead to incorrect results, unless you retrain the model.
We have a clip model that has been fine-tuned on medical images and just wanted to check the possibility of having the swap done between siglip with our clip.
sameway goes with medical whisper and also qwen llm as well since we have individually these domain fine-tuned components in place.
Theoretically, this is feasible; however, you need to convert the VIT to the NAVIT-SigLIP-400M format, paying particular attention to the embedding. It is advisable to conduct thorough training after the replacement to ensure optimal performance.
Team, Is it possible to replace the existing SigLip-400M with our clip encoder in MiniCPM-o model? If yes, can you pls assist with directions.
The text was updated successfully, but these errors were encountered: