📄️ VLM
CLIP is a multimodal machine learning model proposed by OpenAI. By performing comparative learning on large-scale image and text pairs, the model can simultaneously process images and text, mapping them into a shared vector space. This example demonstrates using CLIP for image management and text search on the RDK platform.