Contrastive Language Image Pre-Training
CLIP architecture
Presently, here are some more applications of multimodal models.
receipt bookkeeping
image source: Recycle This Pittsburgh
scan grocery receipts
OCR
AI text decoding
code expense report
COPUS
Vision Question Answering
VQA
transfer learning between RNA and ATAC sequencing
scButterfly
data types