microsoft/AI-For-Beginners

Public

mirrored fromhttps://github.com/microsoft/AI-For-BeginnersAvailable

Watch0 Fork0 Star0

Code Commits Issues Pull requests Actions Insights Security

968117204e3d0612b3a8a8c7b6eab42ecc1ea926

Find a branch or tag

Branches

968117204e3d0612b3a8a8c7b6eab42ecc1ea926

Clone

HTTPS

Download ZIP

AI-For-Beginners/lessons/X-Extras/X1-MultiModal

lessons/X-Extras/X1-MultiModal/README.md

6lines · modepreview

Raw Download

Latest commit unavailable.

unknown

# Multi-Modal Networks

After the success of transformer models for solving NLP tasks, there were many attempts to apply the same or similar architectures to computer vision tasks. Also, there is a growing interest in building models that would *combine* vision and natural language capabilities. One of such attempts was done by OpenAI, which is called CLIP.

## Contrastive Image Pre-Training (CLIP)

microsoft/AI-For-Beginners

Branches

Tags

Clone