top of page
Writer's pictureVishwanath Akuthota

Machines don't understand human language & that is where we need embeddings for Generative AI

LLMs store the meaning and context of the data fed in a specialized format known as embeddings. Imagine capturing the essence of a word, image or video in a single mathematical equation. That’s the power of vector embeddings — one of the most fascinating and influential concepts in machine learning today.


For example, the images of animals like cat and dog are unstructured data and cannot be directly stored in a database. Hence, they will be converted into machine readable format, that's what we call embeddings and then stored in a vector database.


By translating unstructured and high-dimensional data into a lower-dimensional space, embeddings make it possible to perform complex computations more efficiently.


Types of Embedding:

While most of us have commonly used text embedding, Embeddings can also be utilised for various types of data, such as images, graphs, and more.


⮕ Word Embeddings: Embedding of Individual words. Models: Word2Vec, GloVe, and FastText.


⮕ Sentence Embeddings Embedding of entire sentences as vectors that capture the overall meaning and context of the sentences. Models: Universal Sentence Encoder (USE) and SkipThought.


⮕ Document Embeddings Embedding of entire sentences capturing the semantic information and context of the entire document. Models: Doc2Vec and Paragraph Vectors.


⮕ Image Embeddings — captures different visual features. Models: CNNs, ResNet and VGG.


⮕ User/Product Embeddings represent users/products in a system as vectors. Capture user/products preferences, behaviors, attributes and characteristics. These are primarily used in recommendation systems.



embeddings AI


Below are some common embedding models we can use.

⮕ Cohere’s Embedding: Powerful for processing short texts with under 512 tokens.


⮕ Mistral Embedding: Strong embedding for AI/ML modeling like text classification, sentiment analysis etc.


⮕ Open AI Embeddings: Open AI is currently one of the market leaders for Embedding Algorithms. Of the all, OpenAI second-gen text-embedding model, ada-002, has proven to give top-notch results across various use cases.


Let's build a future where humans and AI work together to achieve extraordinary things!


Let's keep the conversation going!

What are your thoughts on the limitations of AI for struggling companies? Share your experiences and ideas for successful AI adoption.


Contact us(info@drpinnacle.com) today to learn more about how we can help you.

4 views0 comments

Comments


bottom of page