Understanding LLM and RAG: My Notes — Part I
Artificial Intelligence (AI) refers to the process of training machines to learn and exhibit intelligence based on data provided to algorithms. AI is broadly classified into two types: Predictive AI and Generative AI.
Types of Artificial Intelligence
Predictive AI
Predictive AI is used to analyze existing data and make predictions about future events. It helps in identifying patterns and estimating probabilities. For example:
- Image recognition: Identifying objects in images.
- Stock market analysis: Predicting stock price movements based on historical data.
Generative AI
Generative AI is used to create new content, such as images, videos, and text. It learns patterns from existing data and generates novel outputs. Examples include:
- AI-generated artwork and videos.
- Writing essays, poems, and code.
- Creating synthetic voices and music.
Machine Learning and Deep Learning
Machine Learning (ML) and Deep Learning (DL) are subsets of AI that focus on gaining intelligence through data-driven training. A simple analogy is human learning through schools and colleges to acquire knowledge.
Machine Learning Methods
Machine learning can be categorized into different types:
- Supervised Learning: Similar to a school setting where a teacher provides labeled examples for training.
- Semi-Supervised Learning: Comparable to homeschooling, where both labeled and unlabeled data are used for learning.
- Unsupervised Learning: Learning without direct supervision, similar to self-learning without attending formal education.
Deep Learning
Deep learning is a specialized form of ML that uses neural networks to train models with multiple layers of interconnected nodes. These layers are called hidden layers, and each connection (vertex) has a weight assigned to it. By adjusting the weights and using activation functions, the system optimizes its predictions. These trained weights and activation function values are then applied to test data to generate insights.
Large Language Models (LLMs)
Large Language Models (LLMs) are a type of generative AI trained on massive amounts of text data with numerous parameters. Examples of popular LLMs include:
- LLaMA (by Meta)
- GPT (OpenAI)
- Claude (Anthropic)
- Gemini (Google DeepMind)
Retrieval-Augmented Generation (RAG)
Why RAG is Needed
LLMs are generally trained on publicly available or general knowledge data. However, enterprises often require AI to provide answers specific to their organization. Training a custom LLM from scratch is resource-intensive, requiring enormous computing power and memory. Retrieval-Augmented Generation (RAG) solves this problem by:
- Storing organization-specific data in a vector database.
- Retrieving relevant information from the database and providing it as context to the LLM.
- Improving the accuracy and relevance of AI-generated responses.
Vector Databases
A vector database is a specialized type of database optimized for semantic search. Instead of storing traditional tabular data, it stores data as vectors to enable similarity searches using mathematical algorithms.
Common Similarity Search Algorithms
- Cosine Similarity: Measures the cosine of the angle between two vectors to determine their similarity.
- Euclidean Distance: Computes the straight-line distance between two points in vector space.
Popular Vector Databases
- Weaviate
- Milvus
These databases enhance AI applications by enabling efficient retrieval of relevant data, making them essential for enterprise AI solutions.
In the next part, we will work on a mini-project where we integrate LLM with a vector database to create a functional AI application. Stay tuned for hands-on implementation!