Home

Glossary¶

A¶

Term	Description
A100	Ampere 100, A GPU variant named after French mathematician and physicist André-Marie Ampère
Adam	Adam: A Method for Stochastic Optimization, an algorithm for first-order gradient-based optimization of stochastic objective functions.
Agentic RAG	Agentic Retrieval-Augmented Generation, an advanced iteration of RAG relying on autonomous agents to handle complex reasoning, structured planning, and iterative tool-use over document sets.
AGI	Artificial general intelligence, A concept that suggests a more advanced version of AI than we know today, one that can perform tasks much better than humans while also teaching and advancing its own capabilities.
AlexNet	A pioneering convolutional neural network that won the 2012 ImageNet Large Scale Visual Recognition Challenge.
AlphaGo	Mastering the game of Go with deep neural networks and tree search, a computer program that defeated a professional human Go player for the first time.
ALIGN	A Large-scale ImaGe and Noisy-Text Embedding, 1.8 Billion Image-Text pairs dataset by Google.
AutoGen	A framework that enables the development of LLM applications using multiple agents that can converse with each other to solve tasks.
AutoGPT	An experimental open-source application showcasing the capabilities of the GPT-4 language model to autonomously achieve goals by breaking them down into sub-tasks.
AWQ	Activation-aware Weight Quantization, an alogorithm for quantizing LLMs like GPTQ.

B¶

Term	Description
BART	Bidirectional and Auto-Regressive Transformers, an LLM by Google.
BELEBELE	A Bambara word meaning "big, large, fat, great". This is a dataset containing 900 unique multiple-choice reading comprehension questions, each associated with one of 488 distinct passages
BERT	Bidirectional Encoder Representations from Transformers, an LLM by Google.
BIG-Bench	Beyond the Imitation Game Benchmark, a benchmark for measuring the performance of language models across a diverse set of tasks.
BiT	Big Transfer, a family of transfer learning models pre-trained on large datasets.
BLEU	BiLingual Evaluation Understudy, a metric for evaluating a generated sentence to a reference sentence.
BLOOM	BigScience Large Open-science Open-access Multilingual Language Model
BPE	Byte Pair Encoding, a tokenization method.

C¶

Term	Description
C4	The Colossal Clean Crawled Corpus, a dataset of 800GB of English text collected from the web.
ChatDev	Communicative Agents for Software Development, an approach simulating a virtual software company run by interacting autonomous AI agents.
Chinchilla	Chinchilla is a 70B parameters model trained as a compute-optimal model with 1.4 trillion tokens by Deepmind.
CLIP	Contrastive Language-Image Pre-training, maps data of different modalities, text and images, into a shared embedding space.
Computer Use	Also known as OS or GUI agent paradigm, where multimodal LLMs are given the ability to directly interact with computer environments such as controlling mouse and keyboard.
CoQA	CoQA is a large-scale dataset for building Conversational Question Answering systems. CoQA contains 127,000+ questions with answers collected from 8000+ conversations.
CoT	Chain-of-Thought prompting, a method that enables large language models to tackle complex reasoning tasks through intermediate reasoning steps.
CrewAI	A Python framework for orchestrating role-playing, autonomous AI agents for structured multi-agent collaboration tasks.

D¶

Term	Description
DALL-E	It is a portmanteau of the names of animated robot Pixar character WALL-E and the Spanish surrealist artist Salvador Dalí.
DDPM	Denoising Diffusion Probabilistic Models, a class of generative models that synthesize high-quality data by reversing a diffusion process.
DSPy	A framework for compiling declarative language model calls into self-improving pipelines, optimizing LM prompts and weights algorithmically.
DPO	Direct Preference Optimization, a new parameterization of the reward model in RLHF with only a simple classification loss
DPR	Dense Passage Retrieval

E¶

Term	Description
ELMo	Embeddings from Language Models
ERNIE	Enhanced Representation through kNowledge IntEgration
ELECTRA	Efficiently Learning an Encoder that Classifies Token Replacements Accurately

F¶

Term	Description
FAIR	Facebook AI Research
FLAN	Fine tuning Language models
FLOPS	Floating Point Operations Per Second
FLoRes	Facebook Low Res Machine Translation Benchmark is a low-resource MT dataset.

G¶

Term	Description
GAN	Generative Adversarial Network, a class of machine learning frameworks designed to produce new data that follows the same distribution as the training set.
GAVIE	GPT4-Assisted Visual Instruction Evaluation, an approach to evaluate visual instruction tuning without the need for human-annotated groundtruth answers and can adapt to diverse instruction formats.
Generative Agents	Interactive simulacra of human behavior that can simulate daily life and interactions within a sandbox environment, guided by large language models.
GGML	Georgi Gerganov Machine Learning, a C library focused on machine learning
GLaM	Generalist Language Model, a family of language models which uses a sparsely activated mixture-of-experts architecture to scale the model capacity while also incurring substantially less training cost compared to dense variants.
GNN	Graph Neural Network, a class of artificial neural networks for processing data that can be represented as graphs.
GSM8K	Grade School Math 8K, GSM8K is a dataset of 8.5K high quality linguistically diverse grade school math word problems created by human problem writers.
GOFAI	Good Old-Fashioned Artificial Intelligence
GPT	Generative Pre-trained Transformer, a family of large language models developed by OpenAI.

H¶

Term	Description
HNSW	Hierarchical Navigable Small Worlds
HH-RLHF	Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

I¶

Term	Description
ILSVRC2012	ImageNet Large Scale Visual Recognition Challenge 2012, a competition to estimate the content of photographs for the purpose of retrieval and automatic annotation using a subset of the large hand-labeled ImageNet dataset (10,000,000 labeled images depicting 10,000+ object categories) as training.
InstructGPT	Language models trained to follow instructions with human feedback; a precursor model to ChatGPT.

J¶

Term	Description
JFT	JFT-300M is an internal Google dataset used for training image classification models. Images are labeled using an algorithm that uses complex mixture of raw web signals, connections between web-pages and user feedback.

L¶

Term	Description
LAION-400M	Large-scale Artificial Intelligence Open Network, an open dataset of CLIP-Filtered 400 Million Image-Text Pairs
LaMDA	Language Model for Dialogue Applications
LangChain	A framework designed to simplify the creation of applications using large language models, particularly well-suited for agentic workflows and tool integration.
LangGraph	A framework for building stateful, multi-actor applications with LLMs, extending LangChain with cyclic computational graphs crucial for reliable agent orchestration.
LCM	Latent Consistency Models
LLaMA	Large Language Model Meta AI
LLaSM	Large Language and Speech Model
LLM	Large Language Model: An AI model trained on mass amounts of text data to understand language and generate novel content in human-like language.
LLaVA	Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general purpose visual and language understanding.
LMM	Large Multimodal Models, models for visual instructions like DeepMind’s Flamingo, Google’s PaLM-E, Salesforce’s BLIP, Microsoft’s KOSMOS-1, Tencent’s Macaw-LLM; Chatbots like ChatGPT and Gemini are LMMs.
LoRA	Low Rank Adaptation, a fine-tuning method that uses low-rank matrices to adapt a pre-trained model to a new task.
LRV	Large-scale Robust Visual, a large and diverse visual instruction tuning dataset.
LSTM	Long Short-Term Memory, an artificial recurrent neural network architecture capable of learning order dependence in sequence prediction problems.

M¶

Term	Description
M3W	Multi Modal Massive Web, an image and text dataset by DeepMind. This is used to train Flamingo, a multimodal LLM.
MAWPS	A Math Word Problem Repository is an online repository of Math Word Problems, to provide a unified testbed to evaluate different algorithms.
MemGPT	Memory for GPT, a system that teaches language models how to manage their own memory hierarchy, creating the illusion of infinite context size for longer-running agent sessions.
ML	Machine Learning, A component in AI that allows computers to learn and make better predictive outcomes without explicit programming. Can be coupled with training sets to generate new content.
MLP	Multi Level Perceptron, a deep artificial neural network. It is a collection of more than one perceptron.
MLLM	Multimodal Large Language Model
MLM	Masked Language Model
MMLU	Massive Multitask Language Understanding, a new test to measure a text model's multitask accuracy
MoA	Mixture-of-Agents, a methodology that enhances LLM capabilities by layering and aggregating outputs from multiple diverse models.
MoE	Mixture of Experts, a neural network architecture in which subsets of the parameters are experts specialized for different inputs, coordinated by a gating network.
MRC	Machine Reading Comprehension
MTPB	Multi-Turn Programming Benchmark, a benchmark consisting of 115 diverse problem sets that are factorized into multi-turn prompts

N¶

Term	Description
NEFTune	Noisy Embedding Instruction Fine Tuning, an algorithm which suggests adding noise to the embedding layer during forward pass of fine tuning.
NeRF	Neural Radiance Field, a method for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function.
NeurIPS	Neural Information Processing System
NLP	Natural language processing. A branch of AI that uses machine learning and deep learning to give computers the ability to understand human language, often using learning algorithms, statistical models and linguistic rules.
NLG	Natural Language Generation. A branch of AI that uses machine learning and deep learning to generate human-like language.
NLU	Natural Language Understanding, to understand the relationship and meaning in text data.
NSP	Next Sentence Prediction

P¶

Term	Description
PALM	Pathways Language Model
PEFT	Parameter Efficient Fine-Tuning
POMDP	Partially Observable Markov Decision Process, a model for decision making in situations where outcomes are partly random and partly under the control of a decision maker.
POPE	Polling-based Object Probing Evaluation, an evaluation metric for probing the knowledge of LVLMs. Code and Data
PPO	Proximal Policy Optimization, foundational RL algorithm for learning from human preferences

Q¶

Term	Description
QLoRA	Quantized Low Rank Adaptation, a fine-tuning method that combines Quantization and LoRA (Low-Rank Adapters).
Quantization	Quantization is the process of reducing the numerical precision of a model's tensors, making the model more compact and the operations faster in execution.

R¶

Term	Description
RAG	Retriever-Augmented Generation is an AI framework that combines an information retrieval component with a text generation model to improve the quality of responses generated by LLMs.
Reflexion	Language Agents with Verbal Reinforcement Learning, an approach where agents reflect on past failures to improve their subsequent actions without updating model weights.
ResNet	A Residual Neural Network (a.k.a. Residual Network, ResNet) is a deep learning model in which the weight layers learn residual functions with reference to the layer inputs.
ReAct	Reasoning and Acting, a paradigm for utilizing language models' reasoning traces and action plans interleaved to solve complex tasks.
RLHF	Reinforcement Learning from Human Feedback
RoBERTa	Robustly Optimized BERT Approach
ROGUE	Recall-Oriented Understudy for Gisting Evaluation, a metric for evaluating a generated sentence to a reference sentence.
RoPE	Rotary Position Embedding, an upgrade to traditional sinusodial positional embedding on Transformer architecture. Check this video for more details.

S¶

Term	Description
SFT	Supervised Fine Tuning, A fine-tuning method especially in LLMs
SQuAD	Stanford Question Answering Dataset
SVAMP	Simple Variations on Arithmetic Math word Problems is a challenge set to enable more robust evaluation of automatic MWP (Math Word Problem) solvers
SWE-agent	Software Engineering agent, an agent-computer interface leveraging language models to fix bugs and resolve issues autonomously in real GitHub repositories.
SLM	It can be both Small Language Model or Statistical Language Model

T¶

Term	Description
T5	Text to Text Transfer Transformer
Toolformer	Language Models Can Teach Themselves to Use Tools, a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results.
ToT	Tree of Thoughts, a framework that generalizes over Chain of Thought by allowing an LM to explore multiple reasoning paths over conscious "thoughts".
Transformer	A deep learning architecture based on attention mechanisms, introduced in "Attention Is All You Need."
TRL	Transformer Reinforcement Learning, a framework for training and evaluating RL agents in the context of language generation tasks.

U¶

Term	Description
U-Net	Convolutional Networks for Biomedical Image Segmentation, a fully convolutional network architecture designed for semantic image segmentation.

V¶

Term	Description
VAE	Variational Autoencoder, a generative model that learns to encode data into a low-dimensional latent space and decode it back.
VIGC	Visual Instruction Generation and Correction, a framework that enables multimodal large language models to generate instruction-tuning data and progressively enhance its quality on-the-fly.
ViT	Vision Transformer, a vision model based as closely as possible on the Transformer architecture originally designed for text-based tasks.
VLU	Vision Language Understanding, like Natural Language Understanding (NLU) but for images
Voyager	An open-ended embodied agent with large language models, demonstrating autonomous skill discovery and continuous learning (most notably within the game Minecraft).
VRAM	Video Random Access Memory, a special type of memory that stores graphics data for the GPU.

W¶

Term	Description
Woodpecker	Woodpecker is a training-free five steps method to correct hallucinations from MLLMs.
Word2Vec	Efficient Estimation of Word Representations in Vector Space, a group of related models that are used to produce word embeddings, preserving semantic relationships.

X¶

Term	Description
XLM	Cross-lingual Language Models
XLU	Cross-lingual Understanding
XNLI	Cross-lingual Natural Language Inference
XLNet	Generalized Autoregressive Pretraining for Language Understanding

Y¶

Term	Description
YOLO	You Only Look Once, a state-of-the-art, real-time object detection system.

Z¶

Term	Description
ZeRO	Zero Redundancy Optimizer

Note: PRs are accepted. Feel free to add more terms and their details.