Skip to content

Home

Glossary

A

Term Description
A100 Ampere 100, A GPU variant named after French mathematician and physicist André-Marie Ampère
Adam Adam: A Method for Stochastic Optimization, an algorithm for first-order gradient-based optimization of stochastic objective functions.
Agentic RAG Agentic Retrieval-Augmented Generation, an advanced iteration of RAG relying on autonomous agents to handle complex reasoning, structured planning, and iterative tool-use over document sets.
AGI Artificial general intelligence, A concept that suggests a more advanced version of AI than we know today, one that can perform tasks much better than humans while also teaching and advancing its own capabilities.
AlexNet A pioneering convolutional neural network that won the 2012 ImageNet Large Scale Visual Recognition Challenge.
AlphaGo Mastering the game of Go with deep neural networks and tree search, a computer program that defeated a professional human Go player for the first time.
ALIGN A Large-scale ImaGe and Noisy-Text Embedding, 1.8 Billion Image-Text pairs dataset by Google.
AutoGen A framework that enables the development of LLM applications using multiple agents that can converse with each other to solve tasks.
AutoGPT An experimental open-source application showcasing the capabilities of the GPT-4 language model to autonomously achieve goals by breaking them down into sub-tasks.
AWQ Activation-aware Weight Quantization, an alogorithm for quantizing LLMs like GPTQ.

B

Term Description
BART Bidirectional and Auto-Regressive Transformers, an LLM by Google.
BELEBELE A Bambara word meaning "big, large, fat, great". This is a dataset containing 900 unique multiple-choice reading comprehension questions, each associated with one of 488 distinct passages
BERT Bidirectional Encoder Representations from Transformers, an LLM by Google.
BIG-Bench Beyond the Imitation Game Benchmark, a benchmark for measuring the performance of language models across a diverse set of tasks.
BiT Big Transfer, a family of transfer learning models pre-trained on large datasets.
BLEU BiLingual Evaluation Understudy, a metric for evaluating a generated sentence to a reference sentence.
BLOOM BigScience Large Open-science Open-access Multilingual Language Model
BPE Byte Pair Encoding, a tokenization method.

C

Term Description
C4 The Colossal Clean Crawled Corpus, a dataset of 800GB of English text collected from the web.
ChatDev Communicative Agents for Software Development, an approach simulating a virtual software company run by interacting autonomous AI agents.
Chinchilla Chinchilla is a 70B parameters model trained as a compute-optimal model with 1.4 trillion tokens by Deepmind.
CLIP Contrastive Language-Image Pre-training, maps data of different modalities, text and images, into a shared embedding space.
Computer Use Also known as OS or GUI agent paradigm, where multimodal LLMs are given the ability to directly interact with computer environments such as controlling mouse and keyboard.
CoQA CoQA is a large-scale dataset for building Conversational Question Answering systems. CoQA contains 127,000+ questions with answers collected from 8000+ conversations.
CoT Chain-of-Thought prompting, a method that enables large language models to tackle complex reasoning tasks through intermediate reasoning steps.
CrewAI A Python framework for orchestrating role-playing, autonomous AI agents for structured multi-agent collaboration tasks.

D

Term Description
DALL-E It is a portmanteau of the names of animated robot Pixar character WALL-E and the Spanish surrealist artist Salvador Dalí.
DDPM Denoising Diffusion Probabilistic Models, a class of generative models that synthesize high-quality data by reversing a diffusion process.
DSPy A framework for compiling declarative language model calls into self-improving pipelines, optimizing LM prompts and weights algorithmically.
DPO Direct Preference Optimization, a new parameterization of the reward model in RLHF with only a simple classification loss
DPR Dense Passage Retrieval

E

Term Description
ELMo Embeddings from Language Models
ERNIE Enhanced Representation through kNowledge IntEgration
ELECTRA Efficiently Learning an Encoder that Classifies Token Replacements Accurately

F

Term Description
FAIR Facebook AI Research
FLAN Fine tuning Language models
FLOPS Floating Point Operations Per Second
FLoRes Facebook Low Res Machine Translation Benchmark is a low-resource MT dataset.

G

Term Description
GAN Generative Adversarial Network, a class of machine learning frameworks designed to produce new data that follows the same distribution as the training set.
GAVIE GPT4-Assisted Visual Instruction Evaluation, an approach to evaluate visual instruction tuning without the need for human-annotated groundtruth answers and can adapt to diverse instruction formats.
Generative Agents Interactive simulacra of human behavior that can simulate daily life and interactions within a sandbox environment, guided by large language models.
GGML Georgi Gerganov Machine Learning, a C library focused on machine learning
GLaM Generalist Language Model, a family of language models which uses a sparsely activated mixture-of-experts architecture to scale the model capacity while also incurring substantially less training cost compared to dense variants.
GNN Graph Neural Network, a class of artificial neural networks for processing data that can be represented as graphs.
GSM8K Grade School Math 8K, GSM8K is a dataset of 8.5K high quality linguistically diverse grade school math word problems created by human problem writers.
GOFAI Good Old-Fashioned Artificial Intelligence
GPT Generative Pre-trained Transformer, a family of large language models developed by OpenAI.

H

Term Description
HNSW Hierarchical Navigable Small Worlds
HH-RLHF Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback

I

Term Description
ILSVRC2012 ImageNet Large Scale Visual Recognition Challenge 2012, a competition to estimate the content of photographs for the purpose of retrieval and automatic annotation using a subset of the large hand-labeled ImageNet dataset (10,000,000 labeled images depicting 10,000+ object categories) as training.
InstructGPT Language models trained to follow instructions with human feedback; a precursor model to ChatGPT.

J

Term Description
JFT JFT-300M is an internal Google dataset used for training image classification models. Images are labeled using an algorithm that uses complex mixture of raw web signals, connections between web-pages and user feedback.

L

Term Description
LAION-400M Large-scale Artificial Intelligence Open Network, an open dataset of CLIP-Filtered 400 Million Image-Text Pairs
LaMDA Language Model for Dialogue Applications
LangChain A framework designed to simplify the creation of applications using large language models, particularly well-suited for agentic workflows and tool integration.
LangGraph A framework for building stateful, multi-actor applications with LLMs, extending LangChain with cyclic computational graphs crucial for reliable agent orchestration.
LCM Latent Consistency Models
LLaMA Large Language Model Meta AI
LLaSM Large Language and Speech Model
LLM Large Language Model: An AI model trained on mass amounts of text data to understand language and generate novel content in human-like language.
LLaVA Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general purpose visual and language understanding.
LMM Large Multimodal Models, models for visual instructions like DeepMind’s Flamingo, Google’s PaLM-E, Salesforce’s BLIP, Microsoft’s KOSMOS-1, Tencent’s Macaw-LLM; Chatbots like ChatGPT and Gemini are LMMs.
LoRA Low Rank Adaptation, a fine-tuning method that uses low-rank matrices to adapt a pre-trained model to a new task.
LRV Large-scale Robust Visual, a large and diverse visual instruction tuning dataset.
LSTM Long Short-Term Memory, an artificial recurrent neural network architecture capable of learning order dependence in sequence prediction problems.

M

Term Description
M3W Multi Modal Massive Web, an image and text dataset by DeepMind. This is used to train Flamingo, a multimodal LLM.
MAWPS A Math Word Problem Repository is an online repository of Math Word Problems, to provide a unified testbed to evaluate different algorithms.
MemGPT Memory for GPT, a system that teaches language models how to manage their own memory hierarchy, creating the illusion of infinite context size for longer-running agent sessions.
ML Machine Learning, A component in AI that allows computers to learn and make better predictive outcomes without explicit programming. Can be coupled with training sets to generate new content.
MLP Multi Level Perceptron, a deep artificial neural network. It is a collection of more than one perceptron.
MLLM Multimodal Large Language Model
MLM Masked Language Model
MMLU Massive Multitask Language Understanding, a new test to measure a text model's multitask accuracy
MoA Mixture-of-Agents, a methodology that enhances LLM capabilities by layering and aggregating outputs from multiple diverse models.
MoE Mixture of Experts, a neural network architecture in which subsets of the parameters are experts specialized for different inputs, coordinated by a gating network.
MRC Machine Reading Comprehension
MTPB Multi-Turn Programming Benchmark, a benchmark consisting of 115 diverse problem sets that are factorized into multi-turn prompts

N

Term Description
NEFTune Noisy Embedding Instruction Fine Tuning, an algorithm which suggests adding noise to the embedding layer during forward pass of fine tuning.
NeRF Neural Radiance Field, a method for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function.
NeurIPS Neural Information Processing System
NLP Natural language processing. A branch of AI that uses machine learning and deep learning to give computers the ability to understand human language, often using learning algorithms, statistical models and linguistic rules.
NLG Natural Language Generation. A branch of AI that uses machine learning and deep learning to generate human-like language.
NLU Natural Language Understanding, to understand the relationship and meaning in text data.
NSP Next Sentence Prediction

P

Term Description
PALM Pathways Language Model
PEFT Parameter Efficient Fine-Tuning
POMDP Partially Observable Markov Decision Process, a model for decision making in situations where outcomes are partly random and partly under the control of a decision maker.
POPE Polling-based Object Probing Evaluation, an evaluation metric for probing the knowledge of LVLMs. Code and Data
PPO Proximal Policy Optimization, foundational RL algorithm for learning from human preferences

Q

Term Description
QLoRA Quantized Low Rank Adaptation, a fine-tuning method that combines Quantization and LoRA (Low-Rank Adapters).
Quantization Quantization is the process of reducing the numerical precision of a model's tensors, making the model more compact and the operations faster in execution.

R

Term Description
RAG Retriever-Augmented Generation is an AI framework that combines an information retrieval component with a text generation model to improve the quality of responses generated by LLMs.
Reflexion Language Agents with Verbal Reinforcement Learning, an approach where agents reflect on past failures to improve their subsequent actions without updating model weights.
ResNet A Residual Neural Network (a.k.a. Residual Network, ResNet) is a deep learning model in which the weight layers learn residual functions with reference to the layer inputs.
ReAct Reasoning and Acting, a paradigm for utilizing language models' reasoning traces and action plans interleaved to solve complex tasks.
RLHF Reinforcement Learning from Human Feedback
RoBERTa Robustly Optimized BERT Approach
ROGUE Recall-Oriented Understudy for Gisting Evaluation, a metric for evaluating a generated sentence to a reference sentence.
RoPE Rotary Position Embedding, an upgrade to traditional sinusodial positional embedding on Transformer architecture. Check this video for more details.

S

Term Description
SFT Supervised Fine Tuning, A fine-tuning method especially in LLMs
SQuAD Stanford Question Answering Dataset
SVAMP Simple Variations on Arithmetic Math word Problems is a challenge set to enable more robust evaluation of automatic MWP (Math Word Problem) solvers
SWE-agent Software Engineering agent, an agent-computer interface leveraging language models to fix bugs and resolve issues autonomously in real GitHub repositories.
SLM It can be both Small Language Model or Statistical Language Model

T

Term Description
T5 Text to Text Transfer Transformer
Toolformer Language Models Can Teach Themselves to Use Tools, a model trained to decide which APIs to call, when to call them, what arguments to pass, and how to best incorporate the results.
ToT Tree of Thoughts, a framework that generalizes over Chain of Thought by allowing an LM to explore multiple reasoning paths over conscious "thoughts".
Transformer A deep learning architecture based on attention mechanisms, introduced in "Attention Is All You Need."
TRL Transformer Reinforcement Learning, a framework for training and evaluating RL agents in the context of language generation tasks.

U

Term Description
U-Net Convolutional Networks for Biomedical Image Segmentation, a fully convolutional network architecture designed for semantic image segmentation.

V

Term Description
VAE Variational Autoencoder, a generative model that learns to encode data into a low-dimensional latent space and decode it back.
VIGC Visual Instruction Generation and Correction, a framework that enables multimodal large language models to generate instruction-tuning data and progressively enhance its quality on-the-fly.
ViT Vision Transformer, a vision model based as closely as possible on the Transformer architecture originally designed for text-based tasks.
VLU Vision Language Understanding, like Natural Language Understanding (NLU) but for images
Voyager An open-ended embodied agent with large language models, demonstrating autonomous skill discovery and continuous learning (most notably within the game Minecraft).
VRAM Video Random Access Memory, a special type of memory that stores graphics data for the GPU.

W

Term Description
Woodpecker Woodpecker is a training-free five steps method to correct hallucinations from MLLMs.
Word2Vec Efficient Estimation of Word Representations in Vector Space, a group of related models that are used to produce word embeddings, preserving semantic relationships.

X

Term Description
XLM Cross-lingual Language Models
XLU Cross-lingual Understanding
XNLI Cross-lingual Natural Language Inference
XLNet Generalized Autoregressive Pretraining for Language Understanding

Y

Term Description
YOLO You Only Look Once, a state-of-the-art, real-time object detection system.

Z

Term Description
ZeRO Zero Redundancy Optimizer

Note: PRs are accepted. Feel free to add more terms and their details.