AI Glossary

The Definitive Glossary for Understanding Artificial Intelligence (AI)
TermDefinition
AlgorithmA set of rules or steps followed to solve a problem or perform a task.
Artificial Intelligence (AI)Simulation of human intelligence by machines, especially computers.
Machine LearningA subset of AI where machines improve performance through experience.
Deep LearningA type of machine learning using neural networks with many layers.
Neural NetworkA series of algorithms that mimic the human brain to recognize patterns.
Supervised LearningMachine learning with labeled data for training.
Unsupervised LearningMachine learning with unlabeled data, finding hidden patterns.
Reinforcement LearningLearning by interacting with an environment and receiving rewards or penalties.
Natural Language Processing (NLP)AI that understands and processes human language.
Computer VisionAI that interprets and understands visual information from the world.
Data MiningThe process of discovering patterns in large data sets.
Big DataExtremely large data sets analyzed to reveal patterns and trends.
Predictive AnalyticsUsing data, statistical algorithms, and machine learning techniques to predict future outcomes.
ClassificationAssigning data into predefined categories.
RegressionA statistical method for predicting a continuous outcome.
ClusteringGrouping data points into clusters based on similarity.
Anomaly DetectionIdentifying rare items, events, or observations which differ significantly from the majority of the data.
Decision TreeA model used to make decisions based on rules.
Random ForestAn ensemble of decision trees used for classification and regression.
Support Vector Machine (SVM)A supervised learning model used for classification and regression analysis.
K-Nearest Neighbors (KNN)A simple algorithm that stores all available cases and classifies new cases based on a similarity measure.
Gradient DescentAn optimization algorithm used to minimize the cost function in machine learning models.
OverfittingWhen a model learns the training data too well, including noise and details, affecting its performance on new data.
UnderfittingWhen a model is too simple and cannot capture the underlying pattern of the data.
Cross-ValidationA technique for assessing how the results of a statistical analysis will generalize to an independent data set.
Training DataData used to train machine learning models.
Test DataData used to test the trained model's performance.
Validation DataData used to tune the model's parameters during training.
HyperparameterParameters set before the learning process begins, controlling the learning process.
Feature EngineeringThe process of using domain knowledge to extract features from raw data.
Feature SelectionThe process of selecting a subset of relevant features for model construction.
Dimensionality ReductionReducing the number of random variables under consideration by obtaining a set of principal variables.
Principal Component Analysis (PCA)A technique used to emphasize variation and bring out strong patterns in a data set.
Linear RegressionA linear approach to modeling the relationship between a dependent variable and one or more independent variables.
Logistic RegressionA statistical model that uses a logistic function to model a binary dependent variable.
BiasError introduced by approximating a real-world problem which may oversimplify the model.
VarianceError introduced by the model's sensitivity to small fluctuations in the training set.
Loss FunctionA method of evaluating how well a specific algorithm models the given data.
Gradient BoostingA machine learning technique for regression and classification problems, which builds a model in a stage-wise fashion.
AdaBoostA boosting algorithm that combines multiple weak classifiers to create a strong classifier.
BaggingA technique that combines the predictions of multiple machine learning algorithms to produce a more accurate prediction.
Ensemble LearningUsing multiple models to improve the performance of a single model.
Convolutional Neural Network (CNN)A deep learning algorithm which can take in an input image, assign importance to various aspects in the image, and differentiate one from the other.
Recurrent Neural Network (RNN)A type of neural network where connections between nodes can create cycles, allowing output from some nodes to affect subsequent input to the same nodes.
Long Short-Term Memory (LSTM)A type of RNN architecture designed to avoid the long-term dependency problem.
AutoencoderA type of artificial neural network used to learn efficient codings of input data.
Generative Adversarial Network (GAN)A class of machine learning frameworks designed by two neural networks competing with each other to generate new data.
Transfer LearningA machine learning method where a model developed for a particular task is reused as the starting point for a model on a second task.
TokenizationBreaking text into smaller pieces, like words or phrases, for analysis.
EmbeddingA learned representation for text where words that have the same meaning have a similar representation.
Bag of Words (BoW)A representation of text that describes the occurrence of words within a document.
Term Frequency-Inverse Document Frequency (TF-IDF)A statistical measure used to evaluate how important a word is to a document in a collection.
Word2VecA group of related models that are used to produce word embeddings.
Sentence EmbeddingA method to represent entire sentences as vectors.
Named Entity Recognition (NER)A process in NLP that locates and classifies named entities in text into predefined categories.
Sentiment AnalysisThe process of determining the emotional tone behind a series of words.
Text ClassificationAssigning predefined categories to text.
Text GenerationUsing machine learning to generate new, similar text based on a given input.
ChatbotA computer program designed to simulate conversation with human users.
Speech RecognitionThe ability of a machine to identify words and phrases in spoken language and convert them to a machine-readable format.
Image RecognitionThe process of identifying and detecting an object or feature in a digital image or video.
Object DetectionIdentifying and locating objects within an image.
Image SegmentationPartitioning a digital image into multiple segments to make it easier to analyze.
Generative ModelA model for generating all values for a phenomenon, both observed and unseen.
Discriminative ModelA model that differentiates between different kinds of data instances.
Markov Decision Process (MDP)A mathematical process for making a sequence of decisions.
Bayesian NetworkA graphical model that represents the probabilistic relationships among a set of variables.
Hidden Markov Model (HMM)A statistical model where the system being modeled is assumed to be a Markov process with hidden states.
Fuzzy LogicA form of many-valued logic dealing with approximate, rather than fixed and exact reasoning.
Expert SystemA computer system that emulates the decision-making ability of a human expert.
HeuristicA problem-solving approach using practical methods for immediate solutions.
Cognitive ComputingTechnologies that mimic human brain function to perform tasks.
Autonomous SystemsSystems capable of performing tasks without human intervention.
RoboticsThe branch of technology dealing with the design, construction, operation, and application of robots.
Internet of Things (IoT)Interconnected devices that communicate and exchange data over the internet.
Edge ComputingComputing that’s done at or near the source of data.
Cloud ComputingDelivering computing services over the internet.
Quantum ComputingComputing using quantum-mechanical phenomena.
BlockchainA decentralized digital ledger of transactions.
CryptographyThe practice of securing communication from third parties.
CybersecurityProtecting systems, networks, and programs from digital attacks.
Data ScienceA field that uses scientific methods, processes, algorithms, and systems to extract knowledge from data.
Data EngineerA professional who prepares ‘big data’ for analytical or operational uses.
Data AnalystA professional who collects, processes, and performs statistical analyses of data.
Data VisualizationThe graphical representation of information and data.
Business Intelligence (BI)Technologies and strategies used by enterprises for data analysis and management.
Data WarehousingThe process of constructing and using a data warehouse.
ETL (Extract, Transform, Load)A process in data warehousing responsible for pulling data out of the source systems and placing it into a data warehouse.
SQL (Structured Query Language)A standard programming language for relational database management and data manipulation.
NoSQLA database management system that does not use SQL.
HadoopAn open-source framework for storing and processing big data.
SparkAn open-source unified analytics engine for big data processing.
TableauA data visualization tool.
Power BIA business analytics tool by Microsoft.
PythonA high-level programming language used for general-purpose programming.
RA programming language and software environment for statistical computing and graphics.
JavaA high-level, class-based, object-oriented programming language.
C++A general-purpose programming language created as an extension of C.
TensorFlowAn open-source machine learning framework developed by Google.
PyTorchAn open-source machine learning library developed by Facebook.
KerasAn open-source software library that provides a Python interface for neural networks.
Scikit-learnA free software machine learning library for the Python programming language.
OpenCVAn open-source computer vision and machine learning software library.
API (Application Programming Interface)A set of functions and procedures allowing the creation of applications that access features or data of an operating system, application, or other service.

Get ahead of the curve.
Join the top 1%.