Categories

Glossary

Prompt injection is an attack or failure mode where external content tries to manipulate the system's instructions. It happens when a large language model treats untrusted text, documents, webpages, emails ...

Prompt injection is an attack or failure mode where external content tries to manipulate the system's instructions. It happens when a large language model treats untrusted text, documents, webpages, emails Read article

An autoencoder is a neural network that learns to compress data and reconstruct it. It can be used as a more advanced dimensionality reduction method for complex data such as ...

An autoencoder is a neural network that learns to compress data and reconstruct it. It can be used as a more advanced dimensionality reduction method for complex data such as Read article

Jailbreaking is an attempt to bypass a model's safety rules or restrictions. In large language models, it usually means using carefully crafted prompts, context or interaction patterns to make the ...

Jailbreaking is an attempt to bypass a model's safety rules or restrictions. In large language models, it usually means using carefully crafted prompts, context or interaction patterns to make the Read article

Feature selection is a machine learning technique used to choose the most useful input variables from an original dataset. The goal is not to create new variables, but to keep ...

Feature selection is a machine learning technique used to choose the most useful input variables from an original dataset. The goal is not to create new variables, but to keep Read article

Data leakage is a situation where a machine learning model receives information during training that would not be available in real use. The model then appears to perform very well ...

Data leakage is a situation where a machine learning model receives information during training that would not be available in real use. The model then appears to perform very well Read article

Model explainability is the ability to understand why a machine learning model produced a certain output, prediction or recommendation. It helps people see which inputs influenced the result, whether the ...

Model explainability is the ability to understand why a machine learning model produced a certain output, prediction or recommendation. It helps people see which inputs influenced the result, whether the Read article

Exploitation is the use of the best-known action based on current knowledge. In reinforcement learning, it means that an agent chooses the option that currently seems most rewarding, instead of ...

Exploitation is the use of the best-known action based on current knowledge. In reinforcement learning, it means that an agent chooses the option that currently seems most rewarding, instead of Read article

Reward hacking is a situation where a model learns to optimise the reward signal while missing the real purpose of the task. The system technically does what it is rewarded ...

Reward hacking is a situation where a model learns to optimise the reward signal while missing the real purpose of the task. The system technically does what it is rewarded Read article

An outlier is a point that appears unusual compared with the rest of the data. It may be a rare but valid observation, a measurement error, a data quality problem, ...

An outlier is a point that appears unusual compared with the rest of the data. It may be a rare but valid observation, a measurement error, a data quality problem, Read article

RLHF, short for reinforcement learning from human feedback, is a training approach in which human preferences are used to help shape model behaviour. Instead of telling the model only what ...

RLHF, short for reinforcement learning from human feedback, is a training approach in which human preferences are used to help shape model behaviour. Instead of telling the model only what Read article