Source: Dataiku

Published: April 2020

### Key Data Science Concepts

*Circulated: June 1, 2020*

**Machine Learning**: programming systems to perform a task without coding rule-based instructions**Deep Learning**: a subset of ML where systems can learn hidden patterns from data**Model**: a representation of the real world using mathematics**Algorithm**: a set of rules used to make a calculation**Training set**: a dataset used to find potentially predictive relationships used to create a model**Test set**: a dataset with the same structure as the training set used to measure the performance of models**Training**: the process of creating a model from the training set**Target**: a dependent variable that is the output that a model predicts (e.g., price of a house)**Feature**: an independent variable that is a measurable piece of data (e.g., # of bathrooms in a house)**Overfitting**: a model that corresponds too closely with a particular set of data (i.e., training set) and may fail to fit additional data (i.e., test set)