![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F9801f5e1-77da-4381-9354-cdd80cee384d_800x1910.jpeg)
Source: Dataiku
Published: April 2020
Key Data Science Concepts
Circulated: June 1, 2020
Machine Learning: programming systems to perform a task without coding rule-based instructions
Deep Learning: a subset of ML where systems can learn hidden patterns from data
Model: a representation of the real world using mathematics
Algorithm: a set of rules used to make a calculation
Training set: a dataset used to find potentially predictive relationships used to create a model
Test set: a dataset with the same structure as the training set used to measure the performance of models
Training: the process of creating a model from the training set
Target: a dependent variable that is the output that a model predicts (e.g., price of a house)
Feature: an independent variable that is a measurable piece of data (e.g., # of bathrooms in a house)
Overfitting: a model that corresponds too closely with a particular set of data (i.e., training set) and may fail to fit additional data (i.e., test set)