4.4. What is a Model?#

In the realm of science and engineering, a model serves as a simplified representation of a system, capturing the essential features to facilitate understanding, prediction, and manipulation. A system refers to a set of interacting components that work together according to a specific principles or procedures.

Consider the process of baking bread: this seemingly straightforward task involves a series of steps governed by the principles of chemistry and thermodynamics. However, the underlying mechanisms—such as yeast fermentation, gluten development, and heat transfer—are complex, and their full intricacies may not be immediately apparent. To navigate and comprehend these complexities, scientists and engineers employ models to distill the world’s workings into manageable and understandable segments.

By abstracting and simplifying reality, models serve as invaluable tools in predicting behavior, testing hypotheses, and designing new systems, thereby bridging the gap between theoretical concepts and practical applications.

4.4.1. Some Real Life Models#

Newton’s Second Law of Motion

\[ F = ma \]

This is a foundational model in physics that relates force (\(F\)), mass (\(m\)), and acceleration (\(a\)). It simplifies the complex phenomenon of motion into a predictive equation, widely used in real-world applications like calculating the force to break a rib, optimizing sports performance, and understanding everyday actions like pushing a shopping cart.

The Ideal Gas Law

\[PV=nRT\]

The Idea Gas Law is a cornerstone model in chemistry that describes the relationship between pressure (\(P\)), volume (\(V\)), temperature (\(T\)), and the amount of gas (\(n\)) using the ideal gas constant (\(R\)).

This equation is widely applied in predicting how gases behave under various conditions, such as calculating the pressure inside a container at a specific temperature, or determining how much gas is needed to inflate a balloon.

4.4.2. Predicting House Prices#

Let’s consider a new example. Imagine you’re exploring a neighborhood and want to understand how different factors influence house prices. You gather data on various properties, noting details such as:

  • Number of Bedrooms: How many bedrooms does each house have?

  • Number of Bathrooms: How many bathrooms are available?

  • Square Footage: What is the total living area of the house?

  • Location: Is the house situated near schools, parks, or shopping centers?

With this information, your goal is to try and predict the price of a house, based on its characteristics.

These characteristics are known as features, or predictor variables. They are the inputs that help the model make predictions.

The response variable, also called the target variable, is what you want to predict. In this case, our target variables is the house price.

4.4.3. How Does The Model Work?#

As the model learns from the collected data, it begins to identify relationships between specific house features and the price. For instance, it might learn that houses with more bedrooms or larger square footage tend to have higher prices.

Once trained, the model can estimate the price of a new house, given a set of features. For example, if you provide details of a house with three bedrooms, two bathrooms, 1,500 square feet, and a specific location, the model uses the patterns it has learned to predict its likely price.

4.4.4. Baseline Model#

Before diving into complex machine learning models, it’s essential to start with a baseline model. This simple model serves as a reference point, helping you understand the minimum performance level and providing a benchmark against which more sophisticated models can be compared.

4.4.4.1. What Is a Baseline Model?#

A baseline model can be thought of as a simple approach in making predictions, particularily without much consideration of the input features.

Continuing with our house price prediction scenario, a baseline model might predict the average house price based solely on the data it has seen, without considering any specific features of the houses. For instance, if we want to create a baseline model from a dataset of houses with an average price of \(500,000, the baseline model would predict \)500,000 for every new house it encounters. Regardless of the number of bedrooms, bathrooms, square footage, or location, this baseline model will always predict the same value.

4.4.4.2. What’s The Deal?#

It might seem silly to only predict the same value, but a baseline model serves as a great benchmark for the models we will create!

In particular, it provides a standard for evaluating the effectiveness of more complex models. If your advanced model doesn’t outperform the baseline, it indicates that the model may not be capturing meaningful patterns in the data.

4.4.4.3. Types of Baseline Models#

  • Mean Predictor: Predicts the average value of the target variable. In our example, it would predict the average house price for all houses.

  • Median Predictor: Predicts the median value of the target variable (e.g. the median house price).

  • Random Predictor: Makes random predictions within the range of the target variable (e.g. if our range of house prices was from \(100,000 to \)1,000,000 we would predict a random house price each time ).

4.4.5. Come On, Let’s Build One Already!#

Not quite! There’s still a bit more theory we need to cover. Hang on tight though