Correctness of a model
Refers to how accurately the model represents the real-world system it is designed to simulate.
- To determine correctness, we compare the results generated by the model with data observed in the original problem.
- If the model's predictions align closely with real-world observations, it is considered correct.
Steps to Assess Correctness
- Collect Real-World Data: Gather data from the original problem.
- Run the Model: Use the model to generate results.
- Compare Results: Analyze how closely the model's results match the real-world data.
- Identify Discrepancies: Note any significant differences.
- Refine the Model: Adjust the model to improve its accuracy.
- Traffic Flow Model
- Real-World Data: Average speed of cars on a highway is 60 km/h.
- Model Prediction: The model predicts an average speed of 55 km/h.
- Comparison: The model is close but slightly underestimates the speed.
- Refinement: Adjust the model to account for factors like fewer traffic jams.
Factors Affecting Correctness
- Simplifications: Models often simplify real-world systems, which can lead to inaccuracies.
- Assumptions: Incorrect assumptions can cause the model to deviate from reality.
- Data Quality: Poor quality or incomplete data can affect the model's accuracy.
A model that is too complex may be difficult to validate, while a model that is too simple may lack accuracy.
Challenges in Ensuring Correctness
- Dynamic Systems: Real-world systems may change over time, making it hard for static models to remain accurate.
- Unpredictable Variables: Some factors, like human behavior, are difficult to model accurately.
- Data Limitations: Limited or outdated data can reduce the model's correctness.
Predicting Electricity Demand
Problem:
A national power company wants to predict daily electricity demand (in megawatts, MW) for a large city so it can plan energy production efficiently.
Model Assumption:
The company builds a model that predicts demand based on two factors:
- Temperature (higher demand when hotter due to air conditioning).
- Day of the week (higher demand on weekdays than weekends).
The model predicts:
- Weekdays: 5000 MW at 25°C, increasing by 100 MW for each additional degree.
- Weekends: 4000 MW at 25°C, increasing by 80 MW per degree.
Real-World Data Collection (One Week):
- Monday (27°C): Model predicted 5200 MW → Observed 5100 MW
- Tuesday (30°C): Model predicted 5500 MW → Observed 5400 MW
- Wednesday (28°C): Model predicted 5300 MW → Observed 5200 MW
- Thursday (26°C): Model predicted 5100 MW → Observed 5000 MW
- Friday (29°C): Model predicted 5400 MW → Observed 5600 MW
- Saturday (31°C): Model predicted 4240 MW → Observed 4500 MW
- Sunday (30°C): Model predicted 4160 MW → Observed 4300 MW
Comparison of Model vs Real Data:
- For weekdays (Mon–Thu), the model is generally accurate (differences of ~100 MW).
- On Friday, the model underestimates demand (predicted 5400 vs actual 5600).
- On weekends, the model consistently underestimates demand by about 200–300 MW.
Discussion of Correctness:
- The model is partially correct: it captures the relationship between temperature and demand, and weekday vs weekend differences.
- However, it fails to account for special factors:
- Friday evening peak (people staying home earlier for the weekend).
- Weekend activities (shopping centres, entertainment) that drive higher electricity use than expected.
- Data quality is limited to one week, so unusual events (heatwaves, holidays) are not represented.
Refinement:
- Add variables for time of day, special events/holidays, and economic activity.
- Collect data for several months to improve reliability.
- Use statistical error measures (e.g., Mean Absolute Error) to quantify correctness more formally.