Practice 3.1 Data with authentic IB Digital Society (DS) exam questions for both SL and HL students. This question bank mirrors Paper 1, 2, 3 structure, covering key topics like systems and structures, human behavior and interaction, and digital technologies in society. Get instant solutions, detailed explanations, and build exam confidence with questions in the style of IB examiners.
Sentencing criminals using artificial intelligence (AI)
In 10 states in the United States, artificial intelligence (AI) software is used for sentencing criminals. Once criminals are found guilty, judges need to determine the lengths of their prison sentences. One factor used by judges is the likelihood of the criminal re-offending*.
The AI software uses machine learning to determine how likely it is that a criminal will re-offend. This result is presented as a percentage; for example, the criminal has a 90 % chance of re-offending. Research has indicated that AI software is often, but not always, more reliable than human judges in predicting who is likely to re-offend.
There is general support for identifying people who are unlikely to re-offend, as they do not need to be sent to prisons that are already overcrowded.
Recently, Eric Loomis was sentenced by the state of Wisconsin using proprietary AI software. Eric had to answer over 100 questions to provide the AI software with enough information for it to decide the length of his sentence. When Eric was given a six-year sentence, he appealed and wanted to see the algorithms that led to this sentence. Eric lost the appeal.
On the other hand, the European Union (EU) has passed a law that allows citizens to challenge decisions made by algorithms in the criminal justice system.
* re-offending: committing another crime in the future
Identify two characteristics of artificial intelligence (AI) systems.
Outline one problem that may arise if proprietary software rather than open-source software is used to develop algorithms.
The developers of the AI software decided to use supervised machine learning to develop the algorithms in the sentencing software.
Identify two advantages of using supervised learning.
The developers of the AI software used visualizations as part of the development process.
Explain one reason why visualizations would be used as part of the development process.
Explain two problems the developers of the AI system could encounter when gathering the data that will be input into the AI system.
To what extent should the decisions of judges be based on algorithms rather than their knowledge and experience?
Cloud networks allow for data storage and access over the internet, making data accessible from anywhere. This accessibility supports remote work, file sharing, and collaboration but also raises concerns about data security and control over personal information.
Evaluate the impact of cloud networks on data accessibility, considering the benefits for remote work and the potential security risks.
The types of data collected in modern digital societies are diverse and can be classified into several categories. Quantitative data, such as statistical or financial records, provide numerical insights, while qualitative data, such as user reviews or interviews, offer context and understanding. From geographical and meteorological to medical data, these different types serve various purposes.
For example, data collected in scientific research might include both statistical results (quantitative) and patient experiences (qualitative). This comprehensive view helps in drawing conclusions that are both statistically valid and contextually rich.
Metadata is another critical type of data that describes other data, aiding in its categorization and retrieval. For instance, a photograph's metadata might include the time it was taken, the camera model, and the geolocation, which aids in organizing vast image collections.
Data analytics involves extracting meaningful insights by identifying trends, patterns, and relationships within large datasets. For instance, companies analyze customer purchase data to model and predict future consumer behavior. This has applications ranging from personalized marketing strategies to more accurate forecasting of demand for products.
Moreover, the increasing availability of big data has enabled researchers to analyze complex relationships between different types of data, such as correlating cultural, financial, and meteorological data to predict economic impacts of climate change. The ability to organize measurable facts about both people and systems allows for a more comprehensive understanding of digital society.
The data life cycle describes the stages through which data passes, from its creation or collection to its reuse. Initially, data is either collected or extracted through primary methods like surveys or through secondary sources such as previously existing databases. The data is then stored in databases, where it can be processed and analyzed to extract insights.
For example, medical research data might undergo multiple stages of this cycle. After being collected, it is stored securely, processed to anonymize patient information, and then analyzed to identify health trends. Data also needs to be preserved for future research and can be reused in subsequent studies, ensuring that its value extends beyond the initial analysis.
Refer to Source A. Identify two stages in the DIKW pyramid and explain their differences.
Using Source B, Discuss the importance of metadata in organizing different types of data.
Refer to Source C. Explain how companies use data analytics to predict human behavior. Provide one example.
Based on Source D, Describe two key stages of the data life cycle in healthcare, and explain their significance.
Compare and contrast Source B and Source D, focusing on how they address data organization and reuse.
With reference to Sources A-D and your own knowledge, Discuss the opportunities and challenges presented by big data analytics in modern society.
Data can be collected in various ways, including primary and secondary methods, and is often organized into databases to ensure it is structured, accessible, and manageable. How we organize data affects its usability and relevance.
Distinguish between primary and secondary data collection, providing one example of each.
Explain how databases organize and structure data to ensure accessibility.
Firewalls are critical for network security, acting as barriers between internal networks and external threats. They control incoming and outgoing traffic, protecting against unauthorized access and cyber attacks. However, configuring firewalls effectively can be challenging, especially in large organizations.
Evaluate the role of firewalls in securing organizational networks, considering their effectiveness and potential challenges in implementation.
Malicious software (malware) is a significant threat to users of personal devices, as it can steal sensitive information, disrupt services, or even cause financial losses. With increased connectivity, devices are more vulnerable to these attacks, raising ethical questions about responsibility in cybersecurity.
Evaluate the ethical responsibilities of software developers and users in preventing the spread of malicious software on personal devices.
Cameras in school
The principal at Flynn School has received requests from parents saying that they would like to monitor their children’s performance in school more closely. He is considering extending the school’s IT system by installing cameras linked to facial recognition software that can record student behaviour in lessons.
The facial recognition software can determine a student’s attention level and behaviour, such as identifying if they are listening, answering questions, talking with other students, or sleeping. The software uses machine learning to analyse each student’s behaviour and gives them a weekly score that is automatically emailed to their parents.
The principal claims that monitoring students’ behaviour more closely will improve the teaching and learning that takes place.
Discuss whether Flynn School should introduce a facial recognition system that uses machine learning to analyse each student’s behaviour and give them a score that is automatically emailed to their parents.
Moore’s Law has driven rapid advancements in technology by predicting that the number of transistors on a chip doubles approximately every two years. This trend has influenced the affordability, size, and power of devices like smartphones and laptops, though some predict Moore’s Law may be slowing down.
Discuss the significance of Moore’s Law in shaping the development of personal computing devices, including potential consequences if the law’s trend no longer holds true.
Source A
Source B (Google review)
FreshPerks helps customers earn points and receive discounts that match their shopping habits. To provide these services, FreshPerks stores purchase history linked to an account identifier and uses analytics to infer product preferences (for example, dietary choices or likely household needs). Customers can turn off “personalized offers” in settings; if they do, they will still receive general promotions. FreshPerks states it does not sell individual purchase histories, but it may share aggregated insights with partner brands, such as “how many shoppers bought a product after seeing a coupon.” FreshPerks retains transaction logs for up to two years to support returns, fraud prevention, and service improvement. The company says it uses “de-identification techniques” when producing reports and that customers can request a copy of their stored data.
Source C (origin unknown)
Personalization is enabled for 87% of active accounts; 13% opt out.
Customers with personalization enabled redeem coupons at 2.4× the rate of those who opt out.
31% of customer data-access requests are made after a disputed charge or return.
Internal testing found that a “budget shopper” segment was assigned disproportionately to customers in lower-income postal codes, even when basket size was similar.
A small sample audit found 5% of accounts had mixed-household data (shared phone number or reused login), producing misleading inferences.
Source D (excerpt from FreshPerks’ competitor’s website) FreshPerks frames loyalty data as harmless convenience: “discounts you’ll like.” But the data is doing more than counting groceries. It is building probabilistic stories about households - who is “budget,” who is “premium,” who is “at risk of switching” - and then acting on those stories through prices and offers. The company emphasizes “de-identification,” yet shoppers are still being shaped as individuals by a system that remembers them for years. Opt-out exists, but the infographic’s own logic is revealing: redemption is higher when personalization is on, meaning the program is designed to make opting out feel irrational. Data also leaks across social reality. Shared logins, family phones, and multi-person households mean the “you” in the dataset may not be a single person at all. If FreshPerks shares insights with brands, it is exporting behavioural knowledge without giving the public meaningful control over how that knowledge is used. The question is not whether data is collected; rather, it is who gets to benefit from it, and who bears the risks.
Outline two categories of data collected by FreshPerks shown in Source A.
Explain how FreshPerks can use the data described in Source B to create “segments” and personalized offers.
Compare and contrast what Source C and Source D suggest about the opportunities and dilemmas created by FreshPerks’ data practices.
Discuss whether FreshPerks demonstrates responsible data use. Answer with reference to all the sources (A–D) and your own knowledge of the Digital Society course.
Facial recognition algorithms, used for security in airports, rely on large datasets and are sometimes criticized for algorithmic bias. For instance, these algorithms have been known to misidentify individuals of certain racial backgrounds, raising fairness and transparency issues.
Identify two issues related to algorithmic bias in facial recognition software.
Explain why transparency is essential for accountability in facial recognition algorithms used in security.
Discuss one risk associated with “black box” algorithms in facial recognition systems.
Evaluate the impact of algorithmic bias on fairness in facial recognition, particularly concerning racial and ethnic disparities.
Practice 3.1 Data with authentic IB Digital Society (DS) exam questions for both SL and HL students. This question bank mirrors Paper 1, 2, 3 structure, covering key topics like systems and structures, human behavior and interaction, and digital technologies in society. Get instant solutions, detailed explanations, and build exam confidence with questions in the style of IB examiners.
Sentencing criminals using artificial intelligence (AI)
In 10 states in the United States, artificial intelligence (AI) software is used for sentencing criminals. Once criminals are found guilty, judges need to determine the lengths of their prison sentences. One factor used by judges is the likelihood of the criminal re-offending*.
The AI software uses machine learning to determine how likely it is that a criminal will re-offend. This result is presented as a percentage; for example, the criminal has a 90 % chance of re-offending. Research has indicated that AI software is often, but not always, more reliable than human judges in predicting who is likely to re-offend.
There is general support for identifying people who are unlikely to re-offend, as they do not need to be sent to prisons that are already overcrowded.
Recently, Eric Loomis was sentenced by the state of Wisconsin using proprietary AI software. Eric had to answer over 100 questions to provide the AI software with enough information for it to decide the length of his sentence. When Eric was given a six-year sentence, he appealed and wanted to see the algorithms that led to this sentence. Eric lost the appeal.
On the other hand, the European Union (EU) has passed a law that allows citizens to challenge decisions made by algorithms in the criminal justice system.
* re-offending: committing another crime in the future
Identify two characteristics of artificial intelligence (AI) systems.
Outline one problem that may arise if proprietary software rather than open-source software is used to develop algorithms.
The developers of the AI software decided to use supervised machine learning to develop the algorithms in the sentencing software.
Identify two advantages of using supervised learning.
The developers of the AI software used visualizations as part of the development process.
Explain one reason why visualizations would be used as part of the development process.
Explain two problems the developers of the AI system could encounter when gathering the data that will be input into the AI system.
To what extent should the decisions of judges be based on algorithms rather than their knowledge and experience?
Cloud networks allow for data storage and access over the internet, making data accessible from anywhere. This accessibility supports remote work, file sharing, and collaboration but also raises concerns about data security and control over personal information.
Evaluate the impact of cloud networks on data accessibility, considering the benefits for remote work and the potential security risks.
The types of data collected in modern digital societies are diverse and can be classified into several categories. Quantitative data, such as statistical or financial records, provide numerical insights, while qualitative data, such as user reviews or interviews, offer context and understanding. From geographical and meteorological to medical data, these different types serve various purposes.
For example, data collected in scientific research might include both statistical results (quantitative) and patient experiences (qualitative). This comprehensive view helps in drawing conclusions that are both statistically valid and contextually rich.
Metadata is another critical type of data that describes other data, aiding in its categorization and retrieval. For instance, a photograph's metadata might include the time it was taken, the camera model, and the geolocation, which aids in organizing vast image collections.
Data analytics involves extracting meaningful insights by identifying trends, patterns, and relationships within large datasets. For instance, companies analyze customer purchase data to model and predict future consumer behavior. This has applications ranging from personalized marketing strategies to more accurate forecasting of demand for products.
Moreover, the increasing availability of big data has enabled researchers to analyze complex relationships between different types of data, such as correlating cultural, financial, and meteorological data to predict economic impacts of climate change. The ability to organize measurable facts about both people and systems allows for a more comprehensive understanding of digital society.
The data life cycle describes the stages through which data passes, from its creation or collection to its reuse. Initially, data is either collected or extracted through primary methods like surveys or through secondary sources such as previously existing databases. The data is then stored in databases, where it can be processed and analyzed to extract insights.
For example, medical research data might undergo multiple stages of this cycle. After being collected, it is stored securely, processed to anonymize patient information, and then analyzed to identify health trends. Data also needs to be preserved for future research and can be reused in subsequent studies, ensuring that its value extends beyond the initial analysis.
Refer to Source A. Identify two stages in the DIKW pyramid and explain their differences.
Using Source B, Discuss the importance of metadata in organizing different types of data.
Refer to Source C. Explain how companies use data analytics to predict human behavior. Provide one example.
Based on Source D, Describe two key stages of the data life cycle in healthcare, and explain their significance.
Compare and contrast Source B and Source D, focusing on how they address data organization and reuse.
With reference to Sources A-D and your own knowledge, Discuss the opportunities and challenges presented by big data analytics in modern society.
Data can be collected in various ways, including primary and secondary methods, and is often organized into databases to ensure it is structured, accessible, and manageable. How we organize data affects its usability and relevance.
Distinguish between primary and secondary data collection, providing one example of each.
Explain how databases organize and structure data to ensure accessibility.
Firewalls are critical for network security, acting as barriers between internal networks and external threats. They control incoming and outgoing traffic, protecting against unauthorized access and cyber attacks. However, configuring firewalls effectively can be challenging, especially in large organizations.
Evaluate the role of firewalls in securing organizational networks, considering their effectiveness and potential challenges in implementation.
Malicious software (malware) is a significant threat to users of personal devices, as it can steal sensitive information, disrupt services, or even cause financial losses. With increased connectivity, devices are more vulnerable to these attacks, raising ethical questions about responsibility in cybersecurity.
Evaluate the ethical responsibilities of software developers and users in preventing the spread of malicious software on personal devices.
Cameras in school
The principal at Flynn School has received requests from parents saying that they would like to monitor their children’s performance in school more closely. He is considering extending the school’s IT system by installing cameras linked to facial recognition software that can record student behaviour in lessons.
The facial recognition software can determine a student’s attention level and behaviour, such as identifying if they are listening, answering questions, talking with other students, or sleeping. The software uses machine learning to analyse each student’s behaviour and gives them a weekly score that is automatically emailed to their parents.
The principal claims that monitoring students’ behaviour more closely will improve the teaching and learning that takes place.
Discuss whether Flynn School should introduce a facial recognition system that uses machine learning to analyse each student’s behaviour and give them a score that is automatically emailed to their parents.
Moore’s Law has driven rapid advancements in technology by predicting that the number of transistors on a chip doubles approximately every two years. This trend has influenced the affordability, size, and power of devices like smartphones and laptops, though some predict Moore’s Law may be slowing down.
Discuss the significance of Moore’s Law in shaping the development of personal computing devices, including potential consequences if the law’s trend no longer holds true.
Source A
Source B (Google review)
FreshPerks helps customers earn points and receive discounts that match their shopping habits. To provide these services, FreshPerks stores purchase history linked to an account identifier and uses analytics to infer product preferences (for example, dietary choices or likely household needs). Customers can turn off “personalized offers” in settings; if they do, they will still receive general promotions. FreshPerks states it does not sell individual purchase histories, but it may share aggregated insights with partner brands, such as “how many shoppers bought a product after seeing a coupon.” FreshPerks retains transaction logs for up to two years to support returns, fraud prevention, and service improvement. The company says it uses “de-identification techniques” when producing reports and that customers can request a copy of their stored data.
Source C (origin unknown)
Personalization is enabled for 87% of active accounts; 13% opt out.
Customers with personalization enabled redeem coupons at 2.4× the rate of those who opt out.
31% of customer data-access requests are made after a disputed charge or return.
Internal testing found that a “budget shopper” segment was assigned disproportionately to customers in lower-income postal codes, even when basket size was similar.
A small sample audit found 5% of accounts had mixed-household data (shared phone number or reused login), producing misleading inferences.
Source D (excerpt from FreshPerks’ competitor’s website) FreshPerks frames loyalty data as harmless convenience: “discounts you’ll like.” But the data is doing more than counting groceries. It is building probabilistic stories about households - who is “budget,” who is “premium,” who is “at risk of switching” - and then acting on those stories through prices and offers. The company emphasizes “de-identification,” yet shoppers are still being shaped as individuals by a system that remembers them for years. Opt-out exists, but the infographic’s own logic is revealing: redemption is higher when personalization is on, meaning the program is designed to make opting out feel irrational. Data also leaks across social reality. Shared logins, family phones, and multi-person households mean the “you” in the dataset may not be a single person at all. If FreshPerks shares insights with brands, it is exporting behavioural knowledge without giving the public meaningful control over how that knowledge is used. The question is not whether data is collected; rather, it is who gets to benefit from it, and who bears the risks.
Outline two categories of data collected by FreshPerks shown in Source A.
Explain how FreshPerks can use the data described in Source B to create “segments” and personalized offers.
Compare and contrast what Source C and Source D suggest about the opportunities and dilemmas created by FreshPerks’ data practices.
Discuss whether FreshPerks demonstrates responsible data use. Answer with reference to all the sources (A–D) and your own knowledge of the Digital Society course.
Facial recognition algorithms, used for security in airports, rely on large datasets and are sometimes criticized for algorithmic bias. For instance, these algorithms have been known to misidentify individuals of certain racial backgrounds, raising fairness and transparency issues.
Identify two issues related to algorithmic bias in facial recognition software.
Explain why transparency is essential for accountability in facial recognition algorithms used in security.
Discuss one risk associated with “black box” algorithms in facial recognition systems.
Evaluate the impact of algorithmic bias on fairness in facial recognition, particularly concerning racial and ethnic disparities.