book collections email follower instructable user
Picture of Ocean Acidification: a Real Data Analysis With Free Software

Oceans play a central role in regulating the concentration of carbon dioxide in the atmosphere since they absorb a significant portion of the atmospheric CO2. When seawater absorbs carbon dioxide, a series of chemical reactions occur and, as a result, the pH of water decreases, meaning that the ocean becomes more acidic. This phenomenon is known as ocean acidification. The acidification of seawater makes creating and maintaining their structures difficult for calcifying organisms such as oysters, clams or corals, and can also affect the behavior of other marine organisms. This has also negative consequences for people whose economies or food depend on fish and shellfish.

It is well known that since the industrial revolution, emissions of carbon dioxide to the atmosphere have increased considerably. But how has this affected the oceans? Long time-series observations in the ocean, although extremely important, are rare. However, the Hawaii Ocean Time-series (HOT) project has been conducting observations in the North Pacific—at the ALOHA station located 100 km north of the island of Oahu (Hawaii)—since October 1988. They carry out measurements on the hydrography, chemistry, and biology of the seawater approximately once a month, and the measured data are free and open access.

Thanks to the Hawaii Ocean Time-series project our students can work with real data to analyze how the concentration of CO2 in the Pacific Ocean has changed in the last 30 years and how it has affected the acidity of seawater.

Secondary education (15-18 years)

Learning objectives

Having the students get to know about ocean acidification by working with real data is the main objective of this activity. But besides this, the students will as well learn a programming language that they will be able to use afterward to analyze scientific data of different types. Specifically, the students will learn:

  • How ocean acidification has increased in recent years.
  • How the concentration of CO2 dissolved in water has changed in recent years.
  • How pH is related to the amount of CO2 dissolved in water.
  • How to create a .csv data file using a text editor.
  • How to use a spreadsheet program to edit a data table.
  • How to extract information from real data using a programming language.

Given these learning objectives, there are two possible contexts in which this activity can be carried out:

  • In Earth and Environmental Sciences: while learning about ocean acidification and environmental issues, students learn to use free software and a programming language for data analysis and graphing.
  • In Computer Science: while learning about spreadsheets and data analysis software, the use of this real data makes the students more concerned about the environmental consequences of their carbon footprint.

Step 1: The Software

All the programs used in this activity are available as free software.

The text editor: Atom

We will obtain the data from a web page, so the first step is to make a .csv file out of them. For this, we need a text editor that allows us to create and edit plain text files. Do not use a word processor, because they include special formatting symbols in the text. The text editor we have chosen is Atom. Atom is a free and open-source text and source code editor available for macOS, Linux, and Microsoft Windows. If you don’t want to (or can’t) install it on your computer, you can as well work with the default text editor of your operating system (Notepad in Windows, TextEdit in macOS, or Vim in Linux).

The spreadsheet: OpenOffice Calc

Once we have the data file, we’ll have to prepare the raw data for our purposes. A spreadsheet program lets us easily modify the .csv data file. We’ll work with the open-source, multiplatform Apache OpenOffice Calc spreadsheet, but any other spreadsheet program will work similarly.

The R programming language

Once the data are ready, we have to analyze them so we can extract conclusions. We’ll use R, a programming language and environment for statistical computing that is available as free software under the terms of the GNU General Public License. R is widely used in data analysis and scientific research to analyze, visualize and present data.

Although R has a command line interface, there are several graphical user interfaces that make it easier to work with it. The most specialized one, the open-source RStudio IDE, runs on the desktop in Windows, Mac, and Linux. But, due to its popularity as a programming language, there are as well many free online compilers that you can use to run R programs without the need to install anything. Very handy, isn’t it? We’ll focus on one of these online IDE’s: According to their website, their mission is to make programming more accessible for educators, learners, and developers, or, in their own words, “to get people to code as soon as possible. What if teachers who want to teach programming don't have to also work as IT administrators? What if students can just code their homework without having to set up the development environment on every computer they wanted to code on?”. And they do make it easy to start coding. You don’t even need to sign up to start working with it! Sure, the desktop RStudio environment is more versatile, and there are some functionalities of R that don’t work in the online version, but to start with simple programs it’s more than enough. So choosing the desktop or the online environment is up to you.

Giuliacci1 month ago
Hi, thanks for sharing this lesson. I'm wondering why you haven't decided to run all the data manipulation and cleansing in R. Is this a personal choice?
bpadin (author)  Giuliacci1 month ago
Well, actually it was more a "teaching" choice. I 'm assuming the students have no prior knowledge of R, so I wanted to simplify the programs as much as possible. I think that doing the all the data manipulation in R might be overwhelming for them, so I prefered to do it this way. Besides I've noticed that most of them are not completely fluent in using a spreadsheed, so I thought this would be a good way to practice!
This is really interesting! Thanks for sharing.