1  Open science basics

1.1 Goals and motivation

This is the first module in our workshop on open science. This module describes the need for open science, how it can improve research applications, and exposes you to common ideas and terminology that we’ll be using throughout the day. Consider this your 30,000 foot view of open science. Our later modules will provide more detail on specific topics in open science that you can use for continued learning.

  • Goal: get comfortable with key ideas and concepts for understanding open science
  • Motivation: this is the first step in your open science journey!

1.2 Why open science?

Let’s start with revisiting the scientific process. I’m sure this looks familiar to all of you. This is geared towards an applied research question.

Our basic scientific approach to discovery is motivated by a question or research goal, developing a hypothesis for the question, collecting data based on the hypothesis, developing a tool that can be used for decision-making, and summarizing the results in a conventional format.

Many scientists, especially early career researchers (my past self included), may assume that this is sufficient to affect change. We write the report, send it out into the world, and move on to the next project. This is a common mentality:

“This 500-page report will answer all of their questions!”

From the other side, such as the manager or policy-maker, the report may be received like this:

“This 500-page report answers none of my questions!”

It’s dense, inaccessible, and there are probably questions about the underlying data and methods used to achieve the results. More importantly, it doesn’t present the information in an easily digestible format to quickly make the right decision. Sometimes, if you think you’re doing applied science, it may just be implied science that falls short of application.

Why is this conventional approach to science ineffective at seeding change?

The environmental management community is often siloed with each branch doing their own thing and speaking their own language. Between the research (typically academic) and management community, we call this the research-management divide.

A distinct gap exists between how scientific products are developed and how they can be used to meet management needs. This is often the result of communication barriers, irreproducible results, information loss with poor documentation, inaccessible data, and opaque workflows known only to the analyst.

These barriers can occur at any stage of the research process. This compelling graphic from Michener et al. (1997) describes the atrophy of information in a closed approach to creating science.

The last part is especially morbid. Sometimes, this is called the bus factor. What would happen to your important work and life achievements if you were hit by a bus? Would others be able to pick it up? Research products with a high bus factor are at risk of being lost if critical team members are no longer available. This is a very real problem for continuity of science.

So how do we make changes to our workflows to ensure we can achieve truly applied science using open tools and philosophies?

1.3 Learning and speaking the language of open science

The tools and broader philosophy behind open science can help us bridge the research-management divide. It involves a fundamental shift in how we approach the scientific process, both for your own internal workflows and how you can engage others in the process. By others, we mean not just researchers, but specifically those that need the information to make informed decisions. This also includes your future self.

Now, let’s settle on a definition for open science (from Open Knowledge International, http://opendefinition.org/, https://creativecommons.org/):

“The practice of science in such a way that others can collaborate and contribute, where research data, lab notes and other research processes are freely available, under terms that enable reuse, redistribution and reproduction of the research and its underlying data and methods.”

Key words from this definition are italicized. There are very specific tools in the open science toolbox that enable each of these key words. We’ll cover some of these later.

Similarly, the current administration has declared 2023 the Year of Open Science. Their definition is:

“The principle and practice of making research products and processes available to all, while respecting diverse cultures, maintaining security and privacy, and fostering collaborations, reproducibility, and equity.”

We can can breakdown these definitions into key principles.

  1. Open data

    • Public availability of data
    • Reusability and transparent workflows
    • Data provenance and metadata
  2. Open process

    • Iterative methods using reproducible workflows
    • Collaboration with colleagues using web-based tools
    • Leveraging external, open-source applications
  3. Open products

    • Interactive web products for communication
    • Dynamic documents with source code
    • Integration with external networks for discoverability

You’ll notice that web-based tools and open science are often discussed at the same time. Science existed before the internet. Open science often focuses on how the two can leverage and support one another despite the latter being a relatively new addition to society.

Advocates of open science also use the FAIR principles (Wilkinson et al. 2016) as guidelines. The FAIR acronym stands for Findable, Accessible, Interoperable, and Reusable. Anybody should be able to find your science, access it once it’s found, use it in different environments, and reproduce it for additional analysis.

1.4 Schools of thought

Finally, it’s useful to make a distinction of how different people may talk about open science. This can help you better navigate conversations to become an advocate for open science.

A useful paradigm is provided by Fecher and Friesike (2014) that describes open science as five distinct schools of thought:

These are of course only conceptual boxes and there’s considerable overlap when open science is used in practice. For our purposes, we’ll mostly be talking about ideas and tools from the pragmatic, infrastructure, and democratic schools of thought. The end goal is to provide you with the means to create more efficient and impactful science that can more readily be used by others in a collaborative setting.