Housekeeping rules

  • Please mute your microphone when you are not speaking
  • Please turn off your camera during presentations
  • Post your questions in the chat or raise your hand using the “raise your hand” function in Zoom (“Reactions” button)
  • Tell us how we did in the survey (more about that later)
  • The presentations will be shared afterwards

About us

Dr. Franz Eder

Assoc. Prof. for International Relations

University of Innsbruck


Research focus: Foreign and Security Policy; (Counter-)Terrorism; USA, Europe, Austria; social science research methods (v.a. QTA, DNA); academic writing and presentation; open and reproducible science


Foreign Policy Lab

AFP3

Anita Bodlos, MA

Digital Science Services

University of Innsbruck


Expertise: Research Data Management


AUSSDA

Structure

Thursday

  • What is Open Science and why we should do Reproducible Research?
  • Research Data Management (RDM) in the Social Sciences: A brief summary (Anita Bodlos)

Homework

  • install: R, RStudio, LaTeX, git and quarto
  • create a Colab and a Github account

Friday

  • Jupyter Notebook/Colab
  • Git(hub)
  • RStudio and Quarto
  • renv-Package

What is Open Science and why we should do Reproducible Research?

Open Science

Definition

Open Science as a “collaborative culture enabled by technology that empowers the open sharing of data, information, and knowledge within the scientific community and the wider public to accelerate scientific research and understanding” (Ramachandran, Bugbee, and Murphy 2021).

The different facets of Open Science (Source: https://www.earthdata.nasa.gov/)

The different facets of Open Science (Source: https://www.earthdata.nasa.gov/)

The nature of science according to Merton (1973)

  1. universalism
  2. common ownership of goods
  3. “disinterestedness”
  4. “organized criticism”

Drivers of OS I

technological innovations

Drivers of OS II

ethical considerations

Advantages of OS

Demands of OS-movement

The different facets of Open Science (Source: https://www.earthdata.nasa.gov/)

The different facets of Open Science (Source: https://www.earthdata.nasa.gov/)

Reproducible Research

Kitzes, Turek, and Deniz (2017)

Kitzes, Turek, and Deniz (2017)

Kitzes, Justin, Daniel Turek, and Fatma Deniz, eds. 2017. The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences. Oakland, CA: University of California Press. https://doi.org/10.1525/9780520967779.


Online version: http://www.practicereproducibleresearch.org/

Definitions

same data…

Reproducibility

“the ability of a researcher to duplicate the results of a prior study using the same materials as were used by the original investigator. That is, a second researcher might use the same raw data to build the same analysis files and implement the same statistical analysis in an attempt to yield the same results” (Rokem, Marwick, and Staneva 2017, 4)


… vs. different data

Replicability

“refers to the ability of a researcher to duplicate the results of a prior study if the same procedures are followed but new data are collected” (Rokem, Marwick, and Staneva 2017, 4).

Trust me vs. show me

“[O]bservability – visibility into the process of generating results – provides the evidence that the scientific claim is true. It helps ensure we are not fooling ourselves or each other, accidentally or deliberately. It is a safeguard against error and fraud, and a springboard for progress, enbling others to replicate the experiment, to refine or improve the experiment, and to leverage the techniques to answer new questions. It generates and promulgates scientific knowledge and the means of generating scientific knowledge” (Stark 2017, xvii).


  • Key question: How to bring this basic idea to your computer?

Computational Reproducibility

Definition

“the ability of a second researcher to receive a set of files, including data, code, and documentation, and to recreate or recover the outputs of a research project, including figures, tables, and other key quantitative and qualitative results” (Kitzes 2017, 19).

Key practices

  1. Clearly separate, label, and document all data, files, and operations that occur on data and files.
  2. Document all operations fully, automating them as much as possible, and avoiding manual intervention in the workflow when feasible.
  3. Design a workflow as a sequence of small steps that are glued together, with intermediate outputs from one step feeding into the next step as inputs.

Key stages of reproducible workflow

  • Stage 1: Data acquisition

    • csv-file(s)
    • documentation/meta-file
    • folder structure of project
  • Stage 2: Data processing

    • “encode instructions for data processing as computer code” (Kitzes 2017, 25)
  • Stage 3: Data analysis

    • reproduce the “thought process”
    • describe why and how you do things (Kitzes 2017, 28)

Challenges and solutions

  • “multiple inconsistent versions of code, data, or both” (Peikert, Lissa, and Brandmaier 2021, 838)

    • solution: Version control (git)
  • missing documentation and copy-and-paste errors in final reports

  • undocumentend or ambigiuous order of documentation

    • solution: file and folder management; comments; make-files or targets-package
  • software dependencies

Research Data Management (RDM) in the Social Sciences: A brief summary

Homework

Download, install and register

R, Rstudio, LaTeX, Quarto

  1. Install R from Cran

  2. Install RStudio Desktop from Posit

  3. Install LaTeX from the LaTeX Project

  4. or Install tinytex with this command in R: install.packages('tinytex')

  5. Install Quarto

  6. Install Git

Workshop

Jupyter Notebook (Google Colab)

Git & Github

Git as a distributed version control system

Chacon and Straub (2024)

Chacon and Straub (2024)

Bryan (2018):

Bryan, Jennifer. 2018. “Excuse Me, Do You Have a Moment to Talk about Version Control?” The American Statistician 72 (1): 20–27. https://doi.org/10.1080/00031305.2017.1399928.


Vuorre and Curley (2018):

Vuorre, Matti, and James P. Curley. 2018. “Curating Research Assets: A Tutorial on the Git Version Control System.” Advances in Methods and Practices in Psychological Science 1 (2): 219–36. https://doi.org/10.1177/2515245918754826.

RStudio and Quarto

Renv-Package

Bibliography

Anderson, Samantha F., and Scott E. Maxwell. 2017. “Addressing the "Replication Crisis": Using Original Studies to Design Replication Studies with Appropriate Statistical Power.” Multivariate Behavioral Research 52 (3): 305–24. https://doi.org/10.1080/00273171.2017.1289361.
BOAI. 2002. Budapest Open Access Initiative. Budapest: budapestopenaccessinitiative.org. https://www.budapestopenaccessinitiative.org/read/.
Bourne, Philip E., Jon R. Lorsch, and Eric D. Green. 2015. “Sustaining the Big-Data Ecosystem.” Nature 527: S16–17. https://doi.org/10.1038/527S16a.
Breznau, Nate, Eike Mark Rinke, Alexander Wuttke, Hung H. V. Nguyen, Muna Adem, Jule Adriaans, Amalia Alvarez-Benjumea, et al. 2022. “Observing Many Researchers Using the Same Data and Hypothesis Reveals a Hidden Universe of Uncertainty.” Proceedings of the National Academy of Sciences 119 (44): 1–8. https://doi.org/10.1073/pnas.2203150119.
Bryan, Jennifer. 2018. “Excuse Me, Do You Have a Moment to Talk about Version Control?” The American Statistician 72 (1): 20–27. https://doi.org/10.1080/00031305.2017.1399928.
Bullinger, Hans-Jörg et al. 2003. Berliner Erklärung über Den Offenen Zugang Zu Wissenschaftlichem Wissen. Https://openaccess.mpg.de/68053/Berliner_Erklaerung_dt_Version_07-2006.pdf (Zugriffsdatum: 20.02.2019). Berlin: Max-Planck-Gesellschaft.
Camerer, Colin F., Anna Dreber, Eskil Forsell, Teck-Hua Ho, Jürgen Huber, Magnus Johannesson, Michael Kirchler, et al. 2016. “Evaluating Replicability of Laboratory Experiments in Economics.” Science 351 (6280): 1433–36. https://doi.org/10.1126/science.aaf0918.
Camerer, Colin F., Anna Dreber, Felix Holzmeister, Teck-Hua Ho, Jürgen Huber, Magnus Johannesson, Michael Kirchler, et al. 2018. “Evaluating the Replicability of Social Science Experiments in Nature and Science Between 2010 and 2015.” Nature Human Behaviour 2 (9): 637–44. https://doi.org/10.1038/s41562-018-0399-z.
Chacon, Scott, and Ben Straub. 2024. Pro Git: Everything You Need to Know about GIT. 2nd ed. New York, NY: Apress. https://github.com/progit/progit2/releases/download/2.1.438/progit.pdf.
Edwards, Aled. 2016. “Perspective: Science Is Still Too Closed.” Nature 533 (7602): S70–70. https://doi.org/10.1038/533S70a.
English, James F., and Ted Underwood. 2016. “Shifting Scales: Between Literature and Social Science.” Modern Language Quarterly 77 (3): 277–95. https://doi.org/10.1215/00267929-3570612.
Engzell, Per, and Julia M. Rohrer. 2021. “Improving Social Science: Lessons from the Open Science Movement.” PS: Political Science & Politics 54 (2): 297–300. https://doi.org/DOI: 10.1017/S1049096520000967.
EOSC. 2017. EOSC Declaration. Brussels: European Open Science Cloud. https://eosc-portal.eu/sites/default/files/eosc_declaration.pdf.
Graham, Matthew H., Gregory A. Huber, Neil Malhotra, and Cecilia Hyunjung Mo. 2023. “Irrelevant Events and Voting Behavior: Replications Using Principles from Open Science.” The Journal of Politics 85 (1): 296–303. https://doi.org/10.1086/714761.
Grimmer, Justin. 2015. “We Are All Social Scientists Now: How Big Data, Machine Learning, and Causal Inference Work Together.” PS: Political Science & Politics 48 (1): 80–83. https://doi.org/10.1017/S104909651400178.
Holmes, David. 2016. “A New Chapter in Innovation.” Nature 533: S54–55. https://doi.org/10.1038/533S54a.
Ioannidis, John P. 2005. “Why Most Published Research Findings Are False.” PLOS Medicine 2 (8): 0696–0701. https://doi.org/10.1371/journal.pmed.0020124.
Jagadish, H. V. 2015. “Big Data and Science: Myths and Reality.” Big Data Research 2 (2): 49–52. https://doi.org/10.1016/j.bdr.2015.01.005.
Janz, Nicole, and Jeremy Freese. 2021. “Replicate Others as You Would Like to Be Replicated Yourself.” PS: Political Science & Politics 54 (2): 305–8. https://doi.org/10.1017/S1049096520000943.
Kitzes, Justin. 2017. “The Basic Reproducible Workflow Template.” In The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences, edited by Justin Kitzes, Daniel Turek, and Fatma Deniz, 19–30. Oakland, CA: University of California Press.
Kitzes, Justin, Daniel Turek, and Fatma Deniz, eds. 2017. The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences. Oakland, CA: University of California Press. https://doi.org/10.1525/9780520967779.
Korku Avenyo, Elvis et al. 2016. “A More Developmental Approach to Science.” In UNESCO Science Report: Towards 2030, edited by Flavia Schlegel, 57–83. Paris: United Nations Educational, Scientific; Cultural Organization.
Lazer, David et al. 2009. “Computational Social Science.” Science 323 (5915): 721–22. https://doi.org/10.1126/science.1167742.
Lilienfeld, Scott O. 2017. “Psychology’s Replication Crisis and the Grant Culture: Righting the Ship.” Perspectives on Psychological Science 12 (4): 660–64. https://doi.org/10.1177/1745691616687745.
Liu, Hexuan, and Guang Guo. 2016. “Opportunities and Challenges of Big Data for the Social Sciences: The Case of Genomic Data.” Social Science Research 59: 13–22. https://doi.org/10.1016/j.ssresearch.2016.04.016.
Lupia, Arthur. 2021. “Practical and Ethical Reasons for Pursuing a More Open Science.” PS: Political Science & Politics 54 (2): 301–4. https://doi.org/DOI: 10.1017/S1049096520000979.
Maxwell, Scott E., Michael Y. Lau, and George S. Howard. 2018. “Is Psychology Suffering from a Replication Crisis? What Does ’Failure to Replicate’ Really Mean?” American Psychologist 70 (6): 487–98. https://doi.org/10.1037/a0039400.
Merton, Rober K. 1973. “The Normative Structure of Science.” In The Sociology of Science: Theoretical and Empirical Investigations, edited by Norman W. Storer, 267–78. Chicago und London: The University of Chicago Press.
Neupane, Bhanu. 2016. “A More Developmental Approach to Science.” In UNESCO Science Report: Towards 2030, edited by Flavia Schlegel, 6–8. Paris: United Nations Educational, Scientific; Cultural Organization.
Open Science Collaboration. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349 (6251): 943. https://doi.org/10.1126/science.aac4716.
Peikert, Aaron, Caspar J. van Lissa, and Andreas M. Brandmaier. 2021. “Reproducible Research in r: A Tutorial on How to Do the Same Thing More Than Once.” Psych 3 (4): 836–67. https://doi.org/10.3390/psych3040053.
Prinz, Florian, Thomas Schlange, and Khusru Asadullah. 2011. “Believe It or Not: How Much Can We Rely on Published Data on Potential Drug Targets?” Nature Reviews 10 (712). https://doi.org/10.1038/nrd3439-c1.
Purcell, Andrew. 2019. Big Data in the Humanities and Social Sciences. Https://sciencenode.org/feature/big-data-humanities-and-social-sciences.php (Zugriffsdatum: 20.20.2019).
Qui, Lin, Sarah Hian May Chan, and David Chan. 2018. “Big Data in Social and Psychological Science: Theoretical and Methodological Issues.” Journal of Computational Social Science 1 (1): 59–66. https://doi.org/10.1007/s42001-017-0013-6.
Ramachandran, Rahul, Kaylin Bugbee, and Kevin Murphy. 2021. “From Open Data to Open Science.” Earth and Space Science 8 (5): e2020EA001562. https://doi.org/10.1029/2020EA001562.
Rinke, Eike Mark, and Alexander Wuttke. 2021. “Open Minds, Open Methods: Transparency and Inclusion in Pursuit of Better Scholarship.” PS: Political Science & Politics 54 (2): 281–84. https://doi.org/10.1017/S1049096520001729.
Rokem, Ariel, Ben Marwick, and Valentina Staneva. 2017. “Assessing Reproducibility.” In The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences, edited by Justin Kitzes, Daniel Turek, and Fatma Deniz, 3–18. Oakland, CA: University of California Press.
Salganik, Matthew J. 2018. Bit by Bit: Social Research in the Digital Age. Princeton & Oxford: Princeton University Press.
Schooler, Jonathan W. 2014. “Metascience Could Rescue the ‘Replication Crisis’.” Nature 515 (7525): 9. https://doi.org/10.1038/515009a.
Shrout, Patrick E., and Joseph L. Rodgers. 2018. “Psychology, Science, and Knowledge Construction: Broadening Perspectives from the Replication Crisis.” Annual Review of Psychology 69 (1): 487–510. https://doi.org/10.1146/annurev-psych-122216-011845.
Stall, Shelley. 2019. “Make All Scientific Data FAIR.” Nature 570: 27–29.
Stark, Philip B. 2017. “Preface.” In The Practice of Reproducible Research: Case Studies and Lessons from the Data-Intensive Sciences, edited by Justin Kitzes, Daniel Turek, and Fatma Deniz, xvii–xx. Oakland, CA: University of California Press.
Stratmann, Martin. 2003. Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities. Berlin: Max-Planck-Gesellschaft. https://openaccess.mpg.de/Berlin-Declaration.
The Declaration on Research Assessment. 2012. San Francisco Declaration on Research Assessment. San Francisco: DORA. https://sfdora.org/read/.
Van Atteveldt, Wouter, Scott Althaus, and Hartmut Wessler. 2021. “The Trouble with Sharing Your Privates: Pursuing Ethical Open Science and Collaborative Research Across National Jurisdictions Using Sensitive Data.” Political Communication 38 (1-2): 192–98. https://doi.org/10.1080/10584609.2020.1744780.
Van Noorden, Richard. 2013. “Open Access: The True Cost of Science Publishing.” Nature 495 (7442): 426–29. https://doi.org/10.1038/495426a.
Vuorre, Matti, and James P. Curley. 2018. “Curating Research Assets: A Tutorial on the Git Version Control System.” Advances in Methods and Practices in Psychological Science 1 (2): 219–36. https://doi.org/10.1177/2515245918754826.
Wagner, Caroline S., Travis A. Whetsell, and Loet Leydesdorff. 2017. “Growth of International Collaboration in Science: Revisiting Six Specialities.” Scientometrics 110 (3): 1633–52. https://doi.org/10.1007/s11192-016-2230-9.
Wilkinson, Mark D., Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (160018): 1–9. https://doi.org/10.1038/sdata.2016.18.

Q & A

Feedback

Upcoming Events

Appendix

Targets-Package