Research documentation

Breakout rooms 1/3 - Visualizing data and Results & Research documentation #

Markdown CheatSheet
Our Github Repo
Google slides

We’ll cover the following topics: #

Step 1: Presentation of the breakout room topic
Step 2: Brief tutorial and time for individual practice
Step 3: Discussion and summary of challenges
Step 4: Return of participants and organizers to the main workshop room

Examples / References that display research documentation, data, and results #

Use of displaying notebooks as experiments on Github

Working on a team using Github Organizations

Hosting website to work on research publicly

Documentation of data standards

Archival of code and results for papers

Learning Github

Discussion questions: #

  1. What other tools have you used for this purpose?
    • Sharing files back and forth
    • Microsoft Office with track changes
    • OSF R package
  2. How does GitHub compare to other tools you’ve used for this purpose?
    • Harder to get some collaborators on board
    • Smaller file size limits
    • Undergrads can be the best adopters, helping with data tidying (no habits), one workshop at the beginning
    • microsoft owns github
    • size limits actually quite large on OSF
  3. What are the main challenges in using GitHub for this purpose?
    • Making sure everyone on team is willing to use
    • Size limits
    • Certain file types have cryptic ‘diff’ documentation in Github
    • large corporation owning code
    • terminology not consistent across programs
    • Working with a team with varying skill levels and associated challenges
    • seems like a big step relative to track changes in MS word - some collaborators still reluctant to even use Google drive
  4. What are the main benefits of using GitHub over some other tool?
    • Making sure all collaborators are using the same version
    • Example of a microsoft access database- collaborating with several students including undergrads
  5. What could be done to improve GitHub for this purpose?
    • Glossary of terms put into user-friendly language
    • teaching git starting with 5 commands, not overwhelming with all vocabulary
    • teasing apart which software engineering principles are needed for researchers, vs. what should be delegated to a software engineering role within a research team
    • Explain to students that it’s very hard to ‘break’ it
    • Starting with low stakes projects
    • Make learning project-based
    • You don’t have to use github if you use git: Gitlab instead of Github, but same fallbacks. Gitlab offered free private repos
    • When teaching Git/Github, it’s important not to transfer hesitancy to students – helpful to model behaviour of not needing to be an expert and making mistakes

Misc discussion #

  • Some members seeing big disjunct between their willingness to adopt new technology relative to PIs - writing manuscripts as markdown documents, converting into word, getting back track changes and implementing manually
  • github pilot - helps autocomplete code, took everyone’s code and did ML including all license types
  • people like software developers experience more top down support/pressure to adopt, whereas researchers usually working from bottom up
  • dealing with big files - splitting and binding big files can help
  • adding releases helps get around file size restrictions - unlimited file size
  • osf R package as an alternative for written components of projects, scripts go onto github
  • As a minimum situation where no new software needs to be learned - build timestamp into all documentation (including data download)
  • most software engineers don’t store data either locally or online – this is why there is a limit