Breakout rooms 1/3 - Visualizing data and Results & Research documentation #
Markdown CheatSheet
Our Github Repo
Google slides
We’ll cover the following topics: #
Step 1: Presentation of the breakout room topic
Step 2: Brief tutorial and time for individual practice
Step 3: Discussion and summary of challenges
Step 4: Return of participants and organizers to the main workshop room
Examples / References that display research documentation, data, and results #
Use of displaying notebooks as experiments on Github
- Team working on optimization of a neural network: https://github.com/DiscoveryDNA/team_neural_network/tree/master/code/experiments
- “Toolchain walkthrough” for organising data for systematic review: https://github.com/softloud/sysrevdata (https://softloud.github.io/sysrevdata/)
Working on a team using Github Organizations
- Overarching research team that works on multiple projects. Seperating projects that range from experiments to software: https://github.com/DiscoveryDNA
- An ugly example of team working: https://github.com/LivingNorway
Hosting website to work on research publicly
- Organization that shares tutorials and research. Each project will evolve into a “post” on website https://github.com/cabinetofcuriosity
- Website: https://github.com/cabinetofcuriosity/cabinetofcuriosity_site
Documentation of data standards
Archival of code and results for papers
-
Largely non-computational project. Code was just used to analyze resutls and create graphs: https://github.com/iamciera/sister-of-pin1-material/blob/master/cosegregation/readme.md
-
Largely Computational Project: https://github.com/iamciera/lcmProject
Learning Github
- https://github.com/hlowman/TidyTuesday/blob/master/resources/Git_GitHub_Guide.pdf
- https://afredston.github.io/learn-git/learn-git.html
Discussion questions: #
- What other tools have you used for this purpose?
- Sharing files back and forth
- Microsoft Office with track changes
- OSF R package
- How does GitHub compare to other tools you’ve used for this purpose?
- Harder to get some collaborators on board
- Smaller file size limits
- Undergrads can be the best adopters, helping with data tidying (no habits), one workshop at the beginning
- microsoft owns github
- size limits actually quite large on OSF
- What are the main challenges in using GitHub for this purpose?
- Making sure everyone on team is willing to use
- Size limits
- Certain file types have cryptic ‘diff’ documentation in Github
- large corporation owning code
- terminology not consistent across programs
- Working with a team with varying skill levels and associated challenges
- seems like a big step relative to track changes in MS word - some collaborators still reluctant to even use Google drive
- What are the main benefits of using GitHub over some other tool?
- Making sure all collaborators are using the same version
- Example of a microsoft access database- collaborating with several students including undergrads
- What could be done to improve GitHub for this purpose?
- Glossary of terms put into user-friendly language
- teaching git starting with 5 commands, not overwhelming with all vocabulary
- teasing apart which software engineering principles are needed for researchers, vs. what should be delegated to a software engineering role within a research team
- Explain to students that it’s very hard to ‘break’ it
- Starting with low stakes projects
- Make learning project-based
- You don’t have to use github if you use git: Gitlab instead of Github, but same fallbacks. Gitlab offered free private repos
- When teaching Git/Github, it’s important not to transfer hesitancy to students – helpful to model behaviour of not needing to be an expert and making mistakes
Misc discussion #
- Some members seeing big disjunct between their willingness to adopt new technology relative to PIs - writing manuscripts as markdown documents, converting into word, getting back track changes and implementing manually
- github pilot - helps autocomplete code, took everyone’s code and did ML including all license types
- people like software developers experience more top down support/pressure to adopt, whereas researchers usually working from bottom up
- dealing with big files - splitting and binding big files can help
- adding releases helps get around file size restrictions - unlimited file size
- osf R package as an alternative for written components of projects, scripts go onto github
- As a minimum situation where no new software needs to be learned - build timestamp into all documentation (including data download)
- most software engineers don’t store data either locally or online – this is why there is a limit