Version control with the OSF

Version control with the OSF



alright good afternoon everyone welcome to this webinar inversion control with you OSF my name is Courtney soda from the center for open science and I'm going to be leading this webinar today so just to go over a couple of quick logistical things before we get started with the actual content please feel free to ask questions during the webinar also make sure to leave time for questions at the end but I just wanted to quickly go over how you can ask questions so there's AQ & a function which you should be seeing on your zoom dashboard um so if you click on that you can ask questions and then I will be able to see when you ask questions and mark when I'm answering them if anything goes wrong during the webinar you think I'm not sharing the right screen my audio goes out something like that um you can also ask you can also notify me of that by using the chat functionality on so you can send a chat I'll see that and we'll be able to hopefully fix the problem you can also ask questions after the webinar so if I go over something and you have a question afterwards you can always feel free to contact us and there are a couple of options for that anything with the OSF I go over there are a couple of links to help resources for that so the first three is the link to the OSF the link to our basic help documentation as well as specifically a link to the help documentation about version control which we're going to go over today if you have a specific question that isn't covered in the help documentation you can always send us an email at co-op contact at cos I oh alright so now that we have the logistics worked out I'm going to go ahead and start talking about version control and how the OSF can help you out with that alright so for those of you who are less familiar with the OSF all this up stands for open science framework the open science framework is a free open source a collaborative platform that helps researchers track document and potentially share the entirety of their research workflow so what I'm going to go through today is talking about what version control is and what are some of the features that are built into the OS f to help researchers have better version control and better documentation so I'm going to be showing a example project that I have from before called gender and political affiliation so when we talk about version control there are many different ways to go about version control but oftentimes what this means is just how do I keep track of and manage the different versions there are of a file how do I keep track of what's changed how do I keep track of which version of the file is later than the other version of the file how do I keep track of who is making those changes different people will have different systems for this one thing that I know I used to do and is pretty common is kind of a homegrown version control where you may append something to the end of a file name often times a number or someone's initials or a date or something like that and sometimes this will go pretty well for like the first one or two sections of the file but you may eventually end up with something like this where i have analyses analyses to analysis three and we're kind of going okay maybe maybe this is going well then all of a sudden we get final analyses to final analyses final analyses to then analyses for no idea where three wins and then you get something called actual final analyses it actually actual final analyses too so these are actually files that have the names that are very similar to some names of some analysis script files from way back when I was a first year graduate student so this is pretty indicative of what I used to do and what I think many people will often do where they'll start out with a naming convention that is supposed to take care of the file versions and for one reason or another that naming convention will kind of quickly get out of hand and they're left with all of these files different names and like final and on final versions of numbers and it's really unclear who is making those changes and what the actual order is and so it gets it gets complicated to try and recreate on how those files changed over time and figure out what is the current version at any moment in time so I'm going to talk about what are ways that the OSF helps us with this version control so rather than uploading files with different names when we have to upload a new version of a file what are some other ways we could do this so I'm actually going to delete all these other files and we'll just start from scratch one analyses file so there are a couple of ways to deal with versioning on the OSF for files that are that can be edited in a text editor for example in our file but this could be an hour file a CSV file a dot txt file or things like SPSS scripts or soft scripts or status scripts I can edit them directly on the OSF if I want to so if I open this our script you'll see i have an edit button this will bring up an editor and i can actually make some changes for example i could say this is these are the required libraries and when I save those changes there we go you'll see that the view of the file has been updated and if I click on this revisions tab a new version has automatically been created for me it has the date stamp and it has the name of the user who changed the file now this second version of the file is the one that will automatically show up when i go to view the document but through the revision history by clicking on those old version ids i can go back and view those old versions and from the revisions tab I can also download those previous versions so by having these versions automatically created I have one clean line of versions going forward so i know that this one was created after the previous one i know that i created it but I could always go back and look at those old versions now when we think about a project if I am the only one working on this file I may just have one linear line however if I'm working on this in collaboration with somebody I might worry about well what if I'm making changes and they're also making changes at the same time could i accidentally get two versions that are out of sync with each other so to guard against that there's this checkout button so if I click on this checkout button what that does is it means that the other contributors to this project the other people who have read write or administrator access to this section of the project will be able to view this file but they will not be able to edit at so they won't be able to upload a new version they won't be able to use the Edit tab on it the file is basically locked for them and so that's a way to make sure that versions can knock it out of sync so one of my collaborators could be viewing this our script but they wouldn't be able to make edits until I had checked the file back in so we have two questions mallika asked can the versions have comments like commits and get that's a great question so currently you cannot append a comment directly to a purty version of a file that is something that we've had a couple of user requests about so it is something we're looking into however you can comment on a file in general so for example I can say there's this commenting pain right here so I can click on comments and I could say second version has updated documentation and make that comment and those comments are stored with the files but there's no way to connect that comment specifically to a particular version of the file and so I can show you how this would work with a word document for example or a different our script that has another change so if I go into another section of my project for example the questionnaire section I have a questionnaire file this is a word doc file so you can see that I have some text here that says make some changes and it has two versions currently it doesn't it has a check out function so i can keep somebody from interacting with this if I know I'm need to make some changes on that questionnaire file but it doesn't have an edit button so what that means is I can't edit it natively on the OSF so in order to deal with that file what I want to do is open it up on my own computer and and I could make whatever changes I need is to make to that document on my own computer I'm just going to change some basic font colors resave it on my personal machine with that same name so resave it as questionnaire and then as I did with our script I want to go ahead and upload that file with the same name so when I go and open that file you'll see that it now says version 3 and that a new version has been created so now the text is in the text of the header is in green and I have highlighted the background in dark blue I'm not a very important change but it changed its really easy to pick out and just like with the our script that I edited natively on the OSF you'll see that even though this was not edited on the OSF when I uploaded that new version with the same name the system is automatically created a new version with a new timestamp and the user name so if I was collaborating on this with somebody if for example my boss Brian went looked at the questionnaire document decided that he wanted to make some changes and uploaded a doc called questionnaire dogs that was a newer version this would have a new timestamp a new version number and it would say Brian so Brian and mice changes are all going in one stream forward and we're each when we view the page going to be seeing the most recent version so Brian doesn't accidentally edit the old version of the questionnaire document that he happens to pick up from his email because he'll be looking at the OSF seeing okay what's what is the version that is currently up there I know that has to be the most current version and I'm going to work with that one so that's how the OSF deals with versioning of documents that have uploaded but some people may already have version control systems that work well for them I mentions the kind of homegrown version control of you know appending a number or a initials or a timestamp to the end of a document but one popular thing that some people will use to version I'm especially code or analysis scripts is github github is actually what we use at the center for the development of all the code related to the lsf so let's say that my analysis scripts didn't actually exist within the OSF they actually existed in a github repository but since they're related to my project right they're related to this data they're related to these methods and materials maybe I want to how those analysis scripts appear in my OSF project just so I can link up the different pieces and parts of my workflow so rather than having to download them from github and upload them to the OSF I can do something a little bit fancier if I go into the analysis scripts component click on settings I can connect up certain features to the OSF some of them are storage features like Amazon s3 dataverse box or Dropbox big share but github is one of the options so i can check github and then it's going to ask me to basically import my access token this would normally ask me to if I'd never done this before it would ask me to input my password I have done this before so it knows i am who I say I am and then I get an option of which repository do I want to connect this project i went to connect my test repository so if i save that and i look at the project now you'll see that the contents of this github repository is appearing in my project when i click on any of these files the most current version in github is what appears and if i click on the revisions tab it's showing me all the different version IDs that are stored in github and I can actually so I can get that information or i can download them so how this works is two-way door so I can still interact with this file through github any change I make to the file on github will show up in the lsf but it allows me so I can view the file and github if I want but maybe the collaborator on this project is somebody who doesn't use github maybe they don't really want to interact with github review the file through there it allows them to look at this file um through the OSF even though it's actually coming from github if you want to you can give people the ability to interact with this file from the OSF and it will push commits to github you can keep them from doing that if you want to by adding them as read-only contributors to this section of the project that would mean that they would be able to view the contents in that gated repository but would not be able to make any changes if you add them as read right technically they have the ability to push commits to get help there is this edit um functionally where I can say you know make even more changes since this is a text editable document and then change will be pushed to github but you can keep people from doing that if you want either by making them as read read-only and so then if I look back at the github history of this file you can see how that commits is put into github right it'll say it was updated via the open science framework and that the commit was made by me if the commit is made by another collaborator on the project it will say their name so I believe right so this was a commit made via the open science framework but it was made by a contributor on the project Jolene Esposito alright so we have a question from Jade does OSF track versions in the same way for documents that have been registered on the OSF so Jade is asking about her i believe and please correct me if i'm wrong about this um the registration functionality of the OSF so what that means is this is a living project I can make as many changes to these files as I want but there might be certain points in the life of a project that I think are important to keep a read-only kind of version of if for example right before i start my collection my data collection or what my study looked like when i submitted it for publication or what my what my project looked like at a certain point during data collection i can create a snapshot of the project at that point in time they'll be read-only it can never be changed and that is what a registration is so if i go to my project I have this registration option I'm not going to register this but I'll show you what a registration looks like alright so this is a registration of a project I actually was working on so you can see there's this read-only watermark going along if here's an R script for example if i look at this art project I don't have an edit button anymore it will show me the versions that existed up to that point in time so I can go back and look at those versions but none of these can be edited because the registration is view only however this registration is connected to the living project so it says this project is a registration of this project if I go into the living project I click on the registration tab I can see how many registrations were made when they were made and if I wanted to I could continue to edit these files so registration will have the versions of the files that are in OSF storage in the current version of the files in the add on at the time when the registration was made but since the registration is read-only it will not continue to add on versions that are made after the registration to the registration but there's will be made in the project does that make sense Jade hopefully yeah you can think of the project as one flow and the registration as just taking snapshots that are saved separately but connected to the project and so there's a there's always a pointer but the registration is what the project look like at this point in time and so it's not gonna it's not contract what happens into the future but it will track what had happened in the past alright so i quickly wanted to go over the last way that the OSF deals with version control and we've talked about version control of files those text editable files and other files like word documents we talked about adding adding in other tools like github that may have internal version control and how you can still see those previous versions in the OSF but I did want to mention really quickly one other place on the OSF has version control that I haven't talked about which is the wiki so this is the wiki it's a real-time collaborative editor which different people use differently I tend to use it as a way to kind of keep a current description of the state of my project some people use it like a notepad other people use it kind of for abstracts and things so just like that our script there's an edit button and I can make changes so for example I might at some point change my hypothesis and say um women will be more conservative than men you can see that it's automatically updating over here and I save that wiki now if I click on the compare here um the wiki will actually give me a diff of different versions of my wiki so just like files it will tell me um who made the change and when it was made but it has the additional feature of allowing me to look at different versions and say okay what's different between the current version and version 4 for example and so it will actually highlight things that have changed between the two versions all right so that was the different places in the different ways that OSF deals with version control both allowing you to connect in your own version control systems that you maybe already used or automatically automatically version and controlling files so you don't have to have a bunch of files with different names trying to track those versions we have a couple of minutes so if any of you have questions about anything I went over we do have time to answer them inside the webinar or as I mentioned you can always email us with questions after the fact by going to contact at cos I oh all right so it looks like nobody has any more questions so thank you so much for attending the webinar this afternoon and have a good day

Leave a Reply

Your email address will not be published. Required fields are marked *