Version control in the classroom: how teachers use GitHub in their courses

How Harvard, San Francisco State, and The College of New Jersey use GitHub in their courses

As a teacher, you juggle the endless stream of student emails, faculty responsibilities, and an ever-evolving field of scholarship. Is it worth to switch to version control for your courses? What do students get out of it?

Three teachers candidly reflected on the benefits GitHub offers their classroom practice at the recent Special Interest Group on Computer Science Education conference (SIGCSE) in Seattle.

Skip to a specific section:

  • 2m24s Omar Shaikh (SFSU): GitHub + Travis CI for automated grading
  • 11m50s S. Monisha Pulimood (TCNJ): Collaborating on real-world projects
  • 23m28s David J. Malan (Harvard): Custom tools for CS50
  • 37m25s Questions and answers

Monisha Pulimood, Professor of Computer Science and Chair at The College of New Jersey, shifted her Databases class to project-based model, where students collaborate in teams with a service learning component.

My concern was that, because of all the collaborative activities, I was losing class time for the students to actually master the content. But in fact, they are. It turns out they are really doing well with that; there’s a good increase in understanding the content. Students can answer deeper questions than they could before.

David J. Malan from Harvard University walked through the nuts and bolts of implementing Git and GitHub for CS50 in this deck.

CS50 Deck

Why version control is required for Comp 20 at Tufts University

Ming Chow of Tufts gives advice on teaching Git and GitHub

Students don't leave Ming Chow’s course until they are prepared to hold their own on an engineering team. Full stop.

What’s required for those rigorous engineering roles? “Experience making tangible things, mastery of how the web works, and communication,” Ming affirms.

In 2016 he received both the School of Engineering Teacher of the Year and Excellence in Technology Education awards at Tufts, success he reached using GitHub:

I’ve used Git and GitHub for well over five years now. When you take a course with me, it is simply expected that you’re going to learn Git and GitHub.

Ming was inspired to put version control front-and-center in his curriculum from a colleague, Norman Ramsey, who posted the idea to Stack Overflow. “Norman is a man of many great ideas, and this was one of them. I took his idea and executed it.”

To collaborate, you must communicate

It’s rare that developers build something entirely on their own: whether it’s taking over legacy code, reading documentation or submitting bug reports, we’re always in dialogue with others.

The benefit of version control is that it takes snapshots of progress over time, and small commits with clear progress markers give context to your work. Training students to make commit messages that are clear and frequent pays off in the future when other developers are able chip in to help.

Just to make sure that students master collaboration, Ming designs assignments and group projects so students must rely on one another to distribute the workload. He also asks students to give context to their work with GitHub’s documentation features:

In every assignment and every lab, students have to write a README. And a lot of them ask, “Why the heck do we have to write that?” Well, the difference between a good engineer and a great engineer boils down to writing skill and communication. So that’s the whole point of having a README for every project.

Priming collaboration: GitHub users are invited to create a README for every repository.

Collaboration matters because the web is relationships

To demonstrate how web technologies are interrelated, Ming has students evaluate and use third-party tools like APIs, Heroku, and GitHub to build their projects.

As part of that process, students become wise about which APIs to rely on:

One of the nice things about this course is it reveals the ugliness of web development, especially if you’re using other people’s platforms and services. For instance, Instagram recently changed its policy on rate limits, which students had to adjust for in their projects. When you’re dealing with APIs, you’re at the mercy of that third party. In any job, you’ll need to learn how to deal with that kind of risk.

All that said, Ming recommends the third-party tool Heroku for student projects because students can use Git to push changes to a live product.


Screenshot from a COMP 20 project hosted on Heroku. With these end-of-semester projects, students demonstrate their ability to work in teams, apply everything they’ve learned in the class, and have fun.


Screenshot of another COMP 20 project.

Last, Ming points to GitHub itself to show the moving parts of the web in action:

Git and the web are tightly integrated. One of the main points of using GitHub is to show it’s all based on HTTP and HTTPS. That also reinforces the learning of how the web works, as well.

Why is mastery of the fundamental pieces of the web important? Once students master the web conceptually, they will be able to integrate new technologies into this framework, building upon their understanding in the future.

Careers are the sum of our contributions

At the end of the semester, Ming celebrates the end of the course with perhaps one of the best subject lines of all time:

gift image

Ming frames student work as something that’s valuable and in moving a repo to their stewardship, he’s giving students the gift of a portfolio. In the future, students can point to this project as proof of how they communicate with other developers.

As their last assignment, he asks students to frame their repository with one last README: a self-assessments about what they’ve learned in the course. From a student:

The most important thing I learned has been the importance of communication in teamwork. For example, dealing with merge conflicts on GitHub was a big problem in my group, and helped us learn how to discuss our changes with each other, even as we were working on separate parts of the project. Another time we were struggling with making our database queries, and even those group members not working on the server side had to learn about it and help.

A duty to train future leaders

Git and GitHub reinforce the ability to communicate with others around real-life projects, which is why they are required for COMP 20.

Ming believes that teaching computer science is not simply about ones and zeroes but about leadership. The technology of the future relies on the habits he’s building in the classroom.


Announcing updates to the GitHub Developer Program

GitHub Developer Program Updates

For over three years, the GitHub Developer Program has been a launchpad for developers—from testing their newest applications to growing their biggest businesses. Now, we're excited to build on what's made the program successful for members and make it even more accessible.

Welcome, all developers

We're opening the program up to all developers, even those who don't have paid GitHub accounts. That means you can join the program no matter which stage of development you're in.

New levels and benefits

We're also introducing participation levels that come with existing program perks from us and our partners, like development licenses for GitHub Enterprise, and a new category of benefits that help you build and scale even faster. 17,000 developers around the world are already aboard—if you're kicking around ideas for applications that integrate with GitHub, now's the time to get started!

Here's how it works: Depending on the size of your user base, you'll be placed into one of three levels. For each group, we've made a set of benefits, resources, and tools available to help you advance to the next stage of development. If you're already a member of the GitHub Developer Program, you'll get an email with information about your level and available benefits.

We're so excited to see the applications you're building grow, and cheers to the thousands that have already seen success through the GitHub Developer Program: CircleCI, SRC:CLR, and GitPrime just to name a few.

The GitHub Satellite schedule is here: save your seat

There's still time to register for GitHub Satellite, and now you can buy a ticket knowing more about what's in store.

Private emails, now more private

GitHub has supported using an alternate "noreply" email address to author web-based commits for a while now. Starting today, there's another way to ensure you don't inadvertently publish your email address when pushing commits to GitHub via the command line.

Git uses your email address to associate your name to any commits you author. Once you push your commits to a public repository on GitHub, the authorship metadata is published as well.

If you'd like to ensure you don't accidentally publish your email address, simply check the "Keep my email address private" and "Block command line pushes that expose my email" options in your email settings.

email settings page with the block command line pushes that expose my email checkbox

You'll also want to configure Git to use your email. Don't worry—this won't affect your contribution graph. All commits will still be associated with your account.

Once you configure Git, commits will use your alternate "noreply" email address, and any pushes that don't will be rejected.

terminal showing the error message seen when a push is blocked by this setting

If you already have a private email address and would like to use this feature, check your email settings to make sure it's enabled. New private emails will have the option enabled by default.

For more information on keeping your email address private, check out the GitHub Help documentation.

Stay safe!

GitHub Issues and user testing as authentic assessment

Alexey Zagalsky quote on learning process

A course organized around users, not exams

In his Startup Programming course at the University of Victoria, Alexey Zagalsky asks students to design products based on user needs.

Working together in teams of four to six, students deliver pieces of the project at key milestones:

He ties the course to the software industry by inviting experienced mentors from local startups to evaluate student work. Alexey says:

While the end-products are terrific, the larger goal is understanding the process of collaborative software development. Students learn how to listen to users and incorporate their feedback in a thoughtful way.

User testing as assessment

After students ship a working prototype, the next milestone requires user testing with their target audience. And of all the challenges over the semester, students wrestle the most with addressing user feedback:

The most frequent point of failure is not understanding their users. And they wouldn’t see where they’ve failed until they try to get people to use their product.

But being able to listen to feedback, and implement it as part of the design process, is quite important. First, to learn, but also to get a job, because it’s not about writing code but actually understanding what needs to be built and how. One student now works at Amazon. Two or three work at Microsoft. One has gone on to become a UX designer. So many students really benefitted from this approach.”

Feedback through GitHub issues

Alexey admits:

You’d be surprised how often students get stuck and never ask for help

So occasionally he pops into student repositories to see what’s going on, test the code himself, and spot mistakes before it’s too late.

If he spots a bug, he’ll open an issue, outline what’s amiss and upload screenshots of the behavior.

Alexey finds a bug in a student project
From the fall 2016 student project DayTomato.

Next, he works with the team to think about potential solutions:

One team wasn’t sure which metrics they should track using Mixpanel. I suggested they track certain metrics at the prototyping, release, and iteration phases of their project. I gave them some perspective on how to prioritize and implement.

Alexey comments on a student project
From another student group, who made a borrowing and lending application called Bümerang.

Iteration for intrinsic motivation

In an Agile classroom, the goal is not the right answer to the problem, but knowing which problem to take on first, and how to solve it in the right increments.

This course isn’t about assessing a final product and saying, ‘You did that wrong.’ Our in-progress ‘checks’ show the students we care about their work; it’s not just some assignment they need to submit for a grade. The way we care makes them more motivated in turn.”

screenshot of presentation on process

Another student project, SmirkSpace, reflecting on its user feedback for Milestone 3.

A collaborative classroom practice

Alexey’s research focuses on how to use industry tools to build software together, to help his students develop the social ties, trust and curiosity to sustain a successful software career.

So he uses GitHub to enable discovery, design, and collaboration:

It’s about changing the way people work: students and educators, students among themselves, and education’s relationship to industry.

I am working from the hypothesis that software built collaboratively, with many voices and opinions, will improve the collective good of future software, period.

How to implement this classroom practice

Alexey documents all of his course designs and publishes the results of his research on student experiences with GitHub, Slack, Stack Overflow and other real-world tools.

Here’s a recent talk on his course design that discusses the benefits (and drawbacks) of using social tools in the classroom:

Student reflections on tools ecosystem

Welcoming CodePlex projects to GitHub

Welcome CodePlex projects

Earlier today, Microsoft announced the shutdown plans for CodePlex. We're working with the CodePlex team to streamline the experience of importing projects to GitHub for CodePlex users. As always, we will continue to support SVN clients for those who'd prefer to stick with SVN over Git.

Microsoft has made significant contributions to open source on GitHub over the years. With more than 16,000 open source contributors, these contributions continue to positively impact both the Microsoft ecosystem and the open source community.

We welcome CodePlex projects to their new GitHub home!

Disabling projects

Not every team manages their work on GitHub in the same way. Now you can disable repository and organization-wide Projects if you're not using them.

Disable GitHub Projects

Users with admin privileges on a repository can disable Projects by navigating to that repository's settings and unchecking the "Projects" box. Similarly, organization owners can disable Projects by navigating to an organization's settings and clicking "Projects" in the sidebar. On this page, unchecking the "Enable Projects for the organization" box will disable organization-wide Projects, and unchecking the "Enable Projects for all repositories" box will disable Projects for all repositories in the organization.

Disabling Projects hides the Projects tab from the repository and organization navigation, removes Projects from Issue and Pull Request sidebars, and hides Project-related events from Issue timelines. Disabled Projects are also inaccessible via API requests.

Projects can be re-enabled at any time, at which point all previously-disabled projects will be restored exactly as you left them.

Check out the help documentation and the Projects API page to learn more.

Work/life balance in employee intellectual property agreements

At GitHub, we recognize that running a great business over the long term requires a measure of "work/life balance" – and that includes recognizing that developers and other knowledge workers have creative lives outside of work. Whether that free time creativity involves contributing to open source projects, art, or activism, we want to encourage our employees, not put up legal barriers. We've codified this approach in our employee intellectual property (IP) agreement. We've made this agreement reusable and have open sourced as the Balanced Employee Intellectual Property Agreement.

By making the agreement an open source project, we hope to lower barriers to and learn more about innovation in this space. The project FAQ includes further background on related law, policies, and projects. Pull requests are welcome.

If you're in the tech industry, you've probably come across some version of an employee IP agreement before. Typically they assign control over your creativity to your employer, to the extent law allows – sometimes even after you've left a job, through non-compete covenants. These agreements and underlying laws impact worker mobility, innovation, and regional competitiveness. Most non-compete covenants are not enforceable in California, which researchers have long cited as a key reason the computer industry took off in California instead of another contender such as Massachusetts.

GitHub's employment agreement goes a bit further than the California default (and applies to employees outside of California). If you're a GitHub employee, you maintain control over your creation unless it is something "you create, or help create as its employee or contractor" and it is "related to an existing or prospective Company product or service at the time you developed, invented, or created it" or "developed for use by the Company" or "developed or promoted with existing Company IP or with the Company's endorsement." It doesn't matter whether you've used company equipment or not.

Sound interesting? We're always hiring, or we would love to see you start taking the same approach at your company. Check out the repository to learn more.

SHA-1 collision detection on

A few weeks ago, researchers announced SHAttered, the first collision of the SHA-1 hash function. Starting today, all SHA-1 computations on will detect and reject any Git content that shows evidence of being part of a collision attack. This ensures that GitHub cannot be used as a platform for performing collision attacks against our users.

This fix will also be included in the next patch releases for the supported versions of GitHub Enterprise.

Why does SHA-1 matter to Git?

Git stores all data in "objects." Each object is named after the SHA-1 hash of its contents, and objects refer to each other by their SHA-1 hashes. If two distinct objects have the same hash, this is known as a collision. Git can only store one half of the colliding pair, and when following a link from one object to the colliding hash name, it can't know which object the name was meant to point to.

Two objects colliding accidentally is exceedingly unlikely. If you had five million programmers each generating one commit per second, your chances of generating a single accidental collision before the Sun turns into a red giant and engulfs the Earth is about 50%.

Why do collisions matter for Git's security?

If a Git fetch or push tries to send a colliding object to a repository that already contains the other half of the collision, the receiver can compare the bytes of each object, notice the problem, and reject the new object. Git has implemented this detection since its inception.

However, SHA-1 names can be assigned trust through various mechanisms. For instance, Git allows you to cryptographically sign a commit or tag. Doing so signs only the commit or tag object itself, which in turn points to other objects containing the actual file data by using their SHA-1 names. A collision in those objects could produce a signature which appears valid, but which points to different data than the signer intended. In such an attack the signer only sees one half of the collision, and the victim sees the other half.

What would a collision attack against Git look like?

The recent attack cannot generate a collision against an existing object. It can only generate a colliding pair from scratch, where the two halves of the pair are similar but contain a small section of carefully-selected random data that differs.

An attack therefore would look something like this:

  1. Generate a colliding pair, where one half looks innocent and the other does something malicious. This is best done with binary files where humans are unlikely to notice the difference between the two halves (the recent attack used PDFs for this purpose).

  2. Convince a project to accept your innocent half, and wait for them to sign a tag or commit that contains it.

  3. Distribute a copy of the repository with the malicious half (either by breaking into a hosting server and replacing the innocent object on disk, or hosting it elsewhere and asking people to verify its integrity based on the signatures). Anybody verifying the signature will think the contents match what the project owners signed.

How is GitHub protecting against collision attacks?

Generating a collision via brute-force is computationally too expensive, and will remain so for the foreseeable future. The recent attack uses special techniques to exploit weaknesses in the SHA-1 algorithm that find a collision in much less time. These techniques leave a pattern in the bytes which can be detected when computing the SHA-1 of either half of a colliding pair. now performs this detection for each SHA-1 it computes, and aborts the operation if there is evidence that the object is half of a colliding pair. That prevents attackers from using GitHub to convince a project to accept the "innocent" half of their collision, as well as preventing them from hosting the malicious half.

The actual detection code is open-source and was written by Marc Stevens (whose work is the basis of the SHAttered attack) and Dan Shumow. We are grateful for their work on that project.

Are there Git collisions?

Not yet. Git's object names take into account not only the raw bytes of the files, but also some Git-specific header information. The PDFs provided by the SHAttered researchers collide in their raw bytes, but not when added to a Git repository. The same technique could be used to generate a Git object collision, but like the generation of the original SHAttered PDFs, it would require spending hundreds of thousands of dollars in computation.

What future work is there?

Blocking collisions that pass through GitHub is only the first step. We've already been working with the Git project to include the collision detection library upstream. Future versions of Git will be able to detect and reject colliding halves no matter how they reach the developer: fetching from other hosting sites, applying patches, or generating objects from local data.

The Git project is also developing a plan to transition away from SHA-1 to another, more secure hash algorithm, while minimizing the disruption to existing repository data. As that work matures, we plan to support it on GitHub.

Invest in tools students can grow with: GitHub and RStudio for data science at Duke University

quote from mine cetinkaya-rundel

Data science is a melting pot of disciplines: students from Anthropology to Political Science to Education all sign up for the same course. It’s a challenge to keep the material engaging for everyone.

At the same time, teachers want the next generation of data scientists to be able to analyze any dataset they come across in the future with the same level of rigor used in the classroom. That outcome requires a consistent level of training, across diverse datasets.

Mine Çetinkaya-Rundel, Director of Undergraduate Studies and an Associate Professor at Duke University, tackles these problems head-on. She quite literally wrote the book on the subject, and her open, online course Master Statistics with R certifies thousands of students a year. She edits the Citizen Statistician blog and offers the annual Duke DataFest, a hackathon for working with data.

Her data science course emphasizes “reproducible computation”, a documented process of how data is treated in order to replicate the results in the future. She meets that learning goal using R Markdown and R Studio, as well as by requiring students to incrementally commit their changes using Git and GitHub.

Students learn the concept and the real-world tool, at the same time

Some instructional products have a high learning curve in-and-of themselves. Çetinkaya-Rundel would rather students learn a language in the discipline that they can later build upon:

They learn a limited bit of R syntax that allows them to analyze and visualize data in R. While the goal of the course is not to make each student a proficient R programmer, they learn enough R to be able to build on in their second or third classes. Teaching software designed to be used only in an introductory class would not have this benefit.


Editing an R Markdown file in RStudio by Chester Ismay, MIT License

She wants her students to ship their first visualization project quickly and get started “like a knife through butter” without errors. To bootstrap their efforts quickly, she has them access RStudio Pro using her department’s server.

Best practices for scaffolding collaboration

Sta112FS introduces Git on day one to help students learn about version control gradually and in increments. Her GitHub structure is one organization for the course, one repository per team, and one per assignment. She recommends starting out with an individual assignment first, so students can get used to Git’s mental model before adding the complexity of collaboration.


Mine has students access their projects through RStudio, which she integrates with Git. These slides are from her talk, A first-year undergraduate data science course.

On group projects, Mine creates one repository per team, and everyone on the team can push. Knowing that students will encounter merge conflicts when they’re working asynchronously, she provokes the error messages early on in the semester.

In the first exercise, students clone a demo repository and are asked to edit the README in two tools, first on GitHub and then again in the RStudio editor. The conflicting changes throw an error, which she teaches students to resolve calmly, and by reading the messages.

I show them how I would resolve [the conflict], and then they do the same thing... I tried to create a situation in the classroom that could be frustrating for them later, and say, "This is about to happen. It's important to read the message that it sends you, and this is how you would resolve it.”

The setup pays off in engagement and curiosity

For Mine’s course, using a computation tool prompts students to engage with data directly—an important pedagogical choice to make their analysis both personal and real. There’s a freedom in exploring and tinkering with the real-world tool that professionals use.

Using a computational tool means that you can give the dataset to the students, and they can play around with it. They might be playing around with it in a way that you have designed the assignment or the assessment, but still it gives them the flexibility to look at another variable, make a different plot, than what was assigned for them to do.

Git and GitHub expertise after students leave the classroom

While she keeps student repositories private, merely using GitHub gives her students an edge later in their careers. They can also choose to make their repositories public after they are done with the course.

A lot of students have said to me later, even first-year undergraduates, that using GitHub has helped them a lot when they went for an internship or a research position interview.

They are able to say, "Oh, I already have worked with GitHub. I'm familiar with it. I know how it works.” So I think they are at least able to put that on their CVs and go into a situation where there's a research or data analysis team and say, "Yeah, sure. I am actually familiar with the same tools that you use."

Creating RStudio projects from GitHub repositories

To see an example of Git and RStudio in action, check out this tutorial from Dr. Nicholas Reich at UMASS-Amherst:

