Saturday 3 May 2014 — This is close to 11 years old. Be careful.
I continue to notice an unsettling trend: the rise of the GitHub monoculture. More and more, people seem to believe that GitHub is the center of the programming universe.
Don’t get me wrong, I love GitHub. It succeeded at capturing and promoting the social aspect of development better than any other site. And git, despite its flaws, is a great version control system.
And just to be clear, I am not talking about the recent turmoil about GitHub’s internal culture. That’s a problem, but not the one I’m talking about.
Someone said to me, “I couldn’t find coverage.py on GitHub.” Right, because it’s hosted on Bitbucket. When a developer thinks, “I want to find the source for package XYZ,” why do they go to the GitHub search bar instead of Google? Do people really so believe that GitHub is the only place for code that it has supplanted Google as the way to find things?
(Yes, Google has a monopoly on search. But searching with Google today does not lock me in to continuing to search with Google tomorrow. When a new search engine appears, I can switch with no downside whatsoever.)
Another example: I’m contributing a chapter to the 500 lines book (irony: the link is to GitHub). Here in the README, to summarize authors, we are asked to provide a GitHub username and a Twitter handle. I suggested that a homepage URL is a more powerful and flexible way for authors to invite the curious to learn more about them. This suggestion was readily adopted (in a pending pull request), but the fact that the first thing to mind was GitHub+Twitter is another sign of people’s mindset that these sites are the only places, not just some places.
Don’t get me started on the irony of shops whose workflow is interrupted when GitHub is down. Git is a distributed version control system, right?
Some people go so far as to say, as Brandon Weiss has, GitHub is your resume. I would hope they do not mean it literally, but instead as a shorthand for, “your public code will be more useful to potential employers than your list of previous jobs.” But reading Brandon’s post, he means it literally, going so far as to recommend that you carefully garden your public repos to be sure that only the best work is visible. So much for collaboration.
There’s even a site that will read information from GitHub and produce a GitHub resume for you, here’s mine. It’s cute, but does it really tell you about me? No.
There is power in everyone using the same tools. GitHub succeeds because it makes it simple for code to flow from developer to developer, and for people to find each other and work together. Still, other tools do some things better. Gerrit is a better code review workflow. Mercurial is easier for people to get started with.
GitHub has done a good job providing an API that makes it possible for other tools to integrate with them. But if Travis only works with GitHub, that just reinforces the monoculture. Eventually someone will have a better idea than GitHub, or even git. But the more everyone believes that GitHub is the only game in town, the higher the barrier will be to adopting the next great idea.
I love git and GitHub, but they should be a choice, not the only choice.
Comments
In general, I'm against the monoculture. Just that I have an extra reason to be against it.
A newer open source contributor spoke about her first year of contributions last year at Open Source Bridge and said that when she is trying to find out whether a piece of software is open source, or trying to find open source projects on a particular topic, she goes to GitHub and does a search. I think this reality is one to accommodate by, for instance, having a placeholder GitHub presence that perhaps mirrors the canonical Git repo and turning off pull requests (or syncing pull requests into Gerrit or Bitbucket requests). That is what Wikimedia does.
But it's just not a good idea to assume that all of the open source community's development is on GitHub. Wikimedia uses a Gerrit installation (and may move to a Phabricator instance soon), and of course there are a bunch of projects on BitBucket, Gitlab, Gitorious, and what have you. I agree with you - the "GitHub is the only place" assumption is harmful.
(As a regular user of GitHub for both personal and professional projects.)
Imagine a developer in their early 40s, who had children before github even existed. They're busy with their work, their family, and they really don't have time for side projects. Nor should they; spending all your spare time programming if you have a family and a job makes you an unbalanced individual.
Sure, such people may do the occasional small thing to learn new technologies, but you know what... when I do that, it's usually only useful to me, so I don't put it on github. There are squillions of utterly useless 'projects' on github.
The whole thing reeks of the twenty-something childless tunnel vision that badly afflicts the tech sector.
As for twitter, aside from being able to follow food trucks, I seriously don't get it.
But do agree with your idea that it could imply business monopolization. As Noufal pointed out, Free Software Needs Free Tools.
another github based cv is available: http://osrc.dfm.io/ with even more details you ever expected.
What a silly article.
Facebook : social network :: Github : social distributed version control
Amazon : ebooks :: Github : social distributed version control
WordPress : blogging :: Github : social distributed version control
Excel : spreadsheets :: Github : social distributed version control
When the Ada Initiative left their partnership with Github, I decided to migrate my projects as well. I already had an account on Bitbucket; I moved my private repos there to save seven bucks a month long ago. And I already had an account on Gitorious; I forked one of Zed Shaw's projects a long time ago and it's there. So I created a GitLab account and started doing a feature comparison.
First of all, nearly all of my projects are one-person projects and most of them are just free cloud backup. Moving those to any of the Github alternatives is trivial. I settled on GitLab because it's open source (Rails) and the GitLove project is there. I've moved most of the small projects.
A few of the larger projects were next. There were about four of them with a significant number of watchers but really only one is active. The others were functionally superseded by my main project, CompJournoStick, so I made the hard decision to abandon/archive them and break all the links they've built up over the past five years.
That leaves CompJournoStick and my Octopress blog on Github Pages at znmeb.github.io. The blog is inactive; I use it mostly for CompJournoStick release announcements. I could probably port it to a wiki on one of the other hosting sites, or go back to a self-hosted WordPress blog. I tried copying it to Bitbucket's "almost-Github-pages" scheme and there were too many gotchas. For now, I'm leaving it on Github.
I want to move CompJournoStick but:
1. Broken links - whatever search positioning, social media links, etc. I have in the journalism community go out the window.
2. Broken trust - I look like a flake. Suddenly anyone who thought about engaging me on the project has a reason to avoid it. I haven't actively sought contributors, but I'd like to at least have the option. Saying "if you want to work with me on CompJournoStick, you have to move to Gitlab" is not an option.
3. I need an account on Github anyway to contribute to all of the other Github projects I use.
New projects will not be on Github. I'll most likely construct them on GitLab (or GitLove, if it gets moving) with backups on Bitbucket. I don't know about the blog; blogging is a chore (as is maintaining a LinkedIn profile). I'll probably drop the Gitorious account; it's just an addition to my online attack surface at this point.
The problem, Github is good at what it does and it will be hard to replace it with something better. Like Github replaced Sourceforge and Sourceforge replaced all those single CVS repositories.
But yes, Github has to die.
1. GitHub
2. Bitbucket
3. Gerrit (OpenStack)
4. Launchpad
5. Gitorious
6. Google Code
7. Sourceforge
8. Patches via email
This is probably rough order of preference too. Pretty subjective of course. I'll Google (ahem, web search, Google is another monoculture :-)) and contribute via whatever the maintainer has. I am probably somewhat likely to do it if it's one that I'm comfortable with, though I don't think about it consciously. I suspect that GitHub attracts the most contributors, because it's well-known and easy. There are lots of choices though.
I was hoping for something like Diaspora (https://joindiaspora.com/) so we could use our choice of repo or self-hosting but still interact with a global community.
But wait a minute.. Aren't there other search engines? What made you use Google as if "Googling" is search? Because they won search just like Github won version control and code hosting.
A while ago I started trying to contribute to some projects to some projects on github and in my experience it is far from making it easier to contribute to such projects. There are just so many ways in which a git(-hub) workflow can be organized and very few projects take the time to explain the steps you need to take to make your first pull request in a detailed manner, step by step, they just assume, "hey, we're on github so everybody should know how this works, right?". Wrong.
Apart from that, I've seen so many internet service come and go in the last twenty years, that I've grown to be _very_ wary of relying on third parties to host my stuff, be it email, calenders, or my code.
Github is much worse: there are no company accounts, you're required to create and use a personal account which the companies then whitelist for access to their private repos. The Github TOS requires your real name and limits you to a single personal account. In other words they track your private work associations, since your personal account gets connected with the repos of companies X and Y even if you never say anything public about that. This strikes me as extremely invasive: if you're a consultant and your clients use Github, then Github now knows your client list. There is no valid business justification I can see for that disclosure: I'd have no problem at all having each client buy me a separate, paid account under the clients' names, similar to the gmail accounts.
Mostly because of the above privacy issue but also because of the monoculture, I've been boycotting Github (along with Google, Facebook, etc.) to the extent that I can.
While we haven't contributed to github, we've used plenty of ruby gems from it, and our experiences have been poor. With the influx of new devs learning that github is where you throw the code you think will help everyone out, every dev and his mother throws their poorly-built junky code up there. We get new programmers all of the time including 20 unnecessary, buggy gems on our projects, and even the gems we actually needed are buggy and slow, so we end up having to build the functionality from scratch anyway or switch them out for another ridiculously-named gem and rebuilding the interface. It ends up actually costing us significant amounts of time supporting these crappy gems.
So you can understand when it makes me irate when employers begin considering github involvement a heavy factor in determining if a developer is worthy of a high position in their Ruby shop. The best Ruby devs I know aren't on Github, started with languages such as C, and Perl, and don't have time to jump into every hip community that pops up every few years. Most of our github-fanclub devs haven't touched anything other than Ruby, don't know how to use other repositories, and keep on including gems to do work for them that we end up having to remove because the gem sucks.
Thanks for the article which elloquently addresses this trend.
Can we start a "Pro Diversity - Choose Bitbucket" campaign? ;)
https://twitter.com/RandySyringPro/status/463014829406838784
#ProDiversityChooseBitbucket
Also, for what it's worth, I am about 100x less likely to contribute to a project that isn't on github. I think you're only hurting yourself by going against the flow.
Depends on how you use Google, really. If you don't have an account and just only use the search functionality, perhaps this would be true.
But I have an account, I use Gmail, GDrive, Chrome (until some weeks ago, at least), I have a Blogspot blog, and I use my Google account to login in several other websites. The amount of context and targeting that goes in every search I do is certainly very high, and, although some could argue that this is a privacy issue (and I might agree with them), the fact is that I get much better search results with Google than with any other search engine that doesn't "know me" so well.
So, yes, I could change search engines tomorrow, but it wouldn't be without a price.
I use it. Also, I prefer Mercurial to Git, who needs staging?
You say you are 100x less likely to contribute to a project if it isn't on GitHub, but you don't say why. Is it really so hard to use other sites? Isn't most of the work the actual coding, debugging, etc that happens on your own machine?
With GitHub, I know the whole process on how to contribute code. I clone the repo, fork it, make some changes, push it up, and make a pull request. I've done this thousands of times. The entry barrier is low for a few reasons. One because I already know all the tools (git, GitHub, etc.). A project in hg is extremely hard for me to contribute to, because I do not know it. Second, these tools are actually powerful enough to make certain things possible. git's branching model and easy merging makes contributing easier than submitting literal patches. GitHub makes both opening and merging a pull request a single click. In fact, GitHub makes it possible to contribute to a project without ever leaving the browser. It's designed for people who don't want to use git, but I've done it myself. I am definitely savvy enough to clone a repo, fire up emacs, fix some typo, and push up a branch, and do it all pretty quickly, but even this is way slower than just correcting the typo in GitHub and pressing "fork and pull request".
We recently moved all SymPy issues from Google Code to GitHub. This was a huge downgrade in terms of issue features. Google Code lets you do nice things like automatically apply labels, and has very powerful searching (in GitHub, I can't even figure out how to do a negative label search). But this was a downgrade for a few (basically me and maybe a couple other core devs), and an upgrade for the community (all the people who want to report issues).
If you hate drive-by contributions, or if you want to develop your code in the cathedral, then go ahead and pick whatever works the best for you, and you alone. But if you're like me and you love drive-by contributions, and you think that the bazaar is a much better place to develop code, then GitHub is really the only place that you can do it, because GitHub has the one thing that no other site has, which is momentum.
> ... I suggested that a homepage URL is a more powerful and flexible way for authors to invite the curious to learn more about them. This suggestion was readily adopted (in a pending pull request), but the fact that the first thing to mind was GitHub+Twitter is another sign of people's mindset that these sites are the only places, not just some places.
So let's see, the fact that the 500 lines book project is hosted on GitHub, and the contributors are asked for their GitHub usernames as the first point of contact, strikes you as strange? And to be fair, everyone did readily accept the suggestion to use a URL instead, but guess what, my thought is that most everyone is going to just submit their GitHub profile URL. So at best you've added a few more characters to what everyone has to type out.
> Don't get me started on the irony of shops whose workflow is interrupted when GitHub is down. Git is a distributed version control system, right?
Yes, _git_ is a DVCS, but _GitHub_ is a full-featured code hosting, issue tracking, release and deployment solution. So let's not set up a straw man here, please.
> ... But reading Brandon's post, he means it literally, going so far as to recommend that you carefully garden your public repos to be sure that only the best work is visible. So much for collaboration.
Let's face it, whenever a system gives rise to any kind of incentive whatsoever, there will always be some trying to game that system. So, some people will think that GitHub seems like a good proxy for coding mojo and then they'll take that to the logical extreme. Doesn't mean that's actually true. See e.g. GitHub's famous 'meritocracy rug' tweet.
> ... Still, other tools do some things better....
Yes, some tools are better at some tasks and others are better at others. With this kind of argument, no one ever gets anywhere :-)
> ... But if Travis only works with GitHub, that just reinforces the monoculture.
Travis is hardly the only game in town. I'm no CI expert and even I know about Jenkins and CruiseControl. The kind of monoculture that Travis creates by targeting GitHub as a code source is simply a trade-off to get simplicity in exchange for flexibility, and is standard practice in software engineering.
> ... But the more everyone believes that GitHub is the only game in town, the higher the barrier will be to adopting the next great idea.
This is the old vendor lock-in bugaboo. Let's break it down: GitHub locks you in with its pull requests, its commit- and line-level comments, issues, social features etc. That's again a trade-off that you need to choose to make. E.g., Linus Torvalds explicitly doesn't use the 'lock-in' features of GH despite mirroring some of his projects there. Others embrace all the lock-in. The fact is, there will be lock-in to some extent no matter what tool you use, GitHub or no GitHub. Do you use Bugzilla today and decide to migrate to Phabricator tomorrow? Well, you'll need to massage all your issue data out of BZ and import it somehow.
With git, there's no question of lock-in--even at the most primitive level, you can always get all your commits out as plain text patches. Generally, anyone worried about lock-in with a VCS doesn't understand what a VCS _is._
As for Github (as distinct from git) it is like any major city in the real world. People from rural areas migrate to cities because... other people migrate to cities, and so on. Network effects are intrinsically beneficial. This is not sinister. It's just topology and feedback.
I use bitbucket and github interchangeably (unless I want a private repository for some reason: bitbucket!). I also use codeplex occasionally, and at work (sadly) I have to use TFS. I seamlessly mirror change sets between TFS and git repositories for convenient offline working.
If you define a VCS as something that uses a working directory in the file system, and transactional commits with comment+timestamp+username, then that forms a very open protocol. You can very easily shunt changes between such systems by getting them to look at the same working directory. There is really no lock-in at all.
Who cares. You probably don't know any "real" languages anyway.
Good article. For a community that generally consists of a population fluent in multiple programming languages (and in some cases multiple natural languages) this concept of consistent user experience seems contrived and useless. The experience only needs to fit the purpose. In some cases that's Github. In others it's Bitbucket. Sometimes it isn't even git based. Some people and projects prefer Mercurial. I find those people insane but I won't not contribute because their filing system doesn't fit my personal sensibilities. The quality of the project itself is more important.
One encounters this all the time. Heroku's tooling comes to mind.
Add a comment: