Saturday, August 15, 2009

StackOverflow Experiment Results

Thanks to many of the users for pointing me to the official data dump, available here, I was able to complete my experiment.

I measured, using the number of questions asked containing a specific tag, the activity of various programming languages throughout the week. My hypothesis is: Newer dynamic languages like Ruby and Python will see a rise in questions ask
ed on the weekend while more corporate languages like C# and Java will see a dropoff in activity on the weekend.

My theory is that programmers choose to use languages like Python and Ruby for their personal projects, despite their weaknesses, because these languages are more fun to program in. Since programmers tend to work on these projects at night and on the weekends, they will probably be asking questions related to their projects during these times.

Fortunately, the results supported my hypothesis. A plot (made u
sing Python) of the relative number of questions asked per day of the week is shown below. The values were computed by calculating the percentage of questions asked for each topic relative to the total number of questions being asked. This controls for the overall drop in traffic to on the weekends.

Python and Ruby both have a sharp rise on the weekend, while C# and Java both fall off. The fall of C# is quite a bit more pronounced than that of Java, but the effect is still clear. Another interesting note is that the two "workweek languages" both have a rise in activity on Mondays. Maybe programmers leave work Friday and continue to mull over problems at work during the weekend, then ask their problems early Monday morning.

Even though the relative activity of Python and Ruby rises on the weekend, it is still important to note that C# still sees activity around three times higher. This shows that there are still more people using C# than Python on the weekend, just not as many as during the week.

I'm not too sure exactly what the implications of these results are. Let me know what you think.


Gian said...

(I already posted this on the thread on HackerNews, but I decided to add it here as well for the purposes of discussion):

This seems to assume a fixed number of programmers, all of whom program during the week and at the weekend.

My guess would be that it's much more likely to be two largely disjoint sets:

Professional programmers at work during the week, and amateur programmers who have other day jobs (e.g. school or non-programming jobs) who are more likely to be programming during the weekends.

If we assume that this is the case, then the data simply suggests that professional programmers are more likely to use C# and Java, whereas hobbyist/amateur programmers are more likely to use Python or Ruby.

This is just a hypothesis, but it is equally well supported by the data as the article's hypothesis:

"programmers choose to use languages like Python and Ruby for their personal projects, despite their weaknesses, because these languages are more fun to program in"

Which assumes that all professional programmers are also hobbyist programmers in the weekends, and that the numbers of amateur programmers are not significant enough to make an impact on search statistics.

Benjamin Nortier said...

I would like to disagree partly with the statement that "programmers choose to use languages like Python and Ruby for their personal projects, despite their weaknesses, because these languages are more fun to program in"

Firstly, which "weaknesses" are being referred to?

Secondly, I think weekend programmers prefer languages such as Python and Ruby because they are more powerful and more productive. If you're using your free time to program, you are more likely to use something that gets you to your goal more quickly, because you're not being paid for it and your spare time is more precious than work time.

Gian said...

"I think weekend programmers prefer languages such as Python and Ruby because they are more powerful and more productive"

Oh dear. That's rather fuzzy. By what metric?

jamesw said...

there seem to quite a large set of assumptions built into your analysis here. But others here and on HN have commented on that. I'm just amazed at some of the data choices in the dump, Tags especially.

jocknerd said...

I think weekend programmers choose Ruby or Python because they get to choose what they want. In the "enterprise" world, programmers don't usually get to make their own choice. Managers assume Java or C# is the only answer.

Scott Bellware said...

Hi Dan,

Here's another factor:

Seed-stage and angle-funded web startups often use PHP, Python, and Ruby. The folks involved frequently work through weekends. This is also often true of VC web startups, though there are fewer of them.

Also, weekends are the time when curious developers invest personal time to learn new things. We should expect to see an uptick in "alternative" (from the perspective of corporate software work) language use.

Benjamin Nortier said...

Thanks for pointing that out.

What I was trying to say is that the users may *perceive* them to be more powerful or productive, hence part of the reason for using them in their spare time.

Perception is fuzzy by nature.

Dan Lorenc said...

Thanks for the comments everyone. I also agree that there are many other factors to consider here, I was just pointing out a couple possibilities.

@jamew: What would you suggest doing differently, instead of measuring tags?

c.wrinn said...

+1 @jocknerd I agree here, it has little to do with the "strengths" of the languages as much as managers and the target audiences of the projects. Hobbyists use what they prefer at home. I, for instance, work in a C# company that uses WPF because for the history of our company we've always written in the C family and primarily for MS based platforms. I however use Python for my home projects for similar reasons others have made. I do not find it l weaker than C# so I am also curious on your basis for this claim.

Bo said...

By a "weakness" probably is a performance. Python is very slow in compare to Java and is bloated, taking lots of memory at runtime. It is fun to have Python's syntax sugar, but not fun when the at final result the thing does not scales (remembering Zope-2 and Zope-3 nightmare)...

By saying "Python and Ruby more powerful and more productive than Java/C#" — is a plain BS and I don't buy it, sorry. Java — is a C++ done right. If you talking about syntax, I could agree: using VIM/Python you will be nearly as same productive as someone with NetBeans/Java. However, again, the final result will be very different: apart regular mainframes and huge clusters, Java may also run embedded in a smart card or your dishwasher, while Python looks quite ridiculously by a performance, possibilities and resource consumption. I would say Java just simply misunderstood and also bastardized by stupid heavy frameworks, where lots of them are not fun, but a nightmare.

At enterprise I used Python for a core part and the thing is online for 7 years with no major problems. I think, hobbyists taking Python mostly because you can toss up a shoddily done result per 20% of week time, that also satisfies 80%. When fun is over, you do your job which brings you money: Java.

Besides, I am pythoneer for a long years and probably may write a book about it, using it since very old 1.5 version, so no holywar here, please. :-)

Benjamin Nortier said...

I don't want to start a flamewar here so please excuse me if I don't reply directly to some of your statements :)

I regret using the word powerful now, since it is very subjective. It can mean so many things in so many different situations.

But I will stick to my assertion around productivity. I believe spare time coders would choose a language that (they believe) will get things done as quickly and efficiently as possible. I don't think it would be more "fun" to use something that actually takes you longer than you would expect from your working-week experience.

Some more data would be useful.

Perhaps StackOverflow could do a survey around programming languages? It would also be quite interesting to compare results with and without weighting results according to reputation...

UberTeorist said...
This comment has been removed by the author.
UberTeorist said...

A few points you might want to consider

Assuming the number of page views per day represents the number of programmers per day seems dangerous!

Secondly, there are too many factors to consider for the data to be significant in showing any correlation (much less causation) in any of them.

Some questions the data might raise:

Is the content of the site homogenous for all languages? (probably not)

Are all languages prone to the same kind of questions that find answers in the site? (probably not)

Are the programmers' programming experience similar across languages? (probably not)

Is google ranking the results to questions in the same way for all languages? (no idea)

The only thing the data tells you is that on weekends the number opened pages concerning java and c# drops a bit and the number of opened pages concerning python and ruby goes up.

That is already interesting but any other inference from this data alone looks a lot like speculation.

Anyway, raising more question is already pretty good!

ET said...

People who ask questions are typically the beginners. Having more questions about Ruby and Python at the weekend probably just means that people are spending their free time trying out new languages that aren't officially taught.

You can get no conclusion about fun, because most likely people who find a certain language fun are already familiar enough with it to ask few questions.

Bo said...

Yes, been also a python hacker, I agree that Python syntax is way more simpler than verbose Java. This was one of the main kicks I used 8 years ago against Java. However, time changes and now having awesome NetBeans IDE and Java 6, I can not say anymore that Java is sluggish crap, requiring tons of XML. Not anymore: you have annotations and you can do things with literally zero of XML and productivity basically same.

The only value Python has over Java — syntax. It is really neat and sweet. Thus I think Python is a perfect match to replace fugly Perl. Python is fast enough, understandable and has enough libs to do so.

But why I do reject code in Python web stuff? Because I believe WSDL rules over REST, especially at enterprise. Try do WSDL in Java or C# (piece a cake) and Python (a true hell in its hardcore). OK, try to write MP3 or video player in a pure Python. D'oh!, look at performance TTF fonts parsing in ReportLab!.. So that's why Python is completely different than Java: Python does more like a glue between C/C++ libs (hence is hardly portable), while Java tries to do everything by itself (very easily portable at result).

An example: you wrote your PyQT stuff as twice faster as I did. OK. Now look what you've got: Py2App or Py2Exe made a 100M program!!! While I can get a GNU/Classpath, take a CacaoVM and still get smaller than you do!

Python is also more bloated: it is as twice as bigger at footprint by default in Mac OS X, because they simply add all the required features, like wxWindows (forget tkinter crap), Twisted etc. Don't forget that Java comes with multimedia/sound/video, 3D, 2D support, GUI (swing) and your socks washer. :-)

amoore said...

what, no perl?

Gary S. Weaver said... job trend comparison of Java vs. C# vs. Ruby vs. Python

Ankur Kapil said...

thanks a lot.....

Ankur Kapil said...

thanks a lot........

Anonymous said...

I am service provided from (GDM)Global Delivery Model. Software development company has excellent application development services. ITsolusenz company offering software development including,Application Development Company, software development company india, Software Development Services.

Anonymous said...

The rising prices on the aviation industry and the unheard of charges that are being levied on airfares have disturbed people all over the world. This change has shocked people all the more as they had become habitual of comparatively cheap airfares. The IT industry is also following the same trend. The concepts like outsource software development and offshore software development took birth and found existence due to the combination of optimum quality product at competitive rates. Sadly, the IT sector is also experiencing a major downward slope in the present times.

Pradeep said...

nice its is.
download latest 2011 hollywood movies free from

Curruption said...

nice blog.....u can Movies Download Free form mine site....

Neha said...

Hi you have a nice site over here! Thanks for sharing this interesting stuff for us! If you keep up this good work I’ll visit your weblog again. Thanks!
movies online watch