When I posted a graph comparing the size of the mozilla-central repository by Firefox version my colleague gszorc was quick to point out that the 4k blocksize of the filesystem meant that the on-disk size of a working copy might not accurately reflect the true size of the repository. I considered this and compared the working copy size (with blocksize =1) to the typical 4k blocksize. This is the result.
As part of my recent duties I’ve been looking at trends in Mozilla’s monolithic source code repository mozilla-central. As we’re investigating growth patterns and scalability I thought it would be useful to get metrics about the size of the repositories over time, and in what ways it changes.
It should be noted that the sizes are for all of mozilla-central, which is Firefox and several other Mozilla products. I chose Firefox versions as they are useful historical points. As of this posting (2015-02-06) version 36 is Beta, 37 is Aurora, and 38 is tip of mozilla-central.
UPDATE: This newly generated graph shows that there was a sharp increase in the amount of code around v15 without a similarly sharp rise of working copy size. As this size was calculated with ‘du’, it will not count hardlinked files twice. Perhaps the size of the source code files is insignificant compared to other binaries in the repository. The recent (v34 to v35) increase in working copy size could be due to added assets for the developer edition (thanks hwine!)
My teammate Gregory Szorc has reminded me that since this size is based off a working copy, it is not necessarily accurate as stored in Mercurial. Since most of our files are under 4k bytes they will use up more space (4k) when in a working copy.
From this we can see a few things. The line count scales linearly with the size of a working copy. Except at the beginning, where it was about half the ratio until about Firefox version 18. I haven’t investigated why this is, although my initial suspicion is that it might be caused by there being more image glyphs or other binary data compared to the amount of source code.
Also interesting is that Firefox 5 is about 3.4 million lines of code while Firefox 35 is almost exactly 6.6 million lines. That’s almost a doubling in the amount of source code comprising mozilla-central. For reference, Firefox 5 was released around 2011/06/21 and Firefox 35 was released on 1/13/2015. That’s about two and a half years of development to double the codebase.
If I had graphed back to Firefox 1.5 I am confident that we would see an increasing rate at which code is being committed. You can almost begin to see it by comparing the difference between v5 and v15 to v20 and v30.
I’d like to continue my research into how the code is evolving, where exactly the large size growth came from between v34 and v35, and some other interesting statistics about how individual files evolve in terms of size, additions/removals per version, and which areas show the greatest change between versions.
If you’re interested in the raw data collected to make this graph, feel free to take a look at this spreadsheet.
The source lines of code count was generated using David A. Wheeler’s SLOCCount.
Whenever I encounter people as I travel, they are often curious about my luggage. It seems to be invisible. They’ll often ask where my bag is, assuming that it must have gotten lost in transit. Their eyes go wide and confusion sets in when I tell them that the bag on my back is the only one.
It is my estimation that at least some people would be curious about what gear I travel with. They ask how I’m able to pack all the necessities into such a small space. There is no great secret to traveling light. All it takes is a little research and compromise in creature comforts. If you have browsed the postings of other nomadic hackers, there might be little to be gleaned from this post. Here’s a basic rundown, with almost each article deserving its own article.
It should go without saying that nobody paid for me to write this post, and likewise nobody as sent me any products to test.