E-Commerce

We all know that Joomla is a tremendously popular CMS. In the past several attempts have been made to estimate the number of websites using Joomla. While these attempts were worthy contributions to the discussion, all of them were using their own methods and therefore had their own shortcomings. Several of the earlier attempts will be discussed and a new method for estimating the number of websites that use Joomla will be presented.

Attempts made by others

The earliest and perhaps best-known attempt to count the amount of websites using Joomla was done by Barrie M. North on his blog Compass Designs in June 2007. By looking at the link data from Google and Yahoo he estimated the number of websites using Joomla to be around 30 million.

JoomlaShack posted two years later in 2009 that they estimated the number of Joomla websites based on the amount of downloads the Joomla package has had at that point in time to be in the ‘tens of millions’.

Another attempt was made a couple of months later in August of 2009 by Joomla.me. Instead of looking at the total number of Joomla websites they decided to focus on the amount of successful Joomla websites. Their analysis based on Alexa traffic rankings showed that about 2.5% of all successful websites on the Internet are running Joomla.

Criticism on earlier approaches

The methods used in two of the earlier studies have received criticism from the community. For example Brian Teeman pointed out that “the number of downloads is interesting but it doesn’t really give a true figure of installations, especially as you could use one download 100 times or just install it directly from the host.”. Another point criticism came from Alledia who pointed out that they found the results somewhat strange. After closer inspection of the results, they blame the usage of data from Alexa.com which they called “a notoriously unreliable guide to website traffic numbers”.

Interestingly enough, I could not find any evidence of the results from the study by Compass Designs being challenged. For the sake of argument let’s assume the results of this study were correct and there were in fact 30 million websites running Joomla in 2007. According to studies done by Netcraft, in June 2007 (at the time of the publication of the blog on Compass Designs) there were about 55,800,000 active domains on the Internet. This number counts subdomains such as example.blogger.com and test.blogger.com as separate domains. Comparing this number to the number published on Compass Designs blog, this would mean that over half of the active websites on the Internet would run on Joomla at that time. Even using the bottom of the range chosen to base estimations on (which is 10 million websites running Joomla), would imply that about one in every five websites (or 20%) ran on Joomla. One more possible caveat in the study concerns the way in which the “Powered by Joomla!” message was used.

The bottom of the range seems determined by the number of pages including “Powered by Joomla”. However, analyzing the first 50 results on Google.com for the phrase “Powered by Joomla!” only 10 of the websites found actually contained the phrase “Powered by Joomla!” and has the word Joomla! as a link back to the joomla.org website. Even though some of these websites are in fact built on Joomla and contain a link back to joomla.org, they are not using the original “powered by” message. The point here is that Barry (by using the “powered by Joomla!” message as an indicator) might have overestimated the bottom of his range by as much as 80%.

Getting a scope

The size of the Internet is enormous. To give you an impression of what these numbers mean imagine the following. You are standing on a football field in red marked block of 1 x 1 meter. This football field is about 8,300 m long and 10,000 m wide. If the marked block is your domain name, then the football field is the internet. Some numbers:

  • The total number of web pages indexed by search engines on 2 June 2010 according to WorldWideWebSize is about 21,590,000,000
  • The total number of domains on June 2010 (including subdomains) according to Netcraft in their monthly survey is about 200,000,000
  • The total number of domains registered (with all subdomains counting as one domain) is according to DomainTools about 120,000,000
  • The total number of root domains included in the index of Linkscape by SEOmoz is about 86,000,000
  • The total number of active domains on June 2010 (including subdomains) according to Netcraft in their monthly survey is about 83,000,000

As we are looking at the websites in this study and not webpages, we are not looking for the number WorldWideWebSize gives us. As the studies by Netcraft show that over half of the domains crawled are in fact inactive (and therefore will not contain any Joomla installations) we will not look at that number anymore either. Similar reasoning can be applied to the number given by DomainTools. Since the studies by Netcraft show that many domains are left unused, you can assume that the number given by DomainTools is an overestimation of the actual amount of websites on the Internet as well. Seeing that the number of root domains in the SEOmoz Linkscape index is still rising with each update we can assume that this number will rise to at least 100 million root domains. This would explain why the number of active domains as reported by Netcraft is only slightly lower than the number of root domains included in the index of Linkscape, despite the data of Netcraft including subdomains.

Considering how long the Netcraft study is running already we can assume that data to contain the most correct number. We therefore define the number of ‘websites’ on the Internet for the purpose of this study to be 83 million.

Current approach

In the current approach will try to integrate several of the techniques used by other researchers to make a better estimation of the number of websites using Joomla. First we will have a look at two websites that aim to determine the technologies that drive websites on the Internet.

W3Techs uses the Alexa rankings to determine the top 1 million websites. Both subdomains and redirected domains are excluded from the surveys. W3Techs state that “as of today, 19.15% of the top 1 million sites use any of the CMS that we monitor”. 19.15% of 83,000,000 means that 15,894,500 are using any CMS (including WordPress). If 11.2% of these websites are using Joomla as stated in their findings, this means that 1,780,184 (2.14% of 83,000,000) websites are built on Joomla. Comparing this number to the one found by Joomla.me (2.5% of the top 1 million websites sampled, which extrapolates to 2,075,000 websites in total) it shows very similar results.

BuiltWith uses a cross section of website domains provided by URLs entered at BuiltWith.com and the Quantcast Top Million to calculate the distribution of CMSs. Analytics penetration is based on what BuiltWith can find and is typically based on popular 3rd party tools. BuiltWith shows that 4.71% of the websites that were sampled were using any CMS that they monitor. If 9.14% of these websites are using Joomla as stated in their findings, this means that 357,310 (0.43% of 83,000,000) websites are built on Joomla. A possible cause as to why this number is so much lower than the percentage found by Joomla.me and W3Techs is the exclusion of WordPress as a CMS in the analysis of BuiltWith.

Link data as reported by OpenSiteExplorer shows that there are about 264,789 root domains linking to Joomla.org (link data as reported by Google using the ‘link:’ command is not included in this analysis since it seems seriously flawed as reported by the people over at SEOmoz). This number is far more interesting than the amount of web pages linking to Joomla.org, as a Joomla website containing 10,000 pages would generate 10,000 links back to Joomla.org. However, this number is far lower than the approximate 1,900,000 websites reported by W3Techs and Joomla.me. A likely explanation of this is the number of people using commercial templates which do not necessarily include link back to Joomla.org and the amount of people removing the link from their original template. Assuming the numbers reported by SEOmoz, W3Techs and Joomla.me to be representative results this would mean that over 90% of all Joomla users remove to link back from their website in one way or another.

Conclusion

Taking all the evidence reported above in consideration, I would estimate that there are between 1.5 and 2 million websites using Joomla on the Internet. I would love to hear about any flaws in reasoning, statistics, calculations or general methods, so I can correct them in the main article and possibly make the estimation better.