POSIX Threads Don’t Scale Past 100K Concurrent Web Service Requests (Part 1/2)

Hard times are upon our financial sector. The US financial markets are in turmoil. Many companies will be cutting spending as a squeeze is placed on operating budgets over the next couple of months, if not years. This is usually good news to the technology sector as most cost cutting measures depend on technology to keep productivity at the same levels as they were before the sky (and stocks) started to fall. These are exciting times as well in the IT sector. We are seeing a shift in the way we compute – from centralized IT to cloud computing, from one core per processor to many cores per processor, from closed data storage to open data portability, from a web of documents to a web of meaning.

At the heart of this transformation is the concept of a Web Service. Web services are used to perform operations on the cloud. They are used to read data from one place on the Web, process and transform that data in another location, and then send the data to yet another location on the Web. It is through this method that we get mash-ups like Google Maps, Facebook apps, Flickr albums and Twitter streams.

These web services are the workhorse of the current Web. They are highly available, highly concurrent, and usually have tens if not hundreds of thousands of people slamming them at a time. This can lead to heartache for software developers. The fine folks at Twitter have had scaling issues over the past two years that required painful changes to their service to avoid continued downtime.

This is a two-part blog post about how traditional software development does not prepare you for the realities of writing scalable web services. Our company focuses a significant portion of our R&D efforts on scalability. One of the lessons that we have learned over the past three years is that pure POSIX threads do not scale for web services.

Check out the graph below and note how the red line (the pure POSIX threads approach) does a very abrupt nose-dive while attempting to reach 400 concurrent web service requests.

You do not want to be in this position, EVER. Most software developers will inevitably choose to use pure POSIX threads for their application servers in order to scale their web services. They do this because most education institutions and websites drill it into our heads that to have concurrency, you must use threads. “I would never make that mistake!”, you exclaim. However, if you use a standard Apache 2 configuration (which uses MPM_prefork) and PHP for your web services (each PHP instance is run in a separate process), you have already made that mistake.

Read on to find out how to scale past 400 concurrent requests…