Scaling represents an inescapable necessity during development of growing projects and the developer should reflect on that from the very beginning of its creation. Big popularity of online game or application and functionality improvements inevitably leads to the consumption of more resources. Or your risk to lose productivity and nobody else can play your game. So, let’s talk about different ways of effective project scaling.
The project's architecture should initially be adapted to the further scaling. Even though you think your app will not become very popular, it does not mean you will not ameliorate its productivity and make it adjustable to new tasks. Scaling can be vertical or horizontal. In context of web application the first involves improving hardware of your machine adding more RAM, more drives and CPU to one host. Very simple method which does not require to do changes directly in your application, but at the same time extremely limited. It does not give significant efficiency gains.
Horizontal scaling means increase in the number of hosts and allows to add almost infinite number of resources. Therefore we should focus on this way. Since the basic principles of scaling are suitable for many areas, let’s analyze it on the example of web application working on PHP. Attention: these methods are more suitable for developers of small online games working as simple online applications.
So, where should we start ?
Scaling must start from finding a bottleneck in your system. Some developers make big mistake scaling their app without giving an answer to the following question “which element of system is least productive?”. There are no general rules of horizontal scaling, it all depends on your situation. Maybe you just need to upgrade PHP on your server, improve the speed adding the index in database (ironically, sometimes it helps to increase productivity from 40% to 60%) and so don't worry about scaling for a few more years? But, of course, this is not a long-term solution anyway. Especially if you have already done the optimization.
Apache server is used by most small websites and applications. At the beginning we often stumbleupon next elementary architecture: one Apache server is processing all HTML requests from clients. But slow service of static files and high resource consumption are its main disadvantage. One of decision: replace it by nginx server. At first you can make Apache-nginx-PHP config and subsequently get rid of Apache. But this solution is not for everyone. Moreover, Apache is compatible almost with any kind of soft and supports a variety of features due to its modular system. It is better to use both servers simultaneously: when you separate front-end and back-end your nginx will serve static files while Apache works on your back-end.
When you already have a lot of back-ends you will probably wonder how to distribute client requests between them. And here is another argument for nginx: you can use it as the load balancer:
This is typical scheme for project with one front-end and multiple back-ends servers. As you can see the load balancer becomes single point of failure in this structure and you always risk getting an inoperable system in case of its failure. Other solution? Use client side load balancing presuming server choice to client itself. But be advised: thus you complicate client’s logic and reduce flexibility of balancing.
As a rule, most web apps are distributed and have next typical three level architecture: Data (DataBase server, static files)/ Application Server / Client.
If your application and database are working on one host, first of all you need to separate it into different hosts. In most cases the bottleneck of the system is database or code. If your problem is PHP, you can still make multiple front-ends with only one database (this solution can save situation for quite long time). But what to do, if the bottleneck is database? Well, think about replication.
The idea of replication is to copy database file from main server named Master (all data changes are taking place here) to one or more Slave servers. In this way your app can already use not only one server to process all requests, but several, and you can distribute the system load between different servers. Master database server is responsible to data manipulation while Slave database servers are responsible to data reading. I used this method in my website working as online paper writer and plagiarism checker.
If one of slaves is disabled, it is sufficient to refocus application on master, reload replication on slave and run it again. In case of master’s failure you need to switch all commands and data manipulation on slave (it will become the new master consequently). Than you are restoring again replication on previous master and it becomes a new slave. Replication can be synchronous or asynchronous. If you are working with MySQL, you will probably encounter asynchronous replication. It means that the data will appear not instant, but with some delay (replication lag) on slave servers.
Consequently there is a risk of losing some data in case of master failure. In addition, replication lag will only increase in time. But synchronous replication is not also panacea for web projects, because this slows down DML and causes high-latency networking. The best solution is to continue scaling and apply sharding which is usually used together with replication.
So, we covered some basic techniques of app scaling. Different combinations of these methods are able to increase your system performance. Don’t forget also to do always monitoring of your resources identifying weakness points in your architecture.