Loading of resources in the browsers requires HTTP requests, and these requests have high overhead. Depending on the browser the cost can be between 300 and 700 bytes per request, and potentially high latency. The latency will increase with the distance to the servers hosting the data, to between tens of milliseconds to hundreds of milliseconds. Browsers do load multiple resources in parallel but they have a fixed limit of how many resources they load at the same time, usually between 4 to 8 resources in parallel. For example, you can request 1000 resources at the same time but only four may be actively downloading in parallel. Data from these resources will be passed to event callbacks serially as it becomes available.
A good way to reduce latency is to generate requests to servers as close as possible to the user. This requires a network of servers distributed across the world that can serve content to users in their region, usually known as a Content Delivery Network (CDN).
Obviously, connection bandwidth will also affect load times. Average download speeds can range from 200 kilobytes per second to 5 megabytes per second depending on the country.
With these limitations there are two critical recommendations:
#1 requires traditional techniques employed when loading from an slow optical medium:
The most common compression format supported by all browsers is gzip. Your servers will need to respond to the request with special HTTP headers to indicate the compression format and the browser will automatically decompress the data before passing it to game code. Although gzip is a standard format, the size of the compressed files will vary significantly from compressor to compressor. We found that 7-Zip generates the smallest files (this tool supports several compression formats but only gzip will work natively on your browser). Remember that every byte counts, not only because when hosting data in the cloud you will pay for the volume of data stored and transferred, but also loading times will suffer with very slow connections.
The second of our recommendations (persuade the browser to cache data) requires also playing with the HTTP header that the server returns with the requested data. Basically the server needs to tell the browser for how long the data is valid and when the browser should check again for an updated version. Of course the browser will still do whatever it wants in many cases. It may decide to cache only really small files or to reserve a very small amount of disk space for the cache, constantly purging and updating it, but most browsers will try to honor the "time to live" information. There are two main ways to tell a browser for how long it should cache your data:
Servers can return both sets of headers for the same file, but we recommend using only the latter because of its simplicity.
The expire or max-age information gets stored per resource, so if the values are too aggressive the browser may not ask for a new version of the file for a long time. If you are updating data or code then your changes may not be reflected for a long time. In the case of both functionality updates and bug fixes this can conflict with the need to deploy new versions as soon as possible. To avoid this issue, resources are usually given unique names. In this way, new resources will be requested by an updated name, which will bypass the existing cache and force a reload of the new data. Unique names are generated either from an incremental version number or the hash of the contents.
At Turbulenz our resources are named with the hash of their contents, and references are translated at runtime from their logical names (e.g. mymesh.dae) to the unique physical ones (e.g. <hash>.dae.json). This allows us to tell the browser to cache the data for 10 years, which can improve loading times dramatically when playing the game for a second time. As updated resources get new unique names, we can release updates almost immediately. Obviously the resource that performs the translation from logical to physical is not cached at all because that information is dynamic.
The AppCache API is worth noting at this point. This allows developers to declare in advance the resources that will be required so the browser can download them ahead of time. The developer has control over what gets cached and what doesn't, and can use this interface to create web applications that work offline.