Home    Forum    Search    FAQ    Register    Log in
Post new topic  Reply to topic Page 1 of 1
 
How do large Websites handle millions of visitor?
Author Message
Reply with quote
Post How do large Websites handle millions of visitor? 
 
One of the surprising things about Web sites is that, in certain cases, a very small machine can handle a huge number of visitors. For example, imagine that you have a simple Web site containing a number of static pages (in this case, "static" means that everybody sees the same version of any page when they view it). If you took a normal 500MHz Celeron machine running Windows NT or Linux, loaded the Apache Web server on it, and connected this machine to the Internet with a T3 line (45 million bits per second), you could handle hundreds of thousands of visitors per day. Many ISPs will rent you a dedicated-machine configuration like this for $1,000 or less per month. This configuration will work great unless:

* You need to handle millions of visitors per day.
* The single machine fails (in this case, your site will be down until a new machine is installed and configured).
* The pages are extremely large or complicated.
* The pages need to change dynamically on a per-user basis.
* Any back-end processing needs to be performed to create the contents of the page or to process a request on the page.

Since most of the large Web sites meet all of these conditions, they need significantly larger infrastructures.

There are three main strategies for handling the load:

   1. The site can invest in a single huge machine with lots of processing power, memory, disk space and redundancy.
   2. The site can distribute the load across a number of machines.
   3. The site can use some combination of the first two options.

When you visit a site that has a different URL every time you visit (for example www1.xyz.com, www2.xyz.com, www3.xyz.com, etc.), then you know that the site is using the second approach at the front end. Typically the site will have an array of stand-alone machines that are each running Web server software. They all have access to an identical copy of the pages for the site. The incoming requests for pages are spread across all of the machines in one of two ways:

* The Domain Name Server (DNS) for the site can distribute the load. DNS is an Internet service that translates domain names into IP addresses. Each time a request is made for the Web server, DNS rotates through the available IP addresses in a circular way to share the load. The individual servers would have common access to the same set of Web pages for the site.

* Load balancing switches can distribute the load. All requests for the Web site arrive at a machine that then passes the request to one of the available servers. The switch can find out from the servers which one is least loaded, so all of them are doing an equal amount of work. This is the approach that HowStuffWorks uses with its servers. The load balancer spreads the load among three different Web servers. One of the three can fail with no effect on the site.

The advantage of this redundant approach is that the failure of any one machine does not cause a problem -- the other machines pick up the load. It is also easy to add capacity in an incremental way. The disadvantage is that these machines will still have to talk to some sort of centralized database if there is any transaction processing going on.

Microsoft's TerraServer takes the "single large machine" approach. Terraserver stores several terabytes of satellite imagery data and handles millions of requests for this information. The site uses huge enterprise-class machines to handle the load. For example, a single Digital AlphaServer 8400 used at TerraServer has eight 440 MHz 64-bit processors and 10 GB of error checked and corrected RAM.






____________________
The more you lose yourself in something bigger than yourself, the more energy you will have!!
Offline Yahoo Messenger View user's profile Send private message Visit poster's website
Download Post Back to top Page bottom
Display posts from previous:   
HideWas this topic useful?
Share this topic
blinkslist.com blogmarks.net co.mments.com del.icio.us digg.com newsvine.com facebook.com fark.com feedmelinks.com furl.net google.com linkagogo.com ma.gnolia.com meneame.net netscape.com reddit.com shadows.com simpy.com slashdot.org smarking.com spurl.net stumbleupon.com technorati.com favorites.live.com yahoo.com DIGG ITA Fai Informazione KiPapa Ok Notizie Segnalo
HideSimilar Topics
Topic Author Forum Replies Last Post
No new posts How To Handle Spam rssays Lastest News 0 12 Nov 2007 02:28 View latest post
rssays
No new posts What Is Large Scale E-mailing? matafedora Q&A 0 02 Oct 2008 21:20 View latest post
matafedora
No new posts Canon Pixma MP450 size of the print data i... rssays Troubleshooting 0 06 Nov 2007 19:20 View latest post
rssays
No new posts Websites you should Bookmark rssays Latest News 0 02 Nov 2007 14:25 View latest post
rssays
No new posts How To Spot Fake Websites? rssays Q&A 0 24 Oct 2008 23:03 View latest post
rssays

Post new topic  Reply to topic  Page 1 of 1
 

Users browsing this topic: 0 Registered, 0 Hidden and 0 Guests
Registered Users: None


 
Permissions List
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You cannot download files in this forum
You can post calendar events in this forum