Subject: =?iso-8859-1?Q?Re:_Migrating_Microsoft=AE_Hotmail=AE_from_FreeBSD_to_Micr?=
To: None <netbsd-advocacy@netbsd.org>
From: Ben Collver <collver@linuxfreemail.com>
List: netbsd-advocacy
Date: 04/18/2001 08:02:42
http://www.microsoft.com/technet/migration/hotmail/default.asp?a=printable

>The reasons for converting to Windows 2000 were:
>
>Performance (and therefore cost). The cgi "one
>process-per-socket" model, under FreeBSD, is very
>inefficient. Per machine throughput can be dramatically
>improved by moving to a multi-threaded application
>model. This results in one of the following conditions:
>A fewer number of servers required to support the site
>Support for a greater number of users by using the
>servers already deployed at the site

This is not a reason to switch from FreeBSD to Windows 2000.
This is a reason to write the application in a different way.

>Globalization and foreign language support. Hotmail had
>the requirement to launch in new markets, and did not
>want to continue to invest in keeping the FreeBSD
>locale tables up to date and other maintenance activities.
>China and Japan are two important growing markets for
>MSN, so multibyte character sets had to be supported.
>FreeBSD lacked the necessary Unicode support.

Since FreeBSD and Apache both specifically support Unicode,
it be nice if Microsoft provided more detail about the deficiencies.
Later in the paper Microsoft mentions that there were deficiencies
in the FreeBSD and Apache unicode support, but they didn't say
what.  I am guessing that they probably do exist, I don't use unicode.
However, the above paragraph is misleading, it makes it sound
like FreeBSD doesn't have unicode.

I am also curious about Microsofts desire to avoid keeping
the locale tables up to date.  Why do the locale tables need
to be modified at all?

>Shorter development cycles. Better tools for
>development and debugging would allow for more rapid
>feature development and more rapid detection of
>performance bottlenecks in the code.

I couldn't say if it was true or false, but I suspect this last point is the
best point.

>Each Web server had its own local FreeBSD administrator
>account. So, essentially there were literally thousands of
>individual administrator accounts that required synchronization.

Easily solved using the standard FreeBSD distribution using either
NIS or LDAP.  I don't understand why [unnamed commercial unix site]
doesn't use NIS or LDAP for their Unix machines.  I know the
[unnamed college] at OSU does.

>Hardware SSL accelerators were used to offload the
>FreeBSD\Apache Servers from processing SSL transactions.

>The SSL hardware accelerators were no longer required.
>It was found that the onboard SSL processing by Microsoft
>Internet Information Services was more efficient and provided
>greater throughput than the external hardware solution with FreeBSD.

That's an interesting concept.

>All of the Hotmail web servers are dual Pentium processor servers.
>Originally, these servers were built with FreeBSD running Apache
>as the web server. Most of the Web pages were generated by
>Perl-based CGIs. The version of Apache that was being used was
>not multi-threaded so each request was handled by another Apache
>process that was spawned off by the parent process. Spawning a
>new process is costly and Perl is an interpreted language so the
>performance of these machines was not optimal.

Ah, now we get to some details.  They reveal that they were using
an old version of Apache, which is understandable.  But instead of
comparing the Windows 2000/IIS solution to original
FreeBSD/Apache solution, wouldn't it make more sense to compare
the effort of changing to the Windows 2000/IIS solution to the
effort of upgrading to the equivallent up-to-date Unix solution?

I think basing a big site like that on Perl is foolhardy.

>Perfmon was showing a high number of context switches. Further
>investigation showed that this was due to thread blocking on
>allocating and freeing memory from the process heap. To resolve
>this a private heap was allocated for each thread. Each thread
>creates and destroys the heap on each request. This reduced the
>context switches tremendously and also eliminated the biggest
>stability problem: memory leaks. All the memory management
>calls in the code are overridden. Each time new, delete, malloc,
>or another similar call is made, the memory is allocated or freed
>from the thread's private heap. That heap is thrown away
>between each request so the memory leaks went away.

Does this mean that the replacement CGI scripts Microsoft wrote
in C++ were full of memory leaks?

>In general, it must be noted that the development group is
>considerably more productive and satisfied with using a modern
>interactive development environment to create the Hotmail front-end
>code base. If you use WinDBG on the live site to trace through and
>diagnose problems as they happen in real time, this is much better
>than attaching gdb to a transient cgi process or using "printf debugging."
>Bugs can now be identified and fixed in minutes rather than days.)
>Under FreeBSD, bugs and memory leaks would often go
>undetected because of the lack of tools. With Windows 2000 and
>IIS 5, the tools exist to optimize the performance and truly understand
>exactly what the code is doing at all times.

Straw man.  GDB provides plenty of debugging functionality, it just
doesn't give the GUI interface that WinDBG does.  If the GUI interface
is what's important, they could have used DDD, a mature visual
front-end to debuggers including GDB and Perl's debugger.

All that said, I am impressed with what Microsoft has done, they have
a right to brag.  Since Microsoft had to put so much work into competing
against obsolete versions of free software, I think this reveals that free
software is and will continue to be competetive.  I am not an evangelist
of the cause for using free software for everything, just a spectator.

Ben