Servers & Scripting
Web Server as Hardened File Server: A web server gets it's name because it 'serves' web pages. The context of the word 'server' comes from file server, which in networking terms is a computer dedicated to storing files used by a group, which allows people access to these files. The files are available to all, and can be protected, and backed up from one (hopefully) safe place. A web server is therefore a specialized server, in fact, it is frequently hardened, (protected) as it is possible to attack a web server from anywhere in the world: Web Server Security FAQ
The 'web world' can be a pretty cold place: Defacement Archive Gets Defaced
When we use the word 'server' in class, we'll likely mean 'web server' from now on.
Web Server Operating Systems: Most web servers are running operating systems based on UNIX, an operating system created in the early 1970s. It is noted for being stable, quick and obscure. The source code of the operating system has been available for many years, which has led to many "versions" (flavors) of Unix being developed.
UNIX is "command line" based, meaning that it is built to run without any graphical elements whatever. Companies that build their own versions of Unix (flavors) sometimes build a graphical component as well (a GUI, or Window environment). Linux is a new and popular variation of a UNIX based operating system.
Microsoft technologies such as ASP (Active Server Pages) and it's successor, ASP.NET are usually hosted on Windows Servers. Windows servers are primarily visually based, and integrate well with Microsoft networks, making these ASP.NET the primary choice for a customer that uses many Microsoft products internally.
Web Server Software: Just as there is a program on the client machine (the browser) there is a program on the server that handles the web page requests. The web server software used can vary, based on the operating system of the server. Two of the most common operating systems used for web servers are UNIX or Windows. For UNIX based systems, the web server software is usually Apache, which is a bad pun for a 'patchy' program, built of several smaller programs.
Microsoft has it's own version of web server software, called Internet Information Services (IIS). IIS has been included in one version or another of Microsoft Server software since NT 4.0 (1996). The advantages of using Microsoft server products includes GUI (Windows) configuration, ease of deployment and industry giant support.
3 Contexts For Client: Wherever there is a server, there is likely a client. When you read the word client, it may mean one of three overlapping contexts: the user, the user's computer or the user's browser. The user's browser is probably the most important context. For us, the client will always have a local connotation. The server we'll consider remote.
A browser is one of a larger group of web page 'clients' call user agents. There are user agents that don't even process visual information: Braille Browser
Web Page: Whenever a person clicks on a web page, the browser sends a page request to their Internet Service Provider (ISP) which consequently queries it's resources to find an address that matches that request. The web address could look like this:
The ISP will send this to a special server called a Domain Name Server (DNS) that has software to translate the human friendly "example.com" to a number, as all internet addresses are really numbers like this:
Your request then bounces through the internet on special devices called routers and switches (both specialized computers) until your request reaches the server that is hosting the page you requested. When it reaches the server, your page is generated and then bounced back to you, again through the internet, until the string of ones and zeros are recombined into the web page you can see.
Index Files: Note the above web address ends with a 'slash', and not a file name. If no specific HTML or other file is identified, the server frequently needs to decide what to do. Most servers are configured to search for an appropriate index file, for example, index.html, or index.php. Having one of each (.html and .php file) can create a conflict, and the server itself will determine which file to show, the PHP or the HTML file! It's usually not a good idea to have two competing index files in a folder.
Directory Browsing: If we don't have an index file designated in a folder, some servers will allow directory browsing, which means the server shows all files in the folder, which is a security risk. Later we'll learn to turn off directory browsing with a UNIX access file called .htaccess, but for now, we can place an index.php file in every folder. If you have a client whose server space has directory browsing turned on, I recommend contacting the hosting company to shut off directory browsing as this is a security risk.
Dynamic Web Pages: Web pages written in pure HTML are considered as static pages, since they are unable to change according to user input. Web pages can be created that are more dynamic (flexible, and changeable) using one of 2 approaches either via client or server side programming. The differences between them are significant to web developers.
Server Side Scripting: Programs designed to run in the course of serving up web pages and are housed on on the server use server side scripting. It has the advantages of being potentially more secure than client side scripting, and can be used to access data stores such as text files and databases.
Server side scripting originally was handled by a Common Gateway Interface (CGI). Many server side languages still use this means, including Perl. However, the CGI's ability to handle a large volume of traffic was the issue that led to specialized server programs, called pre-processors to intercept a request for a web page, and pass the request to a more capable server side program.
Server-side scripting allows us to provide dynamic content based on user interaction, and our business logic requirements. Unlike a typical HTML page, which displays static information, a page that incorporates server-side scripting can change dynamically over time, interact with databases and other data sources, and provide content and transactions with users. In our class, we will focus on server side scripting. Server side scripting languages include, Perl, PHP, VBScript, and Java.
Server Side Preprocessor: Most dynamic page environments use a special program that intercepts the request for a page by the user, and processes the page in advance of sending it to the server software. This "pre-processor" allows the server to serve up straightforward pages (as far as it knows) due to the fact that the "pre-processor" filters out the request and feeds the required page to server to create the dynamic effect. The way to see this is to look at the source code generated by a dynamic processor, like the PHP pre-processor, or the ASP pre-processor (Microsoft). The source code looks like any other static page (only uglier, potentially much uglier, since the code is dynamically produced by a machine). At this point the HTML that is delivered to the browser is only OUTPUT that is generated on demand and not the CODE of the page written by the developer! This is why we must never overwrite a dynamic page with it's static output.
Leaving HTM & HTML Extensions Behind: When we move into the world of server side scripting, we'll likely leave the .htm and .html extensions behind us. Web servers use extensions to determine which program is to process a page, exactly as our computers differentiate which program will process a file (.xls, .doc). Since search engines base their results on history of pages as well as websites, it behooves us to move all our clients files to an appropriate extension (such as .php) for enabled web servers prior to any specific need for server side processing. That way server side code can be added at any time in the future. Moving clients to PHP pages without having any code in it does not change the way the HTML is handled on the page. If no PHP is present, the file is processed as normal.
Server Side Extensions and Local Development: When we change site pages to the .php (or other) server side extension, we change forever how our local development computers are likely to handle such files. When a file has a .htm extension, or local computer may load the file by default into a browser for viewing. When a file has a .php extension, a server is assumed to be required, making local development potentially more complicated as a web server is now required for the page.
Web Applications instead of Web Pages: Once we move into the world of server side pages we benefit from being able to build web applications which respond dynamically to our users. Static HTML pages look great but are not built to adapt to the needs of a user. Whenever we see a form to be filled out, such as a login to a website or a list of many items such as the books at an online store we are looking at HTML that is delivered as the facade to a web application. The HTML we have been building will essentially become the 'skin' over an application that is usually built upon server and client side scripting and database connectivity. Moving from static pages to dynamic web applications is the core goal of this class.
Web Databases: Web applications access data stored by a Database Management System (DBMS) designed to facilitate access to data. The abilities and limitations of the DBMS are a major concern to the developer, who must limit the number and duration of "hits" to the database in order to allow it to serve many users. Database connectivity is almost always the speed bottleneck in an application in that data may be physically retrieved over many hard disks and compiled and returned to the web server. Surprisingly, when all manner of caching and trickery are removed, static HTM pages are almost always generated more quickly than any page that involves a database!
SQL Language: The DBMS systems allow a developer to create Queries to access the data, using their own variant of a universal database language called SQL, which stands for Structured Query Language. Web database systems in use today include Oracle, SQL Server, Microsoft Access and MySQL.
The Server Side Environment: The developer will usually need to choose between programming environments that include compatible elements. It is not usually recommended to mix environments to a great degree, as there are many potential pitfalls. The most common potential web development environments are PHP, ASP, JSP, Cold Fusion and the .NET environment. Below are examples of the environments a developer may choose:
|Red Hat Linux 9.0
||ASP 3.0 pre-processor
||C# / VB.NET
In most server side scripting environments, when we request a page from the web server, the pre-processor monitors the request. If the applicable extension exists on the file requested (the ASP classic pre-processor monitors for the .asp extension) the pre-processor scans for scripting tags and attempts process any server side code inside the script tags.
Frequently this server side code connects to a database or queries server side resources like the system clock, per the programmer's needs. Once this is done, the resulting HTML is delivered to the browser, as if it was a static HTML page. The HTML delivered by an server side page is merely the output, not the actual code that created the page.
Why PHP/MySQL? For our purposes, we have chosen PHP with MySQL as our development environment. PHP is currently the most popular server side scripting language in the world and is open source which means any changes must be approved by a diverse body of developers and businesses. PHP also has the advantage of being a good language in which to learn fundamentals of server side scripting before tackling heavily abstracted objects to do our work for us.
PHP can run on Windows and other operating systems, but I highly recommend using a Linux/UNIX web server running Apache when using PHP as the server side development platform. This is the most common operating system/server software combination in the world. There are several things we can do with a UNIX web server that can't be done currently on a Windows host. We'll learn more about this later in the quarter.
Contrast our environment with the latest .NET environment, where development is very product dependent. As the direction of one corporation turns, so must all developers who embrace that environment. However, the advantages are ease of implementation and best practice development benefits produced by an industry giant.
Which Environment for my customer?: This is a good question to be asked for every potential client. If I know a client uses Microsoft technologies internally in their business or network, I would recommend using ASP.NET, with pages written in C#, and connect to a SQL Server database. If that means the customer needs to go elsewhere for assistance so be it. If the client is running on a UNIX server, chances are PHP/MySQL are a good bet. We can determine which environment a client is using by visiting netcraft, and click on what's that site running, and input the customer's domain name. If the client is building a new site with their web presence being their primary customer interface, I would again lean toward PHP/MySQL or PHP/PostgreSQL on UNIX.
What About A Content Management System?: There have been many systems designed in the technologies above that allow users to maintain their websites without needing to resort to learning programming. These are called Content Management Systems (CMS), and they vary wildly in quality and features. On the PHP/MySQL side the strongest bets are WordPress, Joomla & Drupal, in that order (from simple to complex & capable).
Frameworks: If we decide our clients needs are specific, and decide not to use a CMS, we will either be building the site ourselves (a custom/boutique site) or could speed up our development process by using a web application framework
Using a framework can speed up our development process and make it easier for other developers to understand the code. Frameworks frequently employ the MVC (model-view-controller) architecture, which separates the design from web plumbing.
There are PHP frameworks such as CakePHP & CodeIgniter. In this class, however, we'll use a mimimum of third party applications so we can focus on a fundamental understanding of the underlying architecture. We can build many of the capabilities of frameworks on our own. Later on I recommend studying these frameworks, particularly CodeIgniter, which appears to be the most popular framework of the day.
What do I do for my customer?: Picking a development environment & a framework (or not) is a huge decision to make for a client. Interview the client carefully, and consider all alternatives to help them make the best choice. Never do resume driven design. That means, do what's best for the client, not what makes our resume 'sparkle' with cool new technologies.
Is your client a simple ma & pa grocery store? Perhaps a WordPress site with a free theme/template would be the best move. Does your client have aspirations to be a world-wide organization involving dozens to hundreds of people uploading content? Drupal might be worth considering.
Here's a handout that may help make good decisions for clients: Server Development Decisions