Created 19 December 2002, Updated August 31, 2016

Miscellaneous Web Files

One of the things about web development is the number of pieces of arcana one picks up. These things are a bit like trivia, and a bit like core knowledge. Here are some common files you may have heard of, but perhaps you never quite knew what they were.

favicon.ico
.ico is the windows icon file format. favicon.ico is the file requested by windows internet explorer for bookmarked sites. Find more articles with Google
robots.txt
this is a simple text file that sits at the root of a web server, and may tell friendly robots which parts of the site may or may not be crawled based on certain conditions -- url, user agent, etc.; if you don't want your site or part of your site in search engines, learn more about this. Find more articles with Google
.htaccess
on systems with apache as a webserver, an .htaccess files is a means to give the web server extra instructions on how to handle what is in a certain directory, and below. this may include password authentication, mod_rewrite, default index, etc. depending on your system configuration, all, some, or none of the standard apache syntax will work. consult your system administrator for more information. Find more articles with Google
application.cfm
on sites with coldfusion application server, this is the "default included" file. that is, in a directory containing an application.cfm file, all cold fusion processed files will include and process this file by default. Find more articles with Google
global.asa
on sites serving active server pages (ASP) this is the "default included" file. that is, in a directory containing an global.asa file, all ASP processed files will include and process this file by default. Find more articles with Google
cgi-bin
in some web hosting environments, server-side processing is limited to a certain directory, and that directory is often named "cgi-bin." CGI stands for "common gateway interface" - a reference to the earliest way that pages were made dynamic. "bin," however, is a unix term meaning "binaries" - programs that can be run. So a program in the cgi-bin directory is likely to be a small program that can be run on the web. Server processed languages such as ASP and ColdFusion and PHP are evolutionary steps up from cgi programs. Find more articles with Google
crossdomain.xml
A cross-domain policy file is an XML document that grants a web client, such as Adobe Flash Player or Adobe Acrobat (though not necessarily limited to these), permission to handle data across domains. See: Setting a crossdomain.xml file for HTTP streaming
googleXXXXXXXXXXXX.html
Seeing a file named google (with a bunch of letters and numbers, then ".html" in it indicates that someone has verified the site using Google Webmaster Tools. Google uses that file to verify your ownership of the site.
What should I add next? .htpasswd maybe?