The Anatomy of a URL: Understanding the elements of a Web Address

Articles —> The Anatomy of a URL: Understanding the elements of a Web Address

The uniform resource locator - or URL - is the most important aspect of the internet. A URL defines a webpage: what a browser will display, how a browser may interact with a website, and in some instances how a browser treats and displays the returned results. Understanding the anatomy of a URL - what a URL is, what features a URL is comprised of, and how these features are used by a web browser and website - can help one better understand how to navigate the internet (and in some cases can help one navigate the internet more safely) as well as how one can create their own website URLs.

A URL can be broken up into several features. While some of the features of a URL are optional, and in many cases not found within a typical URL, other elements are required to access a webpage or file on a website.

  • Transfer Protocol: this element of a URL can take on many forms which represent the protocol for how data from a webpage will be transferred to a browser or client. Examples include http, https, and ftp. Most ordinary websites use http or https, and any vital information should be transferred via https. Common transfer protocols include:
    httpHypertext transfer protocol
    httpsHTTP Secure protocol
    ftpFile Transfer Protocol
  • Domain Name: also known as website name or host name, the domain name element of a URL is often three parts: www (optional), second level name (eg algosome in www.algosome.com), and top level name(.com, .org, .edu, .net). The domain name is an instruction: telling your browser where to go to download the requested webpage. The 'www' portion is optional, and in many cases one can load a website with or without the www and receive the same content. This is not always the case however: in many cases using one may redirect a browser to the other format (eg algosome.com redirects to www.algosome.com). The top level domain often defines the type of website, some of the more regulated names include:
    .eduacademic/educational site, typically a college or university.
    .gov A website owned and operated by the government.
    .orgAn organization
  • Port: The port is often omitted from a URL, and when omitted assumed to be port 80. When present the port occurs after the domain name, delimited from the domain name by a colon (for example, www.mydomain.com:8080). A port is a communication point through which two computers communicate, the designation in the context of a URL defines which port to access on a website host.
  • File name: the file name element of a URL is everything after the domain name, but before the file name ending (below). This is typically (though not always) just a path to a file located on the web server.
  • File Format: often .html, but in many other cases can be .php, .cgi, .html. The file format often (but not always) depicts how the web server provides the content. For example, a .php ending is indicative that the webpage is being served by a php 'engine' - a scripting language which facilitates 'dynamic' webpages (pages which are physically one file but present different information depending upon given options). Common formats include:
    .htmlHypertext markup language
    .phpPHP hypertext processor
    .cgiCommon Gateway Interface
    .jpg/jpegjpg (pronounced 'jpeg') is an image file format, perhaps the most common on the internet today.
  • Query String: optional, the query string portion of a URL defines values sent to a website such that the website feeds the correct information. The query string is found after the first question mark in the URL, and is represented by key/value pairs separated by an ampersand (&). Query strings are used to send dynamic content - content which is created by the same file on the server but provides different content based upon parameters in the query string.
  • Fragment: A fragment is defined as a number sign followed by a name. The fragment instructs a browser where to focus its attention, often by scrolling to the desired component. As an example, append '#webpage-fragment' to the end of this URL and a browser will scroll down to this section.
  • Hidden values: although not physically part of a URL, browsers can send information to a web server - worth noting as they can be important players in internet navigation. These pieces of information include the browser name and version as well as 'cookies' - files downloaded previously from a website used to customize appearance (for instance, login information, shopping cart items, etc...).

The above features of a URL instruct a browser how to locate and download a file from a website. The domain name is first used to identify the web host. The file name and query string (if one exists) are then sent to the web host which uses this information to locate the appropriate file. Often the web host uses scripts - files which send different information based upon the requested content. Lastly, the website sends the content - be it a webpage, file, or image - back to the web browser, which in turn displays the content of the webpage.



There are no comments on this article.

Back to Articles


© 2008-2022 Greg Cope