Community Articles Tutorials
github-logo youtube-logo

Structured Query Language Origins

Edgar Frank Codd and "A Relational Model of Data for Large Shared Data Banks"

post-image

In 1970, Edgar Frank Codd, an English computer scientist working at IBM, published an article titled, "A Relational Model of Data for Large Shared Data Banks." In this article, Codd, describes the flaws of the database systems widely used at the time. Essentially, Codd explains the difficulty for a non-programmer to interact with the available database systems. He then introduces a solution in which he describes relational databases through mathematical terms. Codd outlined two different languages in this article, relational algebra and relational calculus(Alpha). Due to concerns of competition between hierarchical database products, the database architecture already being successfully utilized by many, and the newly introduced relational database, IBM delayed development of relational databases for a few years.

In 1974, the System R database system was developed at an IBM Research Laboratory in San Jose, California in order to further study the abilities of the relational database described by Edgar Frank Codd in 1970. The two people most commonly attributed to the bulk of the findings that came out of System R studies are Donald D. Chamberlin and Ray F. Boyce. However, a team of nearly 45 people is acknowleged as making important contributions to the project in an article published in Communications of the ACM in 1981.

While working on System R, Codd's relational algebra implementation of a relational database evolved into SQUARE(Specifying Queries as Relational Expressions) and SQUARE evolved into SEQUEL(Structured English Query Language). Eventually SEQUEL became SQL(Structured Query Language) after trademark issues arose around the name SEQUEL. Also during this period, Chamberlin and Boyce authored two articles describing the basic principles of relational databases. One focused on DDL(Data Definition Language), used to create a new database or modify the structure of an already existing one. The other focused on DML(Data Manipulation Language), which uses the keywords SELECT, INSERT, UPDATE, AND DELETE to manipulate data within an database. Data is stored in tables comprised of columns and rows. Columns are used to represent data attributes and rows are used to represent data entities or records.

In 1986, SQL became the standard programming lanugage of relational databases, according to ANSI(American National Standards Institute. The next year, SQL became an ISO(International Organization for Standardization) standard.

As Edgar Codd discovered, relational databases enhance data independence of database systems. Since his research, SQL has become the programming language most used in relational databases such as MySQL, MariaDB(MySQL fork), PostgreSQL, Oracle Database Server, and Microsoft SQL Server.