While representing a database query, one of the alternatives is to use fake data (i.e. examples of data) that does not necessary fill all the cell of the tables. In this approach we iteratively exclude rows according to the values of the columns mentioned in the WHERE/HAVING/ON clause. The exclusion of rows can be done either with different marks over one table or by drawing a set of intermediate tables.
The distinctive feature, of this way of representation, is that the fake data is generated such that it makes the representation as unambiguous as possible — though this is not achieved all the time. It is possible (not proved yet) that the process of generating this fake data helps the programmer to reason about the database query.
One of the assignments I got from the last meeting with my supervisor was to check if this was related to the use of the term “query by example” in the literature. Let’s see what I found.
Unlike what the name suggests, “query by example” is associated, in Raghu Ramakrishnan and Johannes Gehrke’s “Database Management Systems” book, to a database query language developed at IBM, in which each query is represented by tables that only contain data directly extracted from the query. Let’s analyze the representation of one query to see some of the main characteristics of this language:
This query find the colors of “Interlake” boats reserved by sailors who have reserved a boat for 8/24/96 and who are older than 25. The notation uses “P.” to denote which columns are selected, “.UNQ” to represent the DISTINCT clause, variables like “_Id” and “_B” to represent values that must be equal, and literals with comparison operators to represent the WHERE clause.
- “Query by Example” chapter of the book “Database Management Systems” by Raghu Ramakrishnan and Johannes Gehrke (Third Edition, 2002).