Getting examples: sequences of SQL statements

To create examples of real sequences of SQL statements we can use a django command that replaces runserver. This way we are able to see for example what happens when a user creates a milestone on Basie:

SELECT django_session.session_key, django_session.session_data, django_session.expire_date FROM django_session WHERE (django_session.session_key = 6c8039fc8f68a6bd34f130e6c59660cc  AND django_session.expire_date > 2010-02-09 16:45:50.814329 )

SELECT auth_user.id, auth_user.username, auth_user.first_name, auth_user.last_name, auth_user.email, auth_user.password, auth_user.is_staff, auth_user.is_active, auth_user.is_superuser, auth_user.last_login, auth_user.date_joined FROM auth_user WHERE auth_user.id = 1

SELECT a3c_userprofile.id, a3c_userprofile.user_id, a3c_userprofile.status, a3c_userprofile.language, a3c_userprofile.timezone FROM a3c_userprofile WHERE a3c_userprofile.user_id = 1

SELECT a3c_project.id, a3c_project.name, a3c_project.tagline, a3c_project.description, a3c_project.slug, a3c_project.locked, a3c_project.default_group_id, a3c_project.anonymous_group_id, a3c_project.homepage_id, a3c_project.channel_id FROM a3c_project WHERE a3c_project.slug = basie

SELECT a3c_project.id, a3c_project.name, a3c_project.tagline, a3c_project.description, a3c_project.slug, a3c_project.locked, a3c_project.default_group_id, a3c_project.anonymous_group_id, a3c_project.homepage_id, a3c_project.channel_id FROM a3c_project WHERE a3c_project.id = 1

SELECT (1) AS a FROM milestones_milestone WHERE (milestones_milestone.project_id = 1  AND milestones_milestone.name = Create different notations )

INSERT INTO milestones_milestone (name, project_id, due_date, description) VALUES (Create different notations, 1, 2010-02-28, )

SELECT django_content_type.id, django_content_type.name, django_content_type.app_label, django_content_type.model FROM django_content_type WHERE (django_content_type.model = milestone  AND django_content_type.app_label = milestones ) ORDER BY django_content_type.name ASC

INSERT INTO audit_genericauditmodel (audit_instance_name, audit_timestamp, audit_user_id, audit_project_id, audit_change_type, audit_object_id, audit_content_type_id) VALUES (Create different notations, 2010-02-09 16:45:50.873381, 1, 1, C, 1, 29)

INSERT INTO milestones_milestone_audit (genericauditmodel_ptr_id, name, project_id, due_date, description, id) VALUES (2, Create different notations, 1, 2010-02-28, , 1)

Structured Query Language

After going through SQL statements, this is a summary of the information that we should be able to annotate on ERDs:

  • Which entities are accessed?
    • SELECT … FROM table_name … — one entity (read).
    • DELETE FROM table_name WHERE … — one entity (delete).
    • INSERT INTO table_name … — one entity (add).
    • UPDATE table_name SET … — one entity (modify).
    • TRUNCATE TABLE table_name — one entity (modify).
    • SELECT … FROM table_name1 [INNER/LEFT/RIGHT/FULL/ JOIN table_name2 ON ... -- two or more entities (read).
    • SELECT ... FROM table_name1 UNION ALL SELECT ... FROM table_name2 ... -- two or more entities (read).
    • SELECT ... FROM table_name1 UNION SELECT ... FROM table_name2 ... -- two or more entities (read) + not all the rows.
    • SELECT ... INTO new_table_name ... FROM old_table_name -- two entities (read-add).
    • SELECT DISTINCT ... -- not all the rows (read).
    • SELECT TOP number|percent ... -- not all the rows (read).
  • Which attributes are accessed?
    • SELECT column_name(s) FROM ... -- read.
    • SELECT aggregate_function(column_name) FROM ... GROUP BY column_name ... -- read.
    • ... WHERE column_name operator value ... -- filter rows.
    • ... WHERE column_name BETWEEN value1 AND value2 ... -- filter rows.
    • ... WHERE column_name LIKE pattern ... - filter rows.
    • ... GROUP BY column_name HAVING aggregate_function(column_name) operator value ... -- filter rows.
    • ... WHERE column_name IN (value1,value2,..) ... -- filter rows.
    • INSERT INTO table_name VALUES (value1, value2, value3,....) -- set values.
    • INSERT INTO table_name (column1, column2, column3,...) VALUES (value1, value2, value3,....) -- set values.
    • ... ORDER BY column_name [ASC|DESC] … — change order of rows.
    • GROUP BY column_name … — change order of rows.
  • Which relationships are accessed? This one is a merge between the first two when the table names or the column names correspond to relationships.
  • Different types of access: read, add, filter, …
  • Order in which the entities, attributes, relationships are accessed: sequential, parallel, nesting, multiple access at different points in time + procedural extensions.

See:

* SQL Quick Reference. (archive)

References: A Survey of Graphical Notations for Program Design — An Update

On my first time at the Engineering & Computer Science Library, I got a photocopy of “A Survey of Graphical Notations for Program Design — An Update”.

Leonard L. Tripp starts the paper marking the importance of the design notation among the information exchanged during the design process. In general, as stated by the author, this paper is focused on graphical notations for program design (i.e. used to represent the logic of the software and implementation details) published since 1977. An exception is a box and line notation from 1972, but there is no reference to entity-relationship diagrams (1976).

In total, 18 notations are presented with parts of diagrams that correspond to basic constructs (mainly: sequences, selection from alternatives, and loops). Though I found the examples of diagram’s parts more descriptive than the syntax definition sections, It is interesting the use of BNF to describe the design notations.

I think it will be useful to go back to this paper after we define what run-time information can be included on the ERD and when the time to describe the different notation alternatives comes.

* This week I also got a book (Documenting Software Architectures) from the Faculty of Information’s Library (Inforum) and found part of another (Thinking with Diagrams) at Google books. — I will not have time to read those before I finish the two course assignments that are due next week

Entity-Relationship diagrams

An entity-relationship diagram have simple elements:

  • Entities (some of them marked as weak).
  • Relationships between two or more entities (some of them marked as identifying relationships).
  • Attributes (some of them marked as primary keys) that can be associated to entities or relationships.
  • Cardinality constrains on the relationships.

See Also:

Back-annotation of entity-relationship diagrams to represent data access

Next year I will be presenting the results of my M.Sc. research project. The topic in which I will be working during the next months will be the:

  1. Design and evaluation of of different ways to back-annotate entity-relationship diagrams with run-time information.
  2. Development of a tool to automatically produce the back-annotated entity-relationship diagrams associated to requests served by a Django site.

During the next months the design of some candidate static notations (not requiring animation, therefore easy to draw by hand) will have the higher priority. Then, those notations will be tested with users to evaluate them. Finally, after we get results of which information of the run-time execution will be useful and how it should be represent, the development will start …

Hopefully the results of this research project will be applied to:

  1. Help new developers of a project to understand the design of a database driven software.
  2. Debug (and maybe track the performance of) a database driven software.

Thanks to Greg Wilson, Jordi Cabot, Robert Clarisó Viladrosa and Mike Conley for the first talks about this topic. This will be my second time working in a thesis project and I think I will enjoy this experience a lot.

Code Readability

Today I am readying a paper that describes in detail an interesting study in which 120 computer science students rated 100 small code snippets (7.7 lines on average) from 1 to 5 according to their readability. The dataset and the tool used for the study are available. Some of the results of this study are that:

  • “… humans agree significantly on what readable code looks like, but not to an overwhelming extent.”
  • “factors like ‘average line length’ and ‘average number of identifiers per line’ are very important to readability”.
  • “… ‘average identifier length’ is not, in itself, a very predictive factor; neither are if constructs, loops, or comparison operators.”

Paper “How do we read algorithms?: A case study”

The paper “How do we read algorithms?: A case study” by Martha E. Crosby and Jan Stelovsky got my attention from the ones listed on the annotated bibliography about Code Reading and Program Comprehension. The authors monitored eye movement of 19 volunteers (randomly selected from the University of Hawaii’s computer science program) while reading an implementation of the binary search algorithm in Pascal. The subjects were divided in two groups (to make comparisons). The low-experienced group consisted of 10 subjects from the second semester, CS2, course. The high-experienced group consisted of eight graduate students and one recent PhD faculty member. Some interesting results presented on this paper are:

  • Though both code-oriented and comment-oriented subjects were found in the low-experienced and high-experienced groups, the low-experienced group spent significantly more time reading the comments than the experienced-group.
  • The high-experienced group expend more time reading the complex statements than the low-experienced group.
  • Increasing experience, subjects learn to discover and focus on key areas of information.

The authors included an error in the implementation but didn’t make any comparisons on the time expend looking at that error.

Code reading, navigation and searching (I)

I started reading papers about code reading, navigation and searching with a small paper: “Fluid Source Code Views for Just In-Time Comprehension”. The paper didn’t give me information about how developers actually read, navigate and search source code, but through it I noticed another paper of Dr. Chris Exton that maybe will contain more information. Other papers I will check are:

Catch up

This Monday I finished the second Computational Complexity problem set. It seams that I will not have assignments this week so I should try to catch up with my research. First, a little summary of what had happened since my last post:

  • I started a secret experiment (I will write a post about it ASAP).
  • I went to http://fsoss.senecac.on.ca/2009/ where I attended this three presentations.
  • Followed some links sent by my supervisor:
    • A video about subtextual. (Its FAQ).
    • A link to a post and comments about literate programming (I will expand on this at the end of this post).
    • A paper about an empirical study related to dynamic information flow analysis.
    • A paper about jGRASP — data structures visualizations intended to support teaching and learning activities in couses that include data structures — that includes a controlled experiment with students.
    • A paper about the ClassCompass System — automatic and collaborative support for software design education — that was evaluated using design exercises of undergraduate software engineering courses.

Since I didn’t read those papers completely (just abstract + conclusion), I will do that this week.

Going back to the literate programming post this are the relevant bits of it:

The author concluded with a question that also crossed my mind when I was reading about literate programming for the first time: “Can LP be used for anything other than small-to-medium programs written by a single person in a single language?”. There was an interesting comment of someone that was rewriting a large computer algebra program into literate style to open source it. Other comment presented an example of another large literate program (a compiler). Also there was a comment that pointed out another research question (What influence have literate programming during a software evolution?): “…relatively small changes made to the code can require a pretty major rewrite of your literate text.”.

There are also two links I will want to read this week and I will try to write tests in a literate programming style ;)

At StackOverflow DevDay

Greg Wilson, an assistant professor at the University of Toronto, gave a talk at the StackOverflow DevDay conference in Toronto, which was entertaining, informative, and generally just a huge hit.

From Joel on Software.