Last Wednesday I summited the ethics forms for the two-stages study that I will be conducing between May and December. During the rest of April I will be taking the final decisions about the database queries that will be included in the first questionnaire. Though I have already posted some examples of the queries, there is still an important issue that I have to consider for this decision: how will we analyze the data from the first stage of the study?
To answer this question I have been dusting off what I learned about Probabilities and Statistics over the past four years (starting with the two undergraduate courses during my 3rd year at the University of Havana and ending in the subset of it that was necessary for the Natural Language Computing course I took this term).
After going through the first two sections of an online book that caught my attention (sections: Elementary Concepts and Basic Statistics), now I am looking deeper into “categorical data analysis”. Since most of the variables that I think can be used to describe diagrams are either nominal or ordinal, I will be studying methods that do not require interval or ratio variables.