FIT5185 – IT Research Methods Week 9

The final lecture on quantitative data analysis covered4 specific statistical test:

  • Binomial – Given a weighted coin, how many heads will probably result from 30 tosses
  • Median – Checks that the medians of two populations are not significantly different
  • Mood’s median test – Checks for significant similarity between unrelated samples (non-parametric)
  • Kilmogorov-Smirnov – Measure the cumulative difference between data, are the data sets different?
  • Friedman – Testing for significant differences across testing intervals on a sample population
The lecture slides included clear examples of these tests. The tutorial followed up with some practical examples using SPSS. After the 4 weeks of quantitative data analysis we now have a decent toolbox specifically for non-parametric data analysis. Our assignment requires application of these tools. I imagine that the assignment will give lease to some of the ambiguities that arise when reasoning from quantitative analysis.
non-parametric
An example of non-parametric data (source: http://perclass.com/doc/kb/15.html)

FIT5185 – IT Research Methods Week 8

Probability, hypothesis testing and regression analysis continued the topic of quantitative analysis in week 8.  Our discussion on the statistic techniques that we are using with the SPSS package focuses on the interpretation of outputs rather than the mathematics behind them. This seems reasonable given the limited time we have assigned to such a large area.

The first points covered were definitions of probability:

  • Marginal (simple) probability – rolling 3 six in a row with a standard dice => (1/6) x (1/6) x (1/6)
  • Joint probability P(AB) => P(A) x P(B)
  • Conditional Probability – I would stick with Bayes theorem => see below
Conditional Probability
Conditional Probability
  • Binomial Distribution – probability of a number times and event occurs given a true or false outcome and n trials. ie: how many times will head appear in 20 tosses of a coin.
  • Normal (Gaussian) distribution – Requires continuous random variables (ie age), see below
Normal distribution demands the percentages show for each standard deviation interval

Hypothesis testing and Regression analysis followed. The recurring theme is the significance value of less then 0.05 required for hypothesis support.

SPSS seems like a great tool for statistical analysis with all of the statistic methods widely used and relatively simple use.

FIT5185 – IT Research Methods Week 7

A short week for IT research methods in terms of new material. Due to the literature review presentations we did not have a tutorial and only half a lecture. The topic of the lecture was ‘Correlation Analysis’, presented by Joze Kuzic.

Lets start with the simple definition of correlation analysis, ‘A statistical investigation of the relationship between one factor and one or more other factors’.

One point that I need reminding on was correlation vs regression (source: http://www.psych.utoronto.ca/courses/c1/chap9/chap9.html):

Correlation – both variables are random variables, and 2) the end goal is simply to find a number that expresses the relation between the variables
Regression – one of the variables is a fixed variable, and 2) the end goal is use the measure of relation to predict values of the random variable based on values of the fixed variable

The topic of causality and correlation was approached quite carefully in the lecture notes citing that correlation can be used to look for causality but does not infer causality.

Methods of correlations:

Pearson’s correlation coefficient – for parametric (randomized, normally distributed data).

Spearman rank order correlation coefficient – for non-parametric data, [-1.0 , 1.0]

Significance of correlations was the next logical point covered, not much mathematical reasoning was covered apart from p < 0.05 is good :).

 

FIT5185 – IT Research Methods Week 6

Week 6 began statistical analysis using SPSS, specifically for non-parametric tests. Non-parametric data can be described as data that does not conform to normal distribution. A simple example is ranked data such as movie reviews (0 – 5 stars). A major limitation of non-parametric data is the increased sample size required to gain sufficient significance to reject a null hypothesis.

A good summary of the assorted types of non-parametric tests was found at http://www.graphpad.com/www/book/choose.htm:

Type of Data
Goal Measurement (from Gaussian Population) Rank, Score, or Measurement (from Non- Gaussian Population) Binomial
(Two Possible Outcomes)
Survival Time
Describe one group Mean, SD Median, interquartile range Proportion Kaplan Meier survival curve
Compare one group to a hypothetical value One-sample t test Wilcoxon test Chi-square
or
Binomial test **
Compare two unpaired groups Unpaired t test Mann-Whitney test Fisher’s test
(chi-square for large samples)
Log-rank test or Mantel-Haenszel*
Compare two paired groups Paired t test Wilcoxon test McNemar’s test Conditional proportional hazards regression*
Compare three or more unmatched groups One-way ANOVA Kruskal-Wallis test Chi-square test Cox proportional hazard regression**
Compare three or more matched groups Repeated-measures ANOVA Friedman test Cochrane Q** Conditional proportional hazards regression**
Quantify association between two variables Pearson correlation Spearman correlation Contingency coefficients**
Predict value from another measured variable Simple linear regression
or
Nonlinear regression
Nonparametric regression** Simple logistic regression* Cox proportional hazard regression*
Predict value from several measured or binomial variables Multiple linear regression*
or
Multiple nonlinear regression**
Multiple logistic regression* Cox proportional hazard regression*

All of the tests described in the table above can be applied via SPSS. Note that “Gaussian population” refers to normally distributed data. Not featured in the table above is the sign test, perhaps as it is described as lacking statistical power of paired t-tests or the Wilcoxon test.

One question that immediately comes to mind is how the process of normalization can be applied to force comparison of normally distributed data to non-parameter data.

The lecture went on to describe important assumptions and the rationale behind several test methods. I will await further practical testing with SPSS before going into more detail on them.

FIT5185 – IT Research Methods Week 5

The topic of week 5’s lecture presented by David Arnott was ‘Communicating Research’. After establishing why it is important to publish research, we cover the paper publication process in some detail.

The first step discussed was the research proposal, aimed at the target audience of supervisors/scholarship committee/confirmation panel. In regards to tense it was advised to write in past tense with the exception of results discussion which would be written in present tense. Proof reading and polishing were highlighted as a key characteristic of successful paper.

Referencing came next, including introduction to the author date and numbered referencing.

Planning on both a paper level and a macro level for a research career where highlighted by David as a key factor for success.

researchprocess
The research publication process

FIT5185 – IT Research Methods Week 4

IT research method’s fourth week was presented by Joze Kuzic providing a detailed introduction to surveys (or ‘super looks’ as the translation demands). First off we clarified that surveys are not limited to forms that managers and students need to fill out! There are many types of surveys,  ie:

  • Statistical
  • Geographic
  • Earth Sciences
  • Construction
  • Deviation
  • Archaeological
  • Astronomical
These are just a few types of non-form surveys. So with this broader view we can see that most anyone conducting research will need to have a good understanding of how to create effective surveys. Interviews were listed as a method for conducting surveys although I imagine this would in most cases be quite dubious if used alone. Anonymous surveys appear to be the most common form of surveys for people.
After discussing some of the obvious pros and cons of mail surveys, the lecture moved into population sampling.
Considering sample sizes – source week 4 lecture notes
Likert scales where subsequently introduced along with nominal , interval and ration frames for question responses.
Finally the format of surveys was raised, specifically the demonstrated effect format has on results.
The test for week 5 on this subject will be on experiments and surveys.

FIT5185 – IT Research Methods Week 3

Experiments was the topic of week 3’s lecture presented by David Arnott. We started with a classification of scientific investigation:

  • Descriptive studies
  • Correlation studies
  • Experiments

Importantly the anchor of these investigations is the research question.

Terms and concepts was the next sub-section:

  •  Subject (Participant by law in Aus where people are subjects) – The target of your experimentation
  • Variables (Independent variables, Dependent variables, Intermediate variables, Extraneous variables), these are self explanatory via dictionary definitions.
  • Variance/Factor  models – Aims to predict outcome from adjustment of predictor (independent?) variables, in an atomic time frame. That is my loose interpretation.
  • Process model -Aims to explain how outcomes develop over time (The difference between variance and process models appears to be moot and I feel somewhat irrelevant).
  • Groups -> experimentation group, control group -> ensuring group equivalence.
  • Hypothesis – Prediction about the effect of independent variable manipulation on dependent variables. One tailed, two tailed,  null hypothesis.
  • Significance – the difference between two descriptive statistics, to an extend which cannot be chance.
  • Reliability – Can the research method be replicated by another researcher
  • Internal Validity – How much is the manipulation of the independent variable responsible for the results in the dependent variable.
  • External validity – Can the results be generalized to entities outside of the experiment
  • Construct validity – extend to which the measures used in the experiment actually measure the construct?

Experimental Design followed:

  • Between-subject design vs Within-subject design -> are subjects manipulated in the same or differing ways.
  • After-only vs Before-after design -> testing of dependent variables at which stages..
  • Statistical tests must reflect the experimental design:

 

Statistical test to reflect the experimental design - Source week 3 lecture notes

When creating an experimental design it seems like a good idea just to make a check list.

The coffee/caffeine example covered next seemed a bit odd as it made the assumption that coffee caffeine are the same things. I recall same type assumption was made in regards to THC and marijuana which was later found to be fundamentally flawed. I did not understand the Decision support system example at all so was not really able to extrapolate much understanding from the two examples covered.

FIT5185 – IT Research Methods Week 2

Unfortunately I was absent for week 2 of IT Research Methods and the lecture delivered by Prof. David Arnott. The lecture was focussed on the initial stages to any research project, literature review.

  • Thematic Analysis – Qualitative in nature, classifying papers according to themes that are relevant to your research project.
  • Bibliographic Analysis – Quantitative in nature, using citation and/or content analysis. (rarely used in IT research)

A question posed at the start of the lecture; what is scientific evidence? Journal and conference papers along with websites, blogs, book and trade magazines were listed as possibilities. Before reading through the lecture I feel that any of these mediums could qualify as scientific evidence. Peer reviewed academics articles would however present a much more filtered source with blogs and websites most likely containing much more refutable contentions. It seems unwise to completely discount a source of information purely on the ground that it is a blog or website though.

The notes go on to present a rating system for journals, A, B and C, the A listers being:

  • Decision Support Systems
  • European Journal of Information Systems
  • Information and Management
  • Information Systems Journal
  • Information Systems Research
  • Journal of Information Technology
  • Journal of Management Information Systems
  • Journal of the Association for Information Systems
  • MIS Quarterly

The aim of a literature review can be summarized as:

  • Synthesis of articles
  • Define and understand relevant controversies
  • Based on critical review (note notes or observations)
  • Reads like an essay (but can use tables)

It seems that the thematic method of literature review is the avenue we will be encouraged to follow, which seems quite reasonable. Thematic review can be author and/or topic centric. Author centric review would only be appropriate in very limited niche topics where the published articles are by a limited number of researchers. When taking on topic centric review, creating a table with concept categorization for articles is recommended:

conceptMatrix
Webster & Watson Concept Matrix - Source week 2 lecture notes

Some questions are presented at the close of the lecture (which I imagine were answered in the lecture):

  • How long should a lit review be?
  • How many papers should be reviewed?
  • What tense should be used?
  • Which citation methodology? APA/Harvard?

I will have to follow up on these in the coming tutorial.

Finally there was a youtube video listed in the review materials for the week which included some good points:

  • What is the purpose of a literature review?
  1. Summarized what has been researched before
  2. Highlights the research gaps that you will aim to fill
  3. Why it is necessary to fill those gaps
  4. Set the scope of your research
  • Scope and length? – Does it need to be everything you know? No, the current state of the theory. Length requires discussion wit supervisor, but consider this is a summary of current research. Summary of existing knowledge, review of current research.
    Look for flaws, disagreement among researchers.
  • Sources – Refereed international journals, Books/Chapters, national journals, conference papers, non-refereed articles.
  • Review of instruments – What are you using to gather data to support your hypothesis, are they an acceptable source, why?

 

Basic Framework:

  1. Introduction
  2. Broader Communication Issues
  3. Likely Causes (Attack methods/motivations/scenarios)
  4. Mitigation Methods
  5. Summary of literature
  6. Research aims

Make a check list for evaluating articles!

FIT5185 – IT Research Methods Week 1

Week 1 of IT research methods was a lecture by Dr Jose Kuzic on the nature of research.  The lecture bounced between subjective opinions from experience in research and a a framework for conducting research questions.

  • Formulating Questions
  • Literature Analysis
  • Case Studies
  • Surveys
  • Qualitative data analysis
  • Quantitative data analysis
  • Communication research

Also introduced were some research paradigms:

  • Scientific research (positivist)
  • Applied research (practical)
  • Social research (interpretive)

I feel that being aware of these paradigms is valuable but self imposing mutual exclusivity or black and white generalization would be counter productive (ie: oh well that’s just a positivist view/ I can’t do that I am doing applied research). A more pragmatic approach of using whatever the best method for reaching outcomes to a posed question regardless of paradigm would be required for good research.

inductiveDeduction
Induction and deduction in science (source: week 1 lecture notes)

Details of Assignment 1 and 2 were also made available on moodle this week. Assignment 1, a literature review and presentation seems like it will be an enjoyable assignment that will allow some synergy with other subjects.