Use Excel to Analyze ABS Table Builder Data

Use Excel to Analyze ABS Table Builder Data

Assignment One: Use Excel to Analyze ABS Table Builder Data

 

– Why are we doing this assignment?

 

            – Specific educational objectives – Learn how to:

  1. Understand and use Australian Bureau of Statistics’Table Builder data sets
  2. Use Excel to do descriptive statistics

 

            – Why do we need to know this?

  • The Australian Bureau of Statistics (ABS) is the major source of empirical evidence on Australian society.
  • Excel is the most frequently used software for basic data keeping and data analysis by Australian public, private, and not-for profit sectors.
  • Understanding of ABS data sets and familiarity with Excel will enable you to carry outbasic evidence-based quantitative analysis, which is necessary for market analysis, policy analysis, and needs assessment.

 

– What exactly are you supposed to produce?

           

            Due: Monday, 6 April, at 9 AM, at My Uni

 

            – Required file format

  1. Microsoft Word 2007 or more recentversion (available at the Campus labs)
  2. Microsoft Excel 2007 or more recentversion (available at the Campus labs)

 

            – Length:

– MS Word file

– Cover page

– 1 page, with name, student number and date of submission

– Textual Report

– 1.5 pages (about 500 words),single spaced

– 12 pt size Times New Roman, 1 inch margins on all sides

– Excel File

– pasted and formatted tables

– calculated indicators

– organized combined table

– scatterplot with a regression line

 

            – in the nutshell

– pick two variables from the ABS Table Builder Data Sets, using

Australian Census data (2006, 2011, or 2016 Censuses)

– formulate a hypothesis that they are causally related

– test that hypothesis

– write up the findings

Structure

 

– Textual Report (MS Word file) (one single-spaced page, about 500 words)

            1) Justify Variable Selection(1-2 paragraphs)

– Explain Dependent Variable

– what is it?any concerns with the way it was  measured?

– find out how it was measured in Census Dictionary

(see p.6 below)

– Explain Independent Variable

– what is it?  any concerns with the way it was  measured?

– Explain the Hypothesis

– how do you think that these two variables are related?

– direction and strength of the relationship

– Provide a causal justification (story)

            2) Univariate Analysis (1 paragraph) (covered in the Labs #1 and #2)

– Dependent Variable:

– mean, median, mode, standard deviation

– Independent Variable:

– mean, median, mode, standard deviation

            3) Bivariate Analysis (1-2 paragraphs) (covered in the Labs #2 and #3)

– write down the regression equation

– report and interpret b, r

– describe the direction and strength of the relationship

– interpret scatter plot

– any issues with outliers or non-linearity?

– if yes, deal with them and interpret results

            4) Conclusion (1-2 paragraphs)

– was your initial hypothesis supported or not?

– discuss relevance of ecological fallacy

 

-Excel File

– should contain

  1. Dependent Variablelabels and values (imported from ABS Table Builder)

– calculated (with Excel formulae, NOT in another way):

-rates, then: mean, median, standard deviation

  1. Independent Variable labels andvalues (imported from ABS Table Builder)

– calculated (with Excel formulae, NOT in another way):

– rates, then: mean, median, standard deviation

  1. Association

– combined table, correctly organized with matching years and names of                                         states and territories

– calculated (with Excel formulae, NOT in another way):

– intercept, slope, correlation

– scatterplot (with regression line and regression equation)

– if needed, outliers and non-linearity dealt with
– Format

– name your assignments file using this form:

– “A1_Text_Firstname_Lastname.doc”

– “A1_Data_Firstname_Lastname.xls”

– for example:

– “A1_Text_Djordje_Stefanovic.doc”

– “A1_Data_Djordje_Stefanovic.xls”

– text file

– use correct grammar and spell check

– sentence construction

– make sure that sentences are grammatical

– prefer short and focused sentences to very long and unfocused ones

– use good style
– do not use “can’t” or “don’t”

– avoid using “this”, “that” and “it”

– use paragraphs and full sentences

– do not use the point form

– paragraph length

– not shorter than 3 sentences

– not more than 2/3 of the page

– try to write as clearly as you can

– social science jargon is avoided

– if you have to use it then it is (briefly) defined and used correctly

– capitalize your two variables:

– wrong: university education

– correct: University Education

– data file

– only 2 decimal places should be used in calculation

 

 

 

 

Grading Key

 

1)  Format (10%)

  1. a) What will George look for?
  • Is everything in line with the instructions (see page 3 above)
  1. b) Potential Problems
  • Forgot to spell check
  • No cover page
  • More than 2 decimal points
  • Using the point form instead of full sentences and paragraphs

 

2) Variable Justification (20%)

  1. a) What will George look for?
  • level of variables correct (ratio)
  • discussion of any problems with values
  • clear and correct statement of the hypothesis
  • causalargument(“story”) makes sense
  1. b) Potential Problems
  • wrong variable level (nominal or ordinal or interval)
  • wrong variables: ABS Table Builder does not provide state/territory-specific data
  • you are using the same variables (one or both) that I used in the model assignment
  • hypothesis not stated
  • causal justification not formulated or does not make sense

 

3) Calculations and Presentation: Individual Variables (20%)

  1. a) What will George look for?
  • copy and paste in Excel done correctly
  • same Census year selected for both variables
  • conversion of absolute numbers into rates done correctly
  • correct calculation of mean, median, standard deviation
  1. b) Potential Problems
  • years used for two variables do not match (such as 2016 and 2006 Censuses)
  • using absolute numbers instead of rates
    • you have to convert absolute numbers into rates or % (explained in Lab 1)
  • excel formulae not used or wrong formulae used
  • forgot to calculate and show one or more of the descriptive statistics required
  • combined table does not contain state/territorial labels
  • instead of formulae, excel cells contain just the number, calculated by hand or calculator (frequent mistake)

 

 

4) Calculations and Presentation: Association (20%)

  1. a) What will George look for?
  • state/territorial data correctly matched (explained in Lab 1, Exercise 1)
  • a, b, r correctly calculated
  • regression equation correctly written out
  • scatterplot clear
  • correct diagnostics  on the outliers and non-linearity
  1. b) Potential Problems
  • scatterplot unreadable
  • graph mixes up X and Y axis (explained in the lab #2)
  • outliers not noticed – simply ignored
  • outliers not handled correctly
  • instead of formulae, excel cells contain just the number, calculated by hand or calculator – it must be clear where these numbers come from

 

5)  Interpretation of Results (20%)

  1. a) What will George look for?
  • requested descriptive statistics correctly reported
  • correct interpretation of strength
  • correct interpretation of direction
  • correct interpretation of scatterplot
  1. b) Potential Problems
  • calculations in Excel correct, but wrong numbers reported in the text file
  • interpretations are wrong

 

6) Conclusion (10%)

  1. a) What will George look for?
  • correct assessment whether the hypothesis was supported
  • correct discussion of ecological fallacy threat
  1. b) Potential Problems
  • claiming that the hypothesis was supported when it was not (frequent mistake)
  • no discussion of ecological fallacy
  • poor understanding of the ecological fallacy issue

                                                                                                     

 

 

 

 

 

How do I do this?

 

  1. Review the Model Assignment posted at the My Uni

– to be posted Friday this week

 

  1. Review the Cheat Sheets and your class notes from Labs

 

  1. Find two variables that meet the requirements

– must be ratio level

– must not be one of the two variables used in the model assignment

– you must be able to come up with a story (causal argument) that explains the                                possible relationship between the two variables (but no lit review)

– two variables must have data for all states/territories and for the same year

– review how the variables were measured at Census dictionary:

https://guest.censusdata.abs.gov.au/webapi/jsf/dataCatalogueExplorer.xhtml

– if they are absolute numbers, you will need to calculate percentages or rates

(such as the homicide rate per 10,000 people in Australian states/territories)

– some trial and error is a normal part of the analytical work

 

  1. Use Cheat Sheets from Labs to do Excel analysis

– paste and format tables

– do calculations

– produce a scatterplot with a regression line

– look at the model assignment, Excel file, to make sure you did it right

 

  1. Analyze the Excel results

– determine whether you have problems with skewness

– decide which indicators of central tendency and spread are

the best to report

– determine whether you have problems with outliers or non-linearity

– take measures as appropriate (see Lab #2 and #3)

– determine strength and direction of the relationship

 

  1. Draft the text file

– use the Excel results and your analysis

– look into the model assignment (text file) to make sure you are doing it right

– do not worry about the grammar, style, etc at this point

 

  1. Revise and edit the text file

– fix paragraph structure, transitions between paragraphs

– fix grammar, style, spelling

 

  1. Proofread

 

 

            9. Submit the Assignment One at