AJK Autumn 2015 Page 1 Using SPSS – A Brief Starter Guide AJK Autumn 2015 Page 2 SECTION 1: Getting Started 1.1. Reading in Data Data can be entered d...
7 downloads
20 Views
3MB Size
Using SPSS – A Brief Starter Guide
AJK Autumn 2015
Page 1
SECTION 1: Getting Started 1.1. Reading in Data Data can be entered directly into SPSS or can be imported from a variety of different types of files, e.g. MS Excel, MS Access and a variety of text files, etc.
SPSS data files are structured by cases (rows) and variables (columns). For example, cases represent individual respondents to a survey (rows) and variables record responses to each questions asked in the survey (columns).
1.1.1. Opening an SPSS data file The data file will have a .sav extension to its name.
To open choose from the menu: File > Open > Data…
Browse the list of files and select the file you need and click Open. The data should now be displayed in the Data Editor window.
AJK Autumn 2015
Page 2
1.1.2. Reading In Data from Spreadsheets (e.g.MS Excel) Data can be read directly into SPSS from applications such as MS Excel, and use the Excel column headings as variable names.
From the menu choose: File > Open > Data… Select Excel (*.xls, *.xlsx, *.xlsm) in Files of Types, then select and click Open to open the file you want.
The Opening Excel Data Source dialog box is displayed – this allows you to specify whether variable names are to be included in the spreadsheet, and which cells and/or worksheets you want to import. If you only want to import a portion of a spreadsheet, specify the range of cells to be imported in the Range text box.
Make sure that “Read variable names from the first row of data” is selected – this reads column headings as variable names. Click OK to read in the Excel file
Note: if the Excel column headings do not conform to SPSS variable-naming rules then they will be converted into a valid form and the original column headings will be saved as variable labels so no information will be lost.
AJK Autumn 2015
Page 3
The data now appear in the Data Editor with the column headings used as variable names.
As variable names can’t contain spaces, the spaces from the original column headings have been removed. E.g. Surgery Volume in the Excel file has become the variable SurgeryVolume, but the original column heading is retrained as the variable label.
1.1.3. Reading In Data from a Text File Text files are a common source of data, with most programs and databases being able to save their contents in a text file format. Comma- or tab-delimited files refer to rows of data that use commas or tabs to indicate each variable.
From the menu choose: File > Open > Data… Select Text (*.txt, *.dat, *.csv) in Files of Types, then select and click Open to open the file you want.
The Text Import Wizard opens and will guide you through importing your data.
AJK Autumn 2015
Page 4
Step 1 – choose a predefined format or create a new format in the Wizard. Select No to indicate that a new format should be created.
Click Next to continue
In this example the file uses comma-delimited formatting, and the variable names are defined on the top line of this file.
Step 2 – Select Delimited to indicate that the data use a delimited formatting structure.
Select Yes to indicate that variable names should be read from the top of the file. Click Next to continue.
AJK Autumn 2015
Page 5
Step 3 – Type 2 in the top selection dialog box to indicate that the first row of data starts on the second row of the text file. Keep the default values for the remainder of this dialog box. Click Next to continue.
Step 4 – The data preview allows you to see if your data are being read properly. Select the relevant delimiter options (e.g. comma) and deselect the other options. Click Next to continue.
AJK Autumn 2015
Page 6
Step 5 – This dialog box gives the opportunity to edit any variable names this may be necessary if they have been truncated to fit formatting requirements. Data types can be defined here too, e.g. changing a numeric variable into Date/time format, etc.
Here it can be seen that the original variable name contained a space between the two words: Surgery Volume, SPSS has formatted this to become SurgeryVolume and has kept the original name as a variable label. Click Next to continue.
Step 6 – Leave the default selections on in this dialog box
Click Finish to import the data
1.1.4. Reading Data from a Database Data from database sources can be imported using the Database Wizard. Any database that uses ODBC (Open Database Connectivity) drivers can be read directly, e.g. MS Access. This will not be covered in this course however if you wish to use this function please refer to the SPSS Tutorial found in the Help option on menus.
AJK Autumn 2015
Page 7
1.2. Saving a Data Set Once the data set has been read into SPSS save the data set in SPSS format (it will have the extension .sav).
Click on the disc icon in the top menu bar, and the following window opens
Choose the location of where you want the data file to be saved to In File name: enter the name that you want the file to be saved as In Save as type: select SPSS Statistics (*.sav) Click Save
AJK Autumn 2015
Page 8
1.3. Using the Data Editor The Data Editor displays the contents of the data file you have selected. There are two views: ● Data View – columns represent variables and rows represent observations (cases) ● Variable View – each row is a variable and each column is an attribute associated with it
1.3.1. Manually Entering Numeric Data
The Data Editor window can be used to directly enter into SPSS or to edit already existing data files. Click on Variable View – the tab is at the bottom of the Data Editor window Variables will need to be defined, e.g. SurgeryVolume, age, gender, etc. Each individual variable is entered on a separate row, new variables are automatically given a Numeric data type. Click the Data View tab to continue entering data
NOTE: If variable names are not entered then a unique name will be automatically assigned - these names will not be descriptive and are not recommended for large data sets.
The names entered in the Variable View are now the column headings in Data View. Data can be entered as per a normal spreadsheet. The numeric columns display decimal points, even when numbers are entered as integers. To hide the decimal points click on Variable View, in the Decimals column type 0.
AJK Autumn 2015
Page 9
1.3.2. Manually Entering String Data Non-numeric data, such as strings of text, can be entered manually into the Data Editor.
Click on the Variable View tab - Type the variable name in the first column, e.g. gender - Click the Type cell next to the entry - Click the button on the right side of the Type cell to open the Variable Type dialog box
Select String to specify the variable type Click OK to save the selection and return to the Data Editor
AJK Autumn 2015
Page 10
1.4. Defining Data 1.4.1. Adding Labels SPSS allows you to define descriptive variable labels and value labels for variable names and data values – these descriptive labels are used in statistical analysis reports and graphs. Labels are used to provide descriptions of the variable’s data and are often longer versions of the actual variable names. Click on Variable View - In the Label column, type in the description of your variable, e.g. in the admistype variable row, the description is “Admission type”. The Type column displays the current data type for each variable. The most common data types are numeric and string, but other formats are supported.
1.4.2. Changing Variable Type & Format The Type column displays the current data type for each variable. The most common data types are numeric and string, but other formats are supported To change the format, click the Type cell for the variable you wish to change; click the button on the right side of the cell to open the Variable Type dialog box and select the format
AJK Autumn 2015
Page 11
1.4.3. Adding Value Labels for Numeric Variables Value labels provide a method for mapping your variable values to a string label. This is found in the Variable View window Click Values cell, and click the button on the right side of the cell to open the Values Labels dialog box
The Value is the actual numeric value The Label is the string label that is applied to the specified numeric value After you type the value and label you require into the boxes, click on Add, this will add you information into the list in the bottom window, then click OK to save your changes and return to the Data Editor
1.4.4. Adding Value Labels to String Variables
Use Change or Remove buttons as required by selecting the variable label from the bottom window and then clicking the required function
Click > Variable View at the bottom of the Data Editor window Click >Values cell in the row of the variable you want to edit Open the Values Label dialog box by clicking the button on the right side of the cell Type your value into the Value field (this is the word that is specified in your data set), e.g. female Type in your label into the Label filed, e.g. Female Click Add to add this label into the data file, then click OK to return to the Data Editor
NB. As string values are case sensitive, be consistent – a lowercase m is not the same as an uppercase M. AJK Autumn 2015
Page 12
1.4.5. Changing the Measure Description for String Variables
Once the data Type, Labels and Values have been entered, go to the Measure column and select the type of variable you have: scale = numeric data, Nominal = unordered categorical data, and Ordinal = categorical data whose order does matter (e.g. the beds variable is ordinal consisting of 3 categories: 100-500, 500-1000 or >1000, which have a logical order).
AJK Autumn 2015
Page 13
1.5. Displaying Variable and Value Labels By default the actual data values are displayed, but to display the labels from the menus choose: View > Value Labels Or, use the Value Labels button on the tool bar
Descriptive value labels are now displayed, making the data easier to understand.
AJK Autumn 2015
Page 14
1.6. Changing the Number of Decimal Places When data sets are imported, SPSS applies one decimal place to the data formatting. This maybe unwanted (e.g. you data may be a binary variable) or it may be that you would need more than one decimal place for your data.
To change the number of decimal places: - Click on Decimals column heading to highlight the column. (This column contains cells that can be formatted depending upon the number of decimal places you want) - To remove the decimal places click on the cell and use the scroll to reduce value to 0 - To change the number of decimal places, click on the cell and use the scroll function to change the number
AJK Autumn 2015
Page 15
SECTION 2: Descriptive Statistics 2.0. Obtaining Descriptive Statistics The Descriptive Statistics command can be used to obtain the measures of location (mean) and variation (range, standard deviation, variance, minimum and maximum).
Click on >Analyse > Descriptive Statistics > Descriptives
Select the variable(s) to analyse by clicking on them in the left hand pane then click the arrow button
The variable then moves into the right hand pane
Click the Options button to select which descriptive statistics you require
AJK Autumn 2015
Page 16
The Descriptives: Option window opens Click on which statistics you require (or click off those that you do not want) Click >Continue This returns you to the Descriptives window Click > OK
The following table is returned, containing the requested descriptive statistics. Do not copy and paste this table directly into your own work, it looks more professional if you produce tables for yourself and they can be customised to fit in with you report and writing style, etc.
The median value cannot be obtained using the descriptive statistics option, to obtain this value, along with other statistics, use the frequency command:
Open the Frequencies window by clicking Analyse > Descriptive Statistics > Frequencies The Frequencies window opens and you select your variable(s) using the same process as in the Descriptives window
AJK Autumn 2015
Page 17
Select which statistics you require by clicking on the Statistics button
Select the statistics you require by clicking on the relevant boxes, then click Continue, and OK in the Frequencies window
This output table is produced, as well as a frequency table (not shown as too large)
AJK Autumn 2015
Page 18
The Frequencies window also contains an option to produce plots using the Charts button
Select which plot you want by clicking on the appropriate button, click Continue then click OK
The histogram and super-imposed normal curve was produced.
AJK Autumn 2015
Page 19
SECTION 3: Graphs 3.0. Producing Graphs There are many different graph types hat can be made in SPSS. The following guide will show you just a few to get started.
3.1. A Bar Chart These are useful to illustrate the distribution of a quantitative variable by a categorical variable, e.g. we may want to know the age of male and female patients.
In the Data View window, go to: Menu > Graphs > Chart Builder, the Chart Builder screen will open In the Gallery, click on the graph picture icon entitled Simple Bar (the name will appear if you leave you cursor over the icon), and drag it into the Chart Preview window above it
Click on the quantitative variable (e.g. Age) in the window on the left and drag it into the “Yaxis?” box in the plot window. Similarly drag your categorical variable (e.g. Gender) into the “X-axis?” box.
AJK Autumn 2015
Page 20
The default graph will change, giving you an idea of what the plot will look like.
Add a title by highlighting Titles/Footnotes, tick the box next to Title 1 and then click on Element Properties, type the title in the Content box, the click Apply
The “Element Properties” box enables you to change many things on the graph, including: what bars represent, SPSS assumes that you want means but this can be changed to show medians, modes, etc.
AJK Autumn 2015
Page 21
To add an error bar, highlight Bar1 in the Edit Properties of: window, then tick the Display error bars box and choose which error bar you want to show. Select the relevant Level (%) for the Confidence intervals, or the appropriate Multiplier for the Standard error or Standard deviation, i.e. a multiplier of 1 will give 1 standard error or 1 standard deviation, etc.
To add or amend the axes labels, click on the appropriate axis in the Edit Properties of: box. In the Axis Label: box type in your label, click on Apply and then OK to produce the graph.
AJK Autumn 2015
Page 22
The graph produced can be edited by double clicking on it – this opens it within a new window called the Chart Editor.
Use the menus at the top of this window to change the widths and colours of the bars, convert the graph from bars to lines or vice versa, alter label font size, the labels themselves, etc. When you are happy with the graph, click on File >Close and the graph will be produced
To produce another plot or to clear your data, click on Reset in the Chart Builder window. Most graphs can be created following a similar method as above.
AJK Autumn 2015
Page 23
3.2. Clustered Bar Chart Sometimes you may want to show the effects of two independent variables, in this case a clustered bar chart may be useful.
In the Data View window, click Graphs > Chart Builder, select bar in the Choose from: list and select the Clustered Bar icon in the Gallery window, drag and drop this into the Chart preview uses example data window. Enter your data, titles, labels, etc. as previously done for the simple bar chart, except that now one of your independent variables goes into the X axis? box and the other into the Cluster on X, set colour box. Don’t forget to set all the various options in the Element Properties box (e.g. confidence intervals)
Click on Apply and the OK in the Chart Builder box and it should produce a graph similar to the following: AJK Autumn 2015
Page 24
The following graph is the same data but with the two independent variables switched, so that Admission Type is in X-axis1 and Gender is in Cluster on X; set colour:
The way you display your variables depends upon your specific research question that you want to answer. The first plot shows the age pattern quite well but the second plot makes it easier to directly compare males and females within each admission type category.
AJK Autumn 2015
Page 25
3.3. Line Graphs These are straightforward to produce, click on Graphs>Line and then on the icon picture for Simple Line for a one line graph, or Multiple Line for a plot with more than one line on it.
For a single line plot, click drag and drop the Simple Line plot icon
For a multiple line plot, click drag and drop the Multiple Line plot
Below, and example of a single line plot showing the relationship of mean patient age and primary disease indication with 95% confidence limits:
AJK Autumn 2015
Page 26
To produce a multiple line plot, in this example showing age and gender differences in primary disease indication, click on Graphs>Chart Builder>Line, then click and drag it to the chart preview area. Put the dependent variable (e.g. Age) into the Y axis? box, put your independent variable (e.g. Primary Indication) into the X axis? box, and then put the variable that you want separate lines on the graph for into the Set colour: box (e.g. Admission Type).
3.4. Bar Charts of Frequency Data When you have categorical data it makes no sense to average the data and display them as we have done in the last few graphs – we can only plot the frequency with which each of the categorical responses have been observed. For example: we have a categorical variable Admission Type, which is sub-divided into: Acute, Elective and Delay. We would want to show how many patients underwent an acute, elective or delayed surgery – so we need to plot the frequencies of these observations. Initially we produce a table of frequencies (click on Analyze>Descriptive Statistics>Frequencies) and a simple bar chart (as before):
To present these frequencies graphically, click on Graphs>Chart Builder, choose Simple Bar as before, but put your variable (here it would be Admission Type) into the X axis? box in the Chart preview box. SPSS will automatically change the Y axis? box to “count”, identifying this as a categorical variable. Click on OK and a frequency plot will be produced: AJK Autumn 2015
Page 27
The frequency plot could be broken down according to another variable (e.g. Gender ). Firstly we obtain the relevant frequencies , click on Analyze>Descriptive Statistics>Crosstabs:
To obtain the cross tabulation frequencies: place the variable of interest (e.g. Admission Type) in the Row(s): box by highlighting and dragging the variable across. Then put the variable you wish to separate your data by in the Column(s): box (e.g. Gender). Then click OK and a cross-tabulation table will be produced:
AJK Autumn 2015
Page 28
To graph these data, click on Graphs>Chart Builder, and drag the Clustered bar icon into the Chart Preview box; then put your variable of interest into X axis? (e.g. Admission Type) and the variable you wish to separate your data by into Cluster on X: set colour (e.g. Gender):
AJK Autumn 2015
Page 29
The same plots can be produced presenting the relative frequencies instead of the raw frequencies (i.e. with each column showing the percentage of males and females who fell into each category of admission type. To do this click on count in the Element Properties box and change it to Percentage. Click on Set Parameters and change to Grand Total.
The plots show the same variables but displayed differently, it depends upon the question which way you present your data.
3.5. Importing Graphs into Word You will want to embed your SPSS graphs into a Word document as part of your lab reports, etc. To do this: click once on the graph in SPSS to highlight it; in the menu at top of screen choose Edit and then click no Copy. Switch to your Word document and place the graph in it by clicking on Edit, then Paste Special and then Bitmap. If you click and drag on the little black dots around the box that contains the graph, you can resize it. Note: Do all of the editing before you paste it into Word as it will be a picture image rather than a graph.
AJK Autumn 2015
Page 30