DAT - Data Manipulation and Analysis (Lesson)
What Can You Do with Data?
CONCEPT |
DEFINITION |
---|---|
Algorithm |
A set of one or more procedures to solve a problem. |
Average or Mean |
The sum of all of the numeric data in a row or column followed by the division of the sum by the number of cells of data added. |
Calculate |
Use of mathematical operations with cell values in a spreadsheet to compute solutions. |
Function |
An organized set of steps, as in a math calculation, to provide a solution, given specific data. Usually solve for the missing data item. |
Interpret |
Analyzing the data given to draw conclusions. |
Median |
The middle number in the row or column of data. If there is no middle number, the two numbers creating the middle are added, then the sum is divided by 2. |
Mode |
The number or item that occurs the most often or the numbers or items that are tied for the most often occurrence. |
Range |
The spread of numeric data between the smallest and the largest number; high - low. |
Sum |
A sum of a column or row of numeric data in a spreadsheet or a sum of a particular section of data or cells in a row or column. |
Let's use the classroom teacher problem with students in the computer science class to create a spreadsheet of the data for the teacher.
Here is the problem again with the student and teacher answers.
A classroom teacher gives each of her students in computer science a test of data. The teacher grades each individual question and provides a grade to each student based on the questions answered correctly. What data could be derived from this test?
Let's go back to our previous example
What data will the student receive when the test is graded?
- What data will the student receive when the test is graded?
- Numeric grade on the test
- Alphabetic grade on the test
- The number of problems that were correct
- The number of problems that were incorrect
- The correct answers
- What data will the teacher be able to create to analyze the learning in the classroom in addition to what the student received
- Number of answers correct
- Number of answers incorrect
- Class average or mean numeric grade
- Class median numeric grade
- Class range of scores
- Class mode of scores
Looking at the spreadsheet above, the labels are important. At the top horizontally across the page is the column heading. We will reference the columns with the labels and the rows by the data row number if needed. For example, cells are labeled by (row, column). So Smasky is located in row 5 column B or (5, B) or 5B for short.
Then names and student ID's were entered for rows 2 through 6. Note that the names are in alphabetical order, sorted in the spreadsheet by the custom filter function that puts the data in last name, first name order. I did not originally put them in in alphabetical order. A spreadsheet is set up to allow for the insertion or deletion of data without having to redo the existing data.
Data is entered under each column for each question for the multiple choice answer that the students entered. So how did I determine each of the following?
- How are the correct answers determined?
- In the column for question 1 (Q1) through question 10 Q(10), All of the multiple choice answers that the students provided were entered.
- Entered all correct answers in row 10.
- In row 8, all of the countif functions were used to examine the cells to see if the cell had the correct answer. This can then be done for each of the cells going across the row and the spreadsheet will populate correctly when the answers are adjusted for each column.
- How are the incorrect answers calculated?
- The incorrect answers are calculated in row 9, by using a calculation.
- 10 - the number correct in row 8 for all of the question columns.
- Once one function calculation is created, the formula may be dragged across the rest of the question columns to provide the number of incorrect answers.
- How is the number right calculated in the column for the #right for each student?
- Using the same formulas that were needed to calculate the number of correct answers, use the countif function for each column going horizontally across the page.
- countif(D2, "answer") + countif(D3, "answer") + countif(D4, "answer") + . . . for all of the 10 questions. This will count only if the answer is correct.
- Drag this downward for all of the students.
- How is the grade calculated?
- Write a function that takes the number correct and divides by 10 and then multiplies by 100.
- Grade = #Right/10 * 100
- How is the mean calculated?
- The average or mean is the sum of all the data divided by the number of items.
- For this use the average function having it take the average of the cells in the Grade % column.
- How is the median calculated?
- The median is the middle of the numbers
- For this use the median function of the cells in the Grade % column.
- How is the mode calculated?
- The mode is the number that occurs the most.
- Use the mode function for the cells in the Grade % column.
- How is the range calculated?
- The range is the difference between the maximum and minimum numbers.
- For this use the max and min functions subtracting the difference.
- Create the max and min functions to find the maximum and minimum of the Grade % column.
So, how can this data be represented in graphs?
So how can this data be represented in graphs?
Notice that charts, graphics allow you to look at the information differently. Some of the statistics that are on the charts can be seen visually in an easy manner for interpretation.
Data may be gathered in a variety of ways, including observation, surveys, using data gathered from others.
IMAGE CREATED BY GAVS AND USED ACCORDING TO TERMS OF USE.