Data categories
1. Variations in data values
Data values can be categorized as varying in either a discrete or a continuous way.

2. Analog signals
Analog signalAn analog signal is a continuous signal that varies continuously and is measured using a real number approximation (i.e., a number with a decimal point).

3. Digital signals
Discrete signalDigital signals vary discreetly. That is, digital information is information that has discrete levels and can be measured using an integer number count (i.e., a number without a decimal point).

5. Counts and measures
Six pack of Coke
Example: 6 pack of Coke
What is the difference between a count and a measure? Give a specific example that illustrates the difference.

6. Integers
Integers are positive and negative whole numbers that can be used for counts.

Math integers:
-, ..., -2, -1, 0, 1, 2, 3, ...,

Computer integers:
-minInt, ..., -2, -1, 0, 1, 2, 3, ..., maxInt

For example, the minInt value for a signed byte is -128 while the maximum value is 128.

7. Real number approximations
Real number approximations are approximations of measures.

Approximations usually include a certain number of significant digits.
0.3333... 0.6666... 1.0000... 3.1415...

8. Baseball counts and measures
Baseball gloveWhen a baseball player gets 2 hits in 6 at bats, what is the batting average?

The number of hits, 2, and the number of at bats, 6, are counts.

The batting average is 2/6 = 0.333 = 33.3% which is a measure.

9. Football
Football FieldIs the number of yards gained by a running back or receiver in a play of American football a count or a measure?

10. American football
Yes. A measure can be converted into a count.

For example, we measure distances. Now, the playing area of an American football field is 100 yards from goal-line to goal-line. What is the record for the longest possible run from scrimmage (that is, from goal-line to goal-line)? It is not 100 yards.

And it is not 99.9 yards.

Why? Even though the distance is measured by the officials on the field, football conventions dictate that, for record purposes, each ball placement is rounded off appropriately to the nearest yard. So the longest possible run is 99 yards. And for the players who have done such a run, it is a record that cannot be broken (why?).

13. Data categories
Four possible ways to measure data are as The type of data determines how one processes this data and how one chooses ways to visualize this data.

14. Nominal level data
Nominal level data is data that can be classified and counted, but that otherwise has no meaningful order. M and M packageIs there an order to the colors in a package of M&M's? M and M colorsWe can pick an order, but the order is no more meaningful than any other order. Nominal level data is often presented by arranging the values in alphabetical order. Example.
Company ------- Apple Google IBM Microsoft Oracle

Can this create problems?

What was one factor in Apple Computer choosing "Apple" as the name of the company in the late 1976?

15. Apple computer
Since ratings of computer companies (i.e., nominal level data) are usually published in alphabetical order, Steve Jobs and Steve Wozniak, founders of Apple Computer, wanted to appear near the top of the list.

Note that Google eventually changed it's parent name to Alphabet.

16. Phone books
Look in any (historical) phone book. Many companies would start their company name with "AAA" in order to be easy to find in the phone book.

17. Political candidates
What about listing candidates for election in alphabetical order?

Note: The same could happen in high school elections, etc.

Is the United States a democracy? Do the people elect the President based on popular vote?

18. Ordinal level data
Ordinal level data is data that can be classified, counted, and rank ordered, but the order is more qualitative than quantitative. What is the temperature like in this room?
Temperature ----------- very hot hot warm nice cool cold very cold

Notice that the categories are qualitative in that the idea of warm may vary from person to person.

I tend to have a wide range of temperatures that I consider comfortable (i.e., not uncomfortable). Who is right? We both are right. It is just that sometimes saying the same thing in different ways can sound better or worse than saying the same thing in another way.

19. Temperature colors
How might we assign colors to temperatures?

20. Interval level data
Interval level data is data that can be classified, counted, rank ordered, and where differences between data items have meaningful significance.

Note: Some people consider temperature or year interval level data.

22. Data sets
Ratio level data is data that can be classified, counted, rank ordered, and where differences and ratios between data items have meaningful significance.

A data set in the form of a table is a collection of data organized into named columns where each row, called a record, contains related information.

Data sets in the form of tables are ideally represented using spreadsheets.

