Hello, this is Dr. Glasbrenner from George

Mason University, and in this video we will learn about the general components of a data

visualization. Data visualizations follow enough of a pattern

that we can devise a taxonomy for them. The elements of any given data visualization

can be broken down into four basic elements: visual cues, coordinate systems, scale, and

context. Visual cues form the building blocks of any

given visualization. In general, they can be categorized into one

of nine separate visual cues. In any given plot, only a subset of these

cues may be present. It is not required that all nine show up in

one plot. The nine cues and their data types are as

follows. The position visual cue is represented with

numerical data. This is where data is physically located on

the plot, and it shows us how data sits in relation to other things. The length visual cue is represented with

numerical data, and shows how big something is along one dimension. The angle visual cue represents numerical

data, and shows how wide something is and whether or not it is parallel to something

else. The direction visual cue represents numerical

data, and can indicate slope in a dataset or, in a time series dataset, whether the

trend is going up or down. The shape visual cue represents categorical

data, and shows what group different data points belong to. The area visual cue represents numerical data,

and shows how big something is along two dimensions. The volume visual cue represents numerical

data, and shows how big something is in three dimensions. The shade visual cue can represent either

numerical or categorical data, and can show the extent to which a certain feature is present

or how severe it is. Last is the color visual cue, which can also

represent either numerical or categorical data, and also can show to what extent or

how severely a certain feature is present. Keep in mind that, when you are using color,

that you should be mindful of red/green color blindness. The second basic element of a data visualization

are coordinate systems, of which there are three types. There is the familiar Cartesian coordinate

system, where we describe points using x and y coordinates measured relative to two perpendicular

axes. There is the polar coordinate system, which

is the radial analog of the Cartesian coordinate system. In this coordinate system, data points are

identified by a radius rho and an angle theta. The third type of coordinate system is the

geographic coordinate system, which corresponds to locations on the curved surface of the

Earth, but need to be represented in a flat two-dimensional plane. The third basic element of a data visualization

is scale, of which there are three types and will seem familiar based on our earlier discussions. The first is the numeric scale. There are several kinds of numeric scales,

with the most common ones being the linear scale, which is the default scale you will

see the most often, the logarithmic scale, which counts along an axes in powers of a

base number such as 10, and the percentage scale, where you consider the relative fraction

of observations instead of their absolute total. The second kind of scale is the categorical

scale. In the categorical scale, the variables may

have no particular ordering, or they may be ordinal, where the position in a series has

meaning. Finally, there is the time scale, which is

a numeric quantity with special properties. Because of the calendar, it can be specified

using a series of units (year, month, day). It can also be considered cyclically, for

example years reset back to January, or a spring oscillating around a central position. The fourth basic element of a data visualization

is context. This refers to annotations and labels that

draw attention to specific parts of a visualization. Examples of this include titles and subtitles

on a plot, labels along the axes that depict scale and indicate the name of the variable,

reference points or lines drawn on the plot, and other markups such as arrows, text boxes,

and so on. It is possible to overdo the markups, so always

try to keep it minimal and focus on the most important things you want to highlight. Let’s now apply these ideas to this example

plot showing the trend between the engine size and gas mileage on the highway for several

different types of automobiles. For visual cues, we see position and color

with the data points, showing us the gas mileage versus engine size trend across several classes

of cars. The coordinate system shown here is Cartesian. The scale is linear along both the horizontal

and vertical axes. For context, we have the plot title, the axes

labels, and the legend on the right. As you’re getting familiar with data visualizations,

it’s worth seeing if you can successfully break down and categorize the elements of

other visualizations that you see.