PV251 Visualization
Autumn 2024
Study material
Lecture 1: Introduction to visualization
This course aims to present the general rules of visualization and techniques for their design
and implementation. The first lecture is focused on basic definitions and understanding of
the complexity of the visualization field. Then it presents a brief history of visualizations, its
relation to other research fields, and the visualization pipeline. The last part contains basic
information about human perception and its relation to visualization.
Definition
There are several possible definitions of visualization. The general one can be:
Displaying a given information using a graphical representation.
Other possible definitions:
„Transformation of symbolic into geometric“ [McCormick et al., 1987]
• „… finding the artificial memory that best supports our natural means of perception.“
[Bertin, 1967]
• „The use of computer-generated, interactive, visual representations of data to
amplify cognition.“ [Card, Mackinlay, Shneiderman, 1999]
• „The purpose of computing is insight, not numbers“ [R. Hamming, 1962]
• „…to form a mental vision, image, or picture of something not visible or present to
the sight, or of an abstraction; to make visible to the mind of imagination“ [Oxford
Engl. Dict., 1989]
• Tool to enable a User insight into Data
The goal is to convey the given information in the most informative and intuitive way.
Visualization surrounds us everywhere, on a daily basis. Therefore, we perceive it mostly as
something natural. Perceiving visualization is mostly based on individual experience and
knowledge, however, visualization design should follow some basic rules which will form the
content of this course. We need to understand what is understood as visualization and how
to use it in an efficient way.
Why creating visualizations
There are many reasons for that. We will focus only on those closely related to our field.
Visualization enables:
• Enhance the decision process
• View onto data in a broader context
• Interpret the data
• Present ideas and results, attract attention
• Inspire the others
• Entertain, educate, …
There are three main functions of visualization:
• Information storage – recording given data (e.g,. photos, images, paintings,
blueprints, …)
• Analysis of information – data processing and evaluation, interaction quality
evaluation
• Conveying information – sharing data between communicating parties, their mutual
cooperation, highlighting important aspects of the data
Visualization is very important namely because it utilizes sight as one of the main senses to
understand the conveying information. Visualization is everywhere – on streets, in public
transportation, television, newspapers. We can play a role of a passive observer, or we can
actively search for and interact with visualization – maps, weather forecast, stock market
exchange, etc. Visualization helps to improve the decision processes and more precisely,
correctly, and quickly understand the data content and context.
Examples of importance of visualization
We can find huge number of examples, one of the most typical ones is the following:
Classical table representation of individual data items is hard to understand the trends in the
data. If the table is much larger, it is even impossible. But when we plot the data to a graph
representation, the interpretation can be done instantly. Moreover, the data size growth
does not influence the interpretation of the data.
Another example can be the management structure in a big corporate. This information can
be conveyed using textual description, but its understanding will be complicated and very
long, with many possible mistakes in interpretation. But when the same information will be
displayed as a connected graph, the orientation in the structure of the company
management is easy and straightforward.
Big data nowadays
One of the main reasons for studying visualization is the data growth every year. Such
amount of data has to be somehow processed and analysed, otherwise there is no reason
for generating and keeping them. For illustration, in 2002 there were 5 exabytes (1018 bytes)
of new data generated, in 2006 it was already 161 exabytes.
According to a research study from University of South California, published in 2011 in
Science, in 2002 for the first time the amount of digital data exceeded the amount of
analogous data. In 2007 there was 94% of all data in the whole planet in a digital form.
Goals of visualization research
The main goals are:
• To understand how a person perceives the visualization and how this is related with
his or her mindset.
• To design and create principles and techniques corresponding to the understanding.
This helps us to create “efficient” visualizations targeting the processes in the human
brain and so increasing the speed of perceiving and understanding of the conveyed
information.
Wrong data interpretation
Wrongly selected data representation can cause wrong perception of the information. Here
is an example:
All four graphs are showing the same information, only the scale in the x and y axes is
changing. Graph (a) represents uniform distribution on both axes. But the scale is selected
wrongly, as it does not correspond to the range of the displayed data. Therefore, the data
items are heavily overlapping, and the user cannot interpret the content of the dataset
correctly. If we change the scale with respect to the data range only on one axis (graphs (b)
and (c)), the interpretation will be completely wrong because the graph is misleading.
Finally, graph (d) shows the correct representation of this dataset, with reasonable scale
with respect to the data range. This example shows that simple change of parameters of the
same data can lead to completely different interpretations.
History of visualization
Visualization is a very old discipline. More than 30 years ago, it was stated as a new research
discipline and first visualization conferences appeared in 1990.
First remarks related to visualization (based on intuition) can be dated to the period 15 –
130000 B.C., when first cave paintings were created in the Lascaux cave in France.
The advantage of image representation is that it does not need any formalization, as it is in
the written representation, where we must have some preliminary set of rules. Visualization
comes from the natural human perception. The participants of the visual communication do
not have to set the rules at the beginning, this is completely intuitive. Of course, this does
not stand for the abstract paintings of the modern art era. ☺
Images were projected to the first types of writing as well. The oldest written document is
considered to be the Kish limestone tablet coming from Mesopotamia (3500 B.C.). One of
the most famous image-based writing systems are hieroglyphs (3000 B.C.).
The main reasons for creating visualizations were mostly practical – travel routes, religion,
communication. One of the pieces of evidence of that is the Peutinger map of the Roman
empire:
In 1137 in China, there was the first geographic map created, which used the Cartesian
coordinate system. Lines are representing longitude and latitude.
One of the most famous examples of successful usage of visualization is the case of cholera
epidemic in London in 1663. John Snow created the following map where each rectangle
stands for one victim in a given house on the street. This map helped to reveal that the
highest number of victims was located close to the city water pump on Broad Street. Closing
the pump led to solving the epidemic, which caused death of more than 500 people.
Details can be found in the book of John Snow On the Mode of Communication of Cholera
(available online at http://books.google.cz/books?id=-
N0_AAAAcAAJ&printsec=frontcover&hl=cs&source=gbs_ge_summary_r&cad=0#v=onepage
&q&f=false). The appendix of this book also contains the names and addresses of all victims,
including the description of the progress of their disease. This confirms the importance of
using the map representation for their interpretation.
This story was so catchy that in 2011 they created a movie:
(http://www.imdb.com/title/tt2061801/, http://www.snowthemovie.com/crew.html).
There is also another book related to this story:
http://en.wikipedia.org/wiki/The_Ghost_Map.
(Information from Tomáš Marek)
One of the most typical usage of visualization has been astronomy. The observers were
visualizing the moon phases or movements of planets:
Visualization was successfully used for conveying the progress of the Napoleon troops when
invading Moscow. The map shows the progress of the army towards Moscow and losses on
the way. The color represents the direction, the bottom part of the visualization contains the
important information about the temperature in given stages of the march – in fact low
temperatures were the main reason of deaths of French soldiers as they were not prepared
for such freeze.
Another interesting example is the graph produced by Florence Nightingale (1820 - 1910),
English social reformer and statistician, and the founder of modern nursing. Her graph shows
the mortality level in army within one year (April 1854 – May 1855), along with the causes of
deaths, marked by colors (blue = sickness, red = injury, black = other). Nightingale based her
work on the graphs designed by Playfair. Blue parts represent deaths caused by diseases,
which could be eliminated by improving the healthcare. This graph was presented to the
Queen Victoria and Florence was the first pioneer who managed to convince about the
necessity of change using visualization.
Visualization today
Nowadays, visualization serves namely as a practical tool for conveying desired information.
For that it is necessary to use different levels of abstraction of data representation, both
from qualitative and quantitative point of view. A typical example of this is a Tokyo metro
network map:
Even though this map is highly abstracted, it serves well for
its purpose and any additional information (e.g.,
highlighting of streets) will be misleading.
On the other hand, if we are planning a walking route from site A to site B, the classical map
representation will be more feasible. Such maps, showing individual streets and their names,
crossings, rivers, parcs, etc., help us to understand the surface information and make the
correct route planning. Here we should be aware of the fact that maps represent a special
case of visual representation with certain degree of inaccuracy according to their scale. This
is caused by the spherical shape of the planet and its projection to plane. It is obvious that
the smaller area the map covers, the smaller distortion it has.
Data can be visualized very precisely, as in the following example:
One can argue that this cannot be considered as visualization. But on the other hand, text,
and numbers can be taken as visual representations as well, similarly to tables and graphs. In
fact, they represent given data. This particular “image” shows the US national debt on
January 22nd 2006.
Another very useful example of utilizing visualization is the record of heart beats
(electrocardiogram).
On the left side is the record of a healthy adult person, on the right side is the record of 83
years old man with high blood pressure.
Nowadays, visualization is used in a variety of areas. Visualization enables to show different
types of objects, such as different datasets, algorithms, results of computations, processes,
etc. More and more often visualizations are interactive, when the user can react on the
displayed information and individually navigate himself or herself in the scene. This
interaction is most often performed as direct interaction with the graphical interface of the
application, instead of using traditional menu.
Visualization plays a crucial role in the following fields:
• Medical data (VolVis)
• Flow data (FlowVis)
• Abstract data (InfoVis)
• GIS data
• Historical data (archeology)
• Microskopic data (molecular physics)
• Macroskopic data (astronomy)
• Big data
Relationship between visualization and computer graphics
Originally, visualization was considered to be the subfield of computer graphics, because it
uses the CG principles to display the information. Computer graphics here serves as the
communication channel.
This relationship can be viewed from the other side as well. In all types of visualization, we
can find basic graphical primitives, such as points, lines, polygons, or volumes. Computer
graphics focuses solely on processing these primitives, but visualization goes beyond – it
takes into account the content of the data visualized and their properties, such as spatial
position, physical properties, etc. This leads to the definition of visualization as the
application of computer graphics to data representation when we are mapping data to
graphical primitives and render the resulting images. On top of that, visualization integrates
many other research disciplines, such as human-computer interaction, perceptual
psychology, databases, statistics, data mining, machine learning, etc.
To summarize, we can claim the following:
Computer graphics focuses primarily on creating interactive images and 3D objects and the
primary goal is to get a realistic result. Typical CG fields are art and entertainment (games,
movies, advertisement, etc.).
Visualization, more than on realistic view of data, focuses on effective communication of
information.
Computer graphics and visualization share a variety of concepts, tools, and techniques, but
differ in the basic model (the information to be displayed) and in particular in the goal (what
the user expects as the output).
The process of visualization
The basis of the new visualization design is to analyze the available input data and the user's
expectations from the resulting visualization (output requirements analysis). The goal of the
result is to explore the data, confirm the hypothesis, present the result (conference, ...) etc.
Interesting results are usually the various anomalies occurring in data, clusters of data
(defining their similarity) or trends (predictive models).
To display the data, it is necessary to define its mapping on the screen.
One important aspect is the possibility of interactive manipulation at all stages of the
process. This is especially important because of the subjective perception of visualization
and its "quality". There is no definition to ensure that the rendering is "effective". It is
therefore important to allow the user to influence the outcome of the process whenever
possible.
CG Pipeline
The classic pipeline in computer graphics consists of the following phases:
Modeling – in the first phase a 3D model consisting of graphic primitives is created and is
located in the global coordinate system.
Viewing – defines the position, direction and orientation of the virtual camera in the global
coordinate system. All vertices of the 3D model are then converted into the coordinate
system of the given by camera parameters.
Clipping – here the boundaries of the intended image are specified, and objects beyond
these boundaries can be removed. Objects that cross the border can be trimmed.
Additionally, objects can be converted to normalized view coordinates, which greatly
simplifies the trimming process.
Hidden surface removal – removing hidden parts (polygons) that are not visible from the
viewpoint of the camera (back faces, polygons hidden behind other ones).
Projection – in the projection phase, 3D polygons are projected onto the 2D projection
plane using, for example, a perspective transformation. The result is displayed in a
normalized 2D coordinate system of the screen.
Rendering – the rendering phase assigns to each pixel the corresponding color - depending
on the color of the polygons, their transparency, luminosity, position, etc. This is solved, for
example, by raytracing.
Data entering the visualization process can be obtained in various ways, such as CT / NMR
data, various types of simulation (e.g., flow simulation), modeling, and other methods. This
data is then processed (filtering, oversampling, selecting a specific part, or derivation,
interpolation, ...). The data is then mapped into a viewable form, such as a geometric model.
In the last phase the principles of computer graphics are used, and the result is displayed on
the screen.
Visualization pipeline
The visualization pipeline is similar to a graphical one at a higher level of abstraction. But it
has its own specifics. The phases are as follows:
Data modeling – preparing data (from file, database, ...) for visualization. This means, for
example, preparing data in a format that allows quick access to those data.
Data selection – the data selection is similar to the CG pipeline clipping phase, where we
select a subset of the data that should be visualized. This phase can be controlled
automatically, can be left fully on the user, or these approaches can be combined.
Data to visual mappings – the most important phase is mapping data to graphical entities or
their attributes. Some parts of the data can control the size of the object, for example, while
others can define the position or color of the object. This phase often integrates additional
pre-processing of data that precedes self-mapping, such as scaling, shifting, filtering,
interpolation, etc.
Scene parameter settings (view transformations) – here we can set scene parameters such
as color scheme selection, lighting, or sound. These parameters are relatively independent of
the data.
Rendering or generation of the visualization – in the final stage, the visualization itself is
created. The selected projection depends on the mapping performed, may include, for
example, shading or texture mapping. Most visualization techniques are sufficient only with
drawing lines and uniformly shaded polygons. In addition to displaying data itself, most
visualizations provide a variety of additional information enhancing the interpretation of
data, such as displaying graphs or general annotations.
Human perception
A proper understanding of human perception is the foundation of every good visualization
design. The first study of human perception focused on the visual system and its capabilities
and limitations. Further research has focused on the area of cognitive senses and the ability
to recognize (that is, the involvement of psychology in the whole process of visualization).
One definition of human perception:
The process of interpreting the surrounding world and shaping its internal representation.
It is due to internal representation that there are many inaccuracies and misinterpretations.
These can be of dual origin - they are a mistaken perception or targeted misinterpretation.
The second option leads to popular optical illusions.
Optical illusions are basically incorrect or confusing perceptions of reality. It is due to the
poor interpretation of the brain when one sees something that is not in the picture at all.
The rectangle in the middle has the same shade of grey in its entire width.
When you move your head from and to the picture, the circles appear to rotate.
In fact, boxes A and B have the same color.
Users interact with visualization based on what they see themselves and how they interpret
it. Therefore, a proper understanding of the vision process helps to produce a better view.
About half of the brain works with visual perception, which is mostly processed in parallel
and continuously – e.g., color, texture, movement. Approximately 8 percent of men are color
blind (Dalton) or have a similar visual defect, suggesting that high quality visualization
software should allow to change the color of the display data.
The image of the monkey on the left is an example of red-green color blindness, on the right,
normal color vision.
One of the major problems that visualization faces is the limited ability of the human eye.
Therefore, this must be taken into account in the visualization process. A high-quality image
can be stimulating, but if it contains ambiguities, it is almost useless. The main finding is that
it is not worth to map the data values to graphical attributes which the human eye cannot
properly process and quantify (unless we do not directly visualize the optical illusion ☺).
Perception in the context of visualization
We will now focus on the influence of color, texture and movement on the visualization
process.
Color
Color is one of the most common parts of visualization design. More sophisticated
visualization methods allow the user to control the difference between the individual
colors according to their subjective perception. It includes:
o Color balance - uniform color distribution throughout the scale used.
o Distinction - in a discrete color collection, each color is equally well
distinguishable from others (no color is "easier" or "worse" identifiable).
o Flexibility – colors can be selected from anywhere in the color space (i.e., the
technique is not limited to selecting green or red shades only).
There are several basic color spaces that are widely known – e.g., RGB, RGBA, CMY,
CMYK, HSV (hue, saturation, value), HLS. The CIE LUV space corresponds to the
subjectively perceived differences in intensity between the shades of color. CIE Lab perceived
color is determined by coordinates in the 3D color space.
But let's focus on the less common and familiar color space where Healey and Enns have
shown that it is important to control the distance of colors, linear separation and color
categories. An example of use is shown in the figure showing historical climate records
over the eastern part of the United States where the color represents temperature (blue
and green = winter, red and pink = heat). Luminosity refers to wind speed (lighter =
stronger wind), orientation is mapped to collisions (greater deflection corresponds to
stronger precipitations), size indicates cloudiness (greater = greater cloudiness), frost
frequency is mapped to density (denser = higher frost).
Texture
Texture is often perceived as one of the features of visualized data. However, similar
to color, it is possible to break the texture into more parts perceived by users.
Computer vision distinguishes texture properties such as regularity, directionality,
contrast, size, and roughness.
Texture can be used in many interesting and unconventional ways. One technique is
to introduce human perception into individual dimensions of data attributes. This will
result in the visual appearance of such a texture depending on the input data.
Examples:
Grinstein et al. used for visualization of multidimensional data a simple character
sketch, whose limbs encode the values of the attributes stored in the data elements.
When we place these figures on the entire display, they create textured patterns
whose spatial arrangement, clustering, and boundaries correspond to
correspondence between attributes. The so-called Chernoff faces share a similar
concept
(http://graphics8.nytimes.com/images/2008/04/01/science/0401-sci-
PROFILE.lg.jpg).
Ware and Knight designed Gabor's filters that change their orientation, size and contrast
based on three independent data attributes.
Movement
Movement is the third visual feature that can be perceived very well. Motion is used in many
areas of visualization, such as particle animation, color change animation, or pictograms to
display the direction and size of vector fields. Like color and texture, we are interested in the
identification of perceptual dimensions of motion and its effective use. The following four
motion features have been extensively studied by psychophysical experts: vibration,
flickering, direction, and speed of motion.
From the visualization point of view, we are interested in flickering about F frequencies,
which are perceived by the observer as discrete flashes.
Many studies have been conducted on the utility and usefulness of visualization, such as
Nakayama and Silverman, Driver et al. and many others. In general, studies have shown that
various changes in the image attract attention and improve the perception process. Of
course, the use of motion during visualization must be governed by certain laws to fulfil its
function. For example, changes in shape, color and speed are used to remind observers of
the remarkable fact they should notice. Also important is the position of the animated object
in the scene - we have a different perception of such object in the centre of interest and the
object perceived by the peripheral vision. Part of the studies included the assessment of the
disturbance of so-called "secondary" movements in the scene. It was found that blinking is
the least disturbing, followed by oscillating motion, the divergence of objects and the most
disturbing movement of objects over long distances.