CSC 444 Data Visualization

November 15 2022

Agenda

Strategies for Multidimensional data visualization:
- direct visualization (this was approached last class)
- projections (dimensionality reduction)
  - Principal Components Analysis (PCA) visualization

Dimensionality Reduction

methods that allow the representation of multidimensional data from a high dimensional to a low-dimensional space (also called projections)
the projection space can be called a display, embedding or image space.

Examples of linear projection methods:

principal component analysis
linear discriminant analysis

Example of nonlinear projection methods:

isometric feature mapping

Principal Component Analysis (PCA)

Invented in 1901 and still widely used today
The main idea is to reduce dimensionality
The first principal component explains the most variance
The principals are uncorrelated and ordered by decreasing variances
Limitations: not good for nonlinear structures

Case Study: McDonald’s Menu Data PCA

McDonald’s Menu Data

PCA results

And here’s the principal components per observation data set

Download the starter project

Build a scatter plot mapping PC1 to x, PC2 to y, and category to fill.

Adding rule marks

Check the rule mark documentation for Vega.

Add rule marks for the loadings data (you have to add scales as well) mapping x and y to zero, x2 to PC1, and y2 to PC2.

Add the other PCs

How can you replicate the plot you created to show PC3 vs. PC4, and PC5 vs. PC6?

Vega-Lite

What is Vega-Lite?

Vega-lite is a higher-level language built on top of Vega that automates some constructions and makes the JSON specification significantly shorter.

Vega-Lite allows the creation of common plots fast.

“Compared to Vega, Vega-Lite provides a more concise and convenient form to author common visualizations. As Vega-Lite can compile its specifications to Vega specifications, users may use Vega-Lite as the primary visualization tool and, if needed, transition to use the lower-level Vega for advanced use cases.”

Examples of Vega-Lite

Here are two examples on how to use Vega-Lite with the data we’ve been working with, the McDonald’s menu data:

Vega-Lite Aggregated Bar Plot

Vega-Lite Scatter Plot Matrix