November 15 2022

## Agenda

• Strategies for Multidimensional data visualization:
• direct visualization (this was approached last class)
• projections (dimensionality reduction)
• Principal Components Analysis (PCA) visualization

## Dimensionality Reduction

• methods that allow the representation of multidimensional data from a high dimensional to a low-dimensional space (also called projections)
• the projection space can be called a display, embedding or image space.

Examples of linear projection methods:

• principal component analysis
• linear discriminant analysis

Example of nonlinear projection methods:

• isometric feature mapping

## Principal Component Analysis (PCA)

• Invented in 1901 and still widely used today
• The main idea is to reduce dimensionality
• The first principal component explains the most variance
• The principals are uncorrelated and ordered by decreasing variances
• Limitations: not good for nonlinear structures

## PCA results

Check the rule mark documentation for Vega.

Add rule marks for the loadings data (you have to add scales as well) mapping x and y to zero, x2 to PC1, and y2 to PC2.

How can you replicate the plot you created to show PC3 vs. PC4, and PC5 vs. PC6?

## What is Vega-Lite?

Vega-lite is a higher-level language built on top of Vega that automates some constructions and makes the JSON specification significantly shorter.

Vega-Lite allows the creation of common plots fast.

“Compared to Vega, Vega-Lite provides a more concise and convenient form to author common visualizations. As Vega-Lite can compile its specifications to Vega specifications, users may use Vega-Lite as the primary visualization tool and, if needed, transition to use the lower-level Vega for advanced use cases.”