Abstract—To verify cluster separation in high-dimensional data, analysts often reduce the data with a dimension reduction (DR) technique, and then visualize it with 2D Scatterplots, interactive 3D Scatterplots, or Scatterplot Matrices (SPLOMs). With the goal of providing guidance between these visual encoding choices, we conducted an empirical data study in which two human coders manually inspected a broad set of 816 scatterplots derived from 75 datasets, 4 DR techniques, and the 3 previously mentioned scatterplot techniques. Each coder scored all color-coded classes in each scatterplot in terms of their separability from other classes. We analyze the resulting quantitative data with a heatmap approach, and qualitatively discuss interesting...
In line with the technological developments, the current data tends to be multidimensional and high ...
Understanding relations in hyper-dimensional data is a prevalent problem, which is often approached ...
Many people interact with scientific data by means of 2D or 3D representations such as scatterplots....
Visualization of high-dimensional data requires a mapping to a visual space. Whenever the goal is to...
Extracting meaningful information out of vast amounts of high-dimensional data is very difficult. Pr...
Applying dimensionality reduction (DR) to large, high-dimensional data sets can be challenging when ...
Many graphical methods for displaying multivariate data consist of arrangements of multiple displays...
Subspace-based analysis has increasingly become the preferred method for clustering high-dimensional...
A scatterplot displays a relation between a pair of variables. Given a set of v variables, there are...
Due to the technological progress over the last decades, today’s scientific and commercial applicati...
Scatterplots are among the most widely used visualization techniques. Compelling scatterplot visuali...
Dimensionality reduction is the transformation of data from a high-dimensional space into a low-dime...
Subspace clustering is a popular method for clustering unlabelled data. However, the computational c...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
Class separation is an important concept in machine learning and visual analytics. We address the vi...
In line with the technological developments, the current data tends to be multidimensional and high ...
Understanding relations in hyper-dimensional data is a prevalent problem, which is often approached ...
Many people interact with scientific data by means of 2D or 3D representations such as scatterplots....
Visualization of high-dimensional data requires a mapping to a visual space. Whenever the goal is to...
Extracting meaningful information out of vast amounts of high-dimensional data is very difficult. Pr...
Applying dimensionality reduction (DR) to large, high-dimensional data sets can be challenging when ...
Many graphical methods for displaying multivariate data consist of arrangements of multiple displays...
Subspace-based analysis has increasingly become the preferred method for clustering high-dimensional...
A scatterplot displays a relation between a pair of variables. Given a set of v variables, there are...
Due to the technological progress over the last decades, today’s scientific and commercial applicati...
Scatterplots are among the most widely used visualization techniques. Compelling scatterplot visuali...
Dimensionality reduction is the transformation of data from a high-dimensional space into a low-dime...
Subspace clustering is a popular method for clustering unlabelled data. However, the computational c...
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Comput...
Class separation is an important concept in machine learning and visual analytics. We address the vi...
In line with the technological developments, the current data tends to be multidimensional and high ...
Understanding relations in hyper-dimensional data is a prevalent problem, which is often approached ...
Many people interact with scientific data by means of 2D or 3D representations such as scatterplots....