Skip to Main Content U.S. Department of Energy
Information Visualization

Research

Our research explores the foundations of visual analysis, supporting the discovery of insights from complex data that encompasses multiple dimensions, sources, data forms, time variants, and languages. We take into consideration the application of human judgment to make the best possible use of incomplete, inconsistent, and potentially deceptive information in the face of rapidly changing situations.

We invite you to explore the research highlights below in the areas of

Text and Multimedia Analytics

Keyword Extraction and Themes with RAKE and CAST

The goal of this research is to provide more descriptive cues so that users have better insight into the features of a text collection. It also allows them to explore with greater precision and identify or evaluate more specific relationships. In order to accomplish this, the algorithm Computation and Analysis of Significant Themes (CAST) was created. CAST computes a set of themes for a collection of documents based on automatically extracted keyword information.

The Rapid Automatic Keyword Extraction (RAKE) algorithm provides this keyword information. RAKE automatically extracts single- and multi-word keywords from individual documents and then provides a set of high-value keywords to CAST which are clustered into themes. Each computed theme comprises a set of highly associated keywords and a set of documents that are highly associated with the theme’s keywords.

Whereas many text analysis methods focus on what distinguishes documents, RAKE and CAST focus on what describes documents, ideally characterizing what each document is essentially about. Keywords provide an advantage over other types of signatures as they are readily accessible to a user and can be easily applied to search other information spaces. The value of any particular keyword can be readily evaluated by a user for their particular interests and applied in or adapted to multiple contexts.

Related Items

RAKE and CAST are implemented in several products, including IN-SPIRE, Canopy, and SRS and Analytic Widgets.

Automatic Keyword Extraction from Individual Documents

Rose SJ, DW Engel, NO Cramer, and WE Cowley. 2010. Automatic Keyword Extraction from Individual Documents. Chapter 1 in Text Mining: Application and Theory, vol. 1, ed. MWBerry, J Kogan, pp. 3-20. John Wiley & Sons, Chichester, United Kingdom.

Abstract

This paper introduces a novel and domain-independent method for automatically extracting keywords, as sequences of one or more words, from individual documents. We describe the method's configuration parameters and algorithm, and present an evaluation on a benchmark corpus of technical abstracts. We also present a method for generating lists of stop words for specific corpora and domains, and evaluate its ability to improve keyword extraction on the benchmark corpus. Finally, we apply our method of automatic keyword extraction to a corpus of news articles and define metrics for characterizing the exclusivity, essentiality, and generality of extracted keywords within a corpus.

Describing Story Evolution from Dynamic Information Streams

Rose SJ, RS Butner, WE Cowley, ML Gregory, and J Walker. 2009. Describing Story Evolution from Dynamic Information Streams. In IEEE Symposium on Visual Analytics Science and Technology (IEEE VAST) VAST 2009, Oct. 12-13, 2009, Atlantic City, NJ, pp. 99-106. IEEE , Piscataway, NJ.

Abstract

Sources of streaming information, such as news syndicates, publish information continuously. Information portals and news aggregators list the latest information from around the world enabling information consumers to easily identify events in the past 24 hours. The volume and velocity of these streams causes information from prior days' to quickly vanish despite its utility in providing an informative context for interpreting new information. Few capabilities exist to support an individual attempting to identify or understand trends and changes from streaming information over time. The burden of retaining prior information and integrating with the new is left to the skills, determination, and discipline of each individual. In this paper we present a visual analytics system for linking essential content from information streams over time into dynamic stories that develop and change over multiple days. We describe particular challenges to the analysis of streaming information and explore visual representations for showing story change and evolution over time.


Story Flow

screenshot of Storyflow

The Story Flow visualization shows for a set of time intervals, the themes computed for those intervals, with size indicating the number of documents in that theme, which may link themes over time into stories. The visualization places days across the horizontal axis and orders daily themes along the vertical axis. Themes are consistently ordered for each interval by their number of assigned documents so that the theme order for each day is unaffected by future days. This preserves the organization of themes in the story flow visualization across days and supports information consumers’ extended interaction over days and weeks. An individual or team would therefore be able to print out each day’s story flow column with document titles and lines, and post that next to the previous day’s columns.

Related Items

The Story Flow visualization has been implemented in several applications, including IN-SPIRE™ and SRS.

Describing Story Evolution from Dynamic Information Streams

Rose SJ, RS Butner, WE Cowley, ML Gregory, and J Walker. 2009. Describing Story Evolution from Dynamic Information Streams. In IEEE Symposium on Visual Analytics Science and Technology (IEEE VAST) VAST 2009, Oct. 12-13, 2009, Atlantic City, NJ, pp. 99-106. IEEE , Piscataway, NJ.

Abstract

Sources of streaming information, such as news syndicates, publish information continuously. Information portals and news aggregators list the latest information from around the world enabling information consumers to easily identify events in the past 24 hours. The volume and velocity of these streams causes information from prior days' to quickly vanish despite its utility in providing an informative context for interpreting new information. Few capabilities exist to support an individual attempting to identify or understand trends and changes from streaming information over time. The burden of retaining prior information and integrating with the new is left to the skills, determination, and discipline of each individual. In this paper we present a visual analytics system for linking essential content from information streams over time into dynamic stories that develop and change over multiple days. We describe particular challenges to the analysis of streaming information and explore visual representations for showing story change and evolution over time.


ThemeRiver™

screenshot

The ThemeRiver™ visualization helps users identify time-related patterns, trends, and relationships across a large collection of documents. The themes in the collection are represented by a "river" that flows left to right through time. The river widens or narrows to depict changes in the collective strength of selected themes in the underlying documents. Individual themes are represented as colored "currents" flowing within the river. The theme currents narrow or widen to indicate changes in individual theme strength at any point in time.

Related Items

ThemeRiver Video (2001) - (h.264, 20MB)

ThemeRiver™: Visualizing Thematic Changes in Large Document Collections

Havre S, E Hetzler, P Whitney, and L Nowell. "ThemeRiver: Visualizing Thematic Changes in Large Document Collections". IEEE Transactions on Visualization and Computer Graphics, Vol.8, No. 1, January-March 2002.

Abstract

The ThemeRiver visualization depicts thematic variations over time within a large collection of documents. The thematic changes are shown in the context of a timeline and corresponding external events. The focus on temporal thematic change whithin a context framework allows a user to discern patterns that suggest relationships or trends. For example, the sudden change of thematic strength following an external event may indicate a causal relationship. Such patterns are not readily accessible in other visualizations of the data. We use a river metaphor to convey several key notions. The document collection's time line, selected thematic content, and thematic strength are indicated by the river's directed flow, composition, and changing width, respectively. The directed flow from left to right is interpreted as movement through time and the horizontal distance between two points on the river defines a time interval. At any point in time, the vertical distance, or width, of the river indicates that collective strength of the selected themes. Colored "currents" flowing within the river represent individual themes. A current's vertical width narrows or broadens to indicate decreases or increases in the strength of the individual theme.

ThemeRiver™: Visualizing Theme Changes over Time

Havre S, B Hetzler, and L Nowell. 2000. "ThemeRiver: Visualizing Theme Changes over Time", Proceedings of IEEE Symposium on Information Visualization, InfoVis 2000, pp. 115 - 123.

Abstract

ThemeRiver™ is a prototype system that visualizes thematic variations over time within a large collection of documents. The "river" flows from left to right through time, changing width to depict changes in thematic strength of temporally associated documents. Colored "currents" flowing within the river narrow or widen to indicate decreases or increases in the strength of an individual topic or a group of topics in the associated documents. The river is shown within the context of a timeline and a corresponding textual presentation of external events.

ThemeRiver™: In Search of Trends, Patterns, and Relationships

Havre S, B Hetzler, and L Nowell. 1999. ThemeRiver™: In Search of Trends, Patterns, and Relationships. In Proceedings of IEEE Symposium on Information Visualization, InfoVis '99, October 25-26, San Francisco CA.

Abstract

ThemeRiver™ is a prototype system that visualizes thematic variations over time across a collection of documents. The "river" flows through time, changing width to depict changes in the thematic strength of documents temporally collocated. Themes or topics are represented as colored "currents" flowing within the river that narrow or widen to indicate decreases or increases in the strength of a topic in associated documents at a specific point in time. The river is shown within the context of a timeline and a corresponding textual presentation of external events.


Lighthouse

Lighthouse is a content analysis system that facilitates the analysis, synthesis, and retrieval of multimedia content. It accomplishes this through an ensemble of state-of-the-art characterization and decision support processes that provide search, classification, summarization, and temporal analysis. This system provides our customers with the ability to find image and video duplicates including both exact matches and fuzzy or partial matches and includes classifiers for object recognition and face detection. Lighthouse provides temporal analysis of videos including shot boundary detection, video summaries, and event detection. The system summarizes the image and video content of a collection by clustering similar content. Lighthouse can be used to identify relationships in collections of data. For example a user may suspect that a video has been created from a portion of another video. To learn more the user can explore the similarity measures provided by Lighthouse to trace the repurposed video back to the video of origin. Lighthouse matches industry standards as demonstrated through an extensive set of benchmarks on collections commonly used within the image processing community.

Related Items

Lighthouse is the content analysis system powering Canopy

Multimedia Analysis + Visual Analytics = Multimedia Analytics

Chinchor N, JJ Thomas, PC Wong, M Christel, and MW Ribarsky. 2010. Multimedia Analysis plus Visual Analytics = Multimedia Analytics. IEEE Computer Graphics and Applications 30(5):52-60.

Abstract

Multimedia analysis has focused on images, video, and to some extent audio and has made progress in single channels excluding text. Visual analytics has focused on the user interaction with data during the analytic process plus the fundamental mathematics and has continued to treat text as did its precursor, information visualization. The general problem we address in this tutorial is the combining of multimedia analysis and visual analytics to deal with multimedia information gathered from different sources, with different goals or objectives, and containing all media types and combinations in common usage.


Arc Weld

screenshot of Arc Weld

ArcWeld is a new analytic visualization for displaying and understanding large diverse sets of multimedia data through abstracting, segmenting and displaying relationships through interchangeable attributes. The visualization is further enhanced through novel interaction techniques that allow the user to pivot, browse, and search and drill in deeper for further insight.

Arcweld gives an analyst a quick summary of a diverse set data and allows them to discover further insight by digging deeper into its content through various attributes by arranging, visualizing relationships and delving deeper into the data.

Related Items

Arc Weld is a visualization in the Canopy suite of tools.

Interactive Visual Comparison of Multimedia Data through Type-specific Views

Burtner, R., Bohn, S., & Payne, D. (2013, February). Interactive Visual Comparison of Multimedia Data through Type-specific Views. In IS&T/SPIE Electronic Imaging (pp. 86540M-86540M). International Society for Optics and Photonics.

Abstract

Analysts who work with collections of multimedia to perform information foraging understand how difficult it is to connect information across diverse sets of mixed media. The wealth of information from blogs, social media, and news sites often can provide actionable intelligence; however, many of the tools used on these sources of content are not capable of multimedia analysis because they only analyze a single media type. As such, analysts are taxed to keep a mental model of the relationships among each of the media types when generating the broader content picture. To address this need, we have developed Canopy, a novel visual analytic tool for analyzing multimedia. Canopy provides insight into the multimedia data relationships by exploiting the linkages found in text, images, and video co-occurring in the same document and across the collection. Canopy connects derived and explicit linkages and relationships through multiple connected visualizations to aid analysts in quickly summarizing, searching, and browsing collected information to explore relationships and align content. In this paper, we will discuss the features and capabilities of the Canopy system and walk through a scenario illustrating how this system might be used in an operational environment.

Multimedia Analysis + Visual Analytics = Multimedia Analytics

Chinchor N, JJ Thomas, PC Wong, M Christel, and MW Ribarsky. 2010. Multimedia Analysis plus Visual Analytics = Multimedia Analytics. IEEE Computer Graphics and Applications 30(5):52-60.

Abstract

Multimedia analysis has focused on images, video, and to some extent audio and has made progress in single channels excluding text. Visual analytics has focused on the user interaction with data during the analytic process plus the fundamental mathematics and has continued to treat text as did its precursor, information visualization. The general problem we address in this tutorial is the combining of multimedia analysis and visual analytics to deal with multimedia information gathered from different sources, with different goals or objectives, and containing all media types and combinations in common usage.


Mobile Networking

Edge Computing

screenshot

Edge Computing is pushing the frontier of computing applications, data, and services away from centralized nodes to the logical extremes of a network. It enables analytics and knowledge generation to occur at the source of the data. This approach requires leveraging resources that may not be continuously connected to a network such as laptops, smartphones, tablets and sensors.

An Edge Computing reference platform named Kaval, running on the Android operating system, has been developed at PNNL to help provide Edge Computing solutions. Kaval is a common platform for clients who need the ability to collect and analyze data on a mobile device, while potentially sharing that data with others devices in the field.


Graph Analytics

Graph Analytics

green arrow screenshot havegreen screenshot

Graph analytics is the study and analysis of data that can be transformed into a graph representation consisting of nodes and links. Scientists at the Pacific Northwest National Laboratory (PNNL) have been actively involved in graph analytics R&D in social network, cyber communication, electric power grid, critical infrastructure, bio-informatics, and earth sciences applications. The mission of graph analytics research at PNNL is not merely about research per se; it has the essential and enduring purpose of producing pragmatic working solutions that meet real-life challenges.

We have developed a series of cutting-edge graph analytics technologies to explore and analyze graphs with different sizes and complexities. For the exploration of small world graphs such as a social network, we developed the concept of a graph signature that extracts the local features of a graph node or set of nodes and used it to supplement the exploration of a complicated graph filled with hidden features. For larger graphs with about one million nodes, we further developed the concept of a multi-resolution, middle-out, cross-zooming technique that allows users to interactively explore their graphs on a common desktop computer. Currently, we are developing the concept of an extreme-scale graph analytics pipeline designed to handle graphs with hundreds of millions of nodes and tens of billions of links. Much of our work developed at PNNL has been loosely integrated into a graph analytics library known as Have Green.

Related Items

A Space-Filling Visualization Technique for Multivariate Small World Graphs

Wong PC, HP Foote, PS Mackey, G Chin, Jr, Z Huang, and JJ Thomas. 2012. "A Space-Filling Visualization Technique for Multivariate Small World Graphs." IEEE Transactions on Visualization and Computer Graphics 18(5):797-809 . doi:10.1109/TVCG.2011.99

Abstract

We introduce an information visualization technique, known as GreenCurve, for large multivariate sparse graphs that exhibit small-world properties. Our fractal-based design approach uses spatial cues to approximate the node connections and thus eliminates the links between the nodes in the visualization. The paper describes a robust algorithm to order the neighboring nodes of a large sparse graph by solving the Fiedler vector of its graph Laplacian, and then fold the graph nodes into a space-filling fractal curve based on the Fiedler vector. The result is a highly compact visualization that gives a succinct overview of the graph with guaranteed visibility of every graph node. GreenCurve is designed with the power grid infrastructure in mind. It is intended for use in conjunction with other visualization techniques to support electric power grid operations. The research and development of GreenCurve was conducted in collaboration with domain experts who understand the challenges and possibilities intrinsic to the power grid infrastructure. The paper reports a case study on applying GreenCurve to a power grid problem and presents a usability study to evaluate the design claims that we set forth.

Graph Analytics—Lessons Learned and Challenges Ahead

Pak Chung Wong, Chaomei Chen, Carsten Gorg, Ben Shneiderman, John Stasko, Jim Thomas, Graph Analytics—Lessons Learned and Challenges Ahead, IEEE Computer Graphics and Applications, vol. 31, no. 5, pp. 18-29, Sep./Oct. 2011, doi:10.1109/MCG.2011.72

Abstract

Graph analytics is one of the most influential and important R&D topics in the visual analytics community. Researchers with diverse backgrounds from information visualization, human-computer interaction, computer graphics, graph drawing, and data mining have pursued graph analytics from scientific, technical, and social approaches. These studies have addressed both distinct and common challenges. Past successes and mistakes can provide valuable lessons for revising the research agenda. In this article, six researchers from four academic and research institutes identify graph analytics' fundamental challenges and present both insightful lessons learned from their experience and good practices in graph analytics research. The goal is to critically assess those lessons and shed light on how they can stimulate research and draw attention to grand challenges for graph analytics. The article also establishes principles that could lead to measurable standards and criteria for research.

A Novel Application of Parallel Betweenness Centrality to Power Grid Contingency Analysis

Jin S, Z Huang, Y Chen, D Chavarria-Miranda, JT Feo, and PC Wong. 2010. A Novel Application of Parallel Betweenness Centrality to Power Grid Contingency Analysis. In IEEE International Symposium on Parallel & Distributed Processing (IPDPS 2010), pp. 1-7. Institute of Electrical and Electronics Engineers, Piscataway, NJ.

Abstract

In Energy Management Systems, contingency analysis is commonly performed for identifying and mitigating potentially harmful power grid component failures. The exponentially increasing combinatorial number of failure modes imposes a significant computational burden for massive contingency analysis. It is critical to select a limited set of high-impact contingency cases within the constraint of computing power and time requirements to make it possible for real-time power system vulnerability assessment. In this paper, we present a novel application of parallel betweenness centrality to power grid contingency selection. We cross-validate the proposed method using the model and data of the western US power grid, and implement it on a Cray XMT system - a massively multithreaded architecture - leveraging its advantages for parallel execution of irregular algorithms, such as graph analysis. We achieve a speedup of 55 times (on 64 processors) compared against the single-processor version of the same code running on the Cray XMT. We also compare an OpenMP-based version of the same code running on an HP Superdome shared-memory machine. The performance of the Cray XMT code shows better scalability and resource utilization, and shorter execution time for large-scale power grids. This proposed approach has been evaluated in PNNL's Electricity Infrastructure Operations Center (EIOC). It is expected to provide a quick and efficient solution to massive contingency selection problems to help power grid operators to identify and mitigate potential widespread cascading power grid failures in real time.

A Novel Visualization Technique for Electric Power Grid Analytics

Wong PC, K Schneider, P Mackey, H Foote, G Chin, R Guttromson, and J. Thomas. 2009 A Novel Visualization Technique for Electric Power Grid Analytics. Visualization and Computer Graphics, IEEE Transactions on 15(3):410-423

Abstract

The application of information visualization holds tremendous promise for the electric power industry, but its potential has so far not been sufficiently exploited by the visualization community. Prior work on visualizing electric power systems has been limited to depicting raw or processed information on top of a geographic layout. Little effort has been devoted to visualizing the physics of the power grids, which ultimately determines the condition and stability of the electricity infrastructure. Based on this assessment, we developed a novel visualization system prototype, GreenGrid, to explore the planning and monitoring of the North American Electricity Infrastructure. The paper discusses the rationale underlying the GreenGrid design, describes its implementation and performance details, and assesses its strengths and weaknesses against the current geographic-based power grid visualization. We also present a case study using GreenGrid to analyze the information collected moments before the last major electric blackout in the Western United States and Canada, and a usability study to evaluate the practical significance of our design in simulated real-life situations. Our result indicates that many of the disturbance characteristics can be readily identified with the proper form of visualization.

A Multi-Level Middle-Out Cross-Zooming Approach for Large Graph Analytics

Wong PC, PS Mackey, KA Cook, RM Rohrer, HP Foote, and MA Whiting. 2009. A Multi-Level Middle-Out Cross-Zooming Approach for Large Graph Analytics. In IEEE Symposium on Visual Analytics Science and Technology (VAST 2009), ed. J Stasko and JJ van Wijk, pp. 147 - 154. IEEE , Piscataway, NJ.

Abstract

This paper presents a working graph analytics model that embraces the strengths of the traditional top-down and bottom-up approaches with a resilient crossover concept to exploit the vast middle-ground information overlooked by the two extreme analytical approaches. Our graph analytics model is developed in collaboration with researchers and users, who carefully studied the functional requirements that reflect the critical thinking and interaction pattern of a real-life intelligence analyst. To evaluate the model, we implement a system prototype, known as GreenHornet, which allows our analysts to test the theory in practice, identify the technological and usage-related gaps in the model, and then adapt the new technology in their work space. The paper describes the implementation of GreenHornet and compares its strengths and weaknesses against the other prevailing models and tools.

Geometry-Based Edge Clustering for Graph Visualization

Cui WW, H Zhou, H Qu, PC Wong, and XM Li. 2008. "Geometry-Based Edge Clustering for Graph Visualization." IEEE Transactions on Visualization and Computer Graphics 14(6):1277 - 1284. doi:10.1109/TVCG.2008.135

Abstract

Graphs have been widely used to model relationships among data. For large graphs, excessive edge crossings make the display visually cluttered and thus difficult to explore. In this paper, we propose a novel geometry-based edge-clustering framework that can group edges into bundles to reduce the overall edge crossings. Our method uses a control mesh to guide the edge-clustering process; edge bundles can be formed by forcing all edges to pass through some control points on the mesh. The control mesh can be generated at different levels of detail either manually or automatically based on underlying graph patterns. Users can further interact with the edge-clustering results through several advanced visualization techniques such as color and opacity enhancement. Compared with other edge-clustering methods, our approach is intuitive, flexible, and efficient. The experiments on some large graphs demonstrate the effectiveness of our method.

A Dynamic Multiscale Magnifying Tool for Exploring Large Sparse Graphs

Wong PC, HP Foote, PS Mackey, G Chin, Jr, HJ Sofia, and JJ Thomas. 2008. "A Dynamic Multiscale Magnifying Tool for Exploring Large Sparse Graphs." Information Visualization 7:105-117.

Abstract

We present an information visualization tool, known as GreenMax, to visually explore large small-world graphs with up to a million graph nodes on a desktop computer. A major motivation for scanning a small-world graph in such a dynamic fashion is the demanding goal of identifying not just the well-known features but also the unknown–known and unknown–unknown features of the graph. GreenMax uses a highly effective multilevel graph drawing approach to pre-process a large graph by generating a hierarchy of increasingly coarse layouts that later support the dynamic zooming of the graph. This paper describes the graph visualization challenges, elaborates our solution, and evaluates the contributions of GreenMax in the larger context of visual analytics on large small-world graphs. We report the results of two case studies using GreenMax and the results support our claim that we can use GreenMax to locate unexpected features or structures behind a graph.

Have Green - A Visual Analytics Framework for Large Semantic Graphs

Wong PC, G Chin, Jr, HP Foote, PS Mackey, and JJ Thomas. 2006. "Have Green - A Visual Analytics Framework for Large Semantic Graphs." In IEEE Symposium on Visual Analytics Science and Technology, pp 67-74. Baltimore, Maryland, October 31-November 2, 2006.

Abstract

A semantic graph is a network of heterogeneous nodes and links annotated with a domain ontology. In intelligence analysis, investigators use semantic graphs to organize concepts and relationships as graph nodes and links in hopes of discovering key trends, patterns, and insights. However, as new information continues to arrive from a multitude of sources, the size and complexity of the semantic graphs will soon overwhelm an investigator's cognitive capacity to carry out significant analyses. We introduce a powerful visual analytics framework designed to enhance investigators' natural analytical capabilities to comprehend and analyze large semantic graphs. The paper describes the overall framework design, presents major development accomplishments to date, and discusses future directions of a new visual analytics system known as Have Green.

Graph Signatures for Visual Analytics

Wong PC, HP Foote, G Chin, JR, PS Mackey, and KA Perrine. 2006. "Graph Signatures for Visual Analytics." IEEE Transactions on Visualization and Computer Graphics 12(6)

Abstract

We present a visual analytics technique to explore graphs using the concept of a data signature. A data signature, in our context, is a multidimensional vector that captures the local topology information surrounding each graph node. Signature vectors extracted from a graph are projected onto a low-dimensional scatterplot through the use of scaling. The resultant scatterplot, which reflects the similarities of the vectors, allows analysts to examine the graph structures and their corresponding real-life interpretations through repeated use of brushing and linking between the two visualizations. The interpretation of the graph structures is based on the outcomes of multiple participatory analysis sessions with intelligence analysts conducted by the authors at the Pacific Northwest National Laboratory. The paper first uses three public domain datasets with either well-known or obvious features to explain the rationale of our design and illustrate its results. More advanced examples are then used in a customized usability study to evaluate the effectiveness and efficiency of our approach. The study results reveal not only the limitations and weaknesses of the traditional approach based solely on graph visualization but also the advantages and strengths of our signature-guided approach presented in the paper.

Generating Graphs for Visual Analytics through Interactive Sketching

Wong PC, HP Foote, PS Mackey, KA Perrine, and G Chin, JR. 2006. "Generating Graphs for Visual Analytics through Interactive Sketching." IEEE Transactions on Visualization and Computer Graphics 12(6)

Abstract

We introduce an interactive graph generator, GreenSketch, designed to facilitate the creation of descriptive graphs required for different visual analytics tasks. The human-centric design approach of GreenSketch enables users to master the creation process without specific training or prior knowledge of graph model theory. The customized user interface encourages users to gain insight into the connection between the compact matrix representation and the topology of a graph layout when they sketch their graphs. Both the human-enforced and machine-generated randomnesses supported by GreenSketch provide the flexibility needed to address the uncertainty factor in many analytical tasks. This paper describes over two dozen examples that cover a wide variety of graph creations from a single line of nodes to a real-life small-world network that describes a snapshot of telephone connections. While the discussion focuses mainly on the design of GreenSketch, we include a case study that applies the technology in a visual analytics environment and a usability study that evaluates the strengths and weaknesses of our design approach.


Active Products

screenshot

Active Products are a new kind of “smart report” that can support both rapid authoring and new modes of presentation. Instead of static reports that are difficult to update and that must be composed manually, Active Products are context-sensitive, automatically taking the form appropriate for their audience.

NVAC’s Active Products research is developing a suite of tools that supports the end-to-end process of information product authoring and dissemination. This tool suite includes components in three areas: analytic snippet collection, report composition, and dynamic presentation.

Active Products introduce new reporting paradigms, allowing user communities to move beyond static reports to dynamic products tightly coupled to data and reasoning.

Related Items

The Science of Analytic Reporting

Chinchor N, and WA Pike. 2009. The Science of Analytic Reporting. Information Visualization 8(4):286-293.

Abstract

The challenge of visually communicating analysis results is central to the ability of visual analytics tools to support decision making and knowledge construction. The benefit of emerging visual methods will be improved through more effective exchange of the insights generated through the use of visual analytics. This paper outlines the major requirements for next-generation reporting systems in terms of eight major research needs: the development of best practices, design automation, visual rhetoric, context and audience, connecting analysis to presentation, evidence and argument, collaborative environments, and interactive and dynamic documents. It also describes an emerging technology called Active Products that introduces new techniques for analytic process capture and dissemination.


Natural User Interfaces

NUI Interface Frameworks

illustration of touch table

Traditionally, Natural User Interactions (NUI) have only existed in the dreams of science fiction writers. NUI has not been pervasive in general purpose computing due to a number of limiting factors including high cost of hardware, inadequate computing power and lack of development frameworks. Researchers at PNNL are actively developing next-generation user interfaces, including contributing to Multi-Touch For Java (MT4j), an open-source NUI framework. MT4j allows developers to create immersive 2D and 3D multi-touch applications on any computing platform which supports a Java Virtual Machine (JVM). Advanced features include hardware acceleration via OpenGL and backwards compatibility with existing Java GUI toolkits.


Precision Information Environments

screenshot

The Precision Information Environments (PIE) project at PNNL is leading the effort to combine NUI with real-time collaborative environments, allowing users to naturally interact with both computer systems and each other.

PIE will provide access to information and decision support capabilities in a multi-platform system that supports multiple user roles, contexts, and phases of emergency management, planning, and response. These analytic and simulation capabilities will be provided through novel interactions that transform the way users engage with each other and with information.


Interactive Power Wall - MURAL

illustration of MURAL photo of MURAL

The Pacific Northwest National Laboratory’s Interactive Power Wall provides a unique capability to explore the boundaries of collaborative scientific and information visualization research on large, high-density displays. It is a multi-projector, multi-touch interactive display system with a total resolution of 15.4 million pixels (7.4 times 1080P HD) on a 7-ft high by 16-ft wide continuous high- quality glass display screen.

The Interactive Power Wall is located in the Multidisciplinary Research and Analysis Lab, or MURAL Room, of PNNL’s Computational Sciences Facility. MURAL is a general-purpose space that seats up to 60 people for research projects, meetings, working groups, and presentations. The Interactive Power Wall serves as an investigation and research tool supporting PNNL contributions to our clients’ critical mission areas.

Related Items

  • Affinity+: Semi-Structured Brainstorming on Large Displays (YouTube)
  • Affinity+: Semi-Structured Brainstorming on Large Displays

    Burtner, E. R., May, R. A., Scarberry, R. E., LaMothe, R. R., & Endert, A. (2013). Affinity+: Semi-Structured Brainstorming on Large Displays. (No. PNNL-SA-93014). Pacific Northwest National Laboratory (PNNL), Richland, WA (US).

    Abstract

    Affinity diagraming is a powerful method for encouraging and capturing lateral thinking in a group environment. The Affinity+ Concept was designed to improve the collaborative brainstorm process through the use of large display surfaces in conjunction with mobile devices like smart phones and tablets. The system works by capturing the ideas digitally and allowing users to sort and group them on a large touch screen manually. Additionally, Affinity+ incorporates theme detection, topic clustering, and other processing algorithms that help bring structured analytic techniques to the process without requiring explicit leadership roles and other overhead typically involved in these activities.


Visual Analytics Evaluation

Threat Stream Generator - Synthetic datasets

illustration

Pacific Northwest National Laboratory’s Threat Stream Generator (TSG) provides information analytic software developers with a new approach and better tools for evaluation than ever before. Originally developed for government use, and built upon feedback from a worldwide beta user community, the complete set of TSG's synthetic scenarios, analytic tasks, and complex datasets are available at no cost to users in commercial and educational domains. The key ingredients that enable unbiased and insightful evaluation of analytic software before operational use are scenarios and tasks with known ground truth and realistic supporting data. The Threat Stream suite of tools provides information of a quality that otherwise could only be found in national security or law enforcement agencies.

Related Items

VAST Contest Dataset Use in Education

Whiting MA, C North, A Endert, J Scholtz, JN Haack, CF Varley, and JJ Thomas. 2009. VAST Contest Dataset Use in Education. In IEEE Symposium on Visual Analytics Science and Technology (VAST 2009), ed. J Stasko and JJ van Wijk, pp. 115 - 122. IEEE, Piscataway, NJ.

Abstract

The IEEE Visual Analytics Science and Technology (VAST) Symposium has held a contest each year since its inception in 2006. These events are designed to provide visual analytics researchers and developers with analytic challenges similar to those encountered by professional information analysts. The VAST contest has had an extended life outside of the symposium, however, as materials are being used in universities and other educational settings, either to help teachers of visual analytics-related classes or for student projects. We describe how we develop VAST contest datasets that results in products that can be used in different settings and review some specific examples of the adoption of the VAST contest materials in the classroom. The examples are drawn from graduate and undergraduate courses at Virginia Tech and from the Visual Analytics "Summer Camp" run by the National Visualization and Analytics Center in 2008. We finish with a brief discussion on evaluation metrics for education.

Understanding the Dynamics of Collaborative Multi-Party Discourse

Cowell AJ, ML Gregory, JR Bruce, JN Haack, DV Love, SJ Rose, and AH Andrew. 2006. "Understanding the Dynamics of Collaborative Multi-Party Discourse." Information Visualization 5(4):250-259. doi:10.1057/palgrave.ivs.9500139

Abstract

In this paper, we discuss the efforts underway at the Pacific Northwest National Laboratory in understanding the dynamics of multi-party discourse across a number of communication modalities, such as email, instant messaging traffic and meeting data. Two prototype systems are discussed. The Conversation Analysis Tool (ChAT) is an experimental test-bed for the development of computational linguistic components and enables users to easily identify topics or persons of interest within multi-party conversations, including who talked to whom, when, the entities that were discussed, etc. The Retrospective Analysis of Communication Events (RACE) prototype, leveraging many of the ChAT components, is an application built specifically for knowledge workers and focuses on merging different types of communication data so that the underlying message can be discovered in an efficient, timely fashion.


User-centered evaluation of visual analytics environments

We have approached user-centered evaluations from several perspectives. First, we have developed a reviewing system for the VAST Challenges that combines reviews from end-users and visual analytics researchers to gain different perspectives on the capabilities that are important in these environments. VAST Challenge participants receive these comments and are able to determine if changes should be made to their work.

We took one year of reviews for all the mini challenges (VAST 2009) combined with additional input from professional analysts and analyzed the comments to obtain the most important aspects of the visual analytic environments. We used this information along with additional papers from the literature and constructed an initial set of guidelines for visual analytics environments. Some of these guidelines are borrowed from other related domains such as human computer interaction (HCI), situation awareness (SA), and human-automation.

Related Items

Developing Guidelines for Assessing Visual Analytics Environments

Scholtz J. 2011. "Developing Guidelines for Assessing Visual Analytics Environments." Information Visualization 10(3):212-231. doi:10.1177/1473871611407399

Abstract

In this article, we develop guidelines for evaluating visual analytics environments based on a synthesis of reviews for the entries to the 2009 Visual AnaLytics Science and Technology [VAST] Symposium Challenge and from a user study with professional intelligence analysts. By analyzing the 2009 VAST Challenge reviews, we gained a better understanding of what is important to our reviewers, both visual.ization researchers and professional analysts. We also report on a small user study with professional analysts to determine the important factors that they use in evaluating visual analysis systems. We also looked at guidelines developed by researchers in various domains and synthesized the results from these three efforts into an initial set for use by others in the community. One challenge for future visual analytics systems is to help in the generation of reports. In our user study, we also worked with analysts to understand the criteria they used to evaluate the quality of analytic reports. We propose that this knowledge will be useful as researchers look at systems to automate some of the report generation.1 From these two efforts, we produced some initial guidelines for evaluating visual analytics environments and for the evaluation of analytic reports. It is important to understand that these guidelines are initial drafts and are limited in scope as the visual analytics systems we evaluated were used in specific tasks. We propose these guidelines as a starting point for the Visual Analytics Community.

Developing Qualitative Metrics for Visual Analytic Environments

Scholtz J. 2010. Developing Qualitative Metrics for Visual Analytic Environments. In BELIV '10: Beyond time and errors: novel evaluation methods for Information Visualization, A Workshop of the ACM CHI Conference, April 10-11, 2010, Atlanta, Georgia. Association for Computing Machinery, New York, NY.

Abstract

In this paper, we examine reviews for the entries to the 2009 Visual Analytics Science and Technology (VAST) Challenge. By analyzing these reviews we gained a better understanding of what is important to our reviewers, both visualization researchers and professional analysts. This is a bottom up approach to the development of heuristics to use in the evaluation of visual analytic environments. The meta-analysis and the results are presented in this paper.

Questionnaires for eliciting evaluation data from users of interactive question answering

Kelly D, PB Kantor, E Morse, J Scholtz, and Y Sun. 2009. "Questionnaires for eliciting evaluation data from users of interactive question answering." Natural Language Engineering 15(1):119-141. doi:10.1017/S1351324908004932

Abstract

Evaluating interactive question answering (QA) systems with real users can be challenging because traditional evaluation measures based on the relevance of items returned are difficult to employ since relevance judgments can be unstable in multi-user evaluations. The work reported in this paper evaluates, in distinguishing among a set of interactive QA systems, the effectiveness of three questionnaires: a Cognitive Workload Questionnaire (NASA TLX), and Task and System Questionnaires customized to a specific interactive QA application. These Questionnaires were evaluated with four systems, seven analysts, and eight scenarios during a 2-week workshop. Overall, results demonstrate that all three Questionnaires are effective at distinguishing among systems, with the Task Questionnaire being the most sensitive. Results also provide initial support for the validity and reliability of the Questionnaires.

Application and Evaluation of Analytic Gaming

Riensche RM, LM Martucci, J Scholtz, and MA Whiting. 2009. Application and Evaluation of Analytic Gaming. In 2009 International Conference on Computational Science and Engineering, August 29-31, 2009, Vancouver, Canada, 4:1169-1173. IEEE Computer Society, Los Alamitos, CA.

Abstract

We describe an "analytic gaming" framework and methodology, and introduce formal methods for evaluation of the analytic gaming process. This process involves conception, development, and playing of games that are informed by predictive models and driven by players. Evaluation of analytic gaming examines both the process of game development and the results of game play exercises.

Advancing user-centered evaluation of visual analytic environments through contests.

Costello L, G Grinstein, C Plaisant, and J Scholtz. 2009. "Advancing user-centered evaluation of visual analytic environments through contests." Information Visualization 8(3):230-238.

Abstract

In this paper, the authors describe the Visual Analytics Science and Technology (VAST) Symposium contests run in 2006 and 2007 and the VAST 2008 and 2009 challenges. These contests were designed to provide researchers with a better understanding of the tasks and data that face potential end users. Access to these end users is limited because of time constraints and the classified nature of the tasks and data. In that respect, the contests serve as an intermediary, with the metrics and feedback serving as measures of utility to the end users. The authors summarize the lessons learned and the future directions for VAST Challenges.

User-Centered Evaluation of Technosocial Predictive Analysis

Scholtz J, M Whiting. 2009. User-Centered Evaluation of Technosocial Predictive Analysis. Association for the Advancement of Artificial Intelligence 2009

Abstract

In today's technology filled world, it is absolutely essential to show the utility of new software, especially software that brings entirely new capabilities to potential users. In the case of technosocial predictive analytics, researchers are developing software capabilities to augment human reasoning and cognition. Getting acceptance and buy-in from analysts and decision makers will not be an easy task. In this position paper, we discuss an approach we are taking for user-centered evaluation that we believe will result in facilitating the adoption of technosocial predictive software by the intelligence community.

VAST Contest Dataset Use in Education

Whiting MA, C North, A Endert, J Scholtz, JN Haack, CF Varley, and JJ Thomas. 2009. VAST Contest Dataset Use in Education. In IEEE Symposium on Visual Analytics Science and Technology (VAST 2009), ed. J Stasko and JJ van Wijk, pp. 115 - 122. IEEE, Piscataway, NJ.

Abstract

The IEEE Visual Analytics Science and Technology (VAST) Symposium has held a contest each year since its inception in 2006. These events are designed to provide visual analytics researchers and developers with analytic challenges similar to those encountered by professional information analysts. The VAST contest has had an extended life outside of the symposium, however, as materials are being used in universities and other educational settings, either to help teachers of visual analytics-related classes or for student projects. We describe how we develop VAST contest datasets that results in products that can be used in different settings and review some specific examples of the adoption of the VAST contest materials in the classroom. The examples are drawn from graduate and undergraduate courses at Virginia Tech and from the Visual Analytics "Summer Camp" run by the National Visualization and Analytics Center in 2008. We finish with a brief discussion on evaluation metrics for education.

Visual-Analytics Evaluation

Plaisant C, G Grinstein, and J Scholtz. 2009. "Visual-Analytics Evaluation." IEEE Computer Graphics and Applications 29(3):16-17. doi:10.1109/MCG.2009.56

Abstract

Visual analytics (VA) is the science of analytical reasoning facilitated by interactive visual interfaces. Assessing VA technology's effectiveness is challenging because VA tools combine several disparate components, both low and high level, integrated in complex interactive systems used by analysts, emergency responders, and others. These components include analytical reasoning, visual representations, computer-human interaction techniques, data representations and transformations, collaboration tools, and especially tools for communicating the results of their use. VA tool users' activities can be exploratory and can take place over days, weeks, or months. Users might not follow a predefined or even linear work flow. They might work alone or in groups. To understand these complex behaviors, an evaluation can target the component level, the system level, or the work environment level, and requires realistic data and tasks. Traditional evaluation metrics such as task completion time, number of errors, or recall and precision are insufficient to quantify the utility of VA tools, and new research is needed to improve our VA evaluation methodology.

Evaluating Visual Analytics at the 2007 VAST Symposium Contest

Plaisant C, G Grinstein, J Scholtz, MA Whiting, T O'Connell, S Laskowski, L Chien, A Tat, W Wright, C Gorg, Z Lui, N Parekh, K Singhal, and JT Stasko. 2008. "Evaluating Visual Analytics at the 2007 VAST Symposium Contest." IEEE Computer Graphics and Applications 28(2):12-21. doi:10.1109/MCG.2008.27

Abstract

In this article, we report on the contest's data set and tasks, the judging criteria, the winning tools, and the overall lessons learned in the competition. We believe that by organizing these contests, we're creating useful resources for researchers and are beginning to understand how to better evaluate VA tools. Competitions encourage the community to work on difficult problems, improve their tools, and develop baselines for others to build or improve upon. We continue to evolve a collection of data sets, scenarios, and evaluation methodologies that reflect the richness of the many VA tasks and applications.

Core Areas

Resources