CS520 Knowledge Graphs
What
should AI
Know ?
 

How do Knowledge Graphs Relate to AI?


1. Introduction

In this concluding chapter, we examine several ways in which knowledge graphs intersect with Artificial Intelligence (AI). As noted in the opening chapter, labeled directed graphs have been used for knowledge representation since the early days of AI research. Our focus here is on the role of knowledge graphs in more recent developments. Accordingly, we organize the discussion around three themes: knowledge graphs as a test bed for AI algorithms; the emergence of graph data science as a distinct area of study; and the role of knowledge graphs in the broader pursuit of AI's long-term goals.

2. Knowledge Graphs as a Test-Bed for Current Generation AI Algorithms

Knowledge graphs have a two-way relationship with AI algorithms. On one hand, knowledge graphs enable many current AI applications, and on the other, many current AI algorithms are used in creating knowledge graphs. We consider this symbiotic relationship in both directions.

Personal assistants, recommender systems, and search engines are applications that exhibit intelligent behavior and have billions of users. It is now widely accepted that these applications perform better when they can leverage knowledge graphs. A personal assistant using a knowledge graph can complete more tasks. A recommender system with a knowledge graph can make better recommendations. Similarly, a search engine can return better results when it has access to a knowledge graph. These applications therefore provide a compelling context and a concrete set of requirements for knowledge graphs to have an impact on real-world product offerings.

To create a knowledge graph, we must absorb knowledge from multiple information sources, align that information, distill key pieces of knowledge from a large volume of data, and mine that knowledge to extract insights that influence intelligent behavior. AI techniques play an important role at each stage of knowledge graph creation and exploitation. For extracting information from sources, we use entity and relation extraction techniques. For aligning information across multiple sources, we use techniques such as schema mapping and entity linking. To distill the extracted information, we apply techniques such as data cleaning and anomaly detection. Finally, to extract actionable insights from the graph, we use techniques such as inference algorithms and natural language question answering.

In summary, knowledge graphs enable AI systems by providing both motivation and concrete requirements for their development. At the same time, AI techniques fuel our ability to create knowledge graphs economically and at scale.

3. Knowledge Graphs and Graph Data Science

Graph data science is an emerging discipline that aims to derive knowledge by leveraging the structure inherent in data. Organizations typically have access to large volumes of data, but their ability to extract value from this data has often been limited to a fixed set of predefined reports. Graph data science is transforming this experience by combining graph algorithms, graph queries, and visualizations into tools and products that significantly accelerate the process of gaining insights.

graph data science overview
Figure 1. Emerging new discipline of Graph Data Science (Figure Credit: Neo4j)

As we saw in the analytics-oriented use cases for the financial industry, organizations are increasingly interested in exploiting the relational structure of their data to make predictions about risk, new market opportunities, and related outcomes. Predictive tasks are commonly addressed using machine learning algorithms, which typically rely on carefully engineered features. As machine learning algorithms have become more accessible and are often available as off-the-shelf solutions, feature engineering has emerged as a distinct and critical skill. Effective feature engineering requires both a deep understanding of the application domain and a solid grasp of how machine learning algorithms operate.

This synergy between traditional graph-based systems and machine learning techniques for identifying and predicting relational properties in data has catalyzed the emergence of graph data science as a distinct sub-discipline. Because of the high-impact use cases enabled by graph data science, it is increasingly regarded as a valuable and in-demand skill in industry.

4. Knowledge Graphs and the Longer-Term Objectives of AI

Early AI research focused on explicitly representing knowledge, giving rise to the first knowledge graphs in the form of semantic networks. Over time, semantic networks were formalized, leading to successive generations of representation languages, including description logics, logic programs, and graphical models. Alongside the development of these languages, researchers also addressed the challenge of authoring knowledge. Techniques for knowledge authoring have ranged from knowledge engineering and inductive learning to more recent methods based on deep learning.

To achieve the broader vision of AI, it is essential to have knowledge representations that align with human understanding and support reasoning. While some AI tasks, such as search, recommendation, or translation, do not require perfect alignment with human understanding, many domains demand precise, interpretable knowledge. Examples include legal knowledge for tax calculations, subject-matter expertise for teaching, or contract knowledge for automated execution. Even in domains where explicit knowledge is not strictly required, the resulting AI behavior must often be explainable, allowing humans to understand its reasoning. For these reasons, explicit knowledge representation remains a critical component of AI.

Some argue that traditional knowledge engineering does not scale, whereas machine learning and natural language processing (NLP) methods do. This argument, however, often overlooks the limitations of these scalable approaches. For example, language models can compute word similarity but cannot explain why two words are considered similar. In contrast, resources like WordNet provide an interpretable basis for such similarity. Scalable methods depend heavily on human input, such as hyperlinks, click data, or explicit feedback. Combining automated methods with knowledge graphs to produce human-understandable representations is therefore essential for AI systems that truly reason and explain their conclusions.

While knowledge graphs of triples are valuable, they are insufficient for many AI tasks that require more expressive representations. Advanced formalisms, though less commonly used due to cost and complexity, address challenges that current deep learning and NLP methods cannot fully solve. These challenges include self-awareness, commonsense reasoning, model-based reasoning, and experimental design. For instance, a self-aware AI could recognize the limits of its knowledge, while commonsense reasoning allows a system to detect impossible scenarios, such as a coin dated 1800 B.C. Current language models can generate coherent short texts, but they lack a global narrative model for long-term coherence. Developing AI systems that can master a domain, formulate hypotheses, design experiments, and analyze results remains beyond the reach of current technologies.

5. Summary

We have considered three ways in which knowledge graphs intersect with AI: as a test-bed for evaluating machine learning and NLP algorithms, as an enabler of the emerging discipline of graph data science, and as a core component in realizing the long-term vision of AI. We hope this volume inspires readers to explore the potential of scalable knowledge graph creation and use. At the same time, it is important to keep sight of the longer-term goal: developing expressive, human-understandable representations that can also be created at scale.

6. Further Reading

Using Machine Learning with Graphs is an important aspect of knowledge graphs that only received a brief introduction in the first chapter. A good starting point for a deeper exploration of this topic is a survey paper on learning embeddings from graphs [Hamilton et. al. 2017]. For a practical resource on graph analytics and algorithm implementations useful for graph data science experiments and hands-on learning, the resources provided by Neo4j are an excellent place to begin [Neo4j]. The future architectures for AI is a hotly contested topic, but two notable voices in this space are [Marcus 2018] and [Lecun 2022].

[Hamilton et. al. 2017] Hamilton, Ying & Leskovec (2017), Representation Learning on Graphs: Methods and Applications. https://arxiv.org/abs/1709.05584

[Neo4j] Neo4j Graph Algorithms Documentation.

[Marcus 2018] Marcus, Gary (2018), Deep Learning: A Critical Appraisal.

[Lecun 2022] LeCun, Y. (2022), A Path Towards Autonomous Machine Intelligence (Version 0.9.2). https://openreview.net/pdf?id=BZ5a1r-kVsf

Exercises

Exercise 10.1. Which of the following statements is false?
(a) Knowledge graphs have been an essential ingredient to the success of personal assistants.
(b) Machine learning is indispensable for creating large-scale modern knowledge graphs.
(c) Knowledge graphs will eventually be unnecessary for the success of AI applications.
(d) Knowledge graphs significantly expand the inferences possible using natural language.
(e) Technology is now available to create rudimentry knowledge graphs from images.

Exercise 10.2. Which of the following is out of scope of graph data science?
(a) Transaction management
(b) Feature engineering
(c) Visual Analytics
(d) Block Chain
(e) Semantic transformation of logical expressions

Exercise 10.3. Which of the following is true about how knowledge graphs might relate to the future of AI?
(a) Property graphs provide a sufficient representation for us to build future intelligent applications.
(b) Expressive logic-based representations are what we need for future intelligent applications.
(c) Some explicit representation similar to knowledge graphs is required, but exactly what is needed, is open for future research.
(d) Future techniques of AI will have reducing reliance on explicit representation.
(e) The goal of AI should be to eliminate the need for any representation.