Chapter 4

The Information Flow

AI development is essentially the process of collecting and organizing information. Data is collected, its meaning is extracted as information pieces, and then it’s structured into a format that allows future learning for the knowledge that these information pieces represent.

We define 3 types of data:

  • Structured data – think of very defined tables
  • Semi-structured data – like configuration, XML or JSON files
  • Unstructured data – human consumable data such as images, videos, or documents

We will dive into the details of these definitions later but we can already evident that the structure of the data is critical for AI. We structure the data to build a skill or answer a set of questions. 

This data structuring process is often referred to as “unstructured data management”. However, the name is very misleading because you’re structuring information to represent knowledge or skill. We call this structure of different properties, and their relations a knowledge base. 

Figure 9 Information sampling

The Knowledge Base

Developing AI is essentially collecting different information pieces (data pieces with some meaningful interpretation) into a Knowledge Base (KB), this KB stores the different information pieces, their correlations, and relationships. Creating a digital (machine and AI-friendly) KB is very hard. Some might argue it’s even impossible since it’s very hard to define many of the things around us in a formal way that can be stored in modern databases. 

Can you define a cat using a list of yes/no questions? 

Try doing this exercise in your head. You’ll get the intuition on KB’s challenges and why they are hard to develop and maintain these incomplete systems.  

You might think KBs is a new concept, but we’ve had KBs for thousands of years in the form of books and libraries, we just used human language to represent them and papers to store them.   

Let us look at a very simplified information flow and aggregation into the Knowledge Base of the medical treatment knowledge: 

We’ve been collecting medical information for thousands of years and have already established a medical information flow, constantly updating the human health knowledge base: 

Figure 9 The human knowledge distribution flow

When you talk to your family doctor, the doctor represents all human medical knowledge regarding your questions, naturally, a single human is unable to have all the relevant information, and therefore after a few questions and exams (which are other forms of questions), your family doctor will redirect you to the answer, or the next expertise level according to your answers/results, an expert doctor that has more specific knowledge regarding your condition. 

Expert doctors might in turn consult with literature and potentially other experts, sometimes going all the way to a global specialist in some cases.  

Even for medical information, a domain that is a priority for us, we don’t have a single knowledge base that we maintain. Rather, the knowledge is distributed all around us. Whether a single KB for everything is even possible is an open question.  

The above information flow between different knowledge levels exists in every domain. Every human topic has an information flow, knowledge aggregation, storage, and maintenance. AI is about capturing this information in a way that allows us to teach machines using this knowledge. 

As of today, this knowledge is accessible only to humans.  

AI development is about capturing the complex information we have around us into a structured format in the context of cognitive tasks we wish to automate.  

Rule #4: AI model   information container 

It is now time to gain a better sense of the information before we decide on how to structure and manage it.  

The AI Signal, Noise, and Channel

Once we understand that AI is information captured from our human knowledge base into machines, then we can conclude the next logical piece. Every exchange of information is considered a communication process. Therefore, AI learning is considered a communication process – a process in which several parties exchange information with each other.  

   

Every communication process is defined by a: 

  • Signal: A measurable change that carries some meaningful (information). The meaning is what separates the signal from the noise and the meaning is subjectively defined by the signal observer. 
  • Noise: Random, unwanted modifications to the signal. Modifications can happen during capturing, storage, transmitting, processing, or conversion of the signal.  
  • Channel: The medium used to convey the information is sent from the sender to the receiver.  

Let us reflect these into our cat detection task: 

  • Signal: The change in the image pixels representing a cat, is marked by the cat annotation. 
  • Noise: The none-cat parts of the image 
  • Channel: The image file 

Who are the parties in the communication process?  

  • The sender: The human, marking the cat and sending this message (labeled cat image) to the AI model.  
  • The receiver: The model which decodes the message and uses it to “learn” a bit more about the cat’s properties.  

 

Rule #5: AI development is a communication process 

Labeling As Information Transmission

Labeling is the process of encoding information by a human data labeler on top of the data in the form of annotations. Later on, during the training process, this information is decoded back automatically into the neural network. A human encodes the message using his existing knowledge about the world, summarizing(compressing) this knowledge into data annotations. 

The knowledge and experience, acquired by humans during the years of learning (either by formal or informal education), is transferred to the AI bot over the labeled data examples. Examples without labels are simply meaningless data points. Once data is labeled, it turns into context-based information units.  

In the process of machine learning development, knowledge is often referred to as “domain experts”. Some domain experts like identifying a cat as common, while others like identifying sick orange leaves, or breast cancer cells acquired by doctors or farmers throughout a lifetime of professional experience. These are the experts from which our expert system is learning from.  

The above is the core fulfillment of the fundamental business promise of AI. 

Domain experts can transfer their skills to bots, creating labor automation across verticals and markets.  Expert knowledge transfer is a communication process over labeled data where each example carries a bit more information regarding the learned skill.

  

To summarize, neural networks are a great tool for transferring information from humans to machines, yet do not let the “buzz” make you think anything intelligent happens inside a neural network. It’s more like a “monkey see monkey do” tool. Intelligence has and still is a biological quality, found ONLY  in living things.  

I still find myself surprised that after close to a decade with these technologies, the core process of AI, domain knowledge transfer from human experts to machines through labels, is ignored by an entire industry that spends most of its time today on cooler algorithms, faster processors, or labeling free AI. There is no human labeling of free AI, nor is there expected to be anytime soon.   

ML Pointer 

Shannon’s noisy channel model is a useful template for deeper analysis of the labeling process as a communication process. Here is a chart that I find useful in understanding this:    

Think of your model as a decoder that is trying to reconstruct a lossy compressed signal decoded by humans as labels. If that sounds correct- then a world of research opportunities is opened from sampling method analysis to mapping function and the source and the destination of spaces definition. 

This book is based on these ideas and their meanings.  

Next Chapters

Chapter 5

The training process is the process in which we take our training set, i.e. the collection of data examples we’ve collected and create a model that learned from this example. We call this “training” and not “coding” since the model is created automatically from our data, with no coding involved. The result of our training session is a code we can then run that predicts its learned properties as a result of the new data.

Chapter 6

While often bias and variance terms are usually being discussed by data scientists and ML experts, understanding them requires no technical skills and is critical for anyone working with data-driven products, after all these are the data modeling bugs that will hurt our user’s experience and our product competitiveness. Time to gain deeper intuition on these concepts, no worries, you will understand them without a single equation involved.

Chapter 8

It is very popular to talk about machine learning these days while ignoring the teachers in this learning process, time to discuss the machine learning process from its less common perspective - as a teaching process.

Chapter 9

We are preparing to launch our AI app. We have basic models that are functional, we agreed with the pilot customer for a calibration period that allows our models to adjust to the fresh data and the data interfaces (APIs) with customers have been defined. In this chapter we will dive into the preparation and planning needed for launching and scaling our app deployment.