Neural networks are a family of machine learning-oriented algorithms (methods and software). As such, neural networks differ from traditional expert systems in that they generate their own knowledge without the need to implement ad hoc rules for their learning.

Neural networks are composed, basically, of 2 elements: neurons (computational units) and connections between neurons. Depending on the problem to be solved, it is necessary to choose a type of neuron and a diagram of connections. We can understand this process as the equivalent of choosing, according to its characteristics, the individual with the most potential to solve our problem.

 

We already have our candidate, we have designed its neurons and we have interconnected them. The next step is to exploit its potential, i.e. to teach it. To do this we will have to choose a learning and correction method. This stage also bears some similarity to how a cognitive being is taught. In general, we teach a neural network empirically, by showing examples, asking its verdict on these and finally showing it what response we thought was correct so that it can correct itself in case of any discrepancy (unlike expert systems, which must be loaded with knowledge beforehand if we want them to execute actions or give verdicts).

 

If during its learning we send confusing signals to the neural network or do not educate it properly about how to correct its mistakes, we will obtain a hesitant and error-prone individual. If, on the other hand, when it makes a mistake, we show it an unambiguous way of correcting its mistakes and we maintain a minimum level of consistency in the verdicts we expect from it, we will have a robust system capable of dealing with uncertainty and accurate answers.

 

So, in addition to considering which neurons we will use and how we connect them, we must consider how we will instruct the system. By defining these 3 aspects we will obtain a fully functional network, able to solve problems as varied as:

 

 

 

 

 

Especially in highly complex environments, the ability to develop its own rules becomes indispensable. Reviewing the list of examples above we can imagine how having to summarize specific rules with respect to the randomness of the obstacles that a drone might find in its path would be unfeasible. The number of scenarios in which the apparatus might find itself, ways to negotiate situations, and the meteorological contexts that it should deal with would make it necessary to define a huge set of rules to guide its movement effectively. The same thing happens when we consider transcribing conversations automatically. The amount of intonations, timbres, noises and cadences that we can find is such that it would not be feasible to consciously construct a set of specific rules to govern each situation and it would be even less likely to expect the result to be effective.

At Serimag Media we know the advantages of these types of tools and algorithms. Their implementation in recent years in different processes of the Automatic Document Processing (TAAD) platform has allowed a significant increase in the automation, reliability and accuracy of our production processes.

A clear case is the localization of relevant tax data. This is a case in which in optimal digitization conditions traditional expert systems are able to produce reasonable results at the expense of certain disadvantages. For example:

 

Neural Networks

 

In this case the aim is to capture the NIF (tax ID) of the declarant. A traditional expert system could search for the word “NIF” and move to certain coordinates where you could potentially find the desired data.

In this other case we want to find the tax year. Another way to approach the problem for a conventional expert system would be to look for a particular very visually characteristic box (in this case with a blue outline) and once the box is located to scroll to another location from this one to find the year:

 

Neural Networks

 

These methods, although effective in theory, run into conflict with the current policy of digitization, where decentralization prevails and quality control is therefore a disadvantage. Expert methods depend heavily on this homogenization. If the quality is low or heterogeneous, the word “NIF” in the first example could be illegible, jeopardizing the capture of the corresponding data. Similarly, in the second example, the box that is used as an anchor for the capture could have shadows, rotations, rescaling or brightness that would hinder localization of the box and the data that depends on it.

Neural networks, by contrast, when learning with real documentation, with its imperfections and deviations, adapt to the digitalization scenario of the customer and react in a much more satisfactory and robust way in spite of the disturbances that scanning may cause. Neural networks can learn to localize the data without relying on another specific capture, region or box. Instead of this they analyse the appearance of the entire page and autonomously determine which elements are more relevant to deduce the location of the data to be captured. Thus, the same neural network is able to work with several types of pages and the same neural network applies certain strategies depending on the page in question, without it being necessary to explain the type of page involved. All this ultimately results in flexible, highly reliable systems and a greater competitive advantage for the client.

We must also take into account that the documentation we are dealing with in many cases is not generated by the client but third parties not involved in the operation to be evaluated. This makes it increasingly likely that changes will occur to it and that control over the appearance of these variations will require even greater monitoring of the process. Faced with a change of this type an expert system would require an analyst to review the defined anchor points and the keywords in order to capture the data again. With TAAD it is sufficient to re-train the system with new examples processed manually so that these rules are self-generating, making the updates much simpler and more efficient.

At Serimag Media we have acquired the necessary know-how for the implementation of techniques based on machine learning with which to satisfy all the expectations and needs of our customers: efficient implementation, precision and adaptability to the particularities of each process.

 

Author: Daniel Garrido, Project Manager at Serimag

www.serimag.com