Basics of Machine learning

22 Mar 2022

1.   Artificial Intelligence vs Machine Learning vs Deep Learning


Machine learning, a subordinate term of Artificial Intelligence

Artificial intelligence (AI), machine learning, deep learning- these are terms we inevitably run into in this era of technology. They may seem unrelated, but in fact, they are all linked in a form of ‘Artificial intelligence ⊃ Machine learning ⊃ Deep learning’. AI refers to self-thinking computers that can train and make decisions by themselves without any human help. Machine learning, a subordinate term of AI, is a methodology that enables such self-training computation.

To elaborate, machine learning is a channel that facilitates ‘codeless’ decisions for computers by providing them data and algorithms as training materials. Specific manuals aren’t necessarily given; however, in the sense that pre and post-training processes include a certain amount of coding, machine learning refers to not a universal AI term from a macroscopic view, but a methodology fashioned to achieve it.

Deep learning is a subordinate term of machine learning designed into a further advanced form. Its characteristic is that it trains on data independently through numerous layers of neural networks, not merely following the path paved by algorithms.


Time Series Analysis, one of DAVinCI LABS modules

DAVinCI LABS is a solution that maximizes convenience through the automation of machine learning. It offers users machine learning algorithms including deep learning in a form where the least user interference is involved. Users can run the solution with just a few clicks, as most training levels apart from data preparation are automated.

2. Algorithm and model

As mentioned earlier, machine learning is a learning methodology providing data and algorithms to computers. However, the most generally associated concept to machine learning would be ‘models’ rather than ‘algorithms’. What is the exact difference between the two? Let us clarify the two concepts.

Algorithm is a combination of calculation methods/ rules that state in detail the order of steps needed for solving problems. Therefore, algorithm in machine learning would mean reaching a conclusion called ‘prediction’ through the given algorithm.  

If so, what does a ‘model’ exactly mean? It can be put into an intuitive equation below.

Data + Algorithm = Model

To put it in other words, a model is an algorithm that has already completed training on data. A component of the model, an algorithm, combines with another component, the data, to become a ‘model’ as the ultimate output, and the generated prediction value from the model is what we aim to retrieve from machine learning as the total outcome.

Let us apply this idea on DAVinCI LABS- once the user uploads the prepared data on DAVinCI LABS, diverse machine learning algorithms are trained on the provided data to complete a model, and where then the user picks out the most promising model to use for prediction.


Which algorithms should be combined to create the best model?

3. Supervised Learning vs Unsupervised Learning

The two previous categories have indicated that machine learning could be defined as computer training on data. Just as humans proceed with different learning methods, how computers are trained could be segmented as well. In general, they are divided to supervised and unsupervised learning. Let’s take a look at each of them.

Why not take an example of us humans? There could be numerous ways for us to learn things but in most cases, our actions are triggered by past experiences and records. If there aren’t any preceding cases, then we absorb various trials made at that time point to form a new record of our own.

Experience, once undergone, is recorded along with its result. In other words, once the record is learned, then future cases that resemble that very experience could be predicted on their outcomes based on the contained records. However, it is naturally impossible that knowledge is solely based on past references. There is no right or wrong to a completely new piece of knowledge, meaning that scarfing through this newly given data itself becomes a means of learning.

The same is applied to when a computer undergoes a training process. Under the assumption ‘knowledge=data’, if there is an existing history within the data, the computer can absorb and save the record to generate a sensible conclusion in a similar situation. This is called ‘Supervised Learning’. Just as the words themselves, you could predict future possibilities by training on previous records.

On the contrary, how about when data has no past (history) records? Since the computer is missing the accurate answer, the goal then would become to understand patterns based on the data itself and classify similar types of knowledge into clusters (data samples). This is what we call ‘Unsupervised Learning’.


Concept of Supervised and Unsupervised Learning

Machine learning names this process ‘clustering’- grouping similar types of knowledge or characteristics. The previously mentioned model creation process is referred to as modeling. Whereas modeling is designating a subjective to make an algorithm-based prediction, the output of clustering is not a prediction value but a grouping of data.

In general machine learning, clustering tends to be limited to unsupervised learning. That is, it’s focused on being used to group data in order to determine certain cluster characteristics, even without a specific standard or a subject. DAVinCI LABS implements this through a module ‘Auto Clustering’. On the other hand, supervised learning clustering sets standards for data clustering and starts the whole procedure from there. For example, let’s define the main standard as to whether the client is likely to perfectly accomplish ‘loan payment’, and then classify clients with high repayment rates in a cluster, ultimately to see what common characteristics the cluster hold.



Supervised learning Clustering: Rule Generation

DAVinCI LABS is implementing this form of supervised learning clustering through a module named “Rule Generation”. The cluster’s size and characteristics are well classified, right? We will witness how this module could be further utilized in actual business fields from future posts.