Hello

My name is Thales Sehn Körting and I will present very breafly how the kNN algorithm

works kNN means k nearest neighbors

It’s a very simple algorithm, and given N training vectors, suppose we have all these

‘a’ and ‘o’ letters as training vectors in this bidimensional feature space, the kNN

algorithm identifies the k nearest neighbors of ‘c’

‘c’ is another feature vector that we want to estimate its class

In this case it identifies the nearest neighbors regardless of labels

So, suppose this example we have k equal to 3, and we have the classes ‘a’ and ‘o’

And the aim of the algorithm is to find the class for ‘c’

If k is 3 we have to find the 3 nearest neighbors of ‘c’

So, we can see that in this case the 3 nearest neighbors of ‘c’ are these 3 elements here

We have 1 nearest neighbor of class ‘a’, we have 2 elements of the class ‘o’ which are

near to ‘c’ We have 2 votes for ‘o’ and 1 vote for ‘a’

In this case, the class of the element ‘c’ is going to be ‘o’

This is very simple how the algorithm k nearest neighbors works

Now, this is a special case of the kNN algorithm, is that when k is equal to 1

So, we must try to find the nearest neighbor of the element that will define the class

And to represent this feature space, each training vector will define a region in this

feature space here And a property that we have is that each region

is defined by this equation We have a distance between each element x

and x_i, that have to be smaller than the same distance for each other element

In this case it will define a Voronoi partition of the space, and can be defined, for example,

this element ‘c’ and these elements ‘b’, ‘e’ and ‘a’ will define these regions, very specific

regions This is a property of the kNN algorithm when

k is equal to 1 We define regions 1, 2, 3 and 4, based on

the nearest neighbor rule Each element that is inside this area will

be classified as ‘a’, as well as each element inside this area will be classified as ‘c’

And the same for the region 2 and region 3, for classes ‘e’ and ‘b’ as well

Now I have just some remarks about the kNN We have to chose and odd value of k if you

have a 2-class problem This happens because when we have a 2-class

and if we set k equal to 2, for example, we can have a tie

What will be the class? The majority class inside the nearest neighbors?

So, we have always to set odd values for a 2-class problem

And also the value of k must not be a multiple of the number of classes, it is also to avoid

ties And we have to remember that the main drawback

of this algorithm is the complexity in searching the nearest neighbors for each sample

The complexity is a problem because we have lots of elements, in the case of a big dataset

we will have lots of elements And we will have to search the distance between

each element to the element that we want to classify

So, for a large dataset, this can be a problem Anyhow, this kNN algorithm produces good results

So, this is the reference I have used to prepare this presentation

Thanks for your attention, and this is very breafly how the kNN algorithm works

thanks for the great work..could u make a video on selecting apt K Values for differemt data sets and how the complexity of calc nearest neighbours can be minimised?I'll be highly obliged.

You Brasilian? I am too! thanks for the video

how do we choose an optimum k count? is there specific method to choose k?

Thank you for making this video. Could you please make another video to explain how to use KNN in solving regression problem?

Possible image knn code in Java

Thank you very much!

Thank you very much for your explanation

Muito bem explicado

Thank you, sir!! 🙂

Great explanation! Thanks a lot for it 🙂

2X speed

is good

Thank You! You made it so easy. I now use your explanation to teach others.

I do not work like that buddy

100

Great work. Keep it up. I have been trying to learn and write about kNN. Would be great if you can check it out and comment. In the post, I built a model and iterated it until i found an optimal test accuracy for prediction: https://the-ml-blogs.blogspot.com/2018/12/4-deeper-into-knn.html

Great video! And I love your accent!

As simple as that? I was studying KNN from Introduction To Statistical Learning and I felt it so much tough to understand.

I am trying to learn Data Science, I have completed like 4 courses of Datacamp's path to Data Scientist With Python. I am feeling like the path is less concept oriented and more based on how to use python for data science.

I am totally confused with how to learn Data Science, I have 20 semester end holidays left. I want to use the time. I have tried the book Introduction to statistical learning, the book applies the methods using R and I am already learning python(using Datacamp).

I have tried few statistics courses which were complete theory. I have tried Andrew NG and everything went above my head.

I am in search for a course that includes both, the practical as well as the theoretical part.

Any Singaporeans here?

Nice video, thanks! But I do have a question: in k=1 case why do we have to do Voronoi partition? Instead can't we choose the class of the new sample by the shortest distance to representative points of the training dataset classes? I guess it is the same at the end of the day but isn't it easier to calculate a sample's shortest distances than calculating for the whole Euclidean Space?

Good explanation

soft voice！！！

so, which class will be determined for 'c'?

Short and simple, thanks a lot!

You should have explained how KNN is trained first. How the centroid approach works. Without training explained, you are testing the data point.

Such a nice and straightforward explanation on k-NN algorithm! thanks a lot

Thank you very much.

I have never posted a comment but wanted to thank you for the great explanation!

Thanks but I have a question: what is meant by class?

DO EVERY ALGORITHM WHICH EXISTS

Great video but I have one unanswered question:

Given we have three classes A B C and k = 5. Let's assume the nearest neighbours are A A B B C, how is determined whether A or B should be returned as a result?

In knn algorithm which feature extraction is used?

Bow or n-gram

How do you decide the number of k?

how does this work irL?

Nice tutorial. Thanks

I have understood more about k-NN and 1-NN in these few minutes than the hours of reading papers or thesis or wiki or endless webpages of explaining nothing at all! When even a dummy like me understands it, it is explained very well! Most of the reading on ML does nothing except providing you with more abbreviations, acronyms or terms, in which are already a separate paper itself, in which you will find even more abbreviations, acronyms or terms. Thank you for explaining in English!

Can you give me the presentation link. Description link is not working.