Classification with Knn
Knn is a simple machine learning model. which mainly focused on classification and regression problems. knn is a supervised learning model.
where lots of data determine the result.to Understand this assume that you are a doctor you want to identify crack bone so to examine this you have to first encounter or studies lots of (in medical training) broke bone data in the form of x-ray. then you can tell this is a cracked bone or not. similar in knn you have to train your model using previous data and then test the model using test data. and using this accuracy you can classify future data.
how classification work:
According to attributes or features we can label the data. for example, to consider this you have to determine environment is hot or cold. you need attributes like temperature if the temreture< 10 it is cool or 15< temp< 25
then warm and when temp >25 then environment become hot. then you can understand that temp is a function of feeling .y=f(X), here y is the class label and x is the feature. All work of model to find this function.
Classification Vs Regression:
it is simple, classification means the limited class label, for example, if the store of pizza I will like it or I don't like it there is no third option. this problem is called classification. MNIST data set where you can determine 0,1,2,3,4,5,6,7,8,9 among this class label not outside among this set.
In regression, there is no finite class label as height, weight, or temp. where yi is the real number.
Classification technique:
knn means k nearest neighbours:
if we simply said, knn is that take k nearest neighbour and take majority of them and after majority vote declares the label.
for example:
as shown in the figure, consider two group green and red, the black is a query, to predict the label of black we use 5nn ( k always should be an odd number). we have 4 green and 1 red so according to mejority xq is green.
Failure cases of knn:
As shown in fig. the data is separated but as you can see xq the query point is far away from both group. so the result may fail.
As shown in fig (b) if data is jumbled the knn will fail.