Thursday, August 1, 2019

Introducing Machine Learning!

             My dad had me do another free Python course on Udacity this month, and I've found it very enlightening. It's opened my eyes to infinitely more Python possibilities using math machine learning, which showed me how  to teach a computer program to learn and adapt to data.
             One of the many things I learned from this course was the concept of bias-variance. A bias computer program entirely ignores its data, while a variance program abides very strictly to its data. Both properties don't work when they're extreme. A completely bias program doesn't work because when the computer doesn't use it's data, because the output will be completely random. A completely variance program doesn't work because it obeys its data so strictly that it can't adapt to new situations. That's why computer programs need the amount of bias and the amount of variance to be almost even.



              Another part of machine learning I learned about was decision trees. As the name goes, the model branches out like a tree to show different possibilities. It's useful for mapping out how a program should run, and helping me consider different possibilities.

             I also learned the difference between continuous and discrete data. Discrete data is limited by certain values, and it therefore countable. Continuous data can take on any value in a certain range. Because the amount of values in that range is infinite, this data isn't countable.
            I also learned about overfitting, which is when a data analysis is matched up too well with the data, making it inaccurate. This is because when a line fits too well with it's data, adding new data will make the line inaccurate.
             I learned about confusion matrices. A confusion matrix is a type of table I can use to check the performance of a classifying model. It's a summary of predicted and actual results, used to find out the performance of a model.
             Another thing I learned was cross validation, which is a method to test the performance of a model. It's done by dividing the set of data into two parts, training and testing. The model is trained with the training set and tested with the testing set to check that the model works.
             I learned lots of math in this course, which made this course extremely unique. While other courses taught me how to write in Python and use that knowledge to make programs, this course taught me some of the math behind the programming.