CS5783: Machine LearningAssignment 41 Gaussian process regressionUse the crash test dataset from assignment 3 again. In order to make numerical instability less ofan issue, scale the x and t values of your dataset to values between 0 and 1, i.e. normalize eachvalue by dividing it by the maximum value in that dimension.We will be generating Gaussian processes using two different kernels:• Squared exponential: k(x, x0) = exp{−(x−x0)22σ2 }• Exponential: k(x, x0) = exp{−|x−x0|σ}For each of these kernel families, construct your Gram matrix K and add diagonal noise to formC. In the last assignment, we estimated the β precision parameter for the noise as 0.0025 (becausewe eyeballed the standard deviation σ = 20, and β =1σ2 ). If you scale σ by the same magnitudeas you scaled all of the t values, you can compute the appropriate β for C.You can now use C, t and the kernel function distances between x∗ and each x to predict y∗values at x∗. First, figure out an appropriate order of magnitude for the σ parameter (this is the σparameter for the kernels, not the standard deviation of the noise, as in the previous paragraph!).Look at the output of your Gaussian process (perhaps by plotting using evenly-spaced x values)and look for values that seem to be relatively well-behaved (poorly chosen ones might look nothinglike the data, or might crash your evaluator).Once you have found a reasonable value of σ, perform five-fold cross-validation on 100 valuesof σ of the same order of magnitude as your rough calculation found, computing average MSE anddetermining a best-fit hyperparameter value.For each of the kernel functions, plot the training data and the output of the Gaussian processwith the best-fit hyperparameter (by plotting 100 evenly spaced x values and their correspondingGP outputs).2 K-means clusteringUse the MNIST test set rather than the training set, simply because 10000 examples will be alittle easier to work with then 60000, and we’re doing unsupervised learning anyhow. We wish tominimize the K-means objective functionJ(z, µ) = PNn=1PKk=1 znk||xn − µk||2,where znk is 1 if example n is in cluster k and 0 otherwise.1Implement a K-means algorithm function that takes a value for the number of clusters to befound (K), a set of training examples and a K-dimensional vector µ0kthat serves as an initialmean vector. This function should return the n-dimensional cluster assignment (presumably as ann × k one-hot matrix, since that is most convenient), as well as the converged µk vector. At eachiteration, print a dot as a progress indicator. Once J has converged, print out its value, as well asthe number of iterations it took.Run your algorithm with K=10 (the true number of clusters) on the following intializations µ0k:1. Ten data points chosen uniformly at random2. Ten data points found using the K-means++ assignment algorithm3. A data point drawn from each labeled class (found by looking at the test set labels – and yes,this is cheating)Visualize the 28×28-pixel images corresponding to each cluster mean found by your algorithm,for each of these initializations.Cluster the data using K=3, initialized using K-means++. Plot the cluster mean images anda few randomly chosen representatives from the data for each class.3 Hidden Markov ModelsConstruct a state machine that mimics the “occasionally dishonest casino” used as an example inlecture. This machine has two states, “Loaded” and “Fair”. When in the “Fair” state, it outputs avalue between 1 and 6, chosen uniformly at ra代做CS5783、代写Machine Learning、代写ndom. When in the “Loaded” state, it also outputsa value between 1 and 6, but this time the odds of emitting 1-5 are 110 each, while the odds ofemitting a 6 are 510 . This can be represented in a table:p(xt|zt) =xt zt = F zt = L1 0.16667 0.12 0.16667 0.13 0.16666 0.14 0.16667 0.15 0.16667 0.16 0.16666 0.5Furthermore, the transition matrix A between hidden variables is the following:p(zt|zt−1) =zt zt−1 = F zt−1 = LF 0.95 0.10L 0.05 0.90The process should start in the “Fair” state. Capture the output of this process for 1000 stepsin a vector x, and record the true state of the hidden variable z for each step, as well.Use the forward-backward algorithm on your vector of outputs, as well as the true probabilitiescontained in the transition and emission matrices, to construct the MAP estimate of the statedistribution at each time point. Produce two plots of the estimate of ˆz of the probability of aloaded die at time t, compared to the actual state which you saved when you generated the processin the first place. In other words, one line on the graph will be a probability somewhere between 0and 1, while the other will be a step function that transitions between exactly 0 and exactly 1. One2of your plots should be your estimate after performing your forward pass but before computingthe backward pass, and the other should be your estimate of ˆz when the entire inference process iscomplete.4 Turning inYour code must run on Prof. Crick’s Python3 interpreter. He has the numpy, matplotlib and scipylibraries installed, as well as the standard Python libraries such as random and math. You shouldnot need any others to do this assignment, and if you use any others, he will not be able to executeit.You must have a file named ’assn4.py’, and within it, three functions named ’problem1()’,’problem2()’, and ’problem3()’.You may have any number of .py files in your submission, which your assn4.py will import asnecessary. You do not have to include ’crash.txt’ or ’t10k-images-idx3-ubyte’ with your submission,but you should assume that I will put files with those names into the working directory along withyour code.If I execute the Python commands below, I am expecting to see something like the following.Note that your program’s output should be qualitatively similar, but will not likely be identical,since both you and the random number generator will make different choices than I did.>>> import assn4>>> assn4.problem1()Squared ExponentialBest sigma = 0.11See plot.ExponentialBest sigma = 0.15See plot.>>> assn4.problem2()Random initialization................................................................64 iterations, J = 25647803615.36019See plot.k-means++ initialization.....................................................53 iterations, J = 25491276527.472775See plot.Cheating initialization.............................29 iterations, J = 25409428225.92401See plot.........................................40 iterations, J = 30394791469.14684See plot.See plot.3Figure 1: Output of assn4.problem1(), part 1>>> assn4.problem3()Best alpha: 0.003126See plot.4Figure 2: Output of assn4.problem1(), part 2Figure 3: Output of assn4.problem2(), part 15Figure 4: Output of assn4.problem2(), part 2Figure 5: Output of assn4.problem2(), part 36Figure 6: Output of assn4.problem2(), part 47Figure 7: Output of assn4.problem2(), part 48Figure 8: Output of assn4.problem3()9转自:http://www.6daixie.com/contents/13/4427.html
讲解:CS5783、Machine Learning、c/c++、C++C/C++|SQL
©著作权归作者所有,转载或内容合作请联系作者
- 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
- 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
- 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
推荐阅读更多精彩内容
- The Inner Game of Tennis W Timothy Gallwey Jonathan Cape ...