data310

Using the Classify structured data with feature columns you prepared for today’s class as a model example, train, validate and test a model that has wealth class as the target as follows.

Import the dataset city_persons.csv to your PyCharm environment. Initially set the target to the least wealthy class, 2 in this case, and set all other wealth class outcomes to 0 (3,4 & 5). Train, validate and test your model. Interpret and analyze your results. Did the model performance exhibit a particular trend?

For the first set of models, I only used the numeric type of feature columns on all the features. For the model focusing on wealth class 2 and having 10 epochs, the training accuracy was around 0.9875. The testing accuracy was around 0.9854. For the model focusing on the wealth class 2 and having 20 epochs, the training accuracy was around 0.9884. The testing accuracy was around 0.9863. For the model focusing on wealth class 3 and having 10 epochs, the training accuracy was around 0.8905. The testing accuracy was around 0.8995. For the model focusing on wealth class 3 and having 20 epochs, the training accuracy was around 0.8935. The testing accuracy was around 0.8868. For the model focusing on wealth class 4 and having 10 epochs, the training accuracy was around 0.6498. The testing accuracy was around 0.6517. For the model focusing on wealth class 4 and having 20 epochs, the training accuracy was around 0.6534. The testing accuracy was around 0.6468. For the model focusing on wealth class 5 and having 10 epochs, the training accuracy was around 0.4735. The testing accuracy was around 0.4917. For the model focusing on wealth class 5 and having 20 epochs, the training accuracy was around 0.4747. The testing accuracy was around 0.4741.

For the second of models, I used the numeric and bucketized types of feature columns on all the features. For the model focusing on wealth class 2 and having 10 epochs, the training accuracy was around 0.9869. The testing accuracy was around 0.9893. For the model focusing on wealth class 2 and having 20 epochs, the training accuracy was around 0.9887. The testing accuracy was around 0.9834. For the model focusing on wealth class 3 and having 10 epochs, the training accuracy was around 0.8890. The testing accuracy was around 0.9034. For the model focusing on wealth class 3 and having 20 epochs, the training accuracy was around 0.8905. The testing accuracy was around 0.8907. For the model focusing on wealth class 4 and having 10 epochs, the training accuracy was around 0.6415. The testing accuracy was around 0.6673. For the model focusing on wealth class 4 and having 20 epochs, the training accuracy was around 0.6544. The testing accuracy was around 0.6576. For the model focusing on wealth class 5 and having 10 epochs, the training accuracy was around 0.4841. The testing accuracy was around 0.4585. For the model focusing on wealth class 5 and having 20 epochs, the training accuracy was around 0.6034. The testing accuracy was around 0.6068.

For the third set of models, I used the numeric type of feature columns on all the features and the bucketized type of feature columns on the age feature. For the model focusing on wealth class 2 and having 10 epochs, the training accuracy was around 0.9875. The testing accuracy was around 0.9883. For the model focusing on wealth class 2 and having 20 epochs, the training accuracy was around 0.9881. The testing accuracy was around 0.9834. For the model focusing on wealth class 3 and having 10 epochs, the training accuracy was around 0.8935. The testing accuracy was around 0.8888. For the model focusing on wealth class 3 and having 20 epochs, the training accuracy was around 0.8905. The testing accuracy was around 0.9005. For the model focusing on wealth class 4 and having 10 epochs, the training accuracy was around 0.6480. The testing accuracy was around 0.6546. For the model focusing on wealth class 4 and having 20 epochs, the training accuracy was around 0.6480. The testing accuracy was around 0.6439. For the model focusing on wealth class 5 and having 10 epochs, the training accuracy was around 0.4741. The testing accuracy was around 0.4702. For the model focusing on wealth class 5 and having 20 epochs, the training accuracy was around 0.4750. The testing accuracy was around 0.4829.

Overall, for all sets of models, as the wealth class number increases, the accuracy values decrease. Changing epochs from 10 to 20 appears to mainly cause training accuracy values to slightly increase and testing values to slightly decrease. However, I am not sure if this cause and effect can be applied to increasing epochs in general. To be sure, I would increase epochs from 10 to 100 or a much larger number in general.