Diabetic Nutritionist

Read More

Git:

git

Team:

Sarita Bhateja, Kavya Guruprasad and Nandini Goswami

Technologies:

Technologies: Python, MySql, Scikit learn

Problem Statement

According to statistics shared by CDC (Centers for Disease Control and Prevention), there are more than 29 millionAmericans who are suffering from diabetes and one in four don't know about it.Eighty Six million adults who constituteto more than one in three U.S. adults, have prediabetes.These numbers are alarming and needs increased focus on the dietary needs of these patients. We are trying to develop a system that will help the patient keep a check on their diet in a way that there blood sugar does not shoot up after consumption of a particular food item.Also, if required a better substitute could be suggested.

Dataset Description

For implemeting this system we have merged two data sets where one contained all nutrient values and the other contained Glycemicindex corresponding to food name.Our merged dataset contains food names, their nutrition values and the Glycemic index values. A higher Glycemic index value indicates higher carbohydrate level and hence should be avoided by patients. So, the glycemic index is the class label for our data set.

System Design

We have used the the following machine learning and artificial intelligence techniques to develop the system.

Evaluation and Results

We tried other ML algorithms before coming down to Linear Regression and K-means such as Decision tree but results were not satisfactory. When the linear regression was run on complete dataset, the results were not good enough to proceed. We ran Random Forest algorithm to identify the importance of all the feature and features with least importance were removed. This process was repeated over and over to identify when to stop removing the features from the dataset. This process improved the performance of Linear Regression. The results are not at par, however, good enough to get started with the project. The next algorithm K-means is helping in the adaptation part and providing substitute to the user. In this algorithm, number of clusters to be chosen is an important factor for the algorithm to converge. To identify the best possible number of clusters required, we ran the K-means with number of clusters ranging from 1-10 and used Elbow method to choose the best k (number of clusters). The results were better with number of clusters = 3. The limited number of data points is a weakness because K-means performs good with large dataset. The K-means algorithm with CBR for adaptation is working better than expected and results are quite interesting. When we tested the results against test data points with various scenarios, the results were fair. Also, we shared the results with domain expert to get insight into it. To her knowledge, the results are average and there is scope of improvement. The project has a lot of potential to leverage if given enough data points to analyse in depth. Also, domain expert has a huge role to play in this project.

Other Projects

FoodSalsa

A website for food lovers that will facilitate easy access of menu as per cuisine and food review posts for different food joints within the University.

Read More »