
Programming Language: Logistics Regression, Naive Bayes, Decision Tree, Random Forest, KNN, Linear Regression, Lasso, Ridge, SVM, Regression Tree, XGboost, K - means Python 2.7/3.5 (Numpy, Scipy, Pandas, Seaborn, scikit learn, NLTK), R 3.0, SAS 9.1, SQLĪnalytic Tools: Hadoop Ecosystem Anaconda 4.0 (Jupyter NoteBook 4.X, Spyder), Rstudio Hadoop 2.X, Spark 1.6+ (Pyspark, Sparksql, MLlib ), MapReduce, Hive 1.X,ĭataBase: Data Visualization SQL Server 2008/2014, MongoDB 3.2 Tableau 9.4, Python - matplotlib

Implemented Bagging and Boosting to enhance the model performance.Experienced with machine learning algorithm such as logistic regression, random forest, XGboost, KNN, SVM, neural network, linear regression, lasso regression and k - means.Involved in the entire data science project life cycle and actively involved in all the phases including data extraction, data cleaning, statistical modeling and data visualization with large data sets of structured and unstructured data.Professional qualified Data Scientist/Data Analyst with over 6 years of experience in Data Science and Analytics including Machine Learning, Data Mining and Statistical Analysis.
