Model Evaluation and Validation: Predicting Boston Housing Prices

Model Evaluation & Validation

Project 1: Predicting Boston Housing Prices

Machine Learning Engineer Nanodegree

Summary

In this project, I evaluate the performance and predictive power of a model that has been trained and tested on data collected from homes in suburbs of Boston, Massachusetts. A model trained on this data that is seen as a good fit more ...


Identifing Fraud from Enron Email and financial data

Summary

In 2000, Enron was one of the largest companies in the United States. By 2002, it had collapsed into bankruptcy due to widespread corporate fraud. In the resulting Federal investigation, a significant amount of typically confidential information entered into the public record, including tens of thousands of emails and detailed financial data for top executives. In this project, I'm putting my new skills by building a person of interest identifier based on financial and email data made public as a result of the Enron scandal.

more ...


Wrangle OpenStreetMap Data of Gastein Valley Austria

Wrangle OpenStreetMap Data of Gastein Valley, Austria

Summary

Third project of the Udacity Data Analyst Nanodegree covers the aspects of the data wrangling: gathering, extracting, cleaning and storing the data. I was supposed to choose the region and wrangle the data set of the OpenStreetMap. I’ve chosen the place where I live - Gastein valley - one of the most popular tourist destinations in Austria, which is famous for the great skiing area and thermal water. In my project I’m going to concentrate on the accommodation data (hotels, apartments, etc.) as it’s always important for the travel industry.

more ...

Titanic: factors to survive

alt text

Summary

RMS Titanic was a British passenger liner that sank in the North Atlantic Ocean in 1912, after colliding with an iceberg during her maiden voyage from Southampton, UK, to New York City, US. The sinking resulted in the deaths of more than 1,500 passengers and crew, making it one of the deadliest commercial peacetime maritime disasters in modern history. Using the provided dataset and the knowledge gained in Udacity Data Analyst Nanodegree, I’ll try to identify factors made people more likely to survive.
I'm curious to know if there were any difference in survival chances between males and females; passengers of different classes and age groups. I'm also curious if the rule "Children and Women first" worked at Titanic. Let's check it in the study.
For the current study I'm using the jupyter notebook, python and the number of libraries (pandas, numpy, matplotlib and seaborn).
more ...


STROOP EFFECT: Statistical investigation of the psychology phenomenon

Summary

In psychology, the Stroop effect is a demonstration of interference in the reaction time of a task. When the name of a color (e.g., "blue", "green", or "red") is printed in a color not denoted by the name (e.g., the word "red" printed in blue ink instead of red ink), naming the color of the word takes longer and is more prone to errors than when the color of the ink matches the name of the color.[1] I have a doubt that incongruent word condition cause larger response delay. To clarify, I’m going to analyse the proposed data set, which contains repeated measures of the participant’s performance on the congruent and incongruent task.

more ...