Jon Nelson

(858) 208-8699 · jonnelson518@gmail.com

Data Scientist/Analyst & Product Management Specialist with an education in data science, economic analytics and a demonstrated history of working in the information technology and services industry. Pursuing opportunities in product management and various data analytic roles to begin a career focused on deriving the stories being told within data through exploratory data analysis, data manipulation, and machine learning.


Projects

MLB Home Run Exit Velocities

General Assembly

One of the most recent debates in Major League Baseball is focused on why more home runs were hit in the 2017 season than any other season in the leagues history. There were 6,105 home runs hit in the 2017 season, which is more home runs than in the peak of the steroid era and everyone wants to know why. One specific area of interest for investigation is related to the most important item to the game, the baseball. During the 2017 season there were numerous complaints from major league pitchers that the ball felt different and the result was a record breaking year for home runs.

Thanks to Baseball Savant and their StatCast technology we can begin to understand what is happening by analyzing the data tracked on every pitch that was hit for a home run. After a high level review of the data and some outside research I found a specific stat known as the Home Run Exit Velocity which is defined by Baseball Savant as the speed at which the baseball leaves the bat after being hit. Home Run Exit Velocity will be the target variable for my investigation into what is causing an increase in home runs.

Using machine learning techniques I will build a production level regression model to draw a conclusion on what features involved in a home run are most influential to the home run exit velocity. The data that will be used to train this model will come from three different aspects:

  1. StatCast pitch data for all home run hits
  2. Data for a sample of baseballs used in the 2015, 2016 and 2017 seasons
  3. Baseball player personal stats (height, weight, age)
July 2018 - October 2018

Horse Racing Marketing Evaluation

Personal Project

In this assignment I was asked to analyze the total number of bets made in each of the first five months of the 2018 horse racing season. The instructions indicated that in the fourth month of the season a $35,000 marketing campaign was launched to increase the number of bets being placed and I was tasked with evaluating the success of the campaign. After exploratory analysis, I used machine learning modeling techniques to create a model to predict the total number of bets placed within each wager, this allowed me to identify the most influential features from the dataset contributing to the number of bets being placed. Using both my analysis and modeling I then provided my evaluation of the launched marketing campaign along with a recommendation to the marketing team on what to consider for future campaigns.

November 2018

West Nile Virus disease prevention

General Assembly

In order to prevent further outbreaks of West Nile Virus (WNV) in the city of Chicago the Department of Public Health has been collecting historical data on the weather, mosquito traps and the areas of the city where pesticides were sprayed. Using this data, my team will investigate the factors that are contributing to the population of mosquitoes and build models to predict which mosquito traps will have the highest percentage chance of capturing mosquitoes with WNV. From this data, my team will recommend the optimal location and times of the year to spray mosquito pesticides to decrease the mosquito population and ultimately decrease WNV outbreaks.

July 2018 - October 2018

Subreddit Analysis using Natural Language Processing (NLP)

General Assembly

Applied Natural Language Processing (NLP) techniques to analyze text data within posts from both the Space X and NASA subreddits. With the public Reddit API I was able to web scrape both of these subreddiits to obtain the text data needed to perform the necessary NLP techniques. Using CountVectorizer and TfidfVectorizer I was able to identify the vocabulary words and phrases that held the most influence in each of these subreddit posts. Armed with this information I was then able to apply classification modeling techniques to predict if a post originated from the Space X or NASA subreddit.

July 2018 - October 2018

Predicting Home Sale Prices in Ames, Iowa

General Assembly

Explored a dataset containing various different housing features in order to create a production regression model that could predict the overall home price fo homes in Ames, Iowa. Used feature engineering techniques (Variance Threshold, SelectKBest) to ensure the model was making predictions based on the features that were most influential to a home’s sale price. Applied additional model tuning techniques (GridSearch) to ensure the best hyper parameters were being chosen for use in predicting the overall sales price.

July 2018 - October 2018

Experience

Technical Business Systems Analyst

Advisors Asset Management

  • Technical requirements liaison between the financial asset professionals within AAM and the software development team
  • Using Microsoft SQL Server Studio execute queries on Staging, QA, and DEV database environments to analyze data for further defining requirements, troubleshooting production issues, and testing developed features
  • Using agile methodology techniques encourage the software development team to improve alignment to agile principles to be more efficient during the software development lifecycle
  • Communicate my created requirements, identified bugs, and product vision documents with onsite, out of country, and out of state development teams via Confluence
  • Define requirements using the following applications on a regular basis: Confluence, Cacoo (NuLab), Balsamiq, SnagIt, SQL, Excel, Word, and PowerPoint
  • Technology stack varies amongst applications from C#, .NET, VB but all using Microsoft SQL Server databases with varying front end solutions from Angular to Bootstrap

February 2019 - Present

Technical Product Owner

NextGen Healthcare Inc.

  • Voice of the customer in all roadmap committee meetings with Product Management and R&D stakeholders when defining the vision for the product
  • Defined project scope, goals, and deliverables to ensure consistency with company strategy and commitments for SCRUM development teams
  • Conducted various forms of research, including contextual inquiries, usability studies, surveys, competitive analysis, and remote research
  • Using reporting tools within JIRA, produced executive level status reports while managing the development status, issues, and escalations
  • Responsible and accountable for the tactical direction on executing the go-to market strategy for new features and fixes to production defects
  • Technology stack was a JAVA based back end, OracleDB with a JAVASCRIPT, CSS, HTML front end solution

January 2017 - May 2018

AVP, Senior Business Systems Analyst

Union Bank (MUFG)

  • Assigned to a project team to implement a web application (RADAR) devoted to ensuring that Union Bank’s Residential Lending division is in full compliance with the current (and future) residential banking regulatory requirements
  • Generated reports through Microsoft SQL Studio for end users and project stakeholders on a routine and ad-hoc request basis
  • Utilized JIRA to document requirements elicited from users of the application through scheduled meetings, one-on-one sessions and trainings for ideas on features to be developed to improve the application
  • Performed the manual internal testing for the entire web application from the development team perspective and assisted the QA automation developer to ensure that all use cases were covered

August 2014 - December 2017

EHR Implementation Product Specialist

HealthFusion MediTouch

  • Worked with doctors and medical staff to train and implement a software system (MediTouch) that provides doctors an electronic patient health care documentation service and an electronic claims billing service
  • Provided overall technical support to clients with an emphasis on software issues, and business practice implementation
  • Advised and instructed doctors on how to utilize the electronic medical record application to complete the requirements for all complex government incentive programs (Meaningful Use Incentive Program)
  • Maintained the user identified bug list, improvements list and new features request list and presented all of these items to the development team for release prioritization into the application
  • Created training documents and recorded webinars for the users of the application to assist in their implementation of the application within their practice
  • Technology stack was a JAVA based back end, OracleDB with a JAVASCRIPT, CSS, HTML front end solution

August 2010 - December 2012

Education

General Assembly

Data Science
Immersive Program
July 2018 - October 2018

University of California, San Diego

Bachelor of Science
Mangement Science
September 2012 - June 2014

Blogs

Data Analysis : Bryce Harper Home Runs

Medium

This blog post will provide a Home Run scouting report for Bryce Harper using the data from my capstone project.

Setting up Flask …from a beginner

Medium

This blog post will walk through the process of setting up Flask.

Ensemble Models and methods (averaging method and boosting)

Medium

This blog post will define enemble modeling and the methods that are used within these modeling techniques.

The Perfect Multiple Bar Chart

Medium

This blog post will walk through the process of creating the perfect multiple bar chart within MatPlotLib.

The Mysterious Pickle

Medium

This blog post will walk through the process of pickling an object for future use within Python.

The Python/SKLearn Modeling Process

Medium

This blog post will walk through the process to instantiate and score a model using Python with SKLearn.


Skills

Programming Languages & Tools

  • Python
  • Pandas
  • Numpy
  • Seaborn
  • MatPlotLib
  • SKlearn
  • Tableau
  • Git
  • GitHub
  • SQL
    • Microsoft SQL Server
    • Oracle DB
  • Jupyter
  • JIRA
  • Confluence
  • Slack
  • Cacoo
  • Balsamiq

Workflow

Interests

I grew up a basketball player and have been a huge sports fan for my entire life. I appreciate all sports but some of my favorites are basketball, football, golf, hockey and baseball.

I love the snow and traveling up to the mountains for some snowboarding. When also outside I love to hike micellaneous areas in southern California.

When I find myself inside I am practicing my Python, reading about my favorite Data Scientists and reseaching a new Data Science problem to solve.


Training & Certifications