Never Stop Learning

Saturday, June 20, 2020

No comments
Ten months ago, I was watching my three girls play in the living room while drinking my morning tea. A sudden thought crossed my mind: What do I want to be when they grow up? 

Before I became a Mom, I spent a decade in the semiconductor industry. I already had experience there, but what can I do to make myself more effective in any industry. The search began on what is trending now and I stumbled upon the Data Analytics & Visualization Program at UT. I have a Computer Science degree and after going through the curriculum I knew what I wanted to do. 

It wasn't easy for sure. But, as I always teach my kids, nothing is hard... stay focused and you can achieve anything.

Today, I got a job offer. Hard work pays off. So never stop learning.


Read More

Oh Tableau!

Monday, June 15, 2020

No comments
Last week I completed both the Desktop II: Intermediate and Desktop III: Advanced e-learning training on Tableau. I am currently taking full advantage of their 90-day free access to their online training courses. They have learning paths to choose from. So far I've taken the Designer, Analyst and Data Scientist paths. The visual analytics training that was part of all three paths was unfortunately only offered as classroom training and costs $1400. 
So with all the learning I've been doing, I wanted to play with a data set on first names. I am part of a Mom group on Facebook and there is always someone asking for name recommendations for their upcoming baby. The data set consists of the relative frequency of given names in the population of U.S births where the individual has a social security number from the year 1910 to March 3, 2019.

Some questions I wanted answered were:
  • What is the most popular male/female name of all time?
  • What is the most popular male/female name per year?
  • What are the top 10 popular names in each state?
  • When did the name first appear in the United States?
  • What is the number of occurrences of each name per year?

After combining all the text files (data was segregated per state) in Jupyter Lab using Python Pandas, I had a dataframe of 6028151 rows x 5 columns. I then exported this into a csv file which had a size of about 155mb.

I still had one day left in my Tableau Desktop trial so I was excited to start creating some visualizations. I connected to the csv file and was ready to create a simple crosstab. Unfortunately, my Tableau fun was put to a halt. It was running the query for a good 4 minutes before I had to cancel it. I tried it several times but I encountered the same problem. Was my file too big?? It shouldn't be considering it's only 6 million rows. Or am I wrong to assume that Tableau can handle large data sets. What was I doing wrong? I now have new questions unrelated to my name project and will have to play around Tableau Public. Hopefully I find my answers. For now, I revert back to Python to answer my initial inquiries.
    


Read More

Desktop I: Fundamentals

Friday, June 5, 2020

No comments
One of the tools we learned during the bootcamp was Tableau. The first time I used it, my first thought was - "Excel on steroids!". Using CitiBike data, I created some visualizations for the year 2018 as shown below (2019 only had a half year's worth of data). You can also see the interactive dashboards on my Tableau Public account.






It was one of my favorite tools to learn during the bootcamp but we didn't have enough time to dive into all of the features. Tableau is currently offering a free 90-day eLearning (offer ends on June 30), so I am taking advantage of it.


Read More

Data Analytics and Visualization Bootcamp: A full review

Friday, May 29, 2020

No comments



My quick, right off the bat review would be: It wasn't easy! Right after I finished my final presentation last Saturday, I posted about it on my Facebook just because I was feeling proud of myself on what I was able to accomplish in six months. Five minutes later, a friend of mine sent me a direct message asking me tips and advice because she was actually thinking of doing the bootcamp and was elated to see my post. Someone she personally knew went through it... so here is my full review.


A little background on why I took this course: I took a career break to raise my three children. I am at a point where I was searching on what I want to be when they grow up. In every job I was employed at in the past, I always tried to find ways to make a process more efficient not just for myself but for the team I worked with. I utilized the tools that were available to me at the time to accomplish this. And then I stumbled upon Nadieh Bremer. Well now I want to be her! I came across the Data Analytics bootcamp and the course's goals and objectives sparked my interest even more to pursue that path. Everything these days are data driven and as data continues to grow, the industry will need more people to handle it. So I took the first step to become that person I want to be: a Data Analyst with expertise in Data Visualization Design. 


Before I proceed, this is not a paid review from the University of Texas in Austin or Trilogy Education Services. This review is my sole opinion of the course and some tips and advice based on my own personal experience.


Class Structure 


I enrolled in the 24-week part time program. Classes were three times a week. Monday/Wednesday or Tuesday/Thursday from 6:30-9:30pm and Saturdays from 10-2pm. They covered the fundamentals of Data Analytics and Visualization. While there is a set curriculum, it can change based on current market demand. Our instructor extended Python learning, for example, and also did more in-depth lectures on Machine Learning than what the original curriculum was set for. We also skipped some topics that were no longer commonly used in companies. Again, this will depend on the instructor's industry experience. If you are particular about a certain topic that you would like to learn, make sure to ask beforehand that it will be covered during the course. It is very fast-paced and unless you have experience or knowledge on the topic being discussed, you will need to do a lot of additional research on your own. I have a computer science degree, and I found myself adding 25-30 hours on top of the 10 hour class time per week. It can get overwhelming sometimes because just as you are about to grasp a certain topic, you will be immersed in something totally new to learn. Be prepared to invest a lot of your time to succeed in the course. If there is one topic I would suggest to learn before you start the course, it would be Python. At least start with the basics.


Homework


There is homework every week to do hands-on practice on each topic. Treat these as mini-projects that you can later on add to your portfolio. Try to go above and beyond the requirements. The lectures alone will not cover everything to get the homework done, so use the many resources that is available to you. From the class exercises and examples to the wide variety of online resources such as Stack Overflow, geeksforgeeks.org, Coursera, Codecademy, Udacity, Udemy, etc. Google will become your best friend.


Project work


There are three major projects. Each of them will cover the topics discussed between each project. You will be working in teams of 4 or 5 people. This is a great setting to practice team work. I volunteered to manage all three projects and present two of them to my cohort. I used to present often to executive management so presenting wasn't a big issue for me, I figured it would be good practice. I was, however, unfortunate to be grouped with some individuals who did not have very good communication skills. When we shifted to online classes, it was even harder to communicate with them. No replies and no updates to inquiries about their assigned tasks until after several follow ups. Sometimes no reply at all or sometimes replies at the last minute. As the project manager, it was my responsibility to see the project through. I ended up doing some of the tasks assigned to other group members on top of my own task. When a task was too big to handle on my own or too much work to divide among other team members, we cut the scope. It was frustrating but I used that opportunity to learn something new instead. In the end, the projects were completed on time. It is understandable that many students in the class have no IT background, however there needs to be a willingness to learn and a willingness to participate in a team to make the project successful. The key takeaway here is, when you find yourself in that group make sure to communicate often. It is okay if you don't know what to do or where to start. Every student there is learning. Be open about it and you will get the support you need from both your team mates and the teaching assistants.


Student support


There are two teaching assistants during the weekday classes and three teaching assistants during the Saturday class. All of them are very knowledgeable and very willing to help. I am the type of person who does research first to try to solve the problem so I didn't utilize the teaching assistants as much. I only asked them when I failed to find the solution after many hours of trying to solve it. Some of my classmates also used the tutors that are available for a one hour session per week. I never signed up for those so I can't give a review on it but according to the classmates that did use it, they said the sessions was very helpful to them. Our instructor was also very knowledgeable in a vast amount of topics. 


Career Services


The bootcamp comes with career services that will help you build your professional portfolio up to par to meet current market demand. You will receive invites to tech talks in companies based locally or virtual tech talks with a panel that have relevant industry experience. The program manager, the career consultants, and the career director have all been very helpful. I am still in the process of doing this but so far this experience with them has been awesome. 


Cost


The full course is $11,500. It is not cheap and definitely considered an investment. They do have payment plans that you can ask about when you inquire. Because of COVID-19, the bootcamp is now currently fully online. I started classes in November of 2019. So I had the opportunity to experience both in-person learning for four months and online classes for the remaining two months. I am not sure if they adjusted the cost because of the shift to a fully online course but what I paid for was the quality of education from a reputable university AND the use of world class facilities. Classes were held at UT's Red McCombs School of Business building and, if I am not mistaken, is currently the newest building in the campus. I looked forward to going to class each time and utilize what the building had to offer. Also, the ability to be able to just chat with your instructor, your classmate or your group mate at any time during the class to ask about homework, project or the current lecture was not the same when the classes shifted to online. So if your are the type of person that prefers a classroom setting, I suggest to wait until COVID-19 is over, when everything is back to normal. While the quality of course did not change, in-person learning was a just better experience for the price.


Final thoughts


The whole system is setup efficiently. From retrieving more information you will probably request, getting that first phone call to chat with an advisor, getting accepted into the program, gathering all the needed requirements prior to starting the class, submitting homework and projects, to seeking help with both course work and career transformation. 


I highly recommend the bootcamp. You will gain a vast amount of knowledge on tools that are currently used in the world of data analytics. You will gain new friends and professional connections from different backgrounds, they too can be learned from. There were students who dropped out mid-way because it became too overwhelming. It will not be easy but if you have the mindset that nothing is hard, that you can accomplish anything, you have a set goal, and you have the willingness to learn (even when some nights will require sleeping at 2am) then you will succeed in the course and get your money's worth. Plus you will have a certificate from a very reputable university to show for all the hard work. Good luck! 



Read More

Catching up!

Sunday, May 24, 2020

No comments
I officially graduated from the Data Analytics and Visualization Bootcamp yesterday. It has been an amazing experience. I will write a full review on the course in my next blog post.

The past five months have been really hectic. When Covid-19 hit, everything had to change almost overnight. One day I am in a physical classroom, the next class we quickly had to shift to online classes. Suddenly my elementary age kids also had to do distance learning. We all had to scramble and adjust to the new norm. But despite the sudden changes, we were able to make it to the end of the bootcamp! 

Below are some of the projects I worked on since my last post. It's amazing to see how I've progressed in just six months. From not knowing anything about all these tools to seeing them working, is truly an accomplishment for myself. It wasn't easy for sure and there is so much more I need to learn. But with perseverance and hard work, anything is possible.


This project analyzes the weather of 500+ cities across the world of varying distance from the equator. The objective was to build a series of scatter plots to showcase the following relationships: Temperature (F) vs. Latitude, Humidity (%) vs. Latitude, Cloudiness (%) vs. Latitude, and Wind Speed (mph) vs. Latitude.

This site provides the source data and visualizations created using Pandas and Matplotlib with observable trends.






A Flask web application that scrapes various websites for data related to Mars, saves the information in a database using MongoDB and displays the information in a single HTML page.




An interactive website (using javascript, html, css, d3.js) displaying UFO sightings from January 1-13, 2010 which allows users to filter the table data for the following values:
1. `date/time`
2. `city`
3. `state`
4. `country`
5. `shape`



An interactive dashboard (using javascript, flask-sqlalchemy, plotlyjs, flask-api, Heroku) to explore the Belly Button Biodiversity DataSet




This project is an interactive scatter plot using D3.js.



Citi-Bike Analytics

Interactive dashboards using Tableau and probably one of my favorite tools. I truly enjoyed creating all these visualizations.





FundWatch (my second and third major project)

A personal expense tracker web app built with:

  • Python Flask
  • MySQL, PostgreSQL Database
  • Plotly for data visualization
  • HTML, CSS, Bootstrap, Javascript and D3 for front-end web development
  • Scikit-Learn (Isolation Forest Algorithm) for Fraud Detection
  • ARIMA Model for Expense Prediction
The original challenge of the project was to automate the process of manually tracking expenses. This was accomplished for project two via FundWatch - a web based expense tracking app. The second release of the app, project three, included user profiles, a login page and focused on how artificial intelligence can provide the user some insights on the next month's budget and possible fraudulent activities on past expenses. 

It was great teamwork between my group mate and myself to get these projects up and running. I had a vision of what I wanted this app to look like. With my group mate's experience as a developer, we were able to bring it to life! Shout out to you, Carlos, if you are reading this! It was a pleasure working with you! 










Read More

Why Oklahoma Shakes

Friday, January 24, 2020

No comments
To practice the skills we've learned since November (Python, Pandas, and Matplotlib), we were given our first project. Our team chose to work on analyzing why Oklahoma has been experiencing earthquakes more often than California. I was able to find good data sets from The University of Oklahoma Survey and the Oklahoma Corporation Commission, and a good API via the United States Geological Survey. Visit my GitHub for more details on this project.

My team's analysis show that waste water disposals are a big factor to the earthquakes. Below is a scatter plot on a map showing the location of disposal wells and earthquakes in Oklahoma.

Using a horizontal bar plot, we can see the comparison between the injection wells and seismic events on a county level. This data indicates that the number of wells does not correlate to the amount of earthquakes.
Same data used above on a scatter plot. The yellow dot represents Carter county, the red dot represents Grant county.
While there is no correlation between the number of wells and seismic event in a specific location, our analysis below indicates that there is a correlation between the waste water volume being injected and the amount of earthquake events. 








Read More

Black Friday

Sunday, December 1, 2019

No comments
I was always excited about Black Friday. The day I wake up really early to get in line at my favorite stores before they open to get the best deals of the day. Plus being super early means you get the best parking spots too! I finish 95% of my Christmas shopping on this day. But this year, for the first time in ten years, I stayed home! I got turned off at stores who started opening the evening of Thanksgiving day. I feel sorry for the workers who have to leave their families to go to work. All for more profit.

The data below shows the potential number of shoppers over the Thanksgiving weekend according to the National Retail Federation. Black Friday still shows as the most popular day to shop. Based on NRF's analysis, the top reasons consumers are planning to shop include:

1) The deals are too good to pass up (65 percent)
2) Tradition (28 percent)
3) It's when they like to start their holiday shopping (22 percent)
4) It's something to do over the holiday (21 percent)
5) It's a group activity with friends/family (17 percent)




So why did I skip the most popular day of shopping? In my experience, retail pricing have changed over the years on the items that I purchased the most - clothing, accessories and shoes. I find the sales throughout the year better than the Black Friday sales. One perfect example is Old Navy. They offered a 50% sale everything on Friday. They do this throughout the year already. Not for everything all at once, but at staggered times on different categories. They may have added an additional 10% off for early birds, but is it all worth the 5am wake up call? Driving in the dark, cold weather? Standing in long lines to checkout? I don't think so. Not anymore at least.

I've noticed the "before" pricing were purposely raised so even with a 50% discount (which sounds very appealing) looks like you are getting a deal, when in reality you are actually paying an amount very close to retail price. I know this because I've observed the pricing differences of certain items I was very interested in.

Bottomline is, shoppers are drawn to a sale. Advertising has led to the belief that Black Friday sales are the best. Perhaps for certain electronics, yes. But for regular items, you need to observe what sale you are actually getting. It really is not that great. I find the after Christmas sales are actually better if you were shopping solely for yourself.

Cyber Monday is coming up. Maybe I will browse to see some deals in my pajamas with a nice cup of coffee. Maybe I will be one of the 68.7M online shoppers. But really, the one question you should be asking yourself before buying anything is: Do I really need this??



Read More