A Big Ol' Page of Resources

Data and CS

Where I Started with this Stuff

Where I'm at today in my skillset/career is the result of a ton of trying, sucking, pivoting, and all around figuring it out as I went. I didn't know a damn thing about programming (hell, I barely knew how to use Excel) until my Senior year of college. To anyone trying reading this post to figure out how to get started, I feel that it bears benchmarking where I was when I decided to go down the Data Analyst rabbit hole. There's a whole lot of great stuff to engage with, and as far as I can tell, no best road map to navigate it with. This is how I got started, but YMMV.

  • I was studying to be an Actuary for a good while. Got through Calc 1-4, Linear Algebra, Stats, Probability, Interest Theory. Didn't have good study habits and hit a real block in my Exam progression. I definitely use all of them in one capacity or another, but wish I'd understood how absolutely fundamental solid Linear Algebra chops are.
  • Near the end of undergrad, I started dipping my toes into Computer Science, and if I could do it all over again, I'd have, for sure, focused on this route. Learned the fundamentals (data structures, abstraction, OOP, resource management) in C++, took a Discrete Math class, and then they handed me a diploma before I could take any more. Wound up graduating with a B.S. in Econ, but with a far greater micro than macro focus.
  • And so during my last semester, while I was only taking two classes, I decided to simultaneously learn R and Python. I'd read all over that they were great tools to have in the Data Analyst role I was entering, but was unclear which would be better. I read The Art of R Programming and Learn Python the Hard Way, but unfortunately didn't find myself much closer to picking one. Then I stumbled across a blog where using Python, a guy had:

I was sold.

My Python Essentials

For anyone getting started with programming, one of the biggest barriers to entry is, IMO, the frustration of getting set up properly (read: why I still don't know Scala). Save yourself the headache and download the following:

  • The Anaconda distribution of Python, which pre-configures a whole mess of tools and libraries together so you don't have to. Expanding on what comes in the box is also super easy. I still use this today. Also, unless you have a reason to otherwise, I think you should download the latest version of 3.

  • Secondly, get yourself a better text editor than Notepad, where you can write all of your code. I personally love Sublime Text Editor.

To reiterate, when I first learned Python, I already had a pretty fair knowledge base of the underlying concepts-- I was more reading for syntax. HOWEVER, the first two courses in the free Python for Everybody Coursera track appear to be excellent, gentle introductions to Python and Computer Science fundamentals. It's the first place I steer anybody wanting to know the basics. And honestly, once you've got these down, the scope of things that you could learn on your own just explodes.

Another excellent primer is Automate the Boring Stuff with Python, which is about as straight forward as its premise suggests. It gives you a rundown of the basics in the first few chapters and then has a real emphasis on making your bothersome, tedious everyday tasks become scripts that you can run so you can free up time. I love its practical approach and have a copy of this one on my shelf.

After I got down the basics, Python for Data Analysis became my bible. This is such a great book and I'd say a must have for anybody hoping to do Data Anything.

Lastly, here's an excellent demo of the power of Jupyter Notebooks. It's absolutely the ideal that I'm striving for as an analyst. I spend a lot of my time thinking about reproduciblity these days.

Honorable Mention: Grokking

Might be a bold stance, but once you get comfortable using Python, you should really think of Grokking's Algorithms as a required reading. It's crucial in your development as a programmer to understand how things are working and how you could improve them-- all style aside, just because something runs doesn't mean that it's good. It gave me a real intuition for what was going on under the hood and turned me on to so many practical ways to solve problems. Really can't stress enough how much I love this book. I've lifted whole chapters and taken them as topics to Knowledge Shares and had whole rooms of people following along, near-effortlessly, because of how well this book is written.

GitHub

I don't have a lot to say about this one. If you're going to be working on cool projects, it's a good idea to share them. Furthermore, if you need a tool for something, it's likely already been built. Being literate in how people work in an era of Open Source, and considering what workflows/best practices that you could leverage to bang out cleaner and better-structured projects is crucial.

  • I 100% unironically recommend watching Git for Ages 4 and Up for a clear illustration of what Git is and how it works.
  • From there I checked out think-like-a-git to reinforce what I'd learned.
  • Finally I'm using principles I got in this read to shape the way I'm architecting a Git-based Data Science workflow at QL.
  • Git is great. Here's my GitHub account where I'll post every solution that powers the things I write about.

Learning How to Learn

As I outline in the beginning of my Trivial Pursuits in Sports and Web Scraping post, things really started clicking for me after taking the free, 4-week Learning How to Learn Coursera course. There's a ton of invaluable information on the way that your brain works and how you can use that understanding to your advantage as you find yourself up against new information.

Additionally, I absolutely swear by Anki, which is a free flashcard software that you can use to keep useful nuggets of information fresh for just minutes a day. A fantastic alternative to the dopamine kicks you get from checking Reddit/Facebook in those brief pockets you find in everyday life.

Blogs/Websites

I'm certain that it'd take me well over an hour or so to exhaustively find all of the subreddits worth checking out to grow skills relevant to Data Analysis. I've got a couple dozen subscribed on my work account, but I'd say that my big three are:

  • DataIsBeautiful: A collection of awesome/interesting data visualization projects, more often than not with the code/underlying data used to make them.
  • LearnPython: For when Google/Stack Overflow fails you, and friendlier, in my narrow-ish experience.
  • DailyProgrammer: Because practice makes perfect. There are a whole host of programming challenges, complete with test cases and a comment section full of clarifying questions and user-submitted solutions. Learned a TON of neat Python tricks seeing some of the clever stuff people come up with.

Courses

  • The Duke Data Viz in Tableau course was invaluable when I sat down to learn it. In addition an excellent overview of the tool (and a cool "how much do data jobs actually pay" data set), this course provides some seriously fantastic insight into managing business relationships, the best ways to do up-front due diligence that projects require, and a rough primer into visualization theory that will help you make your data stories come across crystal clear.
  • It's in Matlab/Octave (the free Matlab), but Andrew Ng's Machine Learning course is an absolute mainstay in the Data Science community. Great overview of the underpinnings of regressions, Neural Networks, model tuning, ensemble methods, and much more. Took this once a year and some change ago, but am slowly but surely going back through and trying to reimplement the whole thing in Python to solidify my understanding.
  • It wasn't very exhaustive, but Udacity had a really good Intro to Hadoop and MapReduce course that really helped me better conceptualize deserializing my code for parallel processing. Have lifted concepts of this code into some of the datasets at work to do some Process Mining, and am excited to get my hands dirty with the whole Big Data ecosystem. This course really turned that light bulb on for me.

Podcasts

  • Data Skeptic: Alternates between digestible explanations of tools and methods and talks with guests within the industry.
  • Linear Digressions: Put on by Udacity, explores some interesting/fringe applications of Data Science techniques.
  • Not So Standard Deviations: Really enjoy the data-analysis-as-software-development slant that this podcast always champions.
  • Partially Derivative: Two dudes drink and banter about Data Science headlines. Very casual, but often just as interesting.
  • Talk Python to Me: Very great, in-depth conversations with various project creators in the Python community. Lot of great insight into design decisions and the cultural underpinnings of the Open Source language.

YouTube Channels

  • Haven't pored over them, but PyCon uploads the presentations from each year. I imagine there's some real gold here. Here's 2015, 2016, and 2017. Let me know if you find anything particularly cool!
  • Similarly, I attended PyData DC last year and had an absolute blast (which, in turn caused me to spin up this whole blog thing). Same deal, give some of these a watch.
  • Thrilled that The Coding Train has gotten so much more traction since I came across him awhile back. A good deal of algorithms are based in concepts learned from the Natural Sciences and it's clear that Schiffman has a strong grasp on these underlying principles, which makes him extremely effective in writing good code to emulate them. He does most of his work in a Java variant, but the clarity of his explanation of his algorithms is only outmatched by the enthusiasm and goofiness that he does it with.
  • Sentdex is going to take over the world and be a benevolent dictator, I'm sure of it. His channel is a treasure trove of Python tutorials on everything from web scraping, to Raspberry Pi, to using Image Recognition/Neural Networks to teach his computer how to drive in Grand Theft Auto V, and all kinds of in between.
  • Lastly, I'd be remiss if I didn't give a shout out to the guy that got me through all 4 steps of undergraduate calculus, PatrickJMT.

Sound Mind → Well-Oiled Machine

Books

I've gotten so much out of the following that I'm making an effort to revisit them multiple times over the coming years, with the new wealth of experiences and the perspectives they bring.

  • How to Win Friends and Influence People wouldn't still be in circulation, 80+ years later if it weren't rife with practical insights. I know I had a baseless presupposition of the kind of person that would pick up this book, and in that regard, I'm happy that I got over myself. It's a very conversational read that, once I seriously thought about and practiced, helped reshape the way I carry myself and perceive the intentions of others. If you don't consider yourself an affable people person (HARD introvert here), I hope you don't sleep on this book. There's a lot of good in it. Also it taught me the phrase "you unmitigated ass", so...
  • I picked up The Phoenix Project at a time where I was thinking a ton about organizational workflows. I kept having sinking feelings that I and the others around me could be doing more impactful things with our time, but couldn't articulate precisely how, nor what was holding us back. The book did an excellent job shaping the thoughts I was having, and I'm happy I got my hands on it. If any of this sounds familiar, I'd invite you to check out the (admittedly lengthy) write-up I did on the topic.
  • One of my favorite quotes from Getting Things Done that, in my opinion, captures the whole spirit of the book is "You should never have a thought twice... unless you like having that thought." The ultimate goal of this method is to get disciplined in contextualizing, prioritizing, and confidently acting on the litany of To-Dos and disparate facts that fly your way every day, which as someone who is a thoroughly a creature of habit and near-superhuman levels of forgetful, this hit me square between the eyes. I adopted the GTD method into my everyday life a year and some change ago and it's been a huge game-changer for me. Don't see myself ever going back.
  • Not a book, but at the time of writing, I'm almost done with the third installment of twelve in Jordan Peterson's lecture series on meaning. I hit the escape velocity of "Get to a good school, then do well in that school, so you might be lucky enough to get a good job that you enjoy." But with those boxes checked, I'll confess I really didn't give much forethought beyond that. As an avid lover of film, and by extension stories, his framing of self-actualization and meaning using narrative devices and a background in psychology is really finding me at the right time in my life and helping me answer "What am I all about?"

Tools

  • Nirvana. I'm confident that I wouldn't have been able to keep up with GTD for as long as I have if Nirvana didn't make it so damn easy. Syncs across a slick web app to mobile devices with all kinds of customization options to figure out a system that works for you. I've got some 30 projects to categorize Actions into ranging from topic areas that I want to study to "Chores and Errands," to having a thought-repository for the latest blog post I'm working on. You can star actions to blind your focus and give yourself a more manageable ToDo list-- mine seldom exceeds 5-7 and I keep a running tally of how many consecutive days I hit all of them. Reinforce good habits, right? Of course, this only works as much as you invest time into making it work for you. I've overhauled the project/tag structure a few times since starting, will pop into each project and order each item once or twice a week to make sure I still have a good handle on the work ahead. But ultimately, what this allows for me is feeling absolutely bulletproof to that creeping, "I feel like I'm forgetting something" feeling, and gives me an avenue to set reasonable expectations for what I want to get out of a day... and a clean method for picking up the next best thing if I'm killing it for whatever reason.
  • Daylio. I've been using Daylio 5 times a day since November '16 to track my mood/mental health against all kinds of activities, behaviors, and conditions. It gives some, not a lot of insight into some of the wacky A/B testing you might be trying to do with yourself, so I wrote some tools to supplement it.
  • Checky. Every day at noon, I get a push notification from myself telling me how many times that I picked up my phone to rush for a dopamine kick. It's a habit I'm really trying to get better at, but it's damn tough to kick. Assuming I'll be awake for 16 hours a day, I make an active effort to have a sub-60 check count-- or about once every 15 minutes or so. My max streak is 3 days, lol

Before I trudge on to a bunch more links, real talk, getting a handle on your mental health is far more important than any rabbit hole you could fire off into as an offshoot of this page. From experience, unchecked depression, anxiety, Impostor Syndrome... they're no joke. Frankly, no amount of career success, side-hustle achievement, League of Legends ELO, 10s of blog readers, or otherwise, are an appropriate stand-in for feeling a sense of comfort and easiness being in your own skin. If you don't, I'd urge you to do yourself a favor and talk to someone, you can only mitigate so much on your own. The rest will always be there. The common denominator to anything you do is you. Take care of that.


Moderately-Insightful Procrastination Fuel

These are things that are thought-provoking or interesting, but not necessarily good tools to learn new skills or use as a reference for future projects.

YouTube Channels

  • Tech
    • I'd be remiss if I didn't mention the partner channels Numberphile and Computerphile which basically make topic videos on goofy math conundrums or computer science tricks and observations. Endlessly entertaining and make me, a thorough math-in-pencil kind of guy, want to write on paper towel in sharpie.
  • Movies
    • Lessons from the Screenplay routinely does excellent genre and movie-specific deep reads to talk about what makes a good screenplay. Tons of character studies, script-to-movie changes, and general structure topics. Really great stuff.
    • You can blame Every Frame a Painting for firmly cementing Hot Fuzz as my favorite movie. Every video on this channel is pure gold and provides rich insight into elements of movie-making I'd never even considered. Unfortunately, however, posts started to slow down and there's been a dead air for nearly a year since the last video. But that's crazy understandable considering how in my head I get about pumping out blog posts at a fraction of the production value, haha.
  • Video Game
    • Mark Brown is probably my favorite channel on YouTube. He does some fantastic game design analysis on everything from "what makes a fun game mechanic" to talking about the brilliance of the procedural generation that some games use. His Boss Keys series is a must-watch for any Zelda fans, IMO, and if nothing more, this guy has been my lifeline for finding cool little indie games for the past couple years. Can't say enough good things about this channel.
    • Game Score Fanfare. I'm excited about this channel to take off. I love music. I love games. Similar to the way that Every Frame a Painting turns you onto design choices that go into movies, this channel's angle is all about the way that music plays such a crucial role in game experience.
    • Core-A Gaming. This one's crazy. Guy does some super-cool topic videos on the fighting game genre. He's got a hard topic bias toward the Street Fighter series, which I'm more or less clueless about. However, I keep a mild pulse on the Smash Brothers scene and play a few times a week at work. This guy's analysis of things like Mind Games, Balance Patch Design Paradigms, and Expression through Character Selection put me on my ass. Such a well-done, if niche, channel.
    • NakeyJakey. This one's way out of left field. Jake is an absolute goof. Like a hybrid between Dr. Steve Brule and VideoGameDunkey. But he makes some seriously cogent points about games between the punchlines and hilariously awkward editing. I've lost more hours than I'd care to admit binging this whole channel. It's glorious. It's original. I'm pretty sure it's gonna take off soon.
  • Lastly, if you're not watching CGP Grey, you're missing some of the very best YouTube has to offer. He's the XKCD of info-videos in both the clarity and the ridiculous breadth of topics.

Podcasts

  • More Perfect. Looking at pivotal supreme court cases and picking apart the nuance of why they were so groundbreaking/important at their time. Interviews with professors and jurors. Seriously good stuff.
  • Revisionist History. Malcom Gladwell (the 10,000 hrs guy, among other things) does a ton of digging and investigation to give a new spin on stories from history-- everything from NBA to Renaissance-era art
  • The Q&A. Host has some excellent interview chops. Fun, often with really great guests right AS they're releasing movies, so you get a lot of behind the scenes stuff
  • I Hate my Boss. Kind of a tough recommendation. The show is half cringey (and wildly missable) "comedy" sketch, half Q&A about a wide range of workplace related topics, such as sexual harassment, feeling undervalued, incompatibility with the job, finding your strengths. I'm either consistently rolling my eyes or writing down things I want to keep thinking about with this one. Worth checking out if you don't mind doing some sifting.

The Rest

YouTube

  • Blind Covers is hands-down the most exciting/inventive thing I've seen on the Internet. They grab a band, give them the lyrics to a song they've never heard before, and then give them an hour to-- completely blind, save for the lyrics-- come up with a cover for a song that's often a wildly-different genre. The host teases out some hilarious banter with the artists, the production value is top notch, and on more than one occasion I've had to throw the covers up on repeat because they were so damn good. I really can't fathom why this channel doesn't have at least 20-30 times the viewers it does...

Comics

  • I'd recommend the Matt Fraction run of Hawkeye to anyone not terribly familiar with comics-- it's a great standalone series. It's got an entire issue dedicated to the dialogue-less adventures of his animal companion, Pizza Dog. Come on, now.
  • Invincible is equal parts hyper-violence and quality writing.
  • For anyone looking for something weirder, I loved Scud the Disposable Assassin.
  • Even weirder yet, check out God Hates Astronauts.

That's All She Wrote (For Now)