Hairy Sun

Matt's Blog on Some Geeky Topics

Kickstarter for Python and Data Science Screencasts

Down the Training Rabbit Hole

In the past few months I have been doing a bit of corporate training for MetaSnake. I have noticed a few things:

  • Programmers that have used Python in a work setting miss details that I consider “beginning” knowledge

  • Programmers that are new to Python keep coming back to things that I ignore or don’t think about (type checking, attribute protection, compiler warnings, white space)

  • Most content explains how to do something, but neglects the errors. The only recourse is Stack Overflow. Newbies get stuck (and frustrated) on errors

  • There is often more than one way to do something. (This is annoying when you just read import this with the class)

  • What is Pythonic and “idiomatic” in (C)Python is often not in the Data Science stack

  • Data Scientists using Python often use poor programming practices

Introducing Pycast

To that end I’m Kickstarting a new screencast site called Pycast. Pycast will provide subscriptions to two channels, a Python channel, and a Data Science channel. Users can subscribe to one or both channels.

My aim is to publish content in both channels on a weekly basis. I want it to be at a reasonable price. One month will cost about as much as a lunch (US). The content will cover one topic. It will generally be under 10 minutes (you can watch it while you are waiting for lunch), will play nice on mobile devices, be very code heavy, and show hints as well as hangups.

Python Channel

The Python channel will cover topics like:

  • Beginning Python - using IDLE, variables, strings, functions, classes, etc.

  • Intermediate Python - comprehensions, closures, decorators, generators, magic methods.

  • Testing - Unittest, doctest, coverage, CI, linting, generative testing, and more.

  • Environment - Using virtualenv and pip. Reproducing installs, and automation for testing and clean installs.

  • Packaging and Layout - How to layout a package, get it up on PyPi, etc.

  • Standard library - Looking at common modules, and diving into the source.

  • 3rd Party - Review popular libraries, usage, coding styles, etc.

  • 2 vs 3 - Continue stoking the fire here. Writing code that supports both.

Data Science Channel

The other channel I want to start off with is a Data Science channel. I’ve been doing “data” in Python for a while now. There is a lot of demand for Python in this arena. While data science is a somewhat ambiguous term, my content will cover:

  • Collecting data - Scraping, crawling, and automation.

  • Inspecting data - Basic summary statistics and visualizations.

  • Munging data - The fun janitorial work that takes up too much of my time.

  • Analysis and Machine Learning - Classification, regression, clustering, etc.

  • Visualization - Both presenting the data, and analyzing the analysis.

Why not one channel?

Though there is some overlap in the content (hopefully Data Scientists are idiomatic Python programmers), I think the channels will appeal to different groups. I’ve certainly seen that in my corporate training gigs. Though I might have some videos that belong to both channels (SQL is an example).

Why Kickstarter?

First of all, I want to try running a Kickstarter just to have experience with the platform. It is a nice way to launch a product. I also want to attract a group of users who are interested in getting early access and providing feedback. With the aim of having a better product when I launch to the world.

If this is of interest to you, I encourage you to join for a month, or a year. (If you or your company need training, I’ve also put in some packages that basically give you a half day or day of training with a year-long subscription thrown in).

As of this blog post, I’m at 10% funding. Please spread the word.