Saturday, July 20, 2013

On Starting

Learning to program is hard. It's an uphill battle for anyone and as soon as a programmer achieves any level of competency, they start to form opinions on how others should learn to program. Everyone has opinions from what literature, language, editor, compiler, and hairstyle is appropriate for learning to program. After chatting with a few people today on stackoverflow, I thought that I'd write down my own set of ill formed and half baked thoughts on the matter.

First things first when you set about learning to program, know what you want out of it. If you are doing this because you want to write excel macros then great! Go learn about VB and be productive, more power to you. Otherwise, read on

1. It's All About High Level Concepts

If you're getting into this because you want to make a career than you should approach this differently. Your goal isn't to learn the appropriate keys to punch to make the screen light up. Rather you're aiming to learn the concepts that drive software that you'll write.

This is inherently less productive at first. If you're trying to really understand why things are the way that they are, it's far more laborious than just accepting it blindly. However it comes with the benefit that these concepts are timeless. 

Computer Science has been around for 60+ years. In that time, we've gone from C being high level to very low level, computers have vastly increased in speed, we've gone from vacuum tubes to multicore machines. However in that time algorithmic complexity hasn't suddenly stopped mattering, data structures haven't vanished. Sure you may be writing your memoizing algorithm in Haskell instead of assembler, but you still have to know how to memoize!

The next intelligent question is, what concepts are important?

Abstraction is King

All these buzzwords people through around, Object Orientation, Lazy Evaluation, Declarative Programming, Parametric Polymorphism, it's all there for one reason. To think as little as possible about how things are done, instead focusing on what things are done. For example
  • Object Orientation - Hide implementation details
  • Declarative Programming - Don't talk about implementation details
  • Lazy Evaluation - Don't talk about how and when things are evaluated
Notice the theme?

I think this is the single most important concept for new programmers to learn. Good code is all about finding the right level of abstraction for the problem.

Algorithms are Queen

In my opinion, if you're writing code that doesn't deal at all with algorithms, you're not even programming. Doing so is like saying "I'm writing without thinking about plot".

Algorithms are the substance of your programs. Learning them comes in 2 stages.
  1. Algorithmic thinking
  2. Algorithms
Algorithmic thinking is about learning how to make algorithms. Stuff like dynamic programming, and divide and conquer techniques.  These techniques provide a nice foundation for how to approach problems and understand other algorithms. I've found that learning them is best done through example which brings me to 2.

Learning specific algorithms is a tricky business, on one hand you want to learn how to apply algorithmic thinking and techniques, but you don't want to memorize a compendium of algorithms. So don't go and memorize 20 sorting algorithms, but I think a few important ones are
  1. Some greedy algorithms (Dijkstra's is my favorite)
  2. Some divide and conquer (Merge sort is easy to grok)
  3. Some dynamic programming (Knapsack problem)
  4. Some graph/tree search algorithms (BFS & DFS)
And many many more. People write lots and lots of algorithms books, I'd encourage any new programmer to read one as a second book. Personally I favor Skiena's Algorithm Design Manual

Data Structures are Bedrock

Data Structures are up there with algorithms. I was unsure of which should come first actually. They're bedrock in the sense that your program is fundamentally based on them. If you can't represent data well, then how can you expect to operate on it?

Learning data structures is similar to algorithms, you start with some basic ideas and learn a few examples of how to apply them.

In my opinion the "basic idea"s of data structure are ADTs, Abstract Data Types. These are just a set of operations that are defined upon however you represent your data. The most common ones are
  1. Lists
  2. Sets
  3. Dictionaries (associative lists)
  4. Graphs
  5. Trees
So you learn what set of operations define a list and what a graph can do, and then you learn a few implementations like
  1. Linked lists
  2. Hash tables
  3. Balanced binary trees
  4. Association tables
And so on. Much the same as algorithms, the best way to do this is with a good book. Once again I'll recommend Skiena's.

2.  Don't Fixate on Details

I admit it, when I first picked up programming I focused way to much on unimportant stuff. I thought it was cool, "Oh did you know that <insert absurdly detailed fact> was implemented in C++98 by <insert obscure compiler> and later adopted by the standard for C++03 but dropped for C++11!!". Looking back I realize that this was a dumb thing to do.

Languages are Tools

This hurt a little to say. It's completely true, but it's still hard to accept.

In the end, the fact that you're using Python instead of Ruby couldn't matter less. That doesn't stop people (including me) from endlessly debating which is better (Haskell!1!!) but really it's like craftsman discussing power tools. Sure people have preferences, but when your starting it matters very little. You'll have plenty of time to become opinionated later :)

In fact, you should plan on learning a new language after a year. And another one a year later on. And another and another. 

I personally maintain that you should be learning a new language once a year. This will make you a better programmer and cause you to focus more on the Big Issues. Plus, if you're developing software in the Real World, you'll often have to deal with new languages and technologies.

There's just no point in fussing over what language you learn.

The important thing is to not get stuck in trivia-style details. This brings me to my next point

Complicated is Bad

I learned C++ first. C++ is complicated. I had a miserable first 2 months just getting off the ground at all. I just said that all languages are equal, well that was a bit of a fib. All simple languages are equal.

When suggesting a language for a beginner, my philosophy is: 

If a language is making you focus on the language, not abstraction, algorithms, or data structures, it's a bad language for a beginner.

Does this mean C++ is a terrible language? Not necessarily. But it means that it's a bad language to learn with. Starting with a simple, high level language is a good way to let you focus immediately on the important concepts.

After you've gotten the ball rolling, then you can take a look at more complex languages like C++. Your prior knowledge will make learning them much more manageable.

And yes I've very deliberately avoiding mentioning which high level language to start with.

Environments Are Less Than Languages

Often when I get asked how to start programming, 2 questions are asked
  1. What Language?
  2. What IDE/Compiler/OS?
I just said that the language doesn't really matter. Well the environment matters even less.

Your IDE is a place where you type. Your compiler makes 1s and 0s for you. Your OS allocates memory and lets you play videos of kittens.

Unless you've got 10 million lines of code to crawl through, you'll be just fine with any reasonable environment. Personally I don't even use an IDE, I'm just using Emacs. I fluctuate between something like 4 C compilers depending on my mood. I know people who have gone through 30 years of professional software development without using an IDE. They're just not important.

Pick one that makes you feel comfortable, than get back to focusing on what matters. You can always switch later.

3. Dip Your Toes in Many Areas

Aside from the basics, you shouldn't stagnate in one area. As I've said a few times, any one area of computer science/programming will quickly become outdated and die off.

Just think, until you get a decent overall feel for computer science, you'll have no idea what's useful and interesting. Perhaps you'll discover a hidden talent for compiler or OS design.

Now this is the part that takes a long time, think in the order of years. The good news is that you can (and should) be writing software at this point.

Learn Different Abstractions

This is the big one, learning different abstractions is critical to being a good programmer. When all you know how to do is hammer nails, you'll have a heck of time working with screws. Learning new abstractions will help you deal with the diverse set of problems you'll encounter.

Now there are 2 main ways to do this
  1. Learn new paradigms
  2. Learn different levels of abstraction
For 1. this means learning about functional programming, object oriented programming. Fiddle with different paradigms for different things. Learn what logic programming is good for and when imperative programming is ill-suited.

For 2. this means trying low and high level languages. I still advise starting with a high level language, but don't skimp on low level stuff. Learning C is pretty mind bending coming from Scheme, learning assembler is pretty mind bending coming from Haskell. Learning abstractions at different levels means that you'll develop an intuition for how some of these abstractions work. This in turn will deepen your understanding of how to use this abstraction.

Learn Different Areas of CS

Learning about different areas of CS will help in 2 ways
  1. Give you new ways to think about problems
  2. Let you discover your interests
Some of the most profound advancements come from mixing 2 divergent ideas. Fermat's last theorem was finally proven with a theorem about elliptic curves! CS is no different. By learning different areas you'll be able to apply a whole different set of tools to your problems. For example, Machine Learning offers a whole different set of approaches that my lab applies to a classic Human-Computer interaction problem. This leads to all sorts of interesting results!

Hopefully 2. is obvious; you just have no idea what'll fall in love with until you look. I had no idea I'd like compilers so much until I tried reading a book on them.

This last sentence brings me to another key point

Read Lots

This is really the best way to learn new things. Online resources are hit and miss. But books have a much higher signal to noise ratio. Personally I'm always reading a book on computer science (currently Semantics of Programming Languages).

If you're interested in the list of books I think are good, there's a link at the top of my blog. As for what book to start with, I always recommend Structure and Interpretation of Computer Programs. This one comes with the added benefit of being free online! You'd be surprised how many books are for computer science.

Note the difference between "Computer Science" and "Programming" books. One teaches a concept/theory, one teaches a tool. Both are important so make sure you read a mix of each. SICP is kind of an exception to this as it teaches both Scheme and basic CS. It's the primary reason I recommend it.

4. Apply Your Knowledge

This is the most obvious and arguably the most important. You have to practice to become a great programmer or computer scientist.

Pick a project that interests you, and work on it in tandem with your readings. Some projects I've done and enjoyed are
  1. Write a series of prime number calculators
  2. Write a calculator (Parser or GUI)
  3. Write a Chat Program
  4. Write a Chat Website
  5. Write a Scheme Interpreter
  6. Write a Scheme Interpreter in Your Scheme Interpreter (Repeat as often as you like)
  7. Implement a logic programming DSL
  8. Write a Garbage Collector
  9. Write an SML compiler (Current Project)
  10. Write an FTP server
Do whatever makes you happy!  Scratch an itch and write something you want. Make your project free and open source so people can contribute. You can even contribute to others projects to see how they program.

 Just practice and have fun

Good Luck!

No comments:

Post a Comment