## Learning Artificial Intelligence

One reason to love being a mathematician is that the mathematical universe is so staggeringly enormous that there’s never a shortage of fun new directions to go in the mathematical sciences.

Recently, I’ve been learning AI, and really enjoying digging into both the mathematical aspects of intelligence and the concrete programming skills for building state-of-the-art systems.  One big help for this has been a two-term Udacity “Nanodegree” that I just completed.

## Artificial Intelligence Nanodegree and Specializations

The whole program, which took from January through August 2018, consisted of three parts:

1. Introduction to Artificial Intelligence. This covered a bunch of topics in “classical” AI, largely following Russell and Norvig’s text, Artificial Intelligence: A Modern Approach.
2. Deep Learning and Applications.  This covered the hottest topic in AI these days—using deep neural networks to do machine learning—and a bunch of applications.
3. Specializations:
• Computer Vision.
• Natural Language Processing.
• Voice-User Interfaces.

These courses covered a lot of stuff!  For me, it was a good way to boost my skills quickly and help me catch up on developments and literature in AI.  Some people who are also interested in AI have been asking me about this program, so here’s more about what I did in it, and some evaluation of the Udacity courses.

### Projects

A good thing about the Udacity courses is that they’re project-focused.  The required projects for my Nanodegree included:

• Sudoku Solver: Built an AI agent to solve sudoku, using a combination of explicitly programmed strategies and tree search.
• Game Playing Agents: Built an adversarial search agent to play a 2-player game (Isolation, and variations) with superhuman skill.
• Implementing a Domain-Independent Planner: Agent for optimal planning using PDDL and a variety of uninformed and informed search algorithms.  Applied this to air cargo transportation problems.
• Sign Language Recognizer: Used Hidden Markov Models to build models to transcribe sign language from hand position data extracted from video of American Sign Language speakers.
• Neural Dog-Breed Classifier: Used Convolutional Neural Networks (CNN) and transfer learning to build an app that can recognizing dog breeds from pictures of dogs, with human-level accuracy.
• Time Series and Automatic Text Generation: Used Recurrent Neutral Networks (RNN) for modeling time-series such as financial data, and also to generate text automatically, by training an RNN on a Sherlock Holmes novel.
• Machine Translation: The “capsone” project for the Natural Language Processing specialization, this involved building multiple RNN architectures—ultimately leading up to a bidirectional, encoder/decoder (seq2seq) model with word embedding—to translate between English and French.

I was only required to do one of the three possible capstone projects, and chose to do the one associcated with NLP.

Most of the projects were well designed, and introduced a variety of tools.  The programming was all in Python.  Some projects were stand-alone Python code built in an IDE, while others were done using Jupyter notebooks.  Some of the code was written for the student, leaving the key pieces for the student to do, and there were sometimes optional parts covering more advanced ideas.  A few projects also involved written reports, explaining the design choices in the project and/or reviewing some relevant related literature.

My main complaints with the projects: (1) In a few of the projects, I felt like some of the interesting coding that should have been left to the student was pre-coded, and (2) I could sometimes see better methods than what was specified in the project, but had to do it their way to meet specifications and/or pass the automated grader.  Still, overall, the projects helped me learn what I wanted to, and I could make my own private versions where I could do things my own way.

The course was also filled with other chances for practical experience: multiple optional (self-evaluated) projects, “mini-projects”, “labs”, and other coding exercises and guidance for implementing a range of AI tools.

### The Math

The mathematics in the nanodegree was sometimes covered a bit superficially.  That’s probably to be expected, since most of the people who take these courses come from a software engineering background, not from the mathematical sciences.  Some mathematical friends have asked me whether I can recommend the program for people with more math background, so I’d like to say a bit about the mathematical level.

The first thing to realize is that AI is a rather mathematically intensive area of computer science. To do well in the course, you need at least a solid grasp of linear algebra, logic, and basic probability and statistics, or sufficient mathematical sophistication to learn these subjects quickly.  These are the mathematical workhorses of machine learning and AI, and mathophiles will be glad to know Udacity’s courses don’t shy away from expecting students know this stuff.

However, the more math you know, the more you can get out of the course.  One can approach AI at wide range of mathematical levels.  Though the course is targeted at an undergraduate math level, this was fine:  Rather than teaching the more mathematical aspects, which I would have surely been too critical of anyway, the courses mostly helped point me to resources to where I could learn what I wanted to, and gave enough of the mathematical ideas that I could fill in the details myself.  I had a lot of fun seeing unexpected connections to areas of math I’ve worked on before.

## Overall Impressions and Recommendations

If you’re in the mathematical sciences and interested in getting into AI, the Udacity Nanodegrees are, in my opinion, a reasonable way to do it.  You can’t do exactly the program I did, because Udacity has done some restructuring of their Nanodegree programs since I started, and no longer offers the original AI Nanodegree.  However, you can do these:

• Artificial Intelligence Nanodegree (new version) — This is essentially the first term of my Nanodegree program, covering classical AI.
• Deep Learning Nanodegree — This has been available for some time, but it looks like they’ve recently converted this from TensorFlow to PyTorch.  I think this is a smart move. My Nanodegree was based on TensorFlow and Keras, but I’ve done a bit of PyTorch, and like it a lot.
• Machine Learning Engineer Nanodegree — While deep learning is really hot right now, and has some really amazing applications, it’s not the right tool for everything. If you don’t already know more standard machine learning algorithms, or if you just want a more well-rounded approach to machine learning, this might be a good place to start.  I haven’t seen any of this material.
• Natural Language Processing Nanodegree — This is a whole Nanodegree that grew out of the “NLP Specialization” part of mine.  There’s a bunch of new material that wasn’t part of my program, and it’s good — Udacity kindly granted students in my program access to the new material, so I’ve also done most of this.  This also includes the material from the “Voice User Interfaces” specialization in my Nanodegree.
• Computer Vision Nanodegree — This grew out of the “Computer Vision Specialization” part of my Nanodegree. There’s some great material here as well, which has been considerably expanded in the full CV Nanodegree.  It also includes work in C++, especially since many computer vision applications are embedded in systems where Python-level performance isn’t enough.

Since I had 4 degrees before (BS, BA, MS, and PhD), and since “nano” is the prefix for $10^{-9}$, I guess this new “Nanodegree” brings me up to 4.000000001 degrees. But, actually, I think it was worth more than that.  It’s at least a Centidegree. :-)