Recently, I’ve been learning AI, and really enjoying digging into both the mathematical aspects of intelligence and the concrete programming skills for building state-of-the-art systems. One big help for this has been a two-term Udacity “Nanodegree” that I just completed.
The whole program, which took from January through August 2018, consisted of three parts:
These courses covered a lot of stuff! For me, it was a good way to boost my skills quickly and help me catch up on developments and literature in AI. Some people who are also interested in AI have been asking me about this program, so here’s more about what I did in it, and some evaluation of the Udacity courses.
A good thing about the Udacity courses is that they’re project-focused. The required projects for my Nanodegree included:
I was only required to do one of the three possible capstone projects, and chose to do the one associcated with NLP.
Most of the projects were well designed, and introduced a variety of tools. The programming was all in Python. Some projects were stand-alone Python code built in an IDE, while others were done using Jupyter notebooks. Some of the code was written for the student, leaving the key pieces for the student to do, and there were sometimes optional parts covering more advanced ideas. A few projects also involved written reports, explaining the design choices in the project and/or reviewing some relevant related literature.
My main complaints with the projects: (1) In a few of the projects, I felt like some of the interesting coding that should have been left to the student was pre-coded, and (2) I could sometimes see better methods than what was specified in the project, but had to do it their way to meet specifications and/or pass the automated grader. Still, overall, the projects helped me learn what I wanted to, and I could make my own private versions where I could do things my own way.
The course was also filled with other chances for practical experience: multiple optional (self-evaluated) projects, “mini-projects”, “labs”, and other coding exercises and guidance for implementing a range of AI tools.
The mathematics in the nanodegree was sometimes covered a bit superficially. That’s probably to be expected, since most of the people who take these courses come from a software engineering background, not from the mathematical sciences. Some mathematical friends have asked me whether I can recommend the program for people with more math background, so I’d like to say a bit about the mathematical level.
The first thing to realize is that AI is a rather mathematically intensive area of computer science. To do well in the course, you need at least a solid grasp of linear algebra, logic, and basic probability and statistics, or sufficient mathematical sophistication to learn these subjects quickly. These are the mathematical workhorses of machine learning and AI, and mathophiles will be glad to know Udacity’s courses don’t shy away from expecting students know this stuff.
However, the more math you know, the more you can get out of the course. One can approach AI at wide range of mathematical levels. Though the course is targeted at an undergraduate math level, this was fine: Rather than teaching the more mathematical aspects, which I would have surely been too critical of anyway, the courses mostly helped point me to resources to where I could learn what I wanted to, and gave enough of the mathematical ideas that I could fill in the details myself. I had a lot of fun seeing unexpected connections to areas of math I’ve worked on before.
If you’re in the mathematical sciences and interested in getting into AI, the Udacity Nanodegrees are, in my opinion, a reasonable way to do it. You can’t do exactly the program I did, because Udacity has done some restructuring of their Nanodegree programs since I started, and no longer offers the original AI Nanodegree. However, you can do these:
Since I had 4 degrees before (BS, BA, MS, and PhD), and since “nano” is the prefix for , I guess this new “Nanodegree” brings me up to 4.000000001 degrees. But, actually, I think it was worth more than that. It’s at least a Centidegree. :-)
]]>Can I convert a gaming laptop into a deep learning machine?
The answer is Yes, and I’m writing this post in case it’s helpful to someone who wants to do something similar. Also, I’m writing it so I can remember what I did.
If you want to train neural networks, GPUs are an order of magnitude faster than CPUs. I needed a GPU system for my current work, but I also like being mobile, running computations on the go, and without having to rely on pay-by-the-hour cloud services like AWS. I needed a laptop with a decent GPU.
But I also had some concerns. If I’d had the money, I would have gotten a System 76 machine, or some other laptop designed for machine learning, but I had a budget of around $1000. I also suspected that GPUs in many standard laptops are probably severely underclocked for heat reasons. A gaming laptop seemed like a potentially good solution, since gamers need performance.
Unfortunately, I had a hard time finding reliable-sounding information online about whether most gaming laptops play nicely with Linux. The ones I looked at were mostly not on Ubuntu’s certified hardware list, and I had already been burned once: I’d had one recent bad experience with some hardware that crashed frequently with my Ubuntu setup, and this left me slightly nervous about trying to convert a gaming laptop to a deep learning machine.
Eventually, I came up with a solution I’m very happy with. My Dell G7 15 gaming laptop works great for what I needed:
I’ve now been using this machine for a couple of months, and it’s been great for my purposes! I’m happily and quickly training neural networks, even in places where I’ve got no connection to the cloud. True, it’s a bit bulky compared to most laptops, and the power usage means the battery only has typically a couple of hours of charge, but those are really the only downsides.
Below, you can see the steps I used. I wrote these mostly so I can refer back to them easily, but I’d be happy if they’re helpful to anyone else who wants to do the same…
Here’s what I did. To figure out these steps, two articles I found helpful were ones by Sanyam Bhutani and Taylor Denouden.
TroubleShoot>Advanced Options>UEFI Firmware
and Restart again.nouveau.modeset=0
to the end of the line starting with linux
, then press F10 to boot.nouveau.modeset=0
to the end of the line starting with linux
.sudo apt-get update
ubuntu-drivers devices
to show the recommended drivers for the hardwaresudo ubuntu-drivers autoinstall
to get the recommended driverssudo reboot
cd /usr/local/cuda-9.2/bin
sudo ./uninstall_cuda_9.2.pl
to uninstall if necessary)I may end up doing this again eventually, which is one reason I wanted to record all this while I still remember what I did. For one thing, it will be nice when NVIDIA adds Ubuntu 18 support. For another, in retrospect, I perhaps shouldn’t have done the dual boot thing — more than two months later, I’ve never booted into Windows since installing Ubuntu. But I could have anticipated that.
]]>Here’s how it works. Start with an equilateral triangle:
Mark off each edge into thirds, and connect those points with lines using this pattern:
This divides up the triangle into into seven smaller triangles, three equilateral and three isosceles. Remove the three isosceles triangles to get this:
But now, notice that the three triangles you removed can each be cut in half and then taped back together along an edge, so that you get three equilateral triangles. Do that, and place the new equilateral triangles so that they stick out from the sides of the original triangle, and you will get this:
That’s it for step 1!
Now, for step 2, repeat this whole process for each of the 7 equilateral triangles obtained in step 1. Step 3: Do the same for each of the 49 triangles obtained in step 2. And so on. My original picture, at the top of this post, is what you get after step 5.
Notice that each step is area-preserving, so in particular, the total area of all of the black triangles in my original picture is the same as the are of the triangle I started with.
Here’s an animation showing the first five steps in the sequence, and then those same steps backwards, going back to the original triangle:
The reason the picture seems to get darker in the latter steps is that the triangles are drawn with black edges, and eventually there are a lot of edges. Since there’s a limit to how thin the edges can be drawn, eventually, the picture is practically all edges.
The outline of the entire picture is clearly a Koch curve, so we have generated a Koch curve from a triangle. But, what I really love about this construction is that every triangle that occurs at any step in the recursive process also spawns a Koch curve! That’s a lot of Koch curves.
To make this precise, we can assume that triangles at each step are closed subsets of the plane. Admittedly, the “cutting” analogy falls apart slightly here, since two pieces resulting from a “cut” each contain a copy of the edge the cut was made along, but that’s OK. With this closure assumption, each of the Koch curves, one for each triangle formed at any stage in the process, is a subset of the intersection over all steps.
]]>I’ll avoid explaining this one for now, except to say that I generated it starting from a single triangle, and iteratively replacing each triangle by seven new triangles. This is the sixth generation. The construction differs only slightly from my previous Koch snowflake fractal, in which each triangle had six descendants. I really like this new version, because you can see Koch snowflakes showing up in even more (infinitely more!) places than before.
There are also analogs of this for squares and pentagons!
]]>That’s his three-dimensional interpretation of the first few iterations of this design of mine:
What’s fun about Colton’s version is that each new layer of squares is printed a bit taller than the previous layer. I had really only imagined these as two-dimensional objects, so for me it’s really fun to have 3-dimensional models of them to hold and play with! Colton’s idea of adding some depth really adds another … er … dimension to the overall effect:
His work also gives a nice way to illustrate some of the features of these fractals. For example, visually proving that the “inside” and “outside” in my fractals converge to the same shape can be done by printing the same model at different scales. Here are three copies of the same fractal at different scales, each printed with the same number of iterations:
Not only do these nest tightly inside each other, the thickness is also scaled down by the same ratio, so that the upper surfaces of each layer are completely flush.
Colton has been doing this work partly because designing fractals is a great way to learn 3d printing, and he’s now getting some impressively accurate prints. But, I also like some of his earlier rough drafts. For example, in his first attempt with this fractal based on triangles:
there were small gaps between the triangles, which Colton hadn’t intended. But, this gave the piece a sort of rough, edgy look that I like, and it casts shadows like castle battlements:
Colton is still doing nice new work, and we’ll eventually post some more pictures here. But I couldn’t wait to post a preview of some of his stuff!
(Designs and photos © 2018 Colton Baumler and Derek Wise)
]]>This workshop was a lot of fun! I learned a lot, had the chance to talk to people I’ve known for a long time, and meet others I hadn’t managed to connect with before. I was especially excited to find out about some lines of work in progress that build on my work with Catherine Meusburger on Hopf algebra gauge theory.
In fact, our work on this seems to have been an impetus for the workshop, and it was really gratifying to see how other people are beginning to apply our theory, and also work out some interesting examples of it for particular Hopf algebras! I’m anticipating some interesting work coming out in the near future.
Here’s the conference photo; I’m farthest right, and my coauthor, Catherine, is the 11th head from the left, peeking out from the second row:
I gave an introductory talk on the subject of Hopf algebra gauge theory, and you can download the slides from my talk, or even watch the video. Catherine’s talk followed mine, and she showed how Kitaev models are related to Hopf algebra gauge theory in the same way that Turaev-Viro TQFTs are related to Reshetikhin-Turaev TQFTs. Video of her talk is also available. Of course, for more detail on Hopf algebra gauge theory, you can also check out our paper: Hopf algebra gauge theory on a ribbon graph.
I can also recommend watching other talks from the conference, available from the webpage linked to above. This was just the kind of conference I like best, since it brought people from multiple research communities together, in this case including mathematicians and physicists of various sorts as well as mathematical computer scientists. Kitaev models have been a hot topic the past few years, and one reason I think they’re fun is precisely that people from several areas—quantum computation, Hopf algebras, category theory, quantum gravity, quantum foundations, topological quantum field theory, condensed matter physics, and more—are working together. Of course, this probably also helps explain the rather long conference title.
]]>
generates a bunch of copes of the Koch snowflake at different scales:
Similarly, I’ve shown (2) how letting squares reproduce like this:
generates a bunch of copies of a fractal related to the Koch snowflake, but with 8-fold symmetry:
So what about letting pentagons reproduce? For pentagons, an analog of the replication rules above is this:
Each of the 10 cute little pentagon children here is a factor of smaller than its parent, where is the golden ratio.
However, something interesting happens here that didn’t happen with the triangle and square rules. While triangles and squares overlap with their ancestors, pentagons overlap with both their ancestors and their cousins. The trouble is that certain siblings already share faces (I know, the accidental metaphors here are getting troublesome too!), and so siblings’ children have to fight over territory:
In this three-generation pentagon family portrait, you can see that each second generation pentagon has two children that overlap with a cousin.
As we carry this process further, we get additional collisions between second cousins, third cousins, and so on. At five generations of pentagons, we start seeing some interestingly complex behavior develop from these collisions:
There’s a lot of fascinating structure here, and much of it is directly analogous to the 6-fold and 8-fold cases above, but there are also some differences, stemming from the “cousin rivalry” that goes on in pentagon society.
Let’s zoom in to see some collisions near where the two ‘wreaths’ meet on the right side of the picture:
I find the complicated behavior at the collisions quite pretty, but the ordering issues (i.e. which members of a given generation to draw first when they overlap) annoy me somewhat, since they break the otherwise perfect decagonal symmetry of the picture.
If I were doing this for purely artistic purposes, I’d try resolving the drawing order issues to restore as much symmetry as possible. Of course, I could also cheat and restore symmetry completely by not filling in the pentagons, so that you can’t tell which ones I drew first:
It’s cool seeing all the layers at once in this way, and it shows just how complex the overlaps can start getting after a few generations.
Anyway, because of these collisions, we don’t get seem to get a fractal tiling of the plane—at least, not like we got in the previous cases, where the plane simply keeps getting further subdivided into regions that converge to tiles of the same shape at different scales.
Actually, though, we still might get a fractal tiling of the plane, if the total area of overlap of nth generation pentagons shrinks to zero as n goes to infinity! That would be cool. But, I don’t know yet.
In any case, the picture generated by pentagons is in many ways very similar to the pictures generated by triangles and squares. Most importantly, all of the similar-looking octagonal flower shaped regions we see in this picture including the outer perimeter, the inner light-blue region, and tons of smaller ones:
really are converging to the same shape, my proposal for the 10-fold rotationally symmetric analog of the Koch snowflake:
How do we know that all of these shapes are converging to the same fractal, up to rescaling? We can get a nice visual proof by starting with two pentagons, one rotated and scaled down from the other, and then setting our replication algorithm loose on both of them:
Proof:
We see that the area between the two fractal curves in the middle shrinks closer to zero with each generation.
Puzzle for Golden Ratio Fans: What is the exact value of the scaling factor relating the two initial pentagons?
Next up in this infinite series of articles: hexagons! …
I’m joking! But, it’s fairly clear we can keep ascending this ladder to get analogs of the Koch snowflake generated by n-gons, with (2n)-fold rotational symmetry. More nice features might be sacrificed as we go up; in the case generated by hexagons, we’d have collisions not only between cousins, but already between siblings.
]]>In the Sierpinski triangle, each triangle yields three new, scaled-down triangles, attached at the midpoints of sides of the original, like this:
These triangles are usually thought of as “holes” cut out of a big triangle, but all I care about here is the pattern of the triangles. As I explained last time, the Koch snowflake can be built in a similar way, where each triangle produces six new ones, like this:
You might say this bends the usual rules for making fractals since some of the triangles overlap with their ancestors. But, it makes me happy because it lets me think of the Sierpinski triangle and the Koch snowflake as essentially the same kind of thing, just with different self-replication rules.
What other fractals can we build in this way? The Sierpinski carpet is very similar to the Sierpinski triangle, where we now start with squares and a rule for how a square produces 8 smaller ones:
This made me wonder if I could generalize my construction of the Koch snowflake using triangles to generate some other fractal using squares. In other words, is there some Koch snowflake-like fractal that is analogous to the ordinary Koch snowflake in the same way that the Sierpinski carpet is analogous to the Sierpinki traingle?
There is! Taken to the 5th generation, it looks like this:
The outline of this fractal is an analog of the Koch snowflake, but with 8-fold symmetry, rather than 6-fold. Compare the original Koch snowflake with this one:
Just as I explained last time for the Koch snowflake (left), the blue picture above actually provides a proof that the plane can be tiled with copies of tiles like the one on the right, with various sizes—though in this case, you can’t do it with just two sizes of tiles; it takes infinitely many different sizes! In fact, this tiling of the plane is also given in Aidan Burns’ paper I referenced in the previous post.
But, my construction is built entirely out of self-replicating squares. What’s the rule for how squares replicate?
Before I tell you, I’ll give two hints:
First, each square produces 8 new squares, just like in the Sierpinski carpet. (So, we could actually make a cool animation of this fractal morphing into the Sierpinski carpet!)
Second, you can more easily see some of the bigger squares in the picture if you make the colors of the layers more distinct. While I like the subtle effect of making each successive generation a slightly darker shade of blue, playing with the color scheme on this picture is fun. And I learned something interesting when my 7-year old (who is more fond of bold color statements) designed this scheme:
The colors here are not all chosen independently; the only choice is the color of each generation of squares. And this lets us see bits of earlier-generation squares peeking through in places I hadn’t noticed with my more conservative color choices.
For example, surrounding the big light blue flower in the middle, there are 8 small light blue flowers, and 16 even smaller ones (which just look like octagons here, since I’ve only gone to the 5th generation); these are really all part of the same light-blue square that’s hiding behind everything else.
The same thing happens with the pink squares, and so on. If you stare at this picture, you can start seeing the outlines of the squares.
So what’s the rule? Here it is:
The 8 small squares are all the same size, and the side of the big square is two sides plus a diagonal of the small squares, so the squares are scaled down by a factor of .
Up next: Triangles and squares were fun. What fun can we have with pentagons?
(All images in this post copyright 2017, Derek Wise)
]]>The Koch snowflake is usually constructed starting with an equilateral triangle by replacing the middle third of each side with an equilateral triangular protrusion, doing this again to the resulting polygon, and so on. The first seven steps are shown in this animation:
and the Koch snowflake is the “limit” of this process as the number of steps goes to infinity.
In the alternative construction we use only self-replicating triangles. We again start with a triangle:
But now, rather than modifying this triangle, we let it “reproduce,” yielding six new triangles, three at the corners of the original, and three sticking out from the sides of the original. I’ll make the six offspring a bit darker than the original so that you can see them all:
Notice that three of the children hide the corners of their parent triangle, so it looks like we now have a hexagon in the middle, but really we’ve got one big triangle behind six smaller ones. Now we do the same thing again, letting each of the six smaller triangles reproduce in the same way:
The 36 small triangles are the “grandchildren” of the original triangle; if each of these has six children of its own, we get:
Repeating this again:
And again:
At this stage, it starts getting hard to see the new triangles, so I’ll stop and rely on your imagination of this process continuing indefinitely. We can now see some interesting features emerging. Here are some of the main ones:
and so on: we have Koch snowflakes repeating at smaller and smaller scales.
All this self similarity shows in particular that Koch snowflakes can be assembled out of Koch snowflakes. This is nothing new; it’s related to Aidan Burns’ nice demonstration that Koch snowflakes can be used to tile the plane, but only if we use snowflakes of at least two different sizes:
Aidan Burns, Fractal tilings. Mathematical Gazette 78 No. 482 (1994) 193–196.
These tilings are already visible in the above construction using triangles, but we can make them even more evident by just playing with the color scheme.
First, if we draw the previous picture again but make all of the triangles the same color, we just get a region whose perimeter is the usual Koch snowflake:
On the other hand, if we make the original triangle white, but all of its descendants the same color of blue, we get this:
I hope you see how this forms part of a wallpaper pattern that could be extended in all directions, adding more blue snowflakes that bound larger white snowflake-shaped gaps. This gives the tiling of the plane by Koch snowflakes of two different sizes.
Taking this further, if we make the the original triangle and all of its children white, but all of their further descendants the same color of blue, we get this:
The pattern seen here can be extended in a hopefully obvious way to tile the whole plane with Koch snowflakes of three different sizes.
Going further, if we make the first three generations white, but all blue after that, we get:
and so on.
The previous four pictures are generated with exactly the same code—we’re drawing exactly the same triangles, and only changing the color scheme. If we keep repeating this process, we get a tiling with arbitrarily small Koch snowflakes!
But we can also go the other way, continuing up in scale to get a tiling that includes arbitrarily large Koch snowflakes. To do this, we just need to view the above series of pictures in a different way!
The way I described it, the scale is the same in all of the previous four pictures. Making successive generations white, one generation at a time, makes it look as if we’re cutting out a snowflake from the middle of a big snowflake, leaving six similar snowflakes behind, and then iterating this:
On the other hand, we can alternatively view these pictures as a process of zooming out: each picture is built from six copies of the previous one, and we can imagine zooming out so that each picture becomes just one small piece of the next.
If you’re careful about how you do this, you get a tiling of the whole plane, with arbitrarily large Koch-snowflake shaped tiles! I say you have to be careful because it won’t cover the whole plane if, for example, each picture becomes the top middle part of the next picture. But, if you zoom out in a “spiral,” rotating by at each step, you’ll get a tessellation of the plane.
Someone should make an animation that shows how this works. Maybe I’ll get a student to do it.
There are some other fun variations on this theme—including a similar construction that leads to the other “fractal tiling” described by Aidan Burns—which I should explain in another post.
In case anyone wants it, here is a 1-page visual explanation of the construction described in this post: snowflake.pdf
(All images in this post, except for the first, copyright 2017 Derek Wise.)
]]>Last time I suggested using a deck of 12 cards like this:
But instead, we used four solid colors, three cards of each. So, our “star” permutes the colors red, white, black, and silver:
You can get any permutation of these colors in our Star by exactly one symmetry taking outer vertices to outer vertices. The “exactly one” in this isomorphism is what makes the set of outer vertices a 4!-torsor rather than just a 4!-set.
Here’s what it looks like when you put three pieces together, from both sides:
]]>