Skip to content

Commit 29de83d

Browse files
committed
docs: add content about various learning flavours
Signed-off-by: Akshay Mestry <[email protected]>
1 parent 491a4af commit 29de83d

File tree

2 files changed

+55
-9
lines changed

2 files changed

+55
-9
lines changed

docs/source/learning-out-loud/ml-explained/index.rst

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
.. Author: Akshay Mestry <[email protected]>
22
.. Created on: Friday, April 25 2025
3-
.. Last updated on: Sunday, May 04 2025
3+
.. Last updated on: Monday, May 05 2025
44
55
:og:title: ML Explained
66
:og:description: A narrative series that walks through the foundations of
@@ -22,6 +22,11 @@ ML Explained
2222
:linkedin: https://linkedin.com/in/xames3
2323
:timestamp: May 04, 2025
2424

25+
.. rst-class:: lead
26+
27+
This isn't a crash course. There's no "ultimate guide" here, no promise to
28+
make you an expert over a weekend
29+
2530
This corner of the internet is the place where I attempt to teach Machine
2631
Learning the way I wish I'd first encountered it... slowly, clearly, and with
2732
context that sticks. If you've ever googled, "machine learning" and landed on a

docs/source/learning-out-loud/ml-explained/ml101.rst

Lines changed: 49 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
.. Author: Akshay Mestry <[email protected]>
22
.. Created on: Friday, April 25 2025
3-
.. Last updated on: Sunday, May 04 2025
3+
.. Last updated on: Tuesday, May 06 2025
44
55
:og:title: ML101
66
:og:description: Understanding learning as function approximation, not magic.
@@ -19,12 +19,7 @@ ML101
1919
:avatar: https://avatars.githubusercontent.com/u/90549089?v=4
2020
:github: https://github.com/xames3
2121
:linkedin: https://linkedin.com/in/xames3
22-
:timestamp: May 03, 2025
23-
24-
.. rst-class:: lead
25-
26-
This isn't a crash course. There's no "ultimate guide" here, no promise to
27-
make you an expert over a weekend
22+
:timestamp: May 04, 2025
2823

2924
To be fair, this doesn't really need explaining. If you're here, chances are
3025
you already have some sense of what Machine Learning is, or at least you feel
@@ -76,7 +71,7 @@ In the classical approach, we often write explicit instructions, handcrafted
7671
rules, conditional logic, or, to keep it precise, programs. The machine doesn't
7772
think. It simply obeys or follows those rules or instructions.
7873

79-
Machine Learning flips this paradigm.
74+
Machine Learning flips this paradigm...
8075

8176
Instead of coding or programming the logic ourselves, we supply the machine
8277
(computer) with examples. And I mean a lot of them. By the way, these examples
@@ -92,3 +87,49 @@ teaching a child to ride a bicycle. You don't explain Newtonian mechanics or
9287
angular momentum. You run alongside them, steady the seat, and let them wobble.
9388
The learning comes along through doing. Like I said, its a process. The rules
9489
emerge from experience.
90+
91+
.. _learning-has-three-flavours:
92+
93+
-------------------------------------------------------------------------------
94+
Learning has Three Flavours
95+
-------------------------------------------------------------------------------
96+
97+
This is back in 2018 when I started delving deeper and deeper into
98+
understanding Machine Learning concepts. I noticed the same three paradigms
99+
cropping up repeatedly: supervised, unsupervised, and reinforcement learning.
100+
They sound like taxonomies from textbooks, but they're really just different
101+
approaches to learning, not unlike the ones we use ourselves.
102+
103+
Supervised learning is by far the most common and intuitive. Think of it as
104+
"learning by example with feedback." You supply the algorithm with labelled
105+
data, say, images of cats and dogs, each tagged accordingly, and it learns to
106+
map inputs to outputs. It's akin to a student learning from an answer key.
107+
Spam detection, fraud recognition, voice transcription, these are its bread and
108+
butter.
109+
110+
Unsupervised learning is a wee bit murkier. Here, the data comes unlabelled.
111+
The machine's task is to organise it, to find structure, clusters, or
112+
compressed representations. It's like giving someone a pile of puzzle pieces
113+
from various sets and asking them to sort them without knowing what the final
114+
pictures look like. We use it for market segmentation, topic modelling, and
115+
anomaly detection.
116+
117+
Reinforcement learning, though, is where things get truly interesting. Inspired
118+
by how animals (and babies) learn, Reinforcement Learning (RL) involves an
119+
agent interacting with an environment, making choices, and receiving feedback,
120+
rewards or penalties. Over time, the agent learns a policy that maximises the
121+
cumulative reward. This is the technique behind DeepMind's AlphaGo, robotic
122+
locomotion, and even certain kinds of recommendation engines.
123+
124+
In my last quarter of the uni, I wrote and trained a reinforcement learning
125+
`Snake game`_ as part of my assignment. The game was quite simple, where the
126+
agent had to find its way to a goal (fruit) while avoiding eating itself and
127+
hitting the walls. For hours, it kept spinning like a bloody Beyblade! Turns
128+
out my reward function was misaligned; I'd inadvertently taught the agent that
129+
if it is about to collide with the wall, it should take a left or right turn
130+
and in doing so, it will not die. That was a profitable proposition, right?
131+
But no... that's the thing with Reinforcement Learning, you're not merely
132+
teaching what to do, but what to value. And that distinction changes
133+
everything!
134+
135+
.. _Snake game: https://gist.github.com/xames3/563c99598c2aa1dd84e3c9494b648063

0 commit comments

Comments
 (0)