1
1
.. Author: Akshay Mestry <[email protected] >
2
2
.. Created on: Friday, April 25 2025
3
- .. Last updated on: Sunday , May 04 2025
3
+ .. Last updated on: Tuesday , May 06 2025
4
4
5
5
:og: title: ML101
6
6
:og: description: Understanding learning as function approximation, not magic.
19
19
:avatar: https://avatars.githubusercontent.com/u/90549089?v=4
20
20
:github: https://github.com/xames3
21
21
:linkedin: https://linkedin.com/in/xames3
22
- :timestamp: May 03, 2025
23
-
24
- .. rst-class :: lead
25
-
26
- This isn't a crash course. There's no "ultimate guide" here, no promise to
27
- make you an expert over a weekend
22
+ :timestamp: May 04, 2025
28
23
29
24
To be fair, this doesn't really need explaining. If you're here, chances are
30
25
you already have some sense of what Machine Learning is, or at least you feel
@@ -76,7 +71,7 @@ In the classical approach, we often write explicit instructions, handcrafted
76
71
rules, conditional logic, or, to keep it precise, programs. The machine doesn't
77
72
think. It simply obeys or follows those rules or instructions.
78
73
79
- Machine Learning flips this paradigm.
74
+ Machine Learning flips this paradigm...
80
75
81
76
Instead of coding or programming the logic ourselves, we supply the machine
82
77
(computer) with examples. And I mean a lot of them. By the way, these examples
@@ -92,3 +87,49 @@ teaching a child to ride a bicycle. You don't explain Newtonian mechanics or
92
87
angular momentum. You run alongside them, steady the seat, and let them wobble.
93
88
The learning comes along through doing. Like I said, its a process. The rules
94
89
emerge from experience.
90
+
91
+ .. _learning-has-three-flavours :
92
+
93
+ -------------------------------------------------------------------------------
94
+ Learning has Three Flavours
95
+ -------------------------------------------------------------------------------
96
+
97
+ This is back in 2018 when I started delving deeper and deeper into
98
+ understanding Machine Learning concepts. I noticed the same three paradigms
99
+ cropping up repeatedly: supervised, unsupervised, and reinforcement learning.
100
+ They sound like taxonomies from textbooks, but they're really just different
101
+ approaches to learning, not unlike the ones we use ourselves.
102
+
103
+ Supervised learning is by far the most common and intuitive. Think of it as
104
+ "learning by example with feedback." You supply the algorithm with labelled
105
+ data, say, images of cats and dogs, each tagged accordingly, and it learns to
106
+ map inputs to outputs. It's akin to a student learning from an answer key.
107
+ Spam detection, fraud recognition, voice transcription, these are its bread and
108
+ butter.
109
+
110
+ Unsupervised learning is a wee bit murkier. Here, the data comes unlabelled.
111
+ The machine's task is to organise it, to find structure, clusters, or
112
+ compressed representations. It's like giving someone a pile of puzzle pieces
113
+ from various sets and asking them to sort them without knowing what the final
114
+ pictures look like. We use it for market segmentation, topic modelling, and
115
+ anomaly detection.
116
+
117
+ Reinforcement learning, though, is where things get truly interesting. Inspired
118
+ by how animals (and babies) learn, Reinforcement Learning (RL) involves an
119
+ agent interacting with an environment, making choices, and receiving feedback,
120
+ rewards or penalties. Over time, the agent learns a policy that maximises the
121
+ cumulative reward. This is the technique behind DeepMind's AlphaGo, robotic
122
+ locomotion, and even certain kinds of recommendation engines.
123
+
124
+ In my last quarter of the uni, I wrote and trained a reinforcement learning
125
+ `Snake game `_ as part of my assignment. The game was quite simple, where the
126
+ agent had to find its way to a goal (fruit) while avoiding eating itself and
127
+ hitting the walls. For hours, it kept spinning like a bloody Beyblade! Turns
128
+ out my reward function was misaligned; I'd inadvertently taught the agent that
129
+ if it is about to collide with the wall, it should take a left or right turn
130
+ and in doing so, it will not die. That was a profitable proposition, right?
131
+ But no... that's the thing with Reinforcement Learning, you're not merely
132
+ teaching what to do, but what to value. And that distinction changes
133
+ everything!
134
+
135
+ .. _Snake game : https://gist.github.com/xames3/563c99598c2aa1dd84e3c9494b648063
0 commit comments