Skip to content

Commit 8c73935

Browse files
committed
docs: add more information about building xsnumpy
Signed-off-by: Akshay Mestry <[email protected]>
1 parent 3544b61 commit 8c73935

File tree

3 files changed

+197
-100
lines changed

3 files changed

+197
-100
lines changed
547 KB
Loading

docs/source/assets/sigh-meme.jpg

22.2 KB
Loading

docs/source/projects/xsnumpy.rst

Lines changed: 197 additions & 100 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
.. Author: Akshay Mestry <[email protected]>
22
.. Created on: Saturday, March 01 2025
3-
.. Last updated on: Sunday, March 02 2025
3+
.. Last updated on: Monday, March 03 2025
44
55
:og:title: Building xsNumpy
6-
:og:description: A lightweight, pure-python implementation of NumPy's core
7-
features
6+
:og:description: Journey of building a lightweight, pure-python implementation
7+
of NumPy's core features
88
:og:type: article
99

1010
.. _project-building-xsnumpy:
@@ -29,7 +29,7 @@ Building xsNumpy
2929

3030
So it all started in mid-November of 2024. Like usual, I was knee-deep in a
3131
Machine Learning project, working with `NumPy`_ literally every single day. I
32-
was slicing and dicing arrays, multiplying matrices, and running complex
32+
was slicing and dicing arrays, `multiplying matrices`_, and running complex
3333
mathematical operations with just a line or two of code. Everything was
3434
working great, like magic. But like all magic tricks, I had no idea whatsoever
3535
how in the world it actually worked!
@@ -69,27 +69,50 @@ The why behind xsNumPy
6969
Now that I was motivated to learn the nitty-gritty implementation details of
7070
NumPy, I was ready to build my own version of it. I mean, I didn't set out to
7171
create something to rival NumPy. Because let's be real, NumPy is a bloody
72-
powerhouse, built over and backed by decades of work by incredible minds in
72+
powerhouse, built over and backed by decades of work by incredible chaps in
7373
math and science, plus tons of optimizations in place. I possibly can't
7474
compete with that!
7575

76-
But still... I realized something quite important. Just because its dense and
76+
Nonetheless, I realized something quite important. Just because its dense and
7777
complicated, it doesn't mean I can't try to understand it. I was literally
7878
relying on it like a black box. I was using its functions and underlying APIs
79-
without truly understanding how they worked!
80-
81-
So, I challenged myself, *Could I build a dinky version of NumPy from
82-
scratch?* Again, not to replace it but to learn from it.
79+
without truly understanding how they worked! I was gutted and felt an enormous
80+
gap in my understanding. Like I mentioned, I was writing models, manipulating
81+
arrays, tensors, and performing all sorts of operations effortlessly. For
82+
instance, whenever I was using functions like :func:`numpy.linalg.inv` or
83+
:func:`numpy.einsum`, I couldn't shake the feeling that I was just
84+
**"trusting"** the library to work, without understanding why it worked in the
85+
first place.
86+
87+
This realization hit me so hard, I challenged myself, *Could I build a dinky
88+
version of NumPy from scratch?* Again, not to replace it but to learn from it.
89+
If I really want to ace at teaching these stuff to my students, I had to go
90+
deeper.
91+
92+
.. image:: ../assets/need-to-go-deeper-meme.png
93+
:alt: We need to go deeper meme from Inception
94+
95+
There were a few other reasons for writing xsNumPy besides my lack of
96+
understanding about the NumPy internals. I essentially wanted to break free
97+
from the *"Oh, Neural Networks are like black box"* mindset. While teaching
98+
Machine Learning and Neural Networks, I often compare these scientific
99+
computing libraries to a car. You can go places, sure, but what happens when
100+
something breaks? What do you do then? So to get around this situation, I
101+
thought of actually learning it by building.
102+
103+
xsNumPy isn't just for me, it's for anyone and everyone who's ever asked,
104+
*"How in the god's name this thing bloody works?"*
83105

84106
.. _building-process:
85107

86108
-------------------------------------------------------------------------------
87109
Building Process
88110
-------------------------------------------------------------------------------
89111

90-
I was ready to build my small version of NumPy, but I didn't know where to
91-
start. I began scrutinizing and poking at various NumPy functions and methods.
92-
Soon I realized that most of NumPy APIs rely on one core construct, the
112+
So with the "whys" being explained, I'll explain the "hows". I was ready to
113+
build my small version of NumPy, but I didn't know where to start. I began
114+
scrutinizing and poking at various NumPy functions and methods. Soon I
115+
realized that most of NumPy APIs rely on one core construct, the
93116
:func:`numpy.array` function, which is a cheeky little wrapper for
94117
:class:`numpy.ndarray`. That's where I decided to start, implementing my
95118
primary ``xsnumpy.ndarray`` data structure.
@@ -200,121 +223,195 @@ A few weeks in, I somehow got around implementing a barebones version of
200223
69c302ccdd594f1d8f0c51dbe16346232c39047f/xsnumpy/_core.py#L183>`_ on
201224
GitHub.
202225

203-
.. dropdown:: Code explanation
204-
:animate: fade-in
226+
.. _deconstructing-ndarray:
205227

206-
Alright, let me break this down in a way that makes sense. First, I start
207-
with checking if the shape is an :py:class:`collections.abc.Iterable`
208-
(a sequence like a :py:class:`tuple` or :py:class:`list`). If it's not,
209-
I'm wrapping it into a tuple to ensure that the shape is always
210-
represented as a tuple. The shape is then converted into a tuple of
211-
integers, ensuring the dimensions are valid.
228+
Deconstructing ndarray
229+
===============================================================================
212230

213-
.. code-block:: python
214-
:linenos:
231+
Alright, let me break this down in a way that makes sense. First, I start with
232+
checking if the shape is an :py:class:`collections.abc.Iterable` (a sequence
233+
like a :py:class:`tuple` or :py:class:`list`). If it's not, I'm wrapping it
234+
into a tuple to ensure that the shape is always represented as a tuple. The
235+
shape is then converted into a tuple of integers, ensuring the dimensions are
236+
valid.
215237

216-
if not isinstance(shape, Iterable):
217-
shape = (shape,)
218-
self._shape = tuple(int(dim) for dim in shape)
238+
.. code-block:: python
239+
:linenos:
219240
220-
Next up, the ``dtype`` (data type). If ``dtype`` is not provided, the
221-
constructor sets the default data type to ``None``. If a :py:class:`type`
222-
(such as :py:class:`int`, :py:class:`float`, etc.) is provided, it
223-
dynamically retrieves the appropriate data type from the global namespace
224-
using :func:`globals`. This allows flexibility in handling various types.
225-
Finally, the resolved data type is assigned to ``self._dtype``.
241+
if not isinstance(shape, Iterable):
242+
shape = (shape,)
243+
self._shape = tuple(int(dim) for dim in shape)
226244
227-
.. code-block:: python
228-
:linenos:
245+
Next up, the ``dtype`` (data type). If ``dtype`` is not provided, the
246+
constructor sets the default data type to ``None``. If a :py:class:`type`
247+
(such as :py:class:`int`, :py:class:`float`, etc.) is provided, it dynamically
248+
retrieves the appropriate data type from the global namespace using
249+
:func:`globals`. This allows flexibility in handling various types. Finally,
250+
the resolved data type is assigned to ``self._dtype``.
229251

230-
if dtype is None:
231-
dtype = float64
232-
elif isinstance(dtype, type):
233-
dtype = globals()[
234-
f"{dtype.__name__}{'32' if dtype != builtins.bool else ''}"
235-
]
236-
else:
237-
dtype = globals()[dtype]
238-
self._dtype = dtype
252+
.. code-block:: python
253+
:linenos:
239254
240-
The size of each element in the array is calculated based on the provided
241-
data type. I wrote a handy function, ``_convert_dtype`` to fetch the
242-
appropriate size of the data type (in a ``short`` format), and the last
243-
value is used to determine the item size.
255+
if dtype is None:
256+
dtype = float64
257+
elif isinstance(dtype, type):
258+
dtype = globals()[
259+
f"{dtype.__name__}{'32' if dtype != builtins.bool else ''}"
260+
]
261+
else:
262+
dtype = globals()[dtype]
263+
self._dtype = dtype
244264
245-
This is super important for calculating memory layout and strides!
265+
The size of each element in the array is calculated based on the provided data
266+
type. I wrote a handy function, ``_convert_dtype`` to fetch the appropriate
267+
size of the data type (in a ``short`` format), and the last value is used to
268+
determine the item size.
246269

247-
.. code-block:: python
248-
:linenos:
270+
This is super important for calculating memory layout and strides!
249271

250-
self._itemsize = int(_convert_dtype(dtype, "short")[-1])
272+
.. code-block:: python
273+
:linenos:
251274
252-
Now, if ``buffer`` is ``None``, the array is initialized without an
253-
external memory buffer. In this case:
275+
self._itemsize = int(_convert_dtype(dtype, "short")[-1])
254276
255-
- The offset must be zero
256-
- Strides must also be ``None``
277+
Now, if ``buffer`` is ``None``, the array is initialized without an external
278+
memory buffer. In this case:
257279

258-
The constructor calculates the strides. The strides is nothing but steps
259-
between consecutive elements in memory.
280+
- The offset must be zero
281+
- Strides must also be ``None``
260282

261-
.. code-block:: python
262-
:linenos:
283+
The constructor calculates the strides. The strides is nothing but steps
284+
between consecutive elements in memory.
263285

264-
if buffer is None:
265-
self._base = None
266-
if self._offset != 0:
267-
raise ValueError("Offset must be 0 when buffer is None")
268-
if strides is not None:
269-
raise ValueError("Buffer is None; strides must be None")
270-
self._strides = calc_strides(self._shape, self.itemsize)
286+
.. code-block:: python
287+
:linenos:
271288
272-
If a ``buffer`` is provided, the constructor handles it by checking if
273-
it's another ``ndarray``. If the ``ndarray`` has a base buffer, it uses
274-
that. The buffer is assigned to ``self._base``, and strides are either
275-
provided or calculated.
289+
if buffer is None:
290+
self._base = None
291+
if self._offset != 0:
292+
raise ValueError("Offset must be 0 when buffer is None")
293+
if strides is not None:
294+
raise ValueError("Buffer is None; strides must be None")
295+
self._strides = calc_strides(self._shape, self.itemsize)
276296
277-
The constructor validates the offset (it must be non-negative) and the
278-
strides (it must be a tuple of integers matching the shape's dimensions).
297+
If a ``buffer`` is provided, the constructor handles it by checking if it's
298+
another ``ndarray``. If the ``ndarray`` has a base buffer, it uses that. The
299+
buffer is assigned to ``self._base``, and strides are either provided or
300+
calculated.
279301

280-
.. code-block:: python
281-
:linenos:
302+
The constructor validates the offset (it must be non-negative) and the strides
303+
(it must be a tuple of integers matching the shape's dimensions).
282304

283-
else:
284-
if isinstance(buffer, ndarray) and buffer.base is not None:
285-
buffer = buffer.base
286-
self._base = buffer
287-
if isinstance(buffer, ndarray):
288-
buffer = buffer.data
289-
if self._offset < 0:
290-
raise ValueError("Offset must be non-negative")
291-
if strides is None:
292-
strides = calc_strides(self._shape, self.itemsize)
293-
elif not (
294-
isinstance(strides, tuple)
295-
and all(isinstance(stride, int) for stride in strides)
296-
and len(strides) == len(self._shape)
297-
):
298-
raise ValueError("Invalid strides provided")
299-
self._strides = tuple(strides)
305+
.. code-block:: python
306+
:linenos:
307+
:emphasize-lines: 7-10
308+
309+
else:
310+
if isinstance(buffer, ndarray) and buffer.base is not None:
311+
buffer = buffer.base
312+
self._base = buffer
313+
if isinstance(buffer, ndarray):
314+
buffer = buffer.data
315+
if self._offset < 0:
316+
raise ValueError("Offset must be non-negative")
317+
if strides is None:
318+
strides = calc_strides(self._shape, self.itemsize)
319+
elif not (
320+
isinstance(strides, tuple)
321+
and all(isinstance(stride, int) for stride in strides)
322+
and len(strides) == len(self._shape)
323+
):
324+
raise ValueError("Invalid strides provided")
325+
self._strides = tuple(strides)
326+
327+
Finally, the constructor calculates the total buffer size based on the strides,
328+
shape, and item size. The ``Buffer`` is a type derived from the data type
329+
(dtype) and its size. Depending on whether the buffer is provided or not, it
330+
initializes ``self._data`` using different methods:
331+
332+
- If no buffer is provided, a new buffer is created
333+
- If the buffer is a :py:class:`ctypes.Array`, the address of the buffer
334+
is used to initialize the data. Basically, we use its address like a map
335+
- If it's any other type of buffer, the buffer is used directly
336+
337+
Phew, that was a lot, but now you can see how it's all orchestrated!
338+
339+
.. _the-easy-peasy-stuff:
340+
341+
The "easy peasy" stuff
342+
===============================================================================
343+
344+
Like I said before, I wanted to build a tiny version of NumPy. It was my clear
345+
and straightforward goal. Start small, build arrays, and then add the fancy
346+
operations like matrix multiplication, `broadcasting`_, and so on. What took me
347+
by surprise was the fact that how challenging things were, which I thought to
348+
be **"easy peasy"**. Things like writing a :py:func:`repr` or overriding the
349+
built-in methods.
350+
351+
I remember talking to myself one morning, *"let's start with something bloody
352+
easy, perhaps just display the array."* That couldn't be hard, right? All I
353+
need to do is print the content of my array in a readable format how NumPy
354+
does. Little did I know I was shooting myself in the foot. At its core, a
355+
``repr`` is just an object's internal data representation. I started with
356+
something like...
357+
358+
.. code-block:: python
359+
:linenos:
360+
361+
def __repr__(self) -> str:
362+
return f"array({self._data}, dtype={self.dtype.__str__()})"
363+
364+
Sure, it worked for a scalar. But what about vectors? With some adjustments, I
365+
got it working for 1D arrays. Being chuffed, I tried a 2D array. Suddenly, it
366+
printed everything as a flat list. I realized that I had not accounted my
367+
implementation for rows and columns. No problem, I updated the code slightly
368+
to make it work and after some initial struggles, I got it working... barely!
369+
370+
Then the 3D arrays... and it broke again.
371+
372+
That's when it struck me, this wasn't merely about formatting strings. I needed
373+
a generic solution that would work with **any** number of dimensions. A few
374+
days later, I found myself deep into recursive logic and multi-dimensional
375+
indexing, all for what I believed was a **"easy peasy"** print function. Now
376+
the problem wasn't just getting this thing to work but rather make sure it
377+
worked consistently across all the possible array shapes. What I thought would
378+
take an hour or two took days.
379+
380+
But finally, I got it working!
381+
382+
.. note::
383+
384+
You can read about the `complete <https://github.com/xames3/xsnumpy/blob/
385+
69c302ccdd594f1d8f0c51dbe16346232c39047f/xsnumpy/_core.py#L275C1-L327C27>`_
386+
implementation of the ``xsnumpy.ndarray.__repr__`` on GitHub.
387+
388+
Just when I thought the hard part was over, I moved on to array indexing which
389+
is perhaps one of the superpowers of NumPy. At first, I assumed this would be
390+
easy too and it worked, partly.
391+
392+
.. code-block:: python
393+
:linenos:
300394
301-
Finally, the constructor calculates the total buffer size based on the
302-
strides, shape, and item size. The ``Buffer`` is a type derived from the
303-
data type (dtype) and its size. Depending on whether the buffer is provided
304-
or not, it initializes ``self._data`` using different methods:
395+
def __getitem__(self, index) -> t.Any:
396+
row, column = index
397+
flat = row * self.shape[1] + column
398+
return self.data[flat]
305399
306-
- If no buffer is provided, a new buffer is created
307-
- If the buffer is a :py:class:`ctypes.Array`, the address of the buffer
308-
is used to initialize the data. Basically, we use its address like a map
309-
- If it's any other type of buffer, the buffer is used directly
400+
When I tried a slice like ``array[:, 1]``, it broke. When I tried with higher
401+
dimensional arrays, it collapsed! With each new test case, it was pretty
402+
evident that there were significant flows in my logic.
310403

311-
Phew, that was a lot, but now you can see how it's all orchestrated!
404+
.. image:: ../assets/sigh-meme.jpg
405+
:alt: Deep sigh meme
312406

313407
.. _NumPy: https://numpy.org/
408+
.. _multiplying matrices: https://www.mathsisfun.com/algebra/
409+
matrix-multiplying.html
314410
.. _xsNumPy: https://github.com/xames3/xsnumpy
315411
.. _memory allocation and management: https://numpy.org/doc/stable/reference/
316412
c-api/data_memory.html
317413
.. _shape: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.
318414
shape.html
319415
.. _strides: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.
320416
strides.html
417+
.. _broadcasting: https://numpy.org/doc/stable/user/basics.broadcasting.html

0 commit comments

Comments
 (0)