1
1
.. Author: Akshay Mestry <[email protected] >
2
2
.. Created on: Saturday, March 01 2025
3
- .. Last updated on: Sunday , March 02 2025
3
+ .. Last updated on: Monday , March 03 2025
4
4
5
5
:og: title: Building xsNumpy
6
- :og: description: A lightweight, pure-python implementation of NumPy's core
7
- features
6
+ :og: description: Journey of building a lightweight, pure-python implementation
7
+ of NumPy's core features
8
8
:og: type: article
9
9
10
10
.. _project-building-xsnumpy :
@@ -29,7 +29,7 @@ Building xsNumpy
29
29
30
30
So it all started in mid-November of 2024. Like usual, I was knee-deep in a
31
31
Machine Learning project, working with `NumPy `_ literally every single day. I
32
- was slicing and dicing arrays, multiplying matrices, and running complex
32
+ was slicing and dicing arrays, ` multiplying matrices `_ , and running complex
33
33
mathematical operations with just a line or two of code. Everything was
34
34
working great, like magic. But like all magic tricks, I had no idea whatsoever
35
35
how in the world it actually worked!
@@ -69,27 +69,50 @@ The why behind xsNumPy
69
69
Now that I was motivated to learn the nitty-gritty implementation details of
70
70
NumPy, I was ready to build my own version of it. I mean, I didn't set out to
71
71
create something to rival NumPy. Because let's be real, NumPy is a bloody
72
- powerhouse, built over and backed by decades of work by incredible minds in
72
+ powerhouse, built over and backed by decades of work by incredible chaps in
73
73
math and science, plus tons of optimizations in place. I possibly can't
74
74
compete with that!
75
75
76
- But still... I realized something quite important. Just because its dense and
76
+ Nonetheless, I realized something quite important. Just because its dense and
77
77
complicated, it doesn't mean I can't try to understand it. I was literally
78
78
relying on it like a black box. I was using its functions and underlying APIs
79
- without truly understanding how they worked!
80
-
81
- So, I challenged myself, *Could I build a dinky version of NumPy from
82
- scratch? * Again, not to replace it but to learn from it.
79
+ without truly understanding how they worked! I was gutted and felt an enormous
80
+ gap in my understanding. Like I mentioned, I was writing models, manipulating
81
+ arrays, tensors, and performing all sorts of operations effortlessly. For
82
+ instance, whenever I was using functions like :func: `numpy.linalg.inv ` or
83
+ :func: `numpy.einsum `, I couldn't shake the feeling that I was just
84
+ **"trusting" ** the library to work, without understanding why it worked in the
85
+ first place.
86
+
87
+ This realization hit me so hard, I challenged myself, *Could I build a dinky
88
+ version of NumPy from scratch? * Again, not to replace it but to learn from it.
89
+ If I really want to ace at teaching these stuff to my students, I had to go
90
+ deeper.
91
+
92
+ .. image :: ../assets/need-to-go-deeper-meme.png
93
+ :alt: We need to go deeper meme from Inception
94
+
95
+ There were a few other reasons for writing xsNumPy besides my lack of
96
+ understanding about the NumPy internals. I essentially wanted to break free
97
+ from the *"Oh, Neural Networks are like black box" * mindset. While teaching
98
+ Machine Learning and Neural Networks, I often compare these scientific
99
+ computing libraries to a car. You can go places, sure, but what happens when
100
+ something breaks? What do you do then? So to get around this situation, I
101
+ thought of actually learning it by building.
102
+
103
+ xsNumPy isn't just for me, it's for anyone and everyone who's ever asked,
104
+ *"How in the god's name this thing bloody works?" *
83
105
84
106
.. _building-process :
85
107
86
108
-------------------------------------------------------------------------------
87
109
Building Process
88
110
-------------------------------------------------------------------------------
89
111
90
- I was ready to build my small version of NumPy, but I didn't know where to
91
- start. I began scrutinizing and poking at various NumPy functions and methods.
92
- Soon I realized that most of NumPy APIs rely on one core construct, the
112
+ So with the "whys" being explained, I'll explain the "hows". I was ready to
113
+ build my small version of NumPy, but I didn't know where to start. I began
114
+ scrutinizing and poking at various NumPy functions and methods. Soon I
115
+ realized that most of NumPy APIs rely on one core construct, the
93
116
:func: `numpy.array ` function, which is a cheeky little wrapper for
94
117
:class: `numpy.ndarray `. That's where I decided to start, implementing my
95
118
primary ``xsnumpy.ndarray `` data structure.
@@ -200,121 +223,195 @@ A few weeks in, I somehow got around implementing a barebones version of
200
223
69c302ccdd594f1d8f0c51dbe16346232c39047f/xsnumpy/_core.py#L183> `_ on
201
224
GitHub.
202
225
203
- .. dropdown :: Code explanation
204
- :animate: fade-in
226
+ .. _deconstructing-ndarray :
205
227
206
- Alright, let me break this down in a way that makes sense. First, I start
207
- with checking if the shape is an :py:class: `collections.abc.Iterable `
208
- (a sequence like a :py:class: `tuple ` or :py:class: `list `). If it's not,
209
- I'm wrapping it into a tuple to ensure that the shape is always
210
- represented as a tuple. The shape is then converted into a tuple of
211
- integers, ensuring the dimensions are valid.
228
+ Deconstructing ndarray
229
+ ===============================================================================
212
230
213
- .. code-block :: python
214
- :linenos:
231
+ Alright, let me break this down in a way that makes sense. First, I start with
232
+ checking if the shape is an :py:class: `collections.abc.Iterable ` (a sequence
233
+ like a :py:class: `tuple ` or :py:class: `list `). If it's not, I'm wrapping it
234
+ into a tuple to ensure that the shape is always represented as a tuple. The
235
+ shape is then converted into a tuple of integers, ensuring the dimensions are
236
+ valid.
215
237
216
- if not isinstance (shape, Iterable):
217
- shape = (shape,)
218
- self ._shape = tuple (int (dim) for dim in shape)
238
+ .. code-block :: python
239
+ :linenos:
219
240
220
- Next up, the ``dtype `` (data type). If ``dtype `` is not provided, the
221
- constructor sets the default data type to ``None ``. If a :py:class: `type `
222
- (such as :py:class: `int `, :py:class: `float `, etc.) is provided, it
223
- dynamically retrieves the appropriate data type from the global namespace
224
- using :func: `globals `. This allows flexibility in handling various types.
225
- Finally, the resolved data type is assigned to ``self._dtype ``.
241
+ if not isinstance (shape, Iterable):
242
+ shape = (shape,)
243
+ self ._shape = tuple (int (dim) for dim in shape)
226
244
227
- .. code-block :: python
228
- :linenos:
245
+ Next up, the ``dtype `` (data type). If ``dtype `` is not provided, the
246
+ constructor sets the default data type to ``None ``. If a :py:class: `type `
247
+ (such as :py:class: `int `, :py:class: `float `, etc.) is provided, it dynamically
248
+ retrieves the appropriate data type from the global namespace using
249
+ :func: `globals `. This allows flexibility in handling various types. Finally,
250
+ the resolved data type is assigned to ``self._dtype ``.
229
251
230
- if dtype is None :
231
- dtype = float64
232
- elif isinstance (dtype, type ):
233
- dtype = globals ()[
234
- f " { dtype.__name__ }{ ' 32' if dtype != builtins.bool else ' ' } "
235
- ]
236
- else :
237
- dtype = globals ()[dtype]
238
- self ._dtype = dtype
252
+ .. code-block :: python
253
+ :linenos:
239
254
240
- The size of each element in the array is calculated based on the provided
241
- data type. I wrote a handy function, ``_convert_dtype `` to fetch the
242
- appropriate size of the data type (in a ``short `` format), and the last
243
- value is used to determine the item size.
255
+ if dtype is None :
256
+ dtype = float64
257
+ elif isinstance (dtype, type ):
258
+ dtype = globals ()[
259
+ f " { dtype.__name__ }{ ' 32' if dtype != builtins.bool else ' ' } "
260
+ ]
261
+ else :
262
+ dtype = globals ()[dtype]
263
+ self ._dtype = dtype
244
264
245
- This is super important for calculating memory layout and strides!
265
+ The size of each element in the array is calculated based on the provided data
266
+ type. I wrote a handy function, ``_convert_dtype `` to fetch the appropriate
267
+ size of the data type (in a ``short `` format), and the last value is used to
268
+ determine the item size.
246
269
247
- .. code-block :: python
248
- :linenos:
270
+ This is super important for calculating memory layout and strides!
249
271
250
- self ._itemsize = int (_convert_dtype(dtype, " short" )[- 1 ])
272
+ .. code-block :: python
273
+ :linenos:
251
274
252
- Now, if ``buffer `` is ``None ``, the array is initialized without an
253
- external memory buffer. In this case:
275
+ self ._itemsize = int (_convert_dtype(dtype, " short" )[- 1 ])
254
276
255
- - The offset must be zero
256
- - Strides must also be `` None ``
277
+ Now, if `` buffer `` is `` None ``, the array is initialized without an external
278
+ memory buffer. In this case:
257
279
258
- The constructor calculates the strides. The strides is nothing but steps
259
- between consecutive elements in memory.
280
+ - The offset must be zero
281
+ - Strides must also be `` None ``
260
282
261
- .. code-block :: python
262
- :linenos:
283
+ The constructor calculates the strides. The strides is nothing but steps
284
+ between consecutive elements in memory.
263
285
264
- if buffer is None :
265
- self ._base = None
266
- if self ._offset != 0 :
267
- raise ValueError (" Offset must be 0 when buffer is None" )
268
- if strides is not None :
269
- raise ValueError (" Buffer is None; strides must be None" )
270
- self ._strides = calc_strides(self ._shape, self .itemsize)
286
+ .. code-block :: python
287
+ :linenos:
271
288
272
- If a ``buffer `` is provided, the constructor handles it by checking if
273
- it's another ``ndarray ``. If the ``ndarray `` has a base buffer, it uses
274
- that. The buffer is assigned to ``self._base ``, and strides are either
275
- provided or calculated.
289
+ if buffer is None :
290
+ self ._base = None
291
+ if self ._offset != 0 :
292
+ raise ValueError (" Offset must be 0 when buffer is None" )
293
+ if strides is not None :
294
+ raise ValueError (" Buffer is None; strides must be None" )
295
+ self ._strides = calc_strides(self ._shape, self .itemsize)
276
296
277
- The constructor validates the offset (it must be non-negative) and the
278
- strides (it must be a tuple of integers matching the shape's dimensions).
297
+ If a ``buffer `` is provided, the constructor handles it by checking if it's
298
+ another ``ndarray ``. If the ``ndarray `` has a base buffer, it uses that. The
299
+ buffer is assigned to ``self._base ``, and strides are either provided or
300
+ calculated.
279
301
280
- .. code-block :: python
281
- :linenos:
302
+ The constructor validates the offset (it must be non-negative) and the strides
303
+ (it must be a tuple of integers matching the shape's dimensions).
282
304
283
- else :
284
- if isinstance (buffer, ndarray) and buffer.base is not None :
285
- buffer = buffer.base
286
- self ._base = buffer
287
- if isinstance (buffer, ndarray):
288
- buffer = buffer.data
289
- if self ._offset < 0 :
290
- raise ValueError (" Offset must be non-negative" )
291
- if strides is None :
292
- strides = calc_strides(self ._shape, self .itemsize)
293
- elif not (
294
- isinstance (strides, tuple )
295
- and all (isinstance (stride, int ) for stride in strides)
296
- and len (strides) == len (self ._shape)
297
- ):
298
- raise ValueError (" Invalid strides provided" )
299
- self ._strides = tuple (strides)
305
+ .. code-block :: python
306
+ :linenos:
307
+ :emphasize- lines: 7 - 10
308
+
309
+ else :
310
+ if isinstance (buffer, ndarray) and buffer.base is not None :
311
+ buffer = buffer.base
312
+ self ._base = buffer
313
+ if isinstance (buffer, ndarray):
314
+ buffer = buffer.data
315
+ if self ._offset < 0 :
316
+ raise ValueError (" Offset must be non-negative" )
317
+ if strides is None :
318
+ strides = calc_strides(self ._shape, self .itemsize)
319
+ elif not (
320
+ isinstance (strides, tuple )
321
+ and all (isinstance (stride, int ) for stride in strides)
322
+ and len (strides) == len (self ._shape)
323
+ ):
324
+ raise ValueError (" Invalid strides provided" )
325
+ self ._strides = tuple (strides)
326
+
327
+ Finally, the constructor calculates the total buffer size based on the strides,
328
+ shape, and item size. The ``Buffer `` is a type derived from the data type
329
+ (dtype) and its size. Depending on whether the buffer is provided or not, it
330
+ initializes ``self._data `` using different methods:
331
+
332
+ - If no buffer is provided, a new buffer is created
333
+ - If the buffer is a :py:class: `ctypes.Array `, the address of the buffer
334
+ is used to initialize the data. Basically, we use its address like a map
335
+ - If it's any other type of buffer, the buffer is used directly
336
+
337
+ Phew, that was a lot, but now you can see how it's all orchestrated!
338
+
339
+ .. _the-easy-peasy-stuff :
340
+
341
+ The "easy peasy" stuff
342
+ ===============================================================================
343
+
344
+ Like I said before, I wanted to build a tiny version of NumPy. It was my clear
345
+ and straightforward goal. Start small, build arrays, and then add the fancy
346
+ operations like matrix multiplication, `broadcasting `_, and so on. What took me
347
+ by surprise was the fact that how challenging things were, which I thought to
348
+ be **"easy peasy" **. Things like writing a :py:func: `repr ` or overriding the
349
+ built-in methods.
350
+
351
+ I remember talking to myself one morning, *"let's start with something bloody
352
+ easy, perhaps just display the array." * That couldn't be hard, right? All I
353
+ need to do is print the content of my array in a readable format how NumPy
354
+ does. Little did I know I was shooting myself in the foot. At its core, a
355
+ ``repr `` is just an object's internal data representation. I started with
356
+ something like...
357
+
358
+ .. code-block :: python
359
+ :linenos:
360
+
361
+ def __repr__ (self ) -> str :
362
+ return f " array( { self ._data} , dtype= { self .dtype.__str__ ()} ) "
363
+
364
+ Sure, it worked for a scalar. But what about vectors? With some adjustments, I
365
+ got it working for 1D arrays. Being chuffed, I tried a 2D array. Suddenly, it
366
+ printed everything as a flat list. I realized that I had not accounted my
367
+ implementation for rows and columns. No problem, I updated the code slightly
368
+ to make it work and after some initial struggles, I got it working... barely!
369
+
370
+ Then the 3D arrays... and it broke again.
371
+
372
+ That's when it struck me, this wasn't merely about formatting strings. I needed
373
+ a generic solution that would work with **any ** number of dimensions. A few
374
+ days later, I found myself deep into recursive logic and multi-dimensional
375
+ indexing, all for what I believed was a **"easy peasy" ** print function. Now
376
+ the problem wasn't just getting this thing to work but rather make sure it
377
+ worked consistently across all the possible array shapes. What I thought would
378
+ take an hour or two took days.
379
+
380
+ But finally, I got it working!
381
+
382
+ .. note ::
383
+
384
+ You can read about the `complete <https://github.com/xames3/xsnumpy/blob/
385
+ 69c302ccdd594f1d8f0c51dbe16346232c39047f/xsnumpy/_core.py#L275C1-L327C27> `_
386
+ implementation of the ``xsnumpy.ndarray.__repr__ `` on GitHub.
387
+
388
+ Just when I thought the hard part was over, I moved on to array indexing which
389
+ is perhaps one of the superpowers of NumPy. At first, I assumed this would be
390
+ easy too and it worked, partly.
391
+
392
+ .. code-block :: python
393
+ :linenos:
300
394
301
- Finally, the constructor calculates the total buffer size based on the
302
- strides, shape, and item size. The `` Buffer `` is a type derived from the
303
- data type (dtype) and its size. Depending on whether the buffer is provided
304
- or not, it initializes `` self._data `` using different methods:
395
+ def __getitem__ ( self , index ) -> t.Any:
396
+ row, column = index
397
+ flat = row * self .shape[ 1 ] + column
398
+ return self .data[flat]
305
399
306
- - If no buffer is provided, a new buffer is created
307
- - If the buffer is a :py:class: `ctypes.Array `, the address of the buffer
308
- is used to initialize the data. Basically, we use its address like a map
309
- - If it's any other type of buffer, the buffer is used directly
400
+ When I tried a slice like ``array[:, 1] ``, it broke. When I tried with higher
401
+ dimensional arrays, it collapsed! With each new test case, it was pretty
402
+ evident that there were significant flows in my logic.
310
403
311
- Phew, that was a lot, but now you can see how it's all orchestrated!
404
+ .. image :: ../assets/sigh-meme.jpg
405
+ :alt: Deep sigh meme
312
406
313
407
.. _NumPy : https://numpy.org/
408
+ .. _multiplying matrices : https://www.mathsisfun.com/algebra/
409
+ matrix-multiplying.html
314
410
.. _xsNumPy : https://github.com/xames3/xsnumpy
315
411
.. _memory allocation and management : https://numpy.org/doc/stable/reference/
316
412
c-api/data_memory.html
317
413
.. _shape : https://numpy.org/doc/stable/reference/generated/numpy.ndarray.
318
414
shape.html
319
415
.. _strides : https://numpy.org/doc/stable/reference/generated/numpy.ndarray.
320
416
strides.html
417
+ .. _broadcasting : https://numpy.org/doc/stable/user/basics.broadcasting.html
0 commit comments