Question about unused instance variable in CausalAttention class #637

cihankarabulut · 2025-05-01T19:15:38Z

cihankarabulut
May 1, 2025

In the CausalAttention class implementation on pg. 81, I noticed that self.d_out = d_out is assigned in the init method, but this instance variable doesn't seem to be used anywhere else in the class. The output dimension is already passed directly to the linear layers (nn.Linear(d_in, d_out, bias=qkv_bias)), so I'm curious:

1.Is there a specific reason for storing d_out as an instance variable?
2.Is this intended for future extensions of the class?
3. Or is this simply a remnant from a previous version that could be safely removed?

I'm trying to understand best practices for transformer implementations and when it makes sense to store parameters as instance variables versus just using them directly.

casinca · 2025-05-02T12:35:04Z

casinca
May 2, 2025

That's funny because I remember seeing this one when reading the book but never PR for consistency with the notebook.

I can't talk for Sebastian but pretty sure, it's your point 2 and 3.
This specific class isn't really used, it's just an intermediate step to better grasp the actual GPT2 MultiHeadAttention (MHA) class that comes later on, in p.86.

You'll notice that for MHA, self.d_out is actually used for reshaping the context tensor to a 3D shape. But in the context of CausalAttention, it's probably just a remnant or code reused from the MHA class.

1 reply

cihankarabulut May 2, 2025
Author

Thanks for the explanation! That makes perfect sense. I haven't gotten to the MultiHeadAttention class yet in my reading, so I didn't realize that self.d_out would be used there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about unused instance variable in CausalAttention class #637

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Question about unused instance variable in CausalAttention class #637

Uh oh!

cihankarabulut May 1, 2025

Replies: 1 comment · 1 reply

Uh oh!

casinca May 2, 2025

Uh oh!

cihankarabulut May 2, 2025 Author

cihankarabulut
May 1, 2025

Replies: 1 comment 1 reply

casinca
May 2, 2025

cihankarabulut May 2, 2025
Author