5 Comments
User's avatar
Vaibhav's avatar

The architecture section remind me of auto encoders. Where the bottle neck is there so that model learns general rules instead of only pattern matching.

Expand full comment
Vaibhav's avatar

I was also wondering about the legal reasoning setup. Do you use the same model for reward and generation or is there a learned function in between that transforms vectors from source model to the target reward model.

Expand full comment
Devansh's avatar

Different models, but there’s no principled reason. We need to test a lot more to confirm which is best

Expand full comment
Samuel's avatar

is there any open source code for TRM?

Expand full comment
Whispering Pirate's avatar

Not too long ago, I created a blog on making the sub stack built-in text to speech reader work. And I’m secretly feeling like an asshole because I can’t get it to work still. when I try to figure it out, it seems as though it’s not fully offered everyone or something like that. I almost wonder if I’m getting guard railed for cursing or something like that But I have a feeling that you don’t. Have you tried to make it work cause there’s a lot of things I really wanna read that are very much a cumbersome task without having a speech to text.

Expand full comment