this post was submitted on 01 Nov 2023
        
      1 points (100.0% liked)
      Machine Learning
    1 readers
  
      
      1 users here now
      Community Rules:
- Be nice. No offensive behavior, insults or attacks: we encourage a diverse community in which members feel safe and have a voice.
- Make your post clear and comprehensive: posts that lack insight or effort will be removed. (ex: questions which are easily googled)
- Beginner or career related questions go elsewhere. This community is focused in discussion of research and new projects that advance the state-of-the-art.
- Limit self-promotion. Comments and posts should be first and foremost about topics of interest to ML observers and practitioners. Limited self-promotion is tolerated, but the sub is not here as merely a source for free advertisement. Such posts will be removed at the discretion of the mods.
        founded 2 years ago
      
      MODERATORS
      
    you are viewing a single comment's thread
view the rest of the comments
    view the rest of the comments
The ideas in the paper seem to be using a very similar concept from a NMT paper called Deep Encoders, Shallow Encoders. Surprised to see no citation in the related work.