There are some important advantages
to taking the more difficult road in the beginning:

at a later stage when comparing the original model to the Hugging Face implementation, you can verify automatically
  for each component individually that the corresponding component of the 🤗 Transformers implementation matches instead
  of relying on visual comparison via print statements
it can give you some rope to decompose the big problem of porting a model into smaller problems of just porting
  individual components and thus structure your work better
separating the model into logical meaningful components will help you to get a better overview of the model's design
  and thus to better understand the model
at a later stage those component-by-component tests help you to ensure that no regression occurs as you continue
  changing your code

Lysandre's integration checks for ELECTRA
gives a nice example of how this can be done.