vfx Highlight /emphasize the spoken words in captions

Hi everyone, I'm trying trying to highlight each spoken word (similar to karaoke) of a caption (srt file), and it's been extremely tedious work. I'm wondering if anyone has any tips on workflow or tool that would automate or speed up the process. This is what I'm trying to do:
basically have the word that's being spoken highlight or change color.

I'm using FCPX but I'm open to anything that will help speed up the process of creating this text in the video.

Would appreciate any advice or tips!
 
Top