Comment by vrighter
As for the first: Everything points towards the problem NOT being able to be overcome with current architectures.
For the 2nd: We are completely 100% sure this cannot be solved. This is not a training issue. This is the case for any statistical machine. No amount of training can ever solve this.