Michael Nielsen, for Quanta Magazine:
Using this statistical model, the computer could take a new French sentence — one it had never seen before — and figure out the most likely corresponding English sentence. And that would be the program’s translation.
When I first heard about this approach, it sounded ludicrous. This statistical model throws away nearly everything we know about language. There’s no concept of subjects, predicates or objects, none of what we usually think of as the structure of language. And the models don’t try to figure out anything about the meaning (whatever that is) of the sentence either.
Despite all this, the IBM team found this approach worked much better than systems based on sophisticated linguistic concepts. Indeed, their system was so successful that the best modern systems for language translation — systems like Google Translate — are based on similar ideas.
The difference between knowing how to model something vs. understanding why something works is something to ponder. Is knowing how less valuable than understanding why? In most applications, probably not. Either way, you can complete the task at hand, and a statistical model may even be able to extrapolate beyond the bounds of the original question; may even be an aid to understanding.