Na mesma linha do The End of Theory: The Data Deluge Makes the Scientific Method Obsolete, o artigo The Unreasonable Effectiveness of Data destaca principalmente como uma grande quantidade de dados pode ajudar na computação linguística.

Do artigo:

Eugene Wigner’s article “The Unreasonable Effectiveness of Mathematics in the Natural Sciences” examines why so much of physics can be neatly explained with simple mathematical formulas such as f = ma or e = mc2. Meanwhile, sciences that involve human beings rather than elementary particles have proven more resistant to elegant mathematics. Economists suffer from physics envy over their inability to neatly model human behavior. An informal, incomplete grammar of the English language runs over 1,700 pages. Perhaps when it comes to natural language processing and related fields, we’re doomed to complex theories that will never have the elegance of physics equations. But if that’s so, we should stop acting as if our goal is to author extremely elegant theories, and instead embrace complexity and make use of the best ally we have: the unreasonable effectiveness of data.


The first lesson of Web-scale learning is to use available large-scale data rather than hoping for annotated data that isn’t available.

Value pela dica Daniel!

