Updates
Latest Tweet
What's New?
Check out for latest innovation, a computer based training video collection
Like this Page
Text Processing in Python Review by James Stroud
Instant Classic
TPIP is an instant classic in that all you need to do is add a solid understanding of python and you can instantly appreciate its classic nature. Text processing is more fundamental to programming than programming itself. For instance, most of the programs a programmer will write will be written with text. So gaining proficiency in dealing with text is key to not only programming but probably every facet of one's experience with a computer.
In TPIP, David Mertz provides the reader with a set of tools for manipulating text in python. The book is organized by type of text processing activity. For example filters are presented from a functional perspective, searching text is presented in terms of regular expressions, etc. Relevant modules are presented with each type of processing task in a reference format.
The greatest value in the book is that it approaches a fundamental and important programming topic that most books would treat sparingly or dismiss outright. TPIP might be in league with Friedl's Mastering Regular Expressions in that it takes outwardly uninspiring topics, makes them interesting, and teaches them with pedagogical finesse. Somehow, Mertz inspires the reader to feel intelligent while presenting the topics in an accessible way. Even mxtexttools becomes comprehensible in TPIP.
TPIP, though, is not without it shortcomings, especially in organization. The review of python and functional programming are put in appendices and the reference material is interleaved with the text, giving the reader a somewhat disjointed feeling as he makes his way through the book. Better would have been to build the book up from a solid review of the python language, proceeding to a thorough treatment of functional programming in python, to then present the meat of the book, text processing, as a well-organized whole with sensible segue between the chapters. The reference material should be moved to the appendices for easy access.
Even if these organization problems are never fixed, one would be well served to study this fine volume.