Programming and Algorithms: Week 20

banner

File Analysis

What are we doing this week?


This week we are going to look at how the FILE ANALYSIS part of GOOGLE SEARCH in PYTHON. We'll look at CHARACTER COUNT, WORD COUNT, LINE COUNT in PYTHON. We'll look at how to measure WORD FREQUENCY and look at full FILE ANALYSIS in PYTHON.
 
Python Python Logo

Powerpoint: File Analysis


Total running time of videos is 30 minutes.

Google logo
Matt Cutts: How Search Works


TED 
Eli Pariser: Beware online "filter bubbles"



Links
Think Python: Word Frequency Analysis
http://greenteapress.com/thinkpython/html/thinkpython014.html

Learn Python the Hard Way: Dictionaries, Oh Lovely Dictionaries
http://learnpythonthehardway.org/book/ex39.html

Python Docs: Brief Tour of the Standard Library
https://docs.python.org/2/tutorial/stdlib.html


Sample Code:
String Pre-Processing * File Statistics * Word Frequency * Full File Analysis
Sample Files:
StarWarsScript.txtCompleteShakespeare.txt

Lab #20
Lab #20 is about adding options to the FULL FILE ANALYSIS program.


back

If you have any suggestions, corrections, or comments, please feel free to e-mail me at:
Damian.Gordon(a)dit.ie